- Jan Vašák, xvasak01@vutbr.cz, 2024
This is a prototype of a regex matcher based on register set automata [1].
To run use:
python rsa-matcher [-h] [-d] [-f FILE] pattern
patternregex to be matched to input data-hprint help and exit-ddon't determinise ahead of time (determinise the regex on for each input line)-f FILEspecify file to read from
The program then reads lines from stdin (or FILE if specified) and prints out every line that matches the pattern.
As rsaregex has methods for drawing
the automata using graphviz, it requires installation of graphviz (sudo apt-get install graphviz on Linux) and its Python library (pip install graphviz).
Package implementing RsA-based regex matching and also a representation of register (set) automata
as the classes RsA, DRsA, and NRA.
Also provides the function draw_automaton to draw a specified automaton into a pdf file using graphviz.
For regex matching use either
drsa = rsaregex.create_rsa(pattern)to create the DRsA and then useresult = drsa.run_word(input)to match the input to the pattern, orresult = rsaregex.match(pattern, input)to do the above in one operation (not recommended for repeated matching). Beware thatresultmight be-1if the pattern cannot be determinised.
[1] Gulčíková, S. and Lengál, O. Register Set Automata (Technical Report). arXiv. 2022. DOI: 10.48550/ARXIV.2205.12114. Available at: https://arxiv.org/abs/2205.12114