Simplify regex patterns in magic files to improve performance#10
Simplify regex patterns in magic files to improve performance#10sachintu47 merged 3 commits intomainfrom
Conversation
|
augment review |
🤖 Augment PR SummarySummary: This PR updates the Changes:
Technical Notes: The endian fix writes GUID fields in a defined little-endian byte layout and prints them via explicit byte-order reconstruction, which should improve correctness on big-endian platforms. 🤖 Was this summary useful? React with 👍 or 👎 |
| ## Required IF ZOPEN_BUILD_LINE='STABLE' | ||
| export ZOPEN_STABLE_URL="http://ftp.astron.com/pub/file/file-${FILE_VERSION}.tar.gz" # Specify the stable build URL (either git or tarball) | ||
| export ZOPEN_STABLE_DEPS="autoconf automake libtool make check_python zlib gzip tar perl m4 openssl" # Specify the stable build dependencies. | ||
| export ZOPEN_STABLE_DEPS="make check_python zlib perl" # Specify the stable build dependencies. |
There was a problem hiding this comment.
buildenv:19 reduces ZOPEN_STABLE_DEPS to make check_python zlib perl, but the stable source is a .tar.gz at ZOPEN_STABLE_URL. If the zopen build environment doesn’t always provide tar/gzip (and any download tooling) implicitly, this change could make stable builds fail during fetch/extract.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| # class name[(base classes,)]: [pass] | ||
| 0 search/8192 class | ||
| ->0 regex \^class\ [_[:alpha:]]+(\\(.*\\))?(\ )*:([\ \t]+pass)?$ Python script text executable | ||
| +>0 regex \^class\ [_[:alpha:]]+(\([^)]*\))?(\ )*:([\ \t]+pass)?$ Python script text executable |
There was a problem hiding this comment.
patches/regex.patch:49 changes the Python class regex from a permissive \(.*\) to \([^)]*\), which will no longer match valid bases containing ) (e.g., class A(factory()):). That’s a behavior change in type detection, so it may be worth confirming the reduced coverage is acceptable for the performance win.
Other locations where this applies: patches/regex.patch:10
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
augment review |
| # But class alone is reduced to avoid beating php (Jens Schleusener) | ||
| 0 search/8192 class | ||
| ->0 regex \^[[:space:]]*class[[:space:]]+[[:digit:][:alpha:]:_]+[[:space:]]*\\{(.*[\n]*)*\\}(;)?$ C++ source text | ||
| +>0 regex \^[[:space:]]*class[[:space:]]+[[:digit:][:alpha:]:_]+[[:space:]]*\\{[^}]*\\}(;)?$ C++ source text |
There was a problem hiding this comment.
patches/regex.patch:10: The new C++ pattern uses [^}]*, but libmagic compiles regexes with REG_NEWLINE, where a negated character class won’t match newlines; this may stop matching typical multi-line class { ... } blocks that the previous (...[\n]*...)* form handled. Consider confirming this doesn’t noticeably reduce C++ source detection coverage in practice.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
No description provided.