-
Notifications
You must be signed in to change notification settings - Fork 111
[k2] refactor pcre2 functions #1501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…atched_as_null> dump_matches
a0b27ae to
5082d72
Compare
5082d72 to
b7c8ac8
Compare
|
|
||
| static constexpr size_t MAX_SUBPATTERNS_COUNT{512}; | ||
|
|
||
| struct compiled_regex { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be private as it's used in the public API
| const regex_pcre2_compile_context_t compile_context; | ||
| const regex_pcre2_match_context_t match_context; | ||
| regex_pcre2_match_data_t regex_pcre2_match_data; | ||
| const kphp::pcre2::regex_general_context_t regex_pcre2_general_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks odd having these members const while they are used in a non-const contexts, e.g. pcre2_match_8
| const kphp::pcre2::regex_match_context_t match_context; | ||
| kphp::pcre2::regex_match_data_t regex_pcre2_match_data; | ||
|
|
||
| RegexInstanceState() noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about moving the constructor definition into this header?
| #include "runtime-light/coroutine/task.h" | ||
| #include "runtime-light/coroutine/type-traits.h" | ||
| #include "runtime-light/stdlib/diagnostics/logs.h" | ||
| #include "runtime-light/stdlib/string/regex-include.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment that this include is actually used
| struct info final { | ||
| const string& regex; | ||
| const string& subject; | ||
| string replacement; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason replacement's type differs from regex and subject ones?
| using reference = kphp::pcre2::group_name; | ||
|
|
||
| group_name_iterator() = delete; | ||
| group_name_iterator(const PCRE2_UCHAR8* current_entry, size_t entry_size) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that:
- this API can be made more safe by at least checking if the pointer is nullptr
- this API should not be public
| m_entry_size{entry_size} {} | ||
|
|
||
| kphp::pcre2::group_name operator*() const noexcept { | ||
| enum class index_bytes { upper, lower, count }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a really good use case for enums? I think it can be simplified a lot with simple constants
|
|
||
| class group_name_iterator { | ||
| const PCRE2_UCHAR8* m_ptr{nullptr}; | ||
| size_t m_entry_size{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be marked const I think
| } | ||
|
|
||
| bool operator==(const group_name_iterator& other) const noexcept { | ||
| return m_ptr == other.m_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to also check m_entry_size?
| } | ||
| }; | ||
|
|
||
| group_name_range names() const noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to group_names? I think names is meaningless here
In this PR I'm adding pcre2-functions.h and pcre2-functions.cpp, into which I've extracted all PCRE2 specifics.