Spam filter to work with flags from Rspamd and bogofilter#300
Spam filter to work with flags from Rspamd and bogofilter#300Titan-C wants to merge 1 commit intoafewmail:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #300 +/- ##
==========================================
- Coverage 47.08% 46.86% -0.22%
==========================================
Files 30 30
Lines 1079 1084 +5
==========================================
Hits 508 508
- Misses 571 576 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
flokli
left a comment
There was a problem hiding this comment.
This also needs updates to the documentation, describing the behaviour, and a test (especially considering this changes HeaderMatchingFilter, which is used by users directly, and in ListMailsFilter too.
| self.header = [self.header] | ||
|
|
||
| try: | ||
| value = next(filter(None, map(message.get_header, self.header))) |
There was a problem hiding this comment.
This is pretty hard to read. I assume it'll call message.get_header() with every header configured in the filter, and filter out results that are an empty string (which can be the case if one of the headers doesn't exist)?
What's the next() in front of everything doing here? Why is value still a single element? What happens if the filter returns multiple elements?
There was a problem hiding this comment.
Yes, you read it right.
value is a single element, the first matching header. This is because the HeaderMatchingFilter only matches a string, it only has a pattern to match a string. Up until now the filter only matches to one header. For spam, as there are many spam flagging tools, you get different headers depending on the tool, thus you need to be able to search for many headers.
I assume, sysadmins don't run their many spam filters at one, they just pick a tool. Thus I pick the first not empty match(that is what next does), which might be the only match. Later I just let the the same logic of the filter run its way and match the pattern to the string. There I use the benefits of the regex. Spamassasing and Rspamd Mark with YES, their spam, bogofilter uses Spam.
There was a problem hiding this comment.
While it makes sense in the scope of SpamFilter, this changes semantics in HeaderMatchingFilter in a non-intuitive and undocumented way.
Can we expose this in HeaderMatchingFilter in a less confusing way, and have it documented?
I started using bogofilter as it is low on resources, and enough for my email load. I realized I was missing other header matching options for spam.
This also Closes #299, for Rspamd
P.S. Happy if you also give this PR a hacktoberfest-accepted label