Skip to content

Evaluate rules on our own#418

Open
yhabteab wants to merge 12 commits intomainfrom
evaluate-rules-on-our-own
Open

Evaluate rules on our own#418
yhabteab wants to merge 12 commits intomainfrom
evaluate-rules-on-our-own

Conversation

@yhabteab
Copy link
Copy Markdown
Member

@yhabteab yhabteab commented Apr 30, 2026

This PR kind of reverts all the changes mad in v0.2.0 with a slight twist. We will evaluate the rules on our own again (just like pre v0.2.0) but the filter string is expected to have a different format. Icinga Notifications Web will now store the filter string as a JSON stringified object instead of a simple filter string format used before. This is done allegedly to make it easier to parse the filter string but the previous filter string parser is still there as it's used by the escalation filters. The new filter string format is as follows:

{
  "op": "&/|/!",
  "rules": [
    {
      "column": "service.state",
      "op": "==/!=/>/</~/!~",
      "value": "OK"
    },
    {
      "op": "&/|/!",
      "rules": [
        ...
      ]
    }
  ]
}

The JSON object for filter conditions may include other meta information used by Icinga Notifications Web/Sources but the only relevant information for us is the above-mentioned format. The op field is the operator used to combine the filter chain rules and can be either & (AND), | (OR) or ! (NOT). The rules field is an array of rules which can be either a simple condition or a nested filter chain. A simple condition consists of a column, an op and a value. The column is the column name used to extract values from requests as JSONPath expressions, the op is the operator used to compare the JSONPath expression result with the provided value and can be either ==, !=, >, <, ~ (regex match) or !~ (regex not match). If the column is not a valid JSONPath expression, Icinga Notifications will just ignore that rule and won't be loading it from the database. This means that if you have an existing filter string in the old format, they won't be loaded, and you will have to ask the web colleagues if they provide a migration step for the existing filter strings.

Furthermore, prior to v0.2.0, we were evaluating the rules against the Object representation of the request, now they are evaluated against the Event type itself. Specifically, the JSONPath expressions are evaluated against the new relations field of the Event type which is a map of generic key-value pairs extracted from the request. When a source doesn't include that field with all the necessary information in its requests, some or even all the configured rules might not match if they reference the missing information in their object_filter. If the source wishes to be notified about the missing information, it can include the new X-Icinga-Enable-Attributes-Negotiation HTTP header set to true in its requests and Icinga Notifications will respond with a 422 Unprocessable Entity status code (not final, is still up to a discussion) and a JSON body containing the missing attributes. This is a new feature but similar to the previous behavior introduced in v0.2.0 where we were responding with a 412 Precondition Failed status code when the used rules_version was outdated. The source can then extend its requests with the missing information and retry the request.

In the referenced issue, it was mentioned that sources might only re-send the missing information as a follow-up request instead of retrying the entire request. However, it's not worth implementing this behavior just for the sake of this single use case as it would require a lot of additional work and complexity on both the source and Icinga Notifications sides.

Lastly, in a discussion with @nilmerg yesterday, we agreed that Icinga Notifications should abandon all the effects of the ongoing request if it can't successfully notify the source that the request was successfully processed. Currently, the incident starts a database transaction internally and commits it before trying to send the resulting notifications. However, the client will only receive 200 OK if the notifications were sent successfully, and that in turn might not reach the source since the connection might be gone in the meantime (errors are ignored blantantly).

l.logger.Infow("Successfully processed event", zap.String("event", ev.String()))
w.WriteHeader(http.StatusAccepted)
_, _ = fmt.Fprintln(w, "event processed successfully")
_, _ = fmt.Fprintln(w)

if err = tx.Commit(); err != nil {
i.logger.Errorw("Cannot commit db transaction", zap.Error(err))
return err
}
// We've just committed the DB transaction and can safely update the incident muted flag.
i.isMuted = i.Object.IsMuted()
return i.notifyContacts(ctx, ev, notifications)

In that case the source will be left in a state where the request was processed but the source wasn't notified about it, which is not ideal and will definitely lead to some issues when we've added HA support. Icinga Notifications will treat every event as a unique one and won't have any mechanism to detect duplicate events, so if the source retries the request, it will be processed again and might lead to some unwanted side effects. To avoid that, we will need to roll back all the changes made by the request if we fail to notify the source about the successful processing of the request. I didn't implement this behavior in this PR as I see it as a separate task that can be done in a follow-up PR, but if you think it should be included in this PR, please let me know and I will add it. Forget that, that's really something that should be done after or while implementing the HA support.

resolves #406

@cla-bot cla-bot Bot added the cla/signed CLA is signed by all contributors of a PR label Apr 30, 2026
@yhabteab yhabteab added this to the 1.0 milestone Apr 30, 2026
@yhabteab
Copy link
Copy Markdown
Member Author

yhabteab commented Apr 30, 2026

TODO

@yhabteab yhabteab requested a review from oxzi April 30, 2026 12:54
yhabteab added 12 commits April 30, 2026 16:22
Since the event rules are dependent on their corresponding sources, we
don't need to add another layer of indirection via an extra "source ->
IDs" cache in the RuntimeConfig. Instead, we can directly store the rule
IDs within the Source struct and thus bound to the source's lifecycle.
This effectively replaces the previous (prior to v0.2.0) implementation on the `Object` type.
@yhabteab yhabteab force-pushed the evaluate-rules-on-our-own branch from 3a06180 to f2e3457 Compare April 30, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla/signed CLA is signed by all contributors of a PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluate rules on our own again, or in tandem

1 participant