-
Notifications
You must be signed in to change notification settings - Fork 0
Alerting
Logsenta supports real-time alerting when error patterns are detected in your Kubernetes pod logs. Alerts can be sent via email (SMTP) and/or webhooks (Slack, Microsoft Teams, PagerDuty, or custom endpoints).
- How Alerting Works
- Enabling Alerting
- Threshold Configuration
- Email Alerting
- Webhook Alerting
- Multiple Alert Rules (Pattern-Based Routing)
- Configuration Reference
- Examples
┌─────────────────────────────────────────────────────────────────────────┐
│ Logsenta Alerting Flow │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Pod Logs ──► Error Pattern ──► Alert Tracker ──► Threshold Check │
│ Detected (count) (reached?) │
│ │ │ │
│ │ ┌────┴────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Increment No Alert Send Alert │
│ Counter │ │
│ │ │
│ ┌─────────────────┼────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Email Webhooks │
│ (SMTP) (Slack/Teams/PD) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
The alerting system uses a threshold-based approach to prevent alert fatigue:
- Error Detection: When an error pattern is detected in pod logs
- Tracking: The AlertTracker records the occurrence with timestamp
- Window Check: Old occurrences outside the time window are removed
- Threshold Check: If occurrences ≥ threshold count within the window, trigger alert
- Cooldown: After alerting, wait for cooldown period before re-alerting the same pattern
Example: With count: 2, windowSeconds: 300, cooldownSeconds: 600:
- First error at 10:00:00 → No alert (count: 1)
- Second error at 10:03:00 → Alert triggered! (count: 2 within 5 min)
- Third error at 10:05:00 → No alert (in cooldown until 10:13:00)
- Fourth error at 10:15:00 → Alert triggered (cooldown expired, new window)
Enable alerting globally in your values.yaml:
alerting:
enabled: trueOr via Helm install:
helm install logsenta ./charts/logsenta-engine \
--set alerting.enabled=true| Parameter | Description | Default |
|---|---|---|
alerting.threshold.count |
Minimum errors before alerting | 2 |
alerting.threshold.windowSeconds |
Time window for counting errors |
300 (5 min) |
alerting.threshold.cooldownSeconds |
Cooldown before re-alerting same pattern |
600 (10 min) |
alerting.threshold.groupByNamespace |
Group alerts by namespace | true |
alerting.threshold.groupByPod |
Group alerts by pod name | false |
alerting:
enabled: true
threshold:
count: 2 # Alert after 2 errors
windowSeconds: 300 # Within 5 minutes
cooldownSeconds: 600 # Wait 10 min before re-alerting
groupByNamespace: true
groupByPod: false- groupByNamespace: true (default): Errors from different namespaces are tracked separately
- groupByPod: true: Errors from different pods are tracked separately (more granular)
| groupByNamespace | groupByPod | Behavior |
|---|---|---|
| true | false | Same pattern in same namespace = single alert |
| true | true | Same pattern in same pod = single alert |
| false | false | Same pattern anywhere = single alert |
Send alerts via SMTP email.
| Parameter | Description | Default |
|---|---|---|
alerting.email.enabled |
Enable email alerts | false |
alerting.email.smtpHost |
SMTP server hostname | smtp.gmail.com |
alerting.email.smtpPort |
SMTP server port | 587 |
alerting.email.useTls |
Use STARTTLS | true |
alerting.email.useSsl |
Use SSL (for port 465) | false |
alerting.email.username |
SMTP username | "" |
alerting.email.password |
SMTP password | "" |
alerting.email.fromAddress |
Sender email address | "" |
alerting.email.toAddresses |
Recipient email addresses | [] |
alerting.email.ccAddresses |
CC email addresses | [] |
alerting.email.subjectPrefix |
Email subject prefix | [Logsenta Alert] |
alerting:
enabled: true
email:
enabled: true
smtpHost: "smtp.gmail.com"
smtpPort: 587
useTls: true
username: "alerts@yourcompany.com"
password: "your-app-password" # Use App Password for Gmail
fromAddress: "logsenta@yourcompany.com"
toAddresses:
- "sre-team@yourcompany.com"
- "oncall@yourcompany.com"
ccAddresses:
- "devops-lead@yourcompany.com"
subjectPrefix: "[PROD Alert]"For Gmail, you need to use an App Password:
- Enable 2-Factor Authentication on your Google account
- Go to Security → App Passwords
- Generate a new App Password for "Mail"
- Use this password in
alerting.email.password
Send alerts to Slack, Microsoft Teams, PagerDuty, or custom endpoints.
| Parameter | Description | Default |
|---|---|---|
name |
Webhook identifier | - |
enabled |
Enable this webhook | false |
type |
Webhook type: slack, teams, pagerduty, generic
|
- |
url |
Webhook URL | "" |
timeout |
Request timeout (seconds) | 30 |
verifySsl |
Verify SSL certificate | true |
headers |
Custom HTTP headers | {} |
alerting:
enabled: true
webhooks:
- name: "slack"
enabled: true
type: "slack"
url: "https://hooks.slack.com/services/T00/B00/XXXX"
timeout: 30
verifySsl: trueSlack Alert Format:
🚨 Logsenta Alert
┌──────────────────────────────────
│ Pattern: NullPointerException
│ Namespace: production
│ Pod: api-server-abc123
│ Container: main
│ Occurrences: 5
│ Time: 2026-04-03 10:15:30
├──────────────────────────────────
│ Error:
│ java.lang.NullPointerException at...
└──────────────────────────────────
alerting:
enabled: true
webhooks:
- name: "teams"
enabled: true
type: "teams"
url: "https://outlook.office.com/webhook/..."
timeout: 30
verifySsl: truealerting:
enabled: true
webhooks:
- name: "pagerduty"
enabled: true
type: "pagerduty"
url: "https://events.pagerduty.com/v2/enqueue"
timeout: 30
verifySsl: true
headers:
X-Routing-Key: "YOUR_PAGERDUTY_ROUTING_KEY"For custom alerting endpoints:
alerting:
enabled: true
webhooks:
- name: "custom"
enabled: true
type: "generic"
url: "https://your-alerting-system.com/api/alerts"
timeout: 30
verifySsl: true
headers:
Authorization: "Bearer YOUR_API_TOKEN"
X-Custom-Header: "value"Generic Payload Format:
{
"alert_type": "logsenta_error",
"pattern": "NullPointerException",
"namespace": "production",
"pod_name": "api-server-abc123",
"container": "main",
"error_line": "java.lang.NullPointerException at...",
"occurrences": 5,
"timestamp": "2026-04-03T10:15:30Z",
"log_context": ["line1", "line2", "..."]
}For advanced use cases, you can define multiple alert rules to route specific error patterns to different recipients with different thresholds.
┌─────────────────────────────────────────────────────────────────────────┐
│ Rule-Based Alert Routing │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Error Detected ──► Match Rules ──► Found? │
│ │ │ │
│ │ ┌────┴────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Rule 1 No Match Use Default │
│ Rule 2 ──────────► Global Config │
│ Rule N │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Rule Config │ │
│ ├──────────────┤ │
│ │ • Threshold │ │
│ │ • Email │──► Team-specific alerts │
│ │ • Webhooks │ │
│ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Each rule can have:
| Field | Description | Required |
|---|---|---|
name |
Unique rule identifier | Yes |
patterns |
List of regex/string patterns to match | Yes |
threshold |
Custom threshold for this rule | No |
email |
Rule-specific email recipients | No |
webhooks |
List of webhook names to trigger | No |
namespaces |
Limit rule to specific namespaces | No |
alerting:
enabled: true
# Global SMTP configuration (shared by all rules)
email:
enabled: true
smtpHost: "smtp.example.com"
smtpPort: 587
useTls: true
username: "alerts@example.com"
password: "your-password"
fromAddress: "logsenta@example.com"
# Webhooks available for rules to reference
webhooks:
- name: "slack"
enabled: true
type: "slack"
url: "https://hooks.slack.com/services/T00/B00/xxx"
- name: "pagerduty"
enabled: true
type: "pagerduty"
url: "https://events.pagerduty.com/v2/enqueue"
headers:
X-Routing-Key: "YOUR_KEY"
- name: "teams"
enabled: true
type: "teams"
url: "https://outlook.office.com/webhook/..."
# Pattern-based alert rules
rules:
# Critical errors - immediate PagerDuty + Email to on-call
- name: "critical-errors"
patterns:
- "CRITICAL"
- "FATAL"
- "OOMKilled"
- "panic:"
threshold:
count: 1 # Alert immediately
windowSeconds: 60
cooldownSeconds: 300 # 5 min cooldown
email:
enabled: true
toAddresses:
- "oncall@example.com"
- "sre-critical@example.com"
webhooks:
- "pagerduty"
- "slack"
# Java exceptions - notify Java dev team
- name: "java-exceptions"
patterns:
- "NullPointerException"
- "IllegalArgumentException"
- "ClassNotFoundException"
- "at\\s+[\\w.$]+\\([\\w]+\\.java:\\d+\\)"
threshold:
count: 3 # Alert after 3 occurrences
windowSeconds: 300 # Within 5 minutes
cooldownSeconds: 600
email:
enabled: true
toAddresses:
- "java-team@example.com"
webhooks:
- "slack"
namespaces: # Only for these namespaces
- "production"
- "staging"
# Python errors - notify Python team
- name: "python-errors"
patterns:
- "Traceback"
- "File\\s+\"[^\"]+\",\\s+line\\s+\\d+"
- "TypeError"
- "ValueError"
- "ImportError"
threshold:
count: 2
windowSeconds: 300
cooldownSeconds: 600
email:
enabled: true
toAddresses:
- "python-team@example.com"
webhooks:
- "teams"
# Database errors - DBA team with high priority
- name: "database-errors"
patterns:
- "ConnectionRefused"
- "connection timeout"
- "database.*unavailable"
- "SQLSTATE"
- "deadlock"
threshold:
count: 2
windowSeconds: 120 # 2 minutes
cooldownSeconds: 300
email:
enabled: true
toAddresses:
- "dba-team@example.com"
webhooks:
- "pagerduty"
# Security alerts - Security team
- name: "security-alerts"
patterns:
- "unauthorized"
- "authentication failed"
- "access denied"
- "permission denied"
- "invalid token"
threshold:
count: 5 # Higher threshold to avoid noise
windowSeconds: 300
cooldownSeconds: 600
email:
enabled: true
toAddresses:
- "security-team@example.com"
webhooks:
- "slack"- Pattern Matching: Each error is checked against all rules in order
- Multiple Matches: An error can match multiple rules (triggers all matching rules)
-
Namespace Filter: If
namespacesis specified, rule only applies to those namespaces -
No Match: Errors not matching any rule use the global
threshold/email/webhooksconfig - Independent Tracking: Each rule has its own threshold counter and cooldown
| Use Case | Configuration |
|---|---|
| Critical alerts → PagerDuty |
patterns: ["FATAL", "OOMKilled"], webhooks: ["pagerduty"]
|
| Team-specific routing | Different email.toAddresses per rule |
| Namespace isolation |
namespaces: ["production"] to only alert on prod |
| High-severity = low threshold |
threshold.count: 1 for critical patterns |
| Low-severity = high threshold |
threshold.count: 10 for warnings |
| Feature | Global Alerting | Rule-Based Alerting |
|---|---|---|
| Configuration | Single threshold/recipients | Multiple rules with different configs |
| Pattern routing | All patterns → same destination | Pattern → specific destination |
| Threshold flexibility | One threshold for all | Per-rule thresholds |
| Complexity | Simple | More complex but flexible |
| Use when | Small teams, simple needs | Large orgs, multiple teams |
These environment variables are automatically set by the Helm chart:
| Variable | Description |
|---|---|
ALERTING_ENABLED |
Enable alerting (true/false) |
ALERTING_THRESHOLD_COUNT |
Error count threshold |
ALERTING_THRESHOLD_WINDOW_SECONDS |
Time window in seconds |
ALERTING_THRESHOLD_COOLDOWN_SECONDS |
Cooldown period in seconds |
ALERTING_GROUP_BY_NAMESPACE |
Group by namespace |
ALERTING_GROUP_BY_POD |
Group by pod |
ALERTING_EMAIL_ENABLED |
Enable email alerts |
ALERTING_EMAIL_SMTP_HOST |
SMTP host |
ALERTING_EMAIL_SMTP_PORT |
SMTP port |
ALERTING_EMAIL_USE_TLS |
Use TLS |
ALERTING_EMAIL_FROM |
From address |
ALERTING_EMAIL_TO |
To addresses (comma-separated) |
ALERTING_EMAIL_SUBJECT_PREFIX |
Subject prefix |
ALERTING_SMTP_USERNAME |
SMTP username (from Secret) |
ALERTING_SMTP_PASSWORD |
SMTP password (from Secret) |
ALERTING_WEBHOOKS_CONFIG |
Webhooks JSON config |
Sensitive values are stored in Kubernetes Secrets:
| Secret Key | Description |
|---|---|
alerting-smtp-username |
SMTP username (base64 encoded) |
alerting-smtp-password |
SMTP password (base64 encoded) |
alerting-webhook-{name}-url |
Webhook URL (base64 encoded) |
alerting:
enabled: true
threshold:
count: 3 # Alert after 3 errors
windowSeconds: 300 # Within 5 minutes
cooldownSeconds: 1800 # 30 min cooldown (reduce noise)
groupByNamespace: true
groupByPod: false
email:
enabled: true
smtpHost: "smtp.sendgrid.net"
smtpPort: 587
useTls: true
username: "apikey"
password: "SG.xxxx" # SendGrid API key
fromAddress: "alerts@yourcompany.com"
toAddresses:
- "sre-oncall@yourcompany.com"
subjectPrefix: "[PROD K8s Alert]"
webhooks:
- name: "slack-critical"
enabled: true
type: "slack"
url: "https://hooks.slack.com/services/T00/B00/xxx"For critical namespaces where every error matters:
alerting:
enabled: true
threshold:
count: 1 # Alert on first error
windowSeconds: 60 # 1 minute window
cooldownSeconds: 300 # 5 min cooldown
groupByNamespace: true
groupByPod: true # Per-pod tracking
webhooks:
- name: "pagerduty"
enabled: true
type: "pagerduty"
url: "https://events.pagerduty.com/v2/enqueue"
headers:
X-Routing-Key: "YOUR_CRITICAL_SERVICE_KEY"For non-critical environments with reduced alerting:
alerting:
enabled: true
threshold:
count: 10 # High threshold
windowSeconds: 600 # 10 minute window
cooldownSeconds: 3600 # 1 hour cooldown
groupByNamespace: false # Aggregate all namespaces
webhooks:
- name: "slack-dev"
enabled: true
type: "slack"
url: "https://hooks.slack.com/services/T00/B00/dev-channel"-
Check alerting is enabled:
kubectl get configmap -n logsenta -o yaml | grep ALERTING_ENABLED -
Check threshold settings:
- Is
counttoo high? - Is
windowSecondstoo short? - Is the pattern still in cooldown?
- Is
-
Check pod logs:
kubectl logs -n logsenta deployment/logsenta-engine | grep -i alert
-
Verify SMTP credentials:
kubectl get secret -n logsenta logsenta-engine-credentials -o jsonpath='{.data.alerting-smtp-password}' | base64 -d
-
Check firewall/network:
- Ensure pod can reach SMTP server (port 587 or 465)
-
Gmail issues:
- Use App Password, not regular password
- Enable "Less secure app access" (not recommended) or use App Password
-
Verify webhook URL:
kubectl get secret -n logsenta logsenta-engine-credentials -o jsonpath='{.data.alerting-webhook-slack-url}' | base64 -d
-
Test webhook manually:
curl -X POST -H "Content-Type: application/json" \ -d '{"text":"Test message"}' \ "YOUR_WEBHOOK_URL"
-
Check SSL settings:
- Set
verifySsl: falseif using self-signed certificates
- Set
- Storage Backends - Configure where to store captured logs
- Capacity Planning - Size your deployment appropriately
- Troubleshooting - Common issues and solutions