sievelog

Cut your observability bill by 40%+ in 5 minutes. No migration. No risk.

sievelog is an open-source log filter that sits between your applications and your observability platform (Datadog, Splunk, Grafana, New Relic — anything). It drops the noise before it's ingested, so you pay less without losing signal.

your apps → sievelog → Datadog/Splunk/Grafana
              ↓
         drops 40-60% of noise
         keeps 100% of errors

The problem

You're paying $0.10/GB to ingest logs into Datadog. 90% of those logs are health checks passing, requests completing normally, and cache hits — operational noise nobody looks at unless something breaks. But you pay to ingest, index, and store all of it.

What sievelog does

It reads your log stream, applies a configurable set of rules, and outputs only what matters:

Rule	What it drops	Typical reduction
`drop_levels`	DEBUG/TRACE in production	10-30%
`health_check`	Health/readiness/liveness probes	5-15%
`dedup`	Identical consecutive lines	5-10%
`drop_match`	Known noise patterns (cache hits, pool stats)	5-15%
`field_strip`	Verbose fields (stack traces on INFO, full headers)	5-10% bytes
`rate_limit`	Per-service line rate caps	variable

Errors and warnings always pass through. sievelog never drops ERROR, FATAL, PANIC, or CRITICAL lines regardless of any rule. This is the safety guarantee.

Quick start

# Build from source
go build -o sievelog ./cmd/sievelog/

# Pipe your logs through it
cat your_logs.jsonl | ./sievelog -config sievelog.json > filtered.jsonl

# See what it did
# [sievelog] FINAL lines_in=30 lines_out=15 dropped=15 reduction=43.7%

Example

Input (30 lines of a typical payment service):

{"level":"INFO","msg":"Health check OK","latency_ms":2}
{"level":"INFO","msg":"Request processed","endpoint":"/api/charge","status":200}
{"level":"DEBUG","msg":"Cache hit for customer_id=8827"}
{"level":"INFO","msg":"Health check OK","latency_ms":1}
{"level":"INFO","msg":"Request processed","endpoint":"/api/charge","status":200}
{"level":"ERROR","msg":"DB connection timeout","status":503,"latency_ms":5002}
{"level":"INFO","msg":"Metrics exported successfully"}
...

Output (15 lines — 43.7% reduction):

{"level":"INFO","msg":"Request processed","endpoint":"/api/charge","status":200}
{"level":"ERROR","msg":"DB connection timeout","status":503,"latency_ms":5002}
{"level":"WARN","msg":"Elevated latency detected","latency_ms":312}
...

Dropped: health checks, DEBUG lines, cache hits, metrics noise, pool stats. Kept: all errors, all warnings, all real request traffic.

Config

sievelog uses a JSON config file. Here's the default:

{
  "global": {
    "json_mode": true,
    "level_field": "level",
    "message_field": "msg",
    "passthrough_on_error": true,
    "stats": true
  },
  "rules": [
    {
      "name": "drop_debug",
      "type": "drop_levels",
      "action": "drop",
      "config": { "levels": ["DEBUG", "TRACE"] }
    },
    {
      "name": "drop_health_checks",
      "type": "health_check",
      "action": "drop",
      "config": {
        "patterns": ["health check", "healthcheck", "liveness probe"],
        "endpoints": ["/health", "/healthz", "/ready", "/readyz", "/livez"]
      }
    },
    {
      "name": "dedup",
      "type": "dedup",
      "action": "drop",
      "config": { "window_sec": 60, "min_count": 3 }
    },
    {
      "name": "drop_noise",
      "type": "drop_match",
      "action": "drop",
      "config": {
        "patterns": ["Connection pool stats", "Cache hit for", "Metrics exported"]
      }
    },
    {
      "name": "strip_fields",
      "type": "field_strip",
      "action": "passthrough",
      "config": {
        "fields": ["stack_trace", "full_headers", "request_body"]
      }
    }
  ]
}

Usage with common log shippers

# With Filebeat (output to file, sievelog reads it)
filebeat -e | ./sievelog -config sievelog.json | your-destination

# With Fluentd (pipe through sievelog before forwarding)
<match **>
  @type exec_filter
  command /usr/local/bin/sievelog -config /etc/sievelog.json
</match>

# With Vector (as an external transform)
[transforms.sievelog]
  type = "exec"
  command = ["/usr/local/bin/sievelog", "-config", "/etc/sievelog.json"]

# With Docker
docker logs my-container 2>&1 | ./sievelog -config sievelog.json

# Dry run (process but don't filter — just see stats)
cat production.log | ./sievelog -config sievelog.json -dry-run 2>&1 >/dev/null
# [sievelog] FINAL lines_in=1000000 lines_out=420000 dropped=580000 reduction=58.0%

CLI flags

Flag	Default	Description
`-config`	`sievelog.json`	Path to config file
`-stats`	`false`	Print reduction stats to stderr
`-dry-run`	`false`	Process input, count stats, but forward everything
`-version`		Print version and exit

Safety guarantees

Errors always pass. Lines with level ERROR, FATAL, PANIC, CRIT, or CRITICAL are forwarded regardless of any rule.
Parse failures pass. If passthrough_on_error is true (default), lines that fail JSON parsing are forwarded as-is.
No data loss by default. sievelog only drops what you explicitly configure it to drop. The default config is conservative.

Performance

sievelog is a single Go binary with zero external dependencies. It processes logs line-by-line with minimal memory allocation.

Throughput: 50K+ lines/second on a single core
Memory: ~50MB baseline
Binary size: ~8MB
Startup time: instant

How much will you save?

Run the dry-run mode on a sample of your production logs:

# Grab 1 hour of logs
kubectl logs -l app=your-service --since=1h > sample.log

# See the reduction
cat sample.log | ./sievelog -config sievelog.json -dry-run -stats 2>&1 >/dev/null

If you're sending 1TB/day to Datadog at $0.10/GB and sievelog reduces it by 40%, that's $14,600/month saved.

Roadmap

v0.1: Core rule engine (drop, dedup, health check, field strip, rate limit)
v0.2: Prometheus metrics endpoint (/metrics) for monitoring sievelog itself
v0.3: Helm chart for K8s DaemonSet deployment
v0.4: OTel Collector processor plugin
v0.5: Statistical summaries (replace N identical lines with one summary)
v1.0: ML-based anomaly detection (learn what's "normal" per-service)

License

Apache 2.0

Contributing

Issues and PRs welcome. Start with the sievelog.json config — the best way to help is to add rule patterns that work for your log format.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
pkg		pkg
testdata		testdata
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
sievelog.json		sievelog.json
training.json		training.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sievelog

The problem

What sievelog does

Quick start

Example

Config

Usage with common log shippers

CLI flags

Safety guarantees

Performance

How much will you save?

Roadmap

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

sievelog

The problem

What sievelog does

Quick start

Example

Config

Usage with common log shippers

CLI flags

Safety guarantees

Performance

How much will you save?

Roadmap

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages