Hybrid log parser (Drain3-inspired + LLM-ready) for production log analysis. Parse, mine templates, detect anomalies, extract entities, search, and compute statistics across multiple log formats.
+-------------------+
| CLI Interface |
| (Click commands) |
+--------+----------+
|
+------------------+------------------+
| | |
+--------v-------+ +------v------+ +---------v--------+
| Parser | | Miner | | Anomaly Detector |
| (Multi-format) | | (Drain3) | | (Frequency-based)|
+--------+--------+ +------+------+ +---------+--------+
| | |
| +------------+------------+ |
| | | |
+--------v-----v--+ +---------v------v--+
| Entity | | Statistics |
| Extraction | | Engine |
+--------+---------+ +---------+----------+
| |
+--------v--------------------------------v--+
| Data Models |
| (ParsedLog, LogTemplate, Anomaly, Stats) |
+---------------------------------------------+
Supported Formats:
+----------+ +-----------+ +-----+ +------+
| Syslog | | JSON | | CEF | | LEEF |
| RFC3164 | | Structured| | | | |
| RFC5424 | | | | | | |
+----------+ +-----------+ +-----+ +------+
+----------------+ +------------+
| Windows Event | | Plain Text |
| XML | | (regex) |
+----------------+ +------------+
- Multi-format parsing -- Syslog RFC3164/5424, JSON structured, CEF, LEEF, Windows Event XML, and plain text with regex-based extraction
- Template mining -- Drain3-inspired algorithm using a fixed-depth parse tree for automatic log template extraction; groups similar log lines and replaces variables with
<*>wildcards - Anomaly detection -- Frequency-based detection of new templates, frequency spikes, rare templates, volume anomalies (z-score), and time gap anomalies
- Entity extraction -- Extracts IPv4/IPv6 addresses, hostnames, emails, timestamps, error codes, MAC addresses, ports, usernames, PIDs, file paths, URLs, stack traces, and Java exceptions
- Search and filter -- Full-text search, severity/source/hostname/time-range/template-id/regex filters, grouping by template/severity/source
- Statistics -- Log volume over time, severity breakdown, source distribution, top templates, error rate computation
pip install .For development:
pip install -e ".[dev]"# Parse from file (auto-detects format)
logforge parse /var/log/syslog
# Parse from stdin
cat /var/log/app.log | logforge parse
# Parse with entity extraction
logforge parse --extract-entities /var/log/auth.log
# Force format
logforge parse --format json app-logs.jsonl
# Limit lines
logforge parse --max-lines 1000 huge.log# Extract templates from logs
logforge mine /var/log/syslog
# Tune similarity threshold (0.0 - 1.0)
logforge mine --threshold 0.6 /var/log/app.log
# Adjust parse tree depth
logforge mine --depth 5 /var/log/app.log# Full-text search
logforge search -q "connection refused" /var/log/app.log
# Filter by severity
logforge search --severity error /var/log/syslog
# Filter by minimum severity (WARNING and above)
logforge search --severity-min warning /var/log/syslog
# Regex search
logforge search -r "E\d{4}" /var/log/app.log
# Group by severity
logforge search --group-by severity /var/log/syslog
# Combine filters
logforge search -q "timeout" --severity error --source nginx --limit 50 access.loglogforge stats /var/log/syslog
logforge stats --max-lines 10000 /var/log/app.log# Basic anomaly detection
logforge anomaly /var/log/app.log
# Custom time window (seconds)
logforge anomaly --window 600 /var/log/app.log
# Adjust spike sensitivity
logforge anomaly --spike-threshold 2.0 /var/log/app.logAll commands support -o for file output:
logforge parse -o parsed.json /var/log/syslog
logforge mine -o templates.json /var/log/app.log
logforge stats -o report.json /var/log/syslogBuild and run with Docker:
docker build -t logforge .
docker run --rm -v /var/log:/data/logs logforge parse /data/logs/syslogDocker Compose (full stack):
docker-compose up -dServices:
parser-engine-- Log parsing servicelog-store-- Log storage backendapi-- API service
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Lint
ruff check src/ tests/
# Run with coverage
pytest --cov=logforge --cov-report=term-missinglogforge/
src/logforge/
__init__.py # Package version
models.py # Data classes (ParsedLog, LogTemplate, Anomaly, LogStats)
parser.py # Multi-format log parser
miner.py # Drain3-inspired template mining
entities.py # Entity extraction (IPs, hostnames, errors, etc.)
anomaly.py # Anomaly detection engine
search.py # Search, filter, and grouping
stats.py # Statistics computation
cli.py # Click CLI (parse, mine, search, stats, anomaly)
tests/ # 20+ pytest test cases
pyproject.toml # Hatchling build config
Dockerfile
docker-compose.yml
.github/workflows/ci.yml
LICENSE # MIT
README.md
MIT License. Copyright (c) 2026 Corey Wade.