Bidirectional HTTP gateway for PFC cold archives — no DuckDB required.
pfc-gateway makes PFC archives on S3 (or local storage) queryable by any tool — Grafana, Python, curl, PowerBI — through a simple HTTP API. It also receives NDJSON from Fluent Bit, Vector, Telegraf, or any HTTP client and compresses it to .pfc archives automatically.
Part of the PFC Ecosystem.
[Fluent Bit / Vector / Telegraf / curl]
│
▼ POST /ingest — push NDJSON rows
pfc-gateway (this server) ←─────────── also receives data
│
├── .pfc_buffer.jsonl (live buffer)
└── ingest_<ts>.pfc (auto-rotated on size or time)
[Grafana / Python / curl / PowerBI / your own tools]
│
▼ POST /query — HTTP REST, no client library needed
pfc-gateway (this server) ────────────► serves data
│
▼ pfc_jsonl s3-fetch — HTTP Range requests
.pfc archives on S3 / local
│
▼ only ~4% of the archive is read per query
NDJSON stream back to client
One server. Ingest from any tool. Query from any tool. No DuckDB, no custom plugins.
# 1. Install pfc_jsonl binary (required)
# Linux x64:
curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 \
-o /usr/local/bin/pfc_jsonl && chmod +x /usr/local/bin/pfc_jsonl
# macOS (Apple Silicon / M1–M4):
curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-macos-arm64 \
-o /usr/local/bin/pfc_jsonl && chmod +x /usr/local/bin/pfc_jsonl
# 2. Install pfc-gateway
git clone https://github.com/ImpossibleForge/pfc-gateway
cd pfc-gateway
pip install fastapi uvicorn boto3 python-dateutil
# 3. Start the server
PFC_API_KEY=your-secret-key uvicorn pfc_gateway:app --host 0.0.0.0 --port 8765License note: This tool requires the
pfc_jsonlbinary.pfc_jsonlis free for personal and open-source use — commercial use requires a separate license. See pfc-jsonl for details.
AWS credentials are read from the standard locations (~/.aws/credentials, environment variables, IAM role). No extra config needed.
Enable ingest by setting PFC_INGEST_DIR. The gateway appends rows to a buffer file
and rotates it to a compressed .pfc file when a size or time threshold is reached.
# Start gateway with ingest enabled
PFC_API_KEY=secret PFC_INGEST_DIR=/data/pfc \
uvicorn pfc_gateway:app --host 0.0.0.0 --port 8765# JSON array
curl -s -X POST http://localhost:8765/ingest \
-H "X-API-Key: secret" \
-H "Content-Type: application/json" \
-d '[{"ts":"2026-04-21T10:00:00Z","level":"INFO","msg":"server started"}]'
# NDJSON (Fluent Bit json_stream / Vector ndjson)
printf '{"ts":"2026-04-21T10:00:01Z","level":"WARN","msg":"high cpu"}\n' | \
curl -s -X POST http://localhost:8765/ingest \
-H "X-API-Key: secret" \
-H "Content-Type: application/x-ndjson" \
--data-binary @-[OUTPUT]
Name http
Match *
Host your-server
Port 8765
URI /ingest
Format json # sends JSON array — pfc-gateway auto-detects
Header X-API-Key secret[sinks.pfc_gateway]
type = "http"
inputs = ["your_source"]
uri = "http://your-server:8765/ingest"
encoding.codec = "ndjson"
[sinks.pfc_gateway.request.headers]
X-API-Key = "secret"curl -s -X POST http://localhost:8765/ingest/flush \
-H "X-API-Key: secret"
# → {"flushed": true, "rows": 4821, "file": "/data/pfc/ingest_20260421T103045.pfc"}curl -s http://localhost:8765/ingest/status -H "X-API-Key: secret"
# → {"enabled": true, "buffer_rows": 142, "buffer_mb": 0.021,
# "last_flush_age_sec": 312.4, "rotate_mb": 64, "rotate_sec": 3600, ...}curl -s \
-H "X-API-Key: your-secret-key" \
-H "Content-Type: application/json" \
-X POST http://localhost:8765/query \
-d '{
"file": "s3://my-archive/pfc/logs_2026-03.pfc",
"from_ts": "2026-03-05T10:00:00Z",
"to_ts": "2026-03-05T12:00:00Z",
"filter": {"level": "ERROR"}
}'Response: NDJSON stream
{"ts":"2026-03-05T10:14:32Z","level":"ERROR","message":"connection refused","host":"web-03"}
{"ts":"2026-03-05T11:02:19Z","level":"ERROR","message":"disk full","host":"db-01"}curl -s \
-H "X-API-Key: your-secret-key" \
-X POST http://localhost:8765/query \
-H "Content-Type: application/json" \
-d '{"file":"/data/archive/logs_march.pfc","from_ts":"2026-03-01","to_ts":"2026-04-01"}'import requests, json
resp = requests.post(
"http://localhost:8765/query",
headers={"X-API-Key": "your-secret-key"},
json={
"file": "s3://my-archive/pfc/logs_2026-03.pfc",
"from_ts": "2026-03-05T10:00Z",
"to_ts": "2026-03-05T12:00Z",
},
stream=True,
)
for line in resp.iter_lines():
row = json.loads(line)
print(row["ts"], row.get("level"), row.get("message"))curl -s \
-H "X-API-Key: your-secret-key" \
-X POST http://localhost:8765/query/batch \
-H "Content-Type: application/json" \
-d '{
"files": [
"s3://my-archive/pfc/logs_2026-01.pfc",
"s3://my-archive/pfc/logs_2026-02.pfc",
"s3://my-archive/pfc/logs_2026-03.pfc"
],
"from_ts": "2026-01-15T00:00Z",
"to_ts": "2026-03-15T00:00Z"
}'Files are queried in order. Results stream back as a single combined NDJSON response.
If DuckDB with the pfc extension is installed
on the gateway server, you can run full SQL queries against .pfc archives:
curl -X POST http://localhost:8765/query/sql \
-H "x-api-key: secret" \
-H "Content-Type: application/json" \
-d '{
"sql": "SELECT json_extract_string(line, '"'"'$.level'"'"') AS level, COUNT(*) AS cnt FROM pfc_scan('"'"'/var/lib/pfc/logs.pfc'"'"') GROUP BY level ORDER BY cnt DESC"
}'Supports any DuckDB SQL — GROUP BY, AVG, JOIN across multiple files, window functions:
-- Avg latency per service, last hour
SELECT json_extract_string(line, '$.service') AS service,
ROUND(AVG(json_extract(line, '$.latency_ms')::FLOAT), 1) AS avg_ms
FROM pfc_scan('/var/lib/pfc/logs.pfc')
GROUP BY service ORDER BY avg_ms DESC;Check if SQL mode is available on your gateway instance:
curl http://localhost:8765/ -H "x-api-key: secret"
# {"status":"ok","version":"0.3.0","binary":"...","sql_mode":true}sql_mode: false means DuckDB is not installed — standard /query still works normally.
Setup:
# Install DuckDB
curl -L https://github.com/duckdb/duckdb/releases/latest/download/duckdb_cli-linux-amd64.gz \
| gunzip > /usr/local/bin/duckdb && chmod +x /usr/local/bin/duckdb
# Install pfc extension
duckdb -c "INSTALL pfc FROM community;"pfc-gateway implements the Grafana SimpleJSON data source protocol.
Setup (takes 2 minutes):
- In Grafana → Settings → Data Sources → Add data source
- Search for SimpleJSON (install plugin if needed)
- URL:
http://your-server:8765/grafana - Custom HTTP Header:
X-API-Key= your secret key - Save & Test → should show "Data source is working"
In a dashboard panel:
- Target:
s3://my-archive/pfc/logs_2026-03.pfc - Optional filter:
s3://my-archive/logs.pfc|{"level":"ERROR"}
Grafana's time range picker controls from_ts and to_ts automatically.
Without pfc-gateway: Only the last 30 days (hot live data) visible in Grafana.
With pfc-gateway:
- Panel 1: Live DB data source (last 30 days)
- Panel 2: pfc-gateway data source (months/years of cold PFC archives)
Both panels in the same Grafana dashboard. No re-import. No DuckDB.
| Field | Type | Description |
|---|---|---|
file |
string | Local path or s3://bucket/key.pfc |
from_ts |
string | ISO 8601 start time (inclusive) |
to_ts |
string | ISO 8601 end time (exclusive) |
filter |
object | Optional equality filter {"level": "ERROR"} |
aws_profile |
string | Optional AWS profile name |
Returns: application/x-ndjson stream.
Same as /query but with files: [...] array instead of single file.
Health check. Returns {"status": "ok", "version": "0.2.0"}.
Accepts rows in three formats (auto-detected):
- JSON array:
[{...}, {...}] - Object with rows key:
{"rows": [{...}, ...]} - Raw NDJSON:
{...}\n{...}\n
Returns: {"accepted": N}
Requires PFC_INGEST_DIR to be set (returns 503 otherwise).
Force-compresses the current buffer to a .pfc file immediately.
Returns: {"flushed": true, "rows": N, "file": "/path/to/ingest_<ts>.pfc"} or {"flushed": false, "reason": "empty"}.
Returns buffer statistics: row count, byte size, age since last flush, last output file, rotation thresholds.
| Variable | Default | Description |
|---|---|---|
PFC_API_KEY |
(none — auth off) | API key for X-API-Key header |
PFC_JSONL_BINARY |
/usr/local/bin/pfc_jsonl |
Path to pfc_jsonl binary |
PFC_HOST |
0.0.0.0 |
Bind address |
PFC_PORT |
8765 |
Port |
PFC_PRESIGN_TTL |
3600 |
Pre-signed URL TTL in seconds |
AWS_DEFAULT_REGION |
eu-central-1 |
AWS region for S3 |
| Variable | Default | Description |
|---|---|---|
PFC_INGEST_DIR |
(none — ingest off) | Directory for buffer + output .pfc files |
PFC_INGEST_ROTATE_MB |
64 |
Rotate when buffer reaches this size (MB) |
PFC_INGEST_ROTATE_SEC |
3600 |
Rotate when buffer is older than this (seconds) |
PFC_INGEST_PREFIX |
ingest |
Output filename prefix: ingest_<ts>.pfc |
Standard AWS variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_PROFILE) are respected automatically.
# /etc/systemd/system/pfc-gateway.service
[Unit]
Description=pfc-gateway — PFC cold archive bidirectional gateway
After=network.target
[Service]
Type=simple
User=pfc
WorkingDirectory=/opt/pfc-gateway
ExecStart=/usr/bin/uvicorn pfc_gateway:app --host 0.0.0.0 --port 8765
Restart=on-failure
Environment=PFC_API_KEY=your-secret-key
Environment=AWS_DEFAULT_REGION=eu-central-1
Environment=PFC_INGEST_DIR=/data/pfc # omit to disable ingest
Environment=PFC_INGEST_ROTATE_MB=64
Environment=PFC_INGEST_ROTATE_SEC=3600
[Install]
WantedBy=multi-user.targetsudo systemctl enable --now pfc-gatewaydocker run -d \
-p 8765:8765 \
-e PFC_API_KEY=your-secret-key \
-e AWS_ACCESS_KEY_ID=... \
-e AWS_SECRET_ACCESS_KEY=... \
--name pfc-gateway \
impossibleforge/pfc-gateway:latestYour data sources
│
├── pfc-migrate (one-shot export)
├── pfc-archiver-* (autonomous daemon)
├── pfc-fluentbit (live pipeline)
└── pfc-gateway (POST /ingest ← NEW) ← this repo
│
▼
.pfc archives (local / S3 / Azure / GCS)
│
┌─────────┴──────────┐
│ │
▼ ▼
pfc-duckdb pfc-gateway ← this repo
SQL queries HTTP REST
(DuckDB needed) (no DuckDB)
│ │
▼ ▼
Python / CLI Grafana / PowerBI / curl / own tools
Fluent Bit / Vector / Telegraf (ingest)
| Tool | What | DuckDB needed |
|---|---|---|
pfc-migrate |
One-shot export to .pfc |
No |
pfc-archiver-* |
Autonomous archive daemon | No |
pfc-fluentbit |
Live pipeline → .pfc |
No |
pfc-duckdb |
SQL queries on PFC files | Yes |
pfc-gateway |
HTTP REST — any tool | No |
- pfc-jsonl — core binary (compress/decompress/query)
- pfc-migrate — one-shot export and archive conversion
- pfc-fluentbit — live Fluent Bit → PFC pipeline
- pfc-duckdb — DuckDB extension for SQL queries on PFC files
- pfc-vector — high-performance Rust ingest daemon for Vector.dev and Telegraf
- pfc-otel-collector — OpenTelemetry OTLP/HTTP log exporter
- pfc-kafka-consumer — Kafka / Redpanda consumer → PFC
- pfc-telegraf — Telegraf HTTP output plugin → PFC
- pfc-grafana — Grafana data source plugin for PFC archives
ImpossibleForge — github.com/ImpossibleForge Contact: info@impossibleforge.com
pfc-gateway (this repository) is released under the MIT License — see LICENSE.
The PFC-JSONL binary () is proprietary software — free for personal and open-source use. Commercial use requires a license: info@impossibleforge.com