Skip to content

imann128/RelayCore

Repository files navigation

RelayCore

Docker GitHub

A self-hosted event gateway that ingests, verifies, deduplicates, routes, and delivers HTTP webhook events with the same structural discipline a SOC applies to alert pipelines.


Why is it needed

Modern software stacks generate events from dozens of producers: GitHub pushes, form submissions, calendar changes, payment callbacks. Each destination (Slack, Discord, a database, an email service) expects a different payload shape, and each producer has its own signature scheme, retry logic, and delivery guarantees. The naive solution is point-to-point glue code: a script per integration, no visibility, no retries, no deduplication.

RelayCore replaces that with a single, observable ingestion layer. Every event flows through one gateway, gets verified and deduplicated atomically, is stored immutably, and is routed and transformed before delivery along with exponential backoff retries and a dead-letter queue when destinations fail.


Who It's For

  • Engineering teams running multiple webhook integrations who want one place to monitor, debug, and replay events instead of hunting through individual service logs.
  • Students and researchers building event-driven systems who need a reference implementation of a reliable async pipeline with proper idempotency guarantees.
  • Anyone who has lost a webhook event and had no idea why.

How It Resembles a SOC Pipeline

A Security Operations Center ingests alerts from many sensors, normalises them into a common schema, applies detection rules, routes matches to analysts, and keeps an immutable audit trail. RelayCore does the same thing for application events:

SOC Concept RelayCore Equivalent
Sensor / log source Source (one URL endpoint per producer)
Signature / authenticity check HMAC-SHA256 verification per source
Deduplication / event correlation Atomic Redis SET NX idempotency check
Alert normalisation Transformer (payload shape per destination)
Detection rule / routing rule Route (source + event type + JSONPath condition)
Alert fan-out to analysts Fan-out delivery to multiple destinations
Retry / escalation policy Exponential backoff → dead-letter queue
SIEM audit log Immutable WebhookDelivery table
SOC dashboard React monitoring dashboard

The analogy is not cosmetic. The architectural problems are identical: at-least-once delivery, race-free deduplication, observable routing, and graceful degradation when a downstream system is slow or down.


Architecture

┌─────────────────────────────────────────────────────────┐
│                     Producers                           │
│         GitHub · Google Calendar · HTML Forms           │
└───────────────────────┬─────────────────────────────────┘
                        │  POST /webhooks/receive/<slug>/
                        ▼
┌─────────────────────────────────────────────────────────┐
│                  Ingestion Layer (Django)                │
│                                                         │
│  1. Resolve Source by slug                              │
│  2. HMAC-SHA256 signature verification                  │
│  3. Source-level rate limit  (Redis sliding window)     │
│  4. Idempotency check        (Redis atomic SET NX)      │
│  5. Persist WebhookDelivery row  (status = received)    │
│  6. Enqueue Celery task → return 200 immediately        │
└───────────────────────┬─────────────────────────────────┘
                        │
                        ▼
              ┌─────────────────┐
              │  Redis (broker) │
              └────────┬────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────┐
│              Delivery Worker (Celery)                   │
│                                                         │
│  • find_matching_routes()  — event type + JSONPath      │
│  • Per-route rate limit check                           │
│  • transformer.transform(payload)                       │
│  • httpx.post(destination.url)                          │
│  • On failure: retry with 2^n second backoff (max 5)   │
│  • On max retries: status = dead_lettered               │
└───────────────┬─────────────────────────────────────────┘
                │
    ┌───────────┼───────────┐
    ▼           ▼           ▼
  Slack     Discord      Database / Email / any HTTP

┌─────────────────────────────────────────────────────────┐
│           Celery Beat  (every 60 s)                     │
│  collect_metrics() → MetricPoint rows                   │
│  success rate · queue depth · dead letters ·            │
│  throughput · duplicates · sig failures                 │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│           React Dashboard  (localhost:5173)             │
│  Overview · Sources · Destinations · Routes ·           │
│  Deliveries  — live metrics, 10 s auto-refresh          │
└─────────────────────────────────────────────────────────┘

Screenshots

Django Admin Login Dashboard Routes Deliveries


Key Design Decisions

HMAC-SHA256 per source, not per system. Each source can require signature verification — the scheme GitHub uses — so only authenticated producers can inject events into the pipeline. A request that fails verification is logged as sig_failed and dropped before any processing occurs.

Atomic deduplication over GET-then-SET. Two concurrent requests carrying the same idempotency key would both pass a GET check before either writes. SET NX is a single atomic Redis operation — only one caller wins, guaranteed.

200 on duplicates, never 4xx. Returning an error code to a provider like GitHub signals a delivery failure and triggers their retry logic, flooding the system with the exact events you are trying to suppress.

At-least-once, not exactly-once. If a worker crashes after the Redis SET NX but before writing the WebhookDelivery row, the event is lost. Exactly-once across two independent systems (Redis + Postgres) requires distributed transactions; the tradeoff is not worth it for application webhooks.

Fan-out, not first-match-wins. All matching routes execute for every delivery. A push event can simultaneously notify Slack and write to Discord. Routes are evaluated in priority order but none short-circuits the rest.

Auth headers encrypted at rest. Destination.auth_header (Bearer tokens, API keys) is stored using Fernet symmetric encryption. The plaintext value only lives in memory during the Celery delivery task.


Security

RelayCore treats security as a layered concern — each layer is independent so a bypass of one does not compromise the others.

Inbound webhook authenticity

Every source can be configured with an HMAC-SHA256 secret. When a webhook arrives, RelayCore recomputes the signature over the raw request body using that secret and compares it against the X-Hub-Signature-256 header sent by the producer. The comparison uses Python's hmac.compare_digest() rather than ==, which prevents timing attacks — an attacker who measures response time cannot learn how many bytes of a guessed signature are correct.

A request that fails verification is immediately rejected with HTTP 401. A WebhookDelivery row is still written with status=sig_failed so the attempt is auditable, but no Celery task is enqueued and no processing occurs.

Replay attack prevention

Providers like GitHub retry failed deliveries. Without deduplication, a network blip could result in the same event being processed multiple times. RelayCore uses an atomic Redis SET NX EX operation to register each event's idempotency key the first time it is seen. SET NX is a single atomic Redis command — it is not a GET followed by a SET, which would have a race condition where two concurrent requests with the same key could both pass the check. Only one caller wins. A second request carrying the same key within the 24-hour TTL window returns HTTP 200 with {"status": "duplicate"} and is logged but not processed.

Rate limiting

Two independent rate limits protect against volume abuse. At the source level, the ingestion view checks a Redis sliding-window counter before creating any database rows — a producer that exceeds its configured requests-per-minute limit receives HTTP 429 with a Retry-After: 60 header. At the route level, the Celery delivery task checks a separate counter per route — a route that is over its limit is skipped for that delivery but other matching routes still execute, so a misbehaving route does not block fan-out to other destinations.

SSRF protection

Server-Side Request Forgery is a class of attack where a server is tricked into making HTTP requests to internal infrastructure on behalf of an attacker. In a webhook relay this is a real risk — a user with dashboard access could create a destination pointing to http://169.254.169.254/latest/meta-data/ (the AWS instance metadata endpoint) or http://192.168.1.1/ (an internal router). RelayCore blocks this at two independent points.

The first check runs in Destination.clean() when a destination is saved through the API. The URL's hostname is resolved to an IP address and validated against a blocklist of RFC 1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), loopback (127.0.0.0/8), and link-local addresses (169.254.0.0/16, which covers the AWS metadata endpoint). If the resolved IP falls in any blocked range, the save is rejected with a validation error.

The second check runs inside the Celery delivery task immediately before the outbound HTTP POST. This is defence in depth — it catches any destination rows that were inserted directly into the database bypassing the API. A blocked URL raises a ValueError which is treated as a delivery failure and follows the normal retry and dead-letter flow.

SSRF-Protection

Brute-force login protection

The login endpoint (/api/auth/login/) is protected by django-axes. After 5 consecutive failed login attempts for the same username, that account is locked out for one hour regardless of which IP address the attempts came from. The lockout counter resets automatically on a successful login. Tracking by username rather than IP prevents credential stuffing even from distributed sources where each attempt originates from a different address.

Brute-Force-Login

Encrypted credentials at rest

Destination auth_header values (Bearer tokens, API keys) are encrypted before being written to the database using Fernet symmetric encryption from Python's cryptography library. The encryption key is held in the environment variable FIELD_ENCRYPTION_KEY and never touches the database. A complete database dump does not expose any usable credentials — the ciphertext is unreadable without the key. The decryption happens transparently in memory inside the Celery delivery task and the plaintext value is never persisted anywhere.

User action audit log

Every create, update, and delete operation on Sources, Destinations, and Routes is recorded by django-auditlog with the acting user, a UTC timestamp, and a field-level diff showing the exact values that changed. The auth_header field is explicitly excluded from diffs so encrypted credential values never appear in the audit log. The log is viewable in the Audit Log page of the dashboard and is stored in the database as an immutable append-only record.

Audit Log

HTTPS and secure cookies

When DEBUG=False (any non-local deployment), RelayCore activates SECURE_SSL_REDIRECT to redirect all HTTP traffic to HTTPS, sets HSTS headers with a one-year max-age to instruct browsers to always use HTTPS for the domain, and marks both the session cookie and the CSRF cookie as Secure so they are never transmitted over plain HTTP.


Threat Model

What RelayCore protects against

  • Event spoofing — HMAC-SHA256 verification ensures only producers who hold the shared secret can inject events into a source. A forged request fails before any DB write or task is enqueued. Signature comparison uses hmac.compare_digest() to prevent timing attacks.
  • Replay attacks — the idempotency layer detects and drops repeated delivery of the same event. Even if an attacker captures and replays a valid signed request, the second delivery is silently absorbed (logged as duplicate) without re-processing.
  • Abuse / DDoS via high event volume — source-level and route-level rate limits (Redis sliding window) cap events per minute. Requests over the limit receive 429 with a Retry-After header.
  • Credential leakage in storage — destination auth headers (API keys, Bearer tokens) are encrypted at rest with Fernet. A full database dump does not expose usable credentials.
  • SSRF via destination URLs — destination URLs are validated against RFC 1918 private ranges, loopback, and link-local addresses (including the AWS metadata endpoint 169.254.169.254) both at save time and again in the delivery worker before posting. Attempts to route events to internal infrastructure are blocked at both layers.
  • Brute-force login attacksdjango-axes locks out a username after 5 failed login attempts for 1 hour, with automatic reset on successful login. Tracking by username rather than IP remains effective against distributed attacks where each attempt uses a different source address.
  • User action accountability — every create, update, and delete on Sources, Destinations, and Routes is recorded via django-auditlog with actor, timestamp, and field-level diff. auth_header is excluded from diffs so encrypted credentials never appear in the log.
  • HTTPS in productionSECURE_SSL_REDIRECT, HSTS, and secure cookie flags activate automatically when DEBUG=False.

What it explicitly does not protect against

  • Compromised HMAC secrets — if a producer's secret is leaked, an attacker can craft valid signatures indefinitely. Rotate secrets via the Sources page; there is no automatic secret rotation.
  • Payload content — RelayCore verifies the envelope (signature, rate limit, idempotency) but does not inspect or sanitise the payload body. Transformers receive raw producer data.
  • Man-in-the-middle on outbound deliveryhttpx uses system CA certificates but does not enforce certificate pinning to destinations.

What would be added in a production hardening pass

  • Automatic HMAC secret rotation with overlap window for zero-downtime key changes.
  • mTLS for inbound producer connections where the producer supports client certificates.

Performance

Run the included load test against your deployment:

python load_test.py --url http://localhost --slug your-source-slug --requests 300 --concurrency 20

Results on a development laptop running the full Docker stack (4 gunicorn workers, single Celery worker, Redis and Postgres in containers):

Note: Github-slug was used

  • Accepted: 1 (first unique request)
  • Duplicates: 99 (idempotency working, same payload detected and dropped cleanly)
  • Rate limited: 200 (source limit of 100 req/min enforced correctly)
Metric Result
Throughput 66.9 req/s
Wall time 4.49 s
Latency p50 271.7 ms
Latency p95 378.6 ms
Latency p99 482.0 ms
Success rate 100 %

Note: the ingestion view returns 200 immediately after enqueuing — latency here measures time to accept and persist the event, not time to deliver to the destination. 429s are rate limiter enforcing the 100 req/min source limit, not failures


Stack

Component Technology
Backend Django 4.2, Django REST Framework 3.15
Database PostgreSQL
Task queue Celery 5.3 + Celery Beat
Broker / cache / rate limit Redis 5
Outbound HTTP httpx
Encryption cryptography (Fernet)
JSONPath routing jsonpath-ng
Frontend React 18, TypeScript, Vite 5
Styling Tailwind CSS 3, Material Icons Round, Roboto Slab
Data fetching TanStack Query v5, Axios
Audit logging django-auditlog 3.0

Docker

The entire stack — Django, Celery worker, Celery Beat, Redis, Postgres, and the React frontend served by nginx — runs with a single command:

# First time or after code changes — rebuilds images
docker compose up --build

# Subsequent runs — uses existing images, starts faster
docker compose up

Open http://localhost in your browser. The nginx container proxies /api/ and /webhooks/ to Django and serves the React SPA for everything else.

First run only — create a superuser after the containers are up:

docker compose exec web python manage.py createsuperuser

Environment variables

Copy .env.example to .env and fill in your values:

cp .env.example .env

Set at minimum:

SECRET_KEY=your-django-secret-key
FIELD_ENCRYPTION_KEY=your-fernet-key
DB_PASSWORD=postgres

The compose file injects DB_HOST, DB_PORT, and REDIS_URL automatically — do not set those in .env when running via Docker.

Pushing to Docker Hub

# Build and tag
docker build -t imann122/relaycore:latest .
docker build -t imann122/relaycore-frontend:latest ./frontend

# Push
docker push imann122/relaycore:latest
docker push imann122/relaycore-frontend:latest

Then update docker-compose.yml to use image: imann122/relaycore:latest instead of build: . for the web, worker, and beat services to pull pre-built images instead of building locally.


Prerequisites

  • Python 3.9+
  • Node.js 18+
  • PostgreSQL 14+
  • Redis 6+

Setup

git clone <repo-url>
cd relaycore_project

python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # macOS / Linux

pip install -r requirements.txt

Create .env in the project root:

SECRET_KEY=your-django-secret-key
DEBUG=True
DATABASE_URL=postgres://user:password@localhost:5432/relaycore
REDIS_URL=redis://localhost:6379/0
FIELD_ENCRYPTION_KEY=your-fernet-key
CORS_ALLOWED_ORIGINS=http://localhost:5173

Generate a Fernet key:

python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
python manage.py migrate
python manage.py createsuperuser

cd frontend && npm install && cd ..

Running

Four processes, each in its own terminal with the venv activated:

# 1 — Django
python manage.py runserver

# 2 — Celery worker  (-P solo required on Windows)
celery -A relaycore worker -P solo -l info

# 3 — Celery Beat
celery -A relaycore beat -l info \
  --scheduler django_celery_beat.schedulers:DatabaseScheduler

# 4 — React dev server
cd frontend && npm run dev

Dashboard: http://localhost:5173 — log in with your superuser credentials.


Sending a Webhook

Slack Webhook

curl -X POST http://localhost:8000/webhooks/receive/<source-slug>/ \
  -H "Content-Type: application/json" \
  -H "X-GitHub-Event: push" \
  -d '{"repository": {"full_name": "you/repo"}, "pusher": {"name": "you"}, "ref": "refs/heads/main", "commits": []}'

Returns {"status": "accepted", "delivery_id": N}. Send the same payload twice to observe deduplication — the second call returns {"status": "duplicate"}.


GitHub → Slack

The primary demo integration: a GitHub push event arrives at Relay, gets verified, routed, and transformed into a Slack message delivered via Slack's Incoming Webhooks API.

How Slack Incoming Webhooks work

Slack exposes a unique HTTPS URL per channel (generated at api.slack.com/apps). Any HTTP POST to that URL with a JSON body containing a text field appears as a message in that channel — no OAuth flow, no bot token, no Slack SDK. The GitHubToSlackTransformer produces exactly that shape:

{
  "text": "🚀 *iman* pushed to `refs/heads/main` on *iman/relaycore*\n• `a1b2c3d` docs: add screenshots to README"
}

Relay POSTs this to the Slack webhook URL stored on the Destination model (encrypted at rest). Slack renders it with the pusher name bolded, the branch and repo in code formatting, and each commit as a bullet with its short SHA.

Setting it up

  1. Go to api.slack.com/apps → Create New App → From scratch.
  2. Under Features → Incoming Webhooks, toggle on and add a webhook to your target channel.
  3. Copy the webhook URL (https://hooks.slack.com/services/...).
  4. In the Relay dashboard: create a Destination with that URL, create a Source with slug github-production and signature scheme github_hmac, create a Route connecting them with transformer github_to_slack.
  5. Point your GitHub repository webhook at https://your-relay-host/webhooks/receive/github-production/ with the same secret you set on the Source.

Every push to that repository will now appear in your Slack channel within seconds, with HMAC verification, deduplication, and retry guarantees handled by Relay.

Other built-in transformers

GitHubToDiscordTransformer produces a Discord embed with the same commit information, posted to a Discord channel via its webhook URL — identical setup, different destination.

CalendarToDatabaseTransformer handles Google Calendar push notifications, which carry their metadata in HTTP headers (X-Goog-Channel-ID, X-Goog-Resource-State) rather than the body. Relay's ingestion layer extracts these headers and merges them into the payload under __goog_meta before the transformer runs, so the transformer has everything it needs in one dict.

FormToEmailTransformer converts an HTML contact form POST into a generic transactional email API shape compatible with SendGrid, Mailgun, and Postmark — swap the destination URL to change the email provider without touching the transformer.


Adding a Transformer

  1. Create apps/transformers/your_transformer.py extending BaseTransformer, implement transform(self, payload: dict) -> dict.
  2. Add an entry to TRANSFORMER_CHOICES in apps/transformers/choices.py.
  3. Add an entry to TRANSFORMER_REGISTRY in apps/transformers/registry.py.

Testing

.venv\Scripts\activate
pytest

The test suite is split across four focused files plus one end-to-end file.

tests/test_hmac.py — unit tests for HMAC-SHA256 signature verification: valid signature passes, wrong secret fails, tampered body fails, missing header fails, malformed header fails. Then a second class tests the full view — valid request returns 200, invalid returns 401, unknown slug returns 404.

tests/test_idempotency.py — proves the Redis SET NX deduplication is race-free. A key seen for the first time returns False (new event); the same key immediately after returns True (duplicate). Also covers TTL expiry: after the window expires, the same key is treated as new again.

tests/test_routing.py — covers find_matching_routes() and evaluate_conditions() exhaustively: exact event type match, wildcard route matches any event, JSONPath condition match and mismatch, multiple conditions where all must pass, multiple routes returned in priority order, inactive routes excluded, routes from a different source excluded.

tests/test_transformers.py — It asserts the output shape of all four built-in transformers against known input payloads. Ensures a GitHub push event produces a Slack text field, a Discord embeds list, and so on.

tests/test_e2e.py — end-to-end pipeline test. It fires a real HTTP POST to the ingestion view, runs the Celery task synchronously in the same process, intercepts the outbound httpx.post call, and asserts the WebhookDelivery row reaches status='delivered'. Four scenarios covered: happy path delivery, duplicate suppression (httpx called exactly once across two identical requests), sig_failed logging, and no-matching-route handling.

Pytests Results


Project Structure

relaycore_project/
├── apps/
│   ├── api/            # DRF viewsets, serializers, REST endpoints
│   ├── core/           # Models: Source, Destination, Route, WebhookDelivery, MetricPoint
│   ├── delivery/       # Celery tasks, rate limiter
│   ├── idempotency/    # Redis SET NX deduplication
│   ├── routing/        # JSONPath + event-type route evaluator
│   └── transformers/   # BaseTransformer + registry + 4 built-in transformers
├── frontend/src/
│   ├── api/            # Axios clients per resource
│   ├── components/     # Layout, Modal, UI primitives
│   └── pages/          # Overview, Sources, Destinations, Routes, Deliveries, AuditLog
├── tests/              # pytest suite
├── relaycore/      # Django settings, Celery bootstrap, root URLs
└── requirements.txt

About

A webhook gateway that receives HTTP callbacks from any source (GitHub, Google Calendar, HTML forms), deduplicates them, applies routing rules, transforms payloads, and forwards to destinations (Slack, Discord, APIs) with reliable retry logic

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors