Facebook crawler — three deployable units (api + web + crawler) backed by Postgres (operational state) and Neo4j (graph).
apps/
api/ Hono on Bun — REST endpoints, webhook dispatcher, serves SPA
web/ React + Vite — dashboard SPA
crawler/ Node 24 LTS — Trigger.dev tasks running Playwright (Local | Browserbase)
packages/
db/ Drizzle schemas + migrations (Postgres)
graph/ Neo4j driver + Cypher schema + repository fns
browser/ BrowserDriver interface + Local + Browserbase impls
crypto/ AES-GCM session encryption
shared/ Zod schemas, types, URL parsers
- Runtime: Bun for
apiandweb; Node 24 LTS forcrawler(Playwright reliability) - API: Hono with structured logging (pino), request IDs, rate limiting, graceful shutdown
- Frontend: React + Vite + TanStack Query + Tailwind, error boundary + toasts
- Job orchestration: Trigger.dev v4 (self-hosted at
trigger.raqz.link) - DBs: Postgres 16, Neo4j 5
- ORM: Drizzle (Postgres) + neo4j-driver (Neo4j)
cp .env.example .env
# Generate a 32-byte hex key for SESSION_ENCRYPTION_KEY:
openssl rand -hex 32
# Set APP_SECRET to anything reasonably long.
bun install
bun run dev:db # postgres + neo4j via docker compose
bun --filter @fbeye/db migrate # apply schema
bun run dev:api # :3000
bun run dev:web # :5173 (proxies /api → :3000)
bun run dev:crawler # trigger.dev local devPostgres on :5434 (avoids :5432 system + :5433 mtcuteweb). Neo4j on :7474/:7687.
docker-compose.dev.yml has an app profile that builds Dockerfile.api/Dockerfile.crawler with their dev targets and bind-mounts source for hot reload (Bun --watch, Vite HMR, Trigger.dev dev).
# DBs only (default, identical to bun run dev:db):
docker compose -f docker-compose.dev.yml up -d
# Full stack with hot reload:
docker compose -f docker-compose.dev.yml --profile app up
# api → http://localhost:3000
# web → http://localhost:5173
# crawler logs in foreground; trigger.dev dev needs `npx trigger.dev login`
# at least once on the host (creds are mounted via your shell's home).Each containerized workspace has its own *-node-modules named volume so the host's bind mount doesn't clobber the container's installed deps.
The repo ships three multi-target Dockerfiles and a production docker-compose.yml orchestrating postgres → migrate (one-shot) → api → optional crawler-runner, all with health gates. Images are pushed to GHCR by CI on every push to main:
ghcr.io/<owner>/facebook-eye-api:{latest,<sha>}— Hono API + bundled SPA (Bun runtime, non-root)ghcr.io/<owner>/facebook-eye-migrate:{latest,<sha>}— one-shot Drizzle migrator- (
ghcr.io/<owner>/facebook-eye-crawler:{latest,<sha>}— Node 24 + Playwright + Trigger.dev SDK; built locally, optional in prod)
openssl rand -hex 32 # SESSION_ENCRYPTION_KEY (must be 32 bytes hex / 64 chars)
openssl rand -hex 24 # APP_SECRET
openssl rand -hex 24 # INTERNAL_API_SECRETcp .env.compose.example .env.compose
# Fill in everything; required vars use ${VAR:?required} so compose will fail
# fast if any are missing.cd apps/crawler
npx trigger.dev@latest login --api-url https://trigger.raqz.link
npx trigger.dev@latest init --project-ref <new-project-ref>
# Paste the ref into TRIGGER_PROJECT_ID in .env.compose, and the secret key
# from the Trigger.dev dashboard into TRIGGER_SECRET_KEY.# Pull pre-built images from GHCR (or omit `pull` to build locally):
docker compose --env-file .env.compose pull
docker compose --env-file .env.compose up -d
docker compose --env-file .env.compose logs -f apiThe api will:
- block on postgres healthcheck
- wait for the one-shot migrate container to exit cleanly
- expose
/health(always 200) and/ready(200 only when DB is reachable)
To run a self-hosted Trigger.dev runner on this same host (e.g. when this host has the residential IP you want crawler traffic to come from), enable the crawler-runner profile:
docker compose --env-file .env.compose --profile crawler-runner up -dcd apps/crawler
npx trigger.dev@latest deployRun a self-hosted Trigger.dev runner on whatever host has the residential IP you want the crawler to use; it'll pick up crawl-target jobs.
- Health/readiness:
GET /health(process up),GET /ready(DB reachable) - Logs: pino JSON; pipe through
pino-prettylocally - Auth: dashboard sends
x-app-secretheader (stored in localStorage). Worker→api callbacks usex-internal-secret. - Rate limit: 120 req/min per IP on
/api/*(in-memory; replace with redis if scaled out) - Idempotency: only one pending/running crawl run per target; second
POST /api/crawl-runsreturns409with the existing run id - Cancel a stuck run:
POST /api/crawl-runs/:id/cancel
GitHub Actions on push/PR:
verify: format check (Prettier), lint (ESLint), typecheck (tsc), tests (bun test), web builddocker(main only): builds & pushesghcr.io/<owner>/facebook-eye-{api,migrate}:{latest,<sha>}
bun run typecheck # tsc across all workspaces
bun run lint # eslint
bun run format # prettier --write
bun run format:check # prettier --check (CI)
bun test # bun:test, all *.test.ts files
bun run build # build every workspace
bun run dev:db # docker compose up -d (postgres + neo4j)
bun run dev:db:down
bun run dev:api / dev:web / dev:crawler