Atlantis KB is a private, authenticated internal web platform built with Next.js. It currently ships two working product areas inside one codebase:
- Leads: a permit- and enrichment-driven contractor intelligence system focused on Georgia markets.
- COMEX: a metals pricing workspace for copper/aluminum historical prices, trend indicators, and a retrieval-augmented chat assistant.
The actual runnable app is in atlantiskb-home/.
- Auth-protected dashboards and working pages for dashboard, companies, jobs, permits, prospecting, import, and settings.
- CRUD and filtered listing APIs for companies.
- Bulk CSV preview/commit import pipeline.
- Company enrichment pipeline (website discovery, content extraction, AI + keyword fallback classification).
- Job execution framework with persisted crawl job history.
- Permit ingestion workflows with multi-source fetchers, matching, bulk-linking, rematch tools, and permit signal scoring.
- Google Places prospecting flow (search, duplicate check, add-to-company pipeline).
- Price sync endpoint pulling market data and writing
CommodityPricerows. - Price API returning 1y history + MA30 + simple 30/60/90 day linear-regression projections.
- News sync endpoint that ingests RSS, embeds snippets, stores vectors in Postgres/pgvector, and derives price events.
- RAG chat endpoint that retrieves semantically similar news + related price events and streams LLM answers.
- Root
/page renders tool cards fromlib/tools.config.ts. - Clerk-protected sign-in and account views.
- Several source adapters remain intentionally demo/limited mode (for example permit adapter in
lib/sources/permits.ts). - Some integrations rely on external credentials and will no-op or return configuration errors when not present.
- Duplicate backup-like files exist (e.g.
* 2.ts) and are not clearly part of active production paths. - COMEX RAG requires pgvector column/index setup beyond base Prisma schema migration lifecycle.
- Framework: Next.js App Router (
app/) with server routes and server components. - Auth: Clerk middleware + server auth checks per route.
- Data layer: Prisma ORM over PostgreSQL.
- Background/sync model: triggered HTTP endpoints + persisted
CrawlJobrecords + Vercel cron for scheduled COMEX sync. - AI usage:
- Company enrichment classification (
lib/ai). - COMEX embedding generation and RAG answer generation (
lib/comex).
- Company enrichment classification (
- Next.js 16 + React 19 + TypeScript 5
- Prisma 5 + PostgreSQL
- Clerk (@clerk/nextjs)
- Tailwind CSS 4 (plus inline style usage in many components)
- Recharts (COMEX charts)
- Playwright core (permit browser scraping paths)
- RSS Parser, node-html-parser, csv-parse
- Voyage AI embeddings API (COMEX semantic retrieval)
atlantiskb-home/
app/
(UI routes + API routes)
leads/
(protected pages, lead APIs)
comex/
(COMEX UI + APIs)
components/
dashboard/, companies/, jobs/, permits/, prospecting/, import/, ui/, layout/
lib/
ai/, comex/, enrichment/, jobs/, normalization/, scoring/, signals/, permits/, sources/, validation/
prisma/
schema.prisma
migrations/
seed.ts
proxy.ts
next.config.ts
vercel.json
Core entities:
Company: lead/account entity with enrichment fields, scores, status, Google Place info, origin metadata.Signal: activity evidence linked to a company (permit/news/job posting/manual/etc.).Contact: people/contact methods associated with a company.Permit: normalized permit record, optional company match, value/status/date lifecycle.CrawlJob: execution log for sync/import/enrichment jobs.NewsArticle,CommodityPrice,PriceEvent: COMEX news + market time-series + derived events.Tag,CompanyTag,UserNote: CRM-style annotation layer.
Enums encode lead lifecycle and signal/source taxonomy (e.g. CompanyStatus, SignalType, SourceType, RecordOrigin).
/tool launchpad/sign-in/[[...sign-in]]/account/leads/*(protected workspace)/comex(charting + agent panel)
- Companies: list/create, detail/update/delete, merge, batch delete, website discovery
- Import: CSV preview + commit
- Enrichment: batch and per-company enrichment
- Jobs: run source adapters + read job history
- Permits: sync, list, stats, signals, single-permit patch, rematch, bulk sync
- Prospecting: Google Places search/check/add
- Dashboard data: top leads, map data, county panels, curated news, company contact updates
- System: health, rescore, job-posting signal sync
/comex/api/prices/sync(cron-targeted)/comex/api/prices/comex/api/news/sync/comex/api/agent
/api/clerk-proxy/[[...path]]for Clerk frontend API proxying.
- Clerk provider at app root.
- Middleware enforces auth globally except explicit public routes (
/sign-in, Clerk proxy paths, cron sync endpoint). - Most API routes also perform explicit server-side
auth()checks.
- Clerk (auth/session)
- Google Places API + Google Maps API (prospecting/maps)
- Google Custom Search API (website finding + job-posting signal workflows)
- Multiple permit portal adapters (Accela/ACA/EnerGov + county-specific scrapers)
- OpenCorporates (business registry adapter)
- Voyage AI embeddings (COMEX semantic retrieval)
- LLM providers via configurable AI layer (OpenAI/Anthropic)
- Yahoo Finance fetch path used by COMEX price sync
Set in .env.local (or deployment env):
DATABASE_URLDIRECT_URLNEXT_PUBLIC_APP_URLNODE_ENV
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEYCLERK_SECRET_KEYNEXT_PUBLIC_CLERK_SIGN_IN_URLNEXT_PUBLIC_CLERK_AFTER_SIGN_IN_URLNEXT_PUBLIC_CLERK_PROXY_URL
AI_PROVIDERAI_MODELOPENAI_API_KEYANTHROPIC_API_KEYVOYAGE_API_KEY
GOOGLE_PLACES_API_KEYGOOGLE_MAPS_API_KEYNEXT_PUBLIC_GOOGLE_MAPS_API_KEYGOOGLE_CSE_API_KEYGOOGLE_CSE_ENGINE_ID
ACCELA_APP_IDACCELA_APP_SECRETCOBB_ACA_USERNAMECOBB_ACA_PASSWORDOPENCORPORATES_API_KEYCHROME_PATHPERMIT_LOOKBACK_DAYSENRICHMENT_MAX_PAGESENRICHMENT_TIMEOUT_MSJOB_STALE_MINUTES
cd atlantiskb-home
npm install
cp .env.local.example .env.local
# fill required env values
npm run devOpen http://localhost:3000.
cd atlantiskb-home
npx prisma generate
npx prisma migrate deploy
# optional local seed data
npx prisma db seedFor development migrations:
npx prisma migrate devThe COMEX semantic retrieval path expects a vector extension and NewsArticle.embedding vector column. Confirm your database has the extension/column/index expected by migrations and raw SQL inserts before enabling news embedding sync in non-dev environments.
- Configured for Vercel deployment.
- Build script runs
prisma generate && next build. vercel.jsonconfigures cron schedules for COMEX sync endpoints.- Ensure all runtime env vars are set in deployment target, especially database, Clerk, and API credentials.
- Vercel cron schedules call:
/comex/api/prices/syncmultiple times on weekdays./comex/api/news/syncon weekdays.
- Leads syncs (permits, enrichment, import, source jobs) are on-demand HTTP triggers from UI/API.
- Job history and status tracking are persisted in
CrawlJob.
- Some lead-source adapters remain demo mode or constrained by external portal/API access.
- Auth + middleware + per-route checks are duplicated in places; behavior is robust but somewhat repetitive.
- Multiple long-running endpoints depend on network calls and can be slow or brittle when upstreams change.
- Repository contains duplicate
* 2.tsfiles, implying cleanup debt. - README/legacy docs in previous state were materially outdated relative to the codebase.
- Consolidate/clean duplicate files and define explicit active adapter list.
- Move long-running sync tasks to durable background jobs/queues.
- Add end-to-end and contract tests around high-value APIs (permits sync, enrich pipeline, CSV import).
- Add stricter observability: job-level metrics, structured logs, failure alerts.
- Normalize environment variable documentation and provide a single authoritative
.env.examplecovering all integrations. - Harden AI/rag safety and response grounding with explicit citation validation tests.