Minimal, production-shaped monorepo: API (Spring Boot) + Worker (Python) communicating via MongoDB. No queues, no auth, no UI.
+------------------+
| MongoDB |
| (cases, |
| case_analysis) |
+--------+---------+
|
+-------------------+-------------------+
| |
v v
+----------------+ +----------------+
| API Service | | Worker Service |
| (Spring Boot) | | (Python) |
| :8080 | | |
| | | - Poll PENDING|
| POST /cases | | - LLM call |
| PUT /cases/id | | - Write result|
| GET /cases/id | | - Retry/error |
+----------------+ +----------------+
^ ^
| |
REST clients OPENAI_API_KEY
(no direct LLM) (env)
- API: Owns case lifecycle, persists data, exposes REST. Never calls the LLM.
- Worker: Polls for
case_analysiswithstatus = PENDING, atomically setsPROCESSING, calls LLM, writes result or fails with retry. - Communication: Database only (no message queue in v0).
+-------------+
| PENDING |
+------+------+
|
(worker claims)
v
+-------------+
| PROCESSING |
+------+------+
|
+---------------+---------------+
| |
(success) (error)
| |
v v
+-------------+ +------------------+
| COMPLETED | | attempt < 3 ? |
+-------------+ | -> PENDING |
| attempt >= 3 ? |
| -> ERRORED |
+------------------+
- PENDING → PROCESSING: Worker does a single atomic find-and-update; no duplicate processing.
- PROCESSING → COMPLETED: LLM returns valid JSON; worker persists summary, entities, confidence.
- PROCESSING → PENDING (retry): On error, attempt incremented; if attempt < 3, status set back to PENDING.
- PROCESSING → ERRORED: On error with attempt ≥ 3.
- Worker crash safety: On startup, worker resets any
case_analysisstuck in PROCESSING (e.g.updatedAtolder than 15 minutes) back to PENDING.
| Field | Type |
|---|---|
| _id | ObjectId / string |
| title | string |
| description | string |
| version | number |
| createdAt | ISODate |
| updatedAt | ISODate |
| Field | Type |
|---|---|
| _id | ObjectId / string |
| caseId | ref to case _id |
| version | number (case version; one analysis per case version) |
| status | PENDING | PROCESSING | COMPLETED | ERRORED |
| attempt | number |
| llmModel | string |
| summary | string |
| entities | [string] |
| confidence | number |
| error | string | null |
| createdAt | ISODate |
| updatedAt | ISODate |
Constraints: One case_analysis per case version; updates idempotent by _id; status transitions explicit.
| Scenario | Behavior |
|---|---|
| LLM timeout | httpx timeout; worker catches, increments attempt, sets PENDING or ERRORED. |
| LLM non-JSON | Parsing fails; same retry/error path. |
| LLM schema mismatch | Validation error; same retry/error path. |
| Worker crash during PROCESSING | On next startup, recover_stale_processing() resets old PROCESSING → PENDING. |
| MongoDB down | API and worker fail until DB is back; no in-memory queue. |
| Duplicate processing | Prevented by atomic find-one-and-update (PENDING → PROCESSING). |
- Java 17+, Maven
- Python 3.10+
- MongoDB (local or Docker)
- OpenAI-compatible API key
# Option A: Docker
docker run -d -p 27017:27017 --name mongo mongo:7
# Option B: Use existing MongoDB; set MONGODB_URI if needed.cd case-intel/api
export MONGODB_URI=mongodb://localhost:27017/case_intel # optional if default
mvn spring-boot:runAPI: http://localhost:8080
cd case-intel/worker
cp .env.example .env
# Edit .env: set OPENAI_API_KEY (or LLM_API_KEY)
pip install -r requirements.txt
python main.pycd case-intel
# Create worker/.env with OPENAI_API_KEY
docker compose up --build- API:
http://localhost:8080 - MongoDB:
localhost:27017 - Worker runs in background, polling DB.
# Create case (creates case + case_analysis PENDING)
curl -s -X POST http://localhost:8080/cases \
-H "Content-Type: application/json" \
-d '{"title":"Contract dispute","description":"Party A claims breach."}'
# Get case + latest analysis (after worker runs)
curl -s http://localhost:8080/cases/<case-id>
# Update case (new version + new case_analysis PENDING)
curl -s -X PUT http://localhost:8080/cases/<case-id> \
-H "Content-Type: application/json" \
-d '{"title":"Contract dispute (updated)","description":"Party A claims breach; Party B denies."}'case-intel/
├── api/
│ ├── src/main/java/com/caseintel/
│ │ ├── CaseIntelApplication.java
│ │ ├── controller/
│ │ ├── service/
│ │ ├── repository/
│ │ ├── document/
│ │ └── dto/
│ ├── pom.xml
│ └── Dockerfile
├── worker/
│ ├── main.py
│ ├── llm_client.py
│ ├── models.py
│ ├── retry.py
│ ├── db.py
│ ├── requirements.txt
│ ├── .env.example
│ └── Dockerfile
├── docker-compose.yml
├── .env.example
└── README.md
- No authentication on API or worker.
- No UI; REST only.
- No message queue; worker polls DB. Not suitable for very high throughput.
- No streaming; LLM response is single completion.
- No embeddings or vector search.
- Single worker; multiple workers are safe (atomic claim) but not tuned (e.g. no backoff).
- Stale recovery is time-based only (e.g. 15 min); no heartbeat.
Internal / portfolio use.