Skip to content

Kalitch/Augmented-Case-Intelligence-Platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Augmented Case Intelligence Platform (v0)

Minimal, production-shaped monorepo: API (Spring Boot) + Worker (Python) communicating via MongoDB. No queues, no auth, no UI.


Architecture (ASCII)

                    +------------------+
                    |   MongoDB        |
                    |  (cases,         |
                    |   case_analysis) |
                    +--------+---------+
                             |
         +-------------------+-------------------+
         |                                       |
         v                                       v
+----------------+                      +----------------+
|  API Service   |                      | Worker Service |
|  (Spring Boot) |                      |  (Python)      |
|  :8080         |                      |                |
|                |                      |  - Poll PENDING|
|  POST /cases   |                      |  - LLM call    |
|  PUT /cases/id |                      |  - Write result|
|  GET /cases/id |                      |  - Retry/error |
+----------------+                      +----------------+
         ^                                       ^
         |                                       |
    REST clients                          OPENAI_API_KEY
    (no direct LLM)                       (env)
  • API: Owns case lifecycle, persists data, exposes REST. Never calls the LLM.
  • Worker: Polls for case_analysis with status = PENDING, atomically sets PROCESSING, calls LLM, writes result or fails with retry.
  • Communication: Database only (no message queue in v0).

State Machine: case_analysis.status

                    +-------------+
                    |   PENDING   |
                    +------+------+
                           |
              (worker claims)
                           v
                    +-------------+
                    | PROCESSING  |
                    +------+------+
                           |
           +---------------+---------------+
           |                               |
     (success)                        (error)
           |                               |
           v                               v
    +-------------+              +------------------+
    |  COMPLETED  |              | attempt < 3 ?    |
    +-------------+              |   -> PENDING     |
                                | attempt >= 3 ?   |
                                |   -> ERRORED     |
                                +------------------+
  • PENDINGPROCESSING: Worker does a single atomic find-and-update; no duplicate processing.
  • PROCESSINGCOMPLETED: LLM returns valid JSON; worker persists summary, entities, confidence.
  • PROCESSINGPENDING (retry): On error, attempt incremented; if attempt < 3, status set back to PENDING.
  • PROCESSINGERRORED: On error with attempt ≥ 3.
  • Worker crash safety: On startup, worker resets any case_analysis stuck in PROCESSING (e.g. updatedAt older than 15 minutes) back to PENDING.

Data Model (MongoDB)

cases

Field Type
_id ObjectId / string
title string
description string
version number
createdAt ISODate
updatedAt ISODate

case_analysis

Field Type
_id ObjectId / string
caseId ref to case _id
version number (case version; one analysis per case version)
status PENDING | PROCESSING | COMPLETED | ERRORED
attempt number
llmModel string
summary string
entities [string]
confidence number
error string | null
createdAt ISODate
updatedAt ISODate

Constraints: One case_analysis per case version; updates idempotent by _id; status transitions explicit.


Failure Scenarios

Scenario Behavior
LLM timeout httpx timeout; worker catches, increments attempt, sets PENDING or ERRORED.
LLM non-JSON Parsing fails; same retry/error path.
LLM schema mismatch Validation error; same retry/error path.
Worker crash during PROCESSING On next startup, recover_stale_processing() resets old PROCESSING → PENDING.
MongoDB down API and worker fail until DB is back; no in-memory queue.
Duplicate processing Prevented by atomic find-one-and-update (PENDING → PROCESSING).

How to Run Locally

Prerequisites

  • Java 17+, Maven
  • Python 3.10+
  • MongoDB (local or Docker)
  • OpenAI-compatible API key

1. MongoDB

# Option A: Docker
docker run -d -p 27017:27017 --name mongo mongo:7

# Option B: Use existing MongoDB; set MONGODB_URI if needed.

2. API

cd case-intel/api
export MONGODB_URI=mongodb://localhost:27017/case_intel   # optional if default
mvn spring-boot:run

API: http://localhost:8080

3. Worker

cd case-intel/worker
cp .env.example .env
# Edit .env: set OPENAI_API_KEY (or LLM_API_KEY)
pip install -r requirements.txt
python main.py

4. Docker Compose (all services)

cd case-intel
# Create worker/.env with OPENAI_API_KEY
docker compose up --build
  • API: http://localhost:8080
  • MongoDB: localhost:27017
  • Worker runs in background, polling DB.

Example API Calls

# Create case (creates case + case_analysis PENDING)
curl -s -X POST http://localhost:8080/cases \
  -H "Content-Type: application/json" \
  -d '{"title":"Contract dispute","description":"Party A claims breach."}'

# Get case + latest analysis (after worker runs)
curl -s http://localhost:8080/cases/<case-id>

# Update case (new version + new case_analysis PENDING)
curl -s -X PUT http://localhost:8080/cases/<case-id> \
  -H "Content-Type: application/json" \
  -d '{"title":"Contract dispute (updated)","description":"Party A claims breach; Party B denies."}'

Project Structure

case-intel/
├── api/
│   ├── src/main/java/com/caseintel/
│   │   ├── CaseIntelApplication.java
│   │   ├── controller/
│   │   ├── service/
│   │   ├── repository/
│   │   ├── document/
│   │   └── dto/
│   ├── pom.xml
│   └── Dockerfile
├── worker/
│   ├── main.py
│   ├── llm_client.py
│   ├── models.py
│   ├── retry.py
│   ├── db.py
│   ├── requirements.txt
│   ├── .env.example
│   └── Dockerfile
├── docker-compose.yml
├── .env.example
└── README.md

Known Limitations (v0)

  • No authentication on API or worker.
  • No UI; REST only.
  • No message queue; worker polls DB. Not suitable for very high throughput.
  • No streaming; LLM response is single completion.
  • No embeddings or vector search.
  • Single worker; multiple workers are safe (atomic claim) but not tuned (e.g. no backoff).
  • Stale recovery is time-based only (e.g. 15 min); no heartbeat.

License

Internal / portfolio use.

About

Production-shaped backend for asynchronous, failure-safe LLM enrichment of case data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors