Agentic Legal AI for Vietnam
🇻🇳 Tiếng Việt · 🇬🇧 English
Lex Companion is an agentic AI legal companion for Vietnam that helps individuals and businesses understand regulations, research legal issues, evaluate options, and generate legal documents through specialized legal agents grounded in authoritative legal sources.
Navigating Vietnamese law often requires searching across thousands of legal provisions, understanding relationships between regulations, and translating legal language into practical actions.
Lex Companion acts as an AI legal companion that assists users throughout this process. Instead of functioning as a traditional chatbot, it coordinates specialized legal agents capable of legal research, information retrieval, document drafting, decision support, and problem-solving.
Core capabilities include:
- Agentic legal workflows powered by intent-specific LangGraph agents
- Grounded legal reasoning over the Vietnamese Pháp điển legal codex (~64k articles)
- Hybrid retrieval architecture combining keyword search, semantic search, and reranking
- Citation-backed responses with transparent references to legal sources
- Human-in-the-loop document generation for contracts and legal forms
- Session-based knowledge augmentation through user-provided documents
For detailed architecture documentation, see docs/ARCHITECTURE.md · Tiếng Việt
| Feature | Description |
|---|---|
| Legal Q&A | Ask questions about Vietnamese law; get answers grounded in Pháp điển articles |
| Legal Research | Multi-query RAG with ontology-aware query expansion and retry |
| Hybrid Search | Elasticsearch keyword + KNN vector fusion with BGE reranking |
| Citation Tracking | Every factual claim links to [n] inline citations and a reference panel |
| Intent Routing | 6 specialized agent workflows: information, decision, problem-solving, exploration, task execution, communication |
| Document Generation | Contract template selection, form fill, and DOCX output with HITL checkpoints |
| User Knowledge Base | Upload personal documents for session-scoped retrieval |
| Legal Corpus Visualization | Interactive graph of topics, subjects, and articles (admin) |
| Web Fallback | Tavily web search when legal corpus context is insufficient |
| i18n | Vietnamese and English UI |
flowchart TB
subgraph Client
WEB["Next.js :3004"]
end
subgraph Backend
API["FastAPI :5999"]
WORKER["Redis Worker"]
end
subgraph AI
AGENTS["LangGraph Agents<br/>(6 intents)"]
RAG["RAG Pipeline<br/>ES → Rerank → LLM"]
end
subgraph Infrastructure
PG[(PostgreSQL)]
ES[(Elasticsearch)]
MINIO[(MinIO)]
REDIS[(Redis)]
end
WEB --> API
API --> AGENTS
AGENTS --> RAG
RAG --> ES
API --> PG
API --> MINIO
WORKER --> REDIS
WORKER --> ES
Request flow: User message → JWT auth → intent routing → LangGraph agent → hybrid retrieval → rerank → LLM with citations → persist & respond.
See docs/ARCHITECTURE.md for complete diagrams covering request lifecycle, agent workflows, RAG pipeline, database design, and deployment.
| Layer | Technologies |
|---|---|
| Frontend | Next.js 16, React 19, TanStack Query, Tailwind CSS 4 |
| Backend | FastAPI, Uvicorn, Peewee ORM, Pydantic v2 |
| AI/Agents | LangGraph, LangChain, FlagEmbedding |
| Search | Elasticsearch 8.13 (hybrid keyword + KNN) |
| Embedding | AITeamVN/Vietnamese_Embedding_v2 (1024 dims) |
| Reranking | BAAI/bge-reranker-v2-m3 |
| LLM | OpenAI-compatible API |
| Document Processing | Docling, PyMuPDF, python-docx |
| Storage | PostgreSQL, MinIO, Redis |
| Package Management | uv (Python), npm (Frontend) |
| Component | Version |
|---|---|
| Python | 3.12+ |
| uv | Latest |
| Docker | For infrastructure services |
| Node.js | 20+ (for frontend) |
git clone <repository-url>
cd langgraph-base
cp .env.example .env
# Edit .env with your credentials (see Configuration section)# Linux: ensure ES can start
sudo sysctl -w vm.max_map_count=262144
docker compose -f docker/docker-compose.yml up -d --build| Service | URL |
|---|---|
| Web UI | http://localhost:3005 |
| API | http://localhost:6000 |
| API Docs | http://localhost:6000/docs |
| Kibana | http://localhost:5602 |
Infrastructure (Postgres, MinIO, Redis, Elasticsearch):
# See api/deployment.readme.md and
# model_serving/retrievers/elastic_search/deployment.readme.mdBackend:
uv venv --python 3.12
uv sync
uv run --env-file .env python -m api.lex_companion_server
# API at http://localhost:5999Frontend:
cd web
npm install
npm run dev
# UI at http://localhost:3004All Python dependencies are managed via uv from the repository root:
uv sync # Install production dependencies
uv sync --group dev # Include LangGraph CLI for developmentImportant: Do not run
uv pip install -e .— this project usespackage = falseand runs viaPYTHONPATH.
cd web
npm installcd model_serving/embeddings/vie_embedding_v2
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python app.py # Runs on port 6501Copy .env.example to .env and configure:
# Database
POSTGRES_USER=your_user
POSTGRES_PASSWORD=your_password
POSTGRES_DB=lex_companion
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
# Object Storage
MINIO_HOST=localhost:6503
MINIO_USER=your_user
MINIO_PASSWORD=your_password
MINIO_BUCKET=lex-companion
# Search
ELASTIC_HOST=localhost:6505
ELASTIC_PASSWORD=your_password
LEX_CHUNKS_INDEX=lex_chunks_v1
LEGAL_VECTOR_DIMS=1024
# LLM
OPENAI_API_KEY=your_key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o
# Embedding
EMBEDDING_PROVIDER=openai
EMBEDDING_BASE_URL=http://localhost:6501/v1
EMBEDDING_MODEL=AITeamVN/Vietnamese_Embedding_v2
# Auth
JWT_SECRET_KEY=your_secret
GOOGLE_CLIENT_ID=your_client_id
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:3004/auth/google/callback# Reranking (significantly improves retrieval quality)
RERANK_ENABLED=true
RERANK_MODEL_NAME=BAAI/bge-reranker-v2-m3
# Redis (enables background document processing)
REDIS_HOST=localhost
REDIS_PORT=6376
REDIS_PASSWORD=your_password
# Web search fallback
TAVILY_API_KEY=your_key# web/.env
NEXT_PUBLIC_API_SERVER=http://localhost:5999
NEXT_PUBLIC_GOOGLE_CLIENT_ID=your_client_id
NEXT_PUBLIC_GOOGLE_OAUTH2_CALLBACK=http://localhost:3004/auth/google/callbackSee docs/ARCHITECTURE.md for the complete environment variable reference.
langgraph-base/
├── api/ # FastAPI backend
│ ├── lex_companion_server.py
│ ├── apps/
│ │ ├── routers/ # Auto-loaded route definitions
│ │ ├── controllers/ # Request handlers
│ │ └── services/ # Business logic + orchestration
│ ├── db/models.py # Peewee ORM models
│ └── worker/ # Redis stream background worker
├── deepagent/ # LangGraph agents + document processing
│ ├── multiagent/legal_assistant/ # Intent-specific graph workflows
│ └── core/ # Rerank, splitters, embeddings, HITL
├── model_serving/ # Standalone embedding + LLM services
├── web/ # Next.js frontend
├── docker/ # Docker Compose + Dockerfiles
├── docs/ # Technical documentation
├── scripts/ # Startup scripts
├── pyproject.toml # Python dependencies (uv)
└── .env.example # Environment template
# Recommended
uv run --env-file .env python -m api.lex_companion_server
# Or via script
./scripts/start_lex_api.sh- API: http://localhost:5999
- OpenAPI docs: http://localhost:5999/docs
Do not run
python api/lex_companion_server.pydirectly — usepython -m api.lex_companion_serverwithPYTHONPATH=..
uv add requests # Production dependency
uv add --dev pytest # Dev dependency
uv sync # Reinstall from lockfileFollow the layered pattern documented in api_creating_instruction.md:
Router → Controller → Service → DB/ES/Agent
uv sync --group dev
uv run langgraph devNote:
langgraph.jsonreferences a legacy graph path. Active graphs are indeepagent/multiagent/legal_assistant/.
uv run python -m pytest tests/docker compose -f docker/docker-compose.yml up -d --buildServices and ports:
| Service | Host Port | Purpose |
|---|---|---|
| PostgreSQL | 5445 | Relational database |
| MinIO | 6503/6504 | Object storage |
| Redis | 6376 | Task queue |
| Elasticsearch | 6505 | Search + vectors |
| Kibana | 5602 | ES management UI |
| Embedding | 6502 | Vietnamese embedding model |
| API | 6000 | FastAPI backend |
| Web | 3005 | Next.js frontend |
- Set strong secrets for
JWT_SECRET_KEY, database passwords, and MinIO credentials - Configure
RERANK_DEVICE=cuda:0if GPU is available - Implement persistent checkpointer (Redis/Postgres scaffold exists) for HITL reliability
- Import Pháp điển corpus via
POST /v1/admin/doc/uploadafter deployment - No CI/CD pipeline is included — set up your own (Inferred from implementation)
See docs/ARCHITECTURE.md for networking diagrams and detailed deployment notes.
| Domain | Prefix | Key Endpoints |
|---|---|---|
| Auth | /v1/user |
POST /oAuth-login |
| Chat | /v1/user |
POST /user_chat, GET /sessions, GET /session |
| Contract | /v1/user |
POST /contract/fill, GET /contract/draft/* |
| Documents | /v1 |
POST /doc/upload, GET /docs, POST /doc/run |
| Admin | /v1/admin |
POST /doc/retrieval, POST /doc/upload, GET /doc/topic |
Full API documentation with inputs/outputs: docs/ARCHITECTURE.md
Interactive docs available at /docs when the API is running.
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Follow the existing code patterns:
- Backend: Router → Controller → Service layering
- Agents: Add nodes to intent-specific graphs in
deepagent/multiagent/legal_assistant/ - Frontend: React Query hooks in
web/hooks/, services inweb/service/
- Run tests:
uv run python -m pytest tests/ - Submit a pull request
- Python 3.12+, type hints, Pydantic v2 models
- Peewee ORM for database (not SQLAlchemy)
- LangGraph
StateGraphwith typed state (LegalAssistantState) - API response envelope:
{ code, msg, data } - Environment variables via
.env(never commit secrets)
Maintained by project contributors.
| Document | Description |
|---|---|
| docs/ARCHITECTURE.md | Full technical architecture (English) |
| docs/ARCHITECTURE.vi.md | Kiến trúc kỹ thuật (Tiếng Việt) |
| api/deployment.readme.md | Manual Docker run for Postgres/MinIO/Redis |
| api_creating_instruction.md | API development conventions |
| model_serving/retrievers/elastic_search/deployment.readme.md | Elasticsearch setup |
Lex Companion would not be possible without the open legal data shared by the community.
We are grateful to tmquan/phapdien-moj-gov-vn on Hugging Face for publishing the Vietnamese legal codex (Pháp điển) dataset sourced from the Ministry of Justice. This project uses multiple configs from that dataset — including tree_nodes, articles, subjects, and ontology metadata — as the foundation of our legal knowledge base, Elasticsearch indexing pipeline, and citation-backed retrieval.
Thank you to the maintainers and contributors of that dataset for making structured Vietnamese legal knowledge openly available.
Lex Companion is actively evolving toward a full agentic legal assistant for Vietnam. The core RAG and information-intent workflows are in place; several specialized agents still need to be built out.
| Agent | Path | Current state | Target |
|---|---|---|---|
| Decision | deepagent/multiagent/legal_assistant/decision/ |
Single-node flow: retrieval + placeholder options and estimates | Multi-step decision reasoning — risk analysis, option comparison, consequence mapping, and structured recommendations grounded in retrieved law |
| Problem solving | deepagent/multiagent/legal_assistant/problem_solving/ |
Single-node flow: retrieval + static strategy template | Dynamic legal problem decomposition — step-by-step action plans, milestone tracking, and iterative clarification when facts are incomplete |
Other areas on the roadmap:
- Exploration agent — richer open-ended legal research with web + corpus fusion
- Persistent HITL checkpointing — Redis/Postgres checkpointer for reliable contract-fill resume across restarts
- User document ingestion — complete Docling parse pipeline for uploaded KB documents
- Calculator tools — real fine/penalty estimation logic (currently placeholder)
- CI/CD & production hardening — automated testing, deployment pipelines, and observability
Contributions toward any of these areas are welcome — see Contributing above.