Lex Companion

Agentic Legal AI for Vietnam

Lex Companion is an agentic AI legal companion for Vietnam that helps individuals and businesses understand regulations, research legal issues, evaluate options, and generate legal documents through specialized legal agents grounded in authoritative legal sources.

Project Overview

Navigating Vietnamese law often requires searching across thousands of legal provisions, understanding relationships between regulations, and translating legal language into practical actions.

Lex Companion acts as an AI legal companion that assists users throughout this process. Instead of functioning as a traditional chatbot, it coordinates specialized legal agents capable of legal research, information retrieval, document drafting, decision support, and problem-solving.

Core capabilities include:

Agentic legal workflows powered by intent-specific LangGraph agents
Grounded legal reasoning over the Vietnamese Pháp điển legal codex (~64k articles)
Hybrid retrieval architecture combining keyword search, semantic search, and reranking
Citation-backed responses with transparent references to legal sources
Human-in-the-loop document generation for contracts and legal forms
Session-based knowledge augmentation through user-provided documents

For detailed architecture documentation, see docs/ARCHITECTURE.md · Tiếng Việt

Features

Feature	Description
Legal Q&A	Ask questions about Vietnamese law; get answers grounded in Pháp điển articles
Legal Research	Multi-query RAG with ontology-aware query expansion and retry
Hybrid Search	Elasticsearch keyword + KNN vector fusion with BGE reranking
Citation Tracking	Every factual claim links to `[n]` inline citations and a reference panel
Intent Routing	6 specialized agent workflows: information, decision, problem-solving, exploration, task execution, communication
Document Generation	Contract template selection, form fill, and DOCX output with HITL checkpoints
User Knowledge Base	Upload personal documents for session-scoped retrieval
Legal Corpus Visualization	Interactive graph of topics, subjects, and articles (admin)
Web Fallback	Tavily web search when legal corpus context is insufficient
i18n	Vietnamese and English UI

Architecture Overview

flowchart TB
    subgraph Client
        WEB["Next.js :3004"]
    end

    subgraph Backend
        API["FastAPI :5999"]
        WORKER["Redis Worker"]
    end

    subgraph AI
        AGENTS["LangGraph Agents<br/>(6 intents)"]
        RAG["RAG Pipeline<br/>ES → Rerank → LLM"]
    end

    subgraph Infrastructure
        PG[(PostgreSQL)]
        ES[(Elasticsearch)]
        MINIO[(MinIO)]
        REDIS[(Redis)]
    end

    WEB --> API
    API --> AGENTS
    AGENTS --> RAG
    RAG --> ES
    API --> PG
    API --> MINIO
    WORKER --> REDIS
    WORKER --> ES

Request flow: User message → JWT auth → intent routing → LangGraph agent → hybrid retrieval → rerank → LLM with citations → persist & respond.

See docs/ARCHITECTURE.md for complete diagrams covering request lifecycle, agent workflows, RAG pipeline, database design, and deployment.

Technology Stack

Layer	Technologies
Frontend	Next.js 16, React 19, TanStack Query, Tailwind CSS 4
Backend	FastAPI, Uvicorn, Peewee ORM, Pydantic v2
AI/Agents	LangGraph, LangChain, FlagEmbedding
Search	Elasticsearch 8.13 (hybrid keyword + KNN)
Embedding	AITeamVN/Vietnamese_Embedding_v2 (1024 dims)
Reranking	BAAI/bge-reranker-v2-m3
LLM	OpenAI-compatible API
Document Processing	Docling, PyMuPDF, python-docx
Storage	PostgreSQL, MinIO, Redis
Package Management	uv (Python), npm (Frontend)

Quick Start

Prerequisites

Component	Version
Python	3.12+
uv	Latest
Docker	For infrastructure services
Node.js	20+ (for frontend)

1. Clone and configure

git clone <repository-url>
cd langgraph-base
cp .env.example .env
# Edit .env with your credentials (see Configuration section)

2. Start with Docker Compose (recommended)

# Linux: ensure ES can start
sudo sysctl -w vm.max_map_count=262144

docker compose -f docker/docker-compose.yml up -d --build

Service	URL
Web UI	http://localhost:3005
API	http://localhost:6000
API Docs	http://localhost:6000/docs
Kibana	http://localhost:5602

3. Or run locally (development)

Infrastructure (Postgres, MinIO, Redis, Elasticsearch):

# See api/deployment.readme.md and
# model_serving/retrievers/elastic_search/deployment.readme.md

Backend:

uv venv --python 3.12
uv sync
uv run --env-file .env python -m api.lex_companion_server
# API at http://localhost:5999

Frontend:

cd web
npm install
npm run dev
# UI at http://localhost:3004

Installation

Backend dependencies

All Python dependencies are managed via uv from the repository root:

uv sync                  # Install production dependencies
uv sync --group dev      # Include LangGraph CLI for development

Important: Do not run uv pip install -e . — this project uses package = false and runs via PYTHONPATH.

Frontend dependencies

cd web
npm install

Embedding service (optional, self-hosted)

cd model_serving/embeddings/vie_embedding_v2
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python app.py  # Runs on port 6501

Configuration

Copy .env.example to .env and configure:

Required

# Database
POSTGRES_USER=your_user
POSTGRES_PASSWORD=your_password
POSTGRES_DB=lex_companion
POSTGRES_HOST=localhost
POSTGRES_PORT=5432

# Object Storage
MINIO_HOST=localhost:6503
MINIO_USER=your_user
MINIO_PASSWORD=your_password
MINIO_BUCKET=lex-companion

# Search
ELASTIC_HOST=localhost:6505
ELASTIC_PASSWORD=your_password
LEX_CHUNKS_INDEX=lex_chunks_v1
LEGAL_VECTOR_DIMS=1024

# LLM
OPENAI_API_KEY=your_key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o

# Embedding
EMBEDDING_PROVIDER=openai
EMBEDDING_BASE_URL=http://localhost:6501/v1
EMBEDDING_MODEL=AITeamVN/Vietnamese_Embedding_v2

# Auth
JWT_SECRET_KEY=your_secret
GOOGLE_CLIENT_ID=your_client_id
GOOGLE_CLIENT_SECRET=your_client_secret
GOOGLE_REDIRECT_URI=http://localhost:3004/auth/google/callback

Frontend (build-time)

# web/.env
NEXT_PUBLIC_API_SERVER=http://localhost:5999
NEXT_PUBLIC_GOOGLE_CLIENT_ID=your_client_id
NEXT_PUBLIC_GOOGLE_OAUTH2_CALLBACK=http://localhost:3004/auth/google/callback

See docs/ARCHITECTURE.md for the complete environment variable reference.

Development

Project structure

langgraph-base/
├── api/                    # FastAPI backend
│   ├── lex_companion_server.py
│   ├── apps/
│   │   ├── routers/        # Auto-loaded route definitions
│   │   ├── controllers/    # Request handlers
│   │   └── services/       # Business logic + orchestration
│   ├── db/models.py        # Peewee ORM models
│   └── worker/             # Redis stream background worker
├── deepagent/              # LangGraph agents + document processing
│   ├── multiagent/legal_assistant/  # Intent-specific graph workflows
│   └── core/               # Rerank, splitters, embeddings, HITL
├── model_serving/          # Standalone embedding + LLM services
├── web/                    # Next.js frontend
├── docker/                 # Docker Compose + Dockerfiles
├── docs/                   # Technical documentation
├── scripts/                # Startup scripts
├── pyproject.toml          # Python dependencies (uv)
└── .env.example            # Environment template

Running the API

# Recommended
uv run --env-file .env python -m api.lex_companion_server

# Or via script
./scripts/start_lex_api.sh

API: http://localhost:5999
OpenAPI docs: http://localhost:5999/docs

Do not run python api/lex_companion_server.py directly — use python -m api.lex_companion_server with PYTHONPATH=..

Adding dependencies

uv add requests              # Production dependency
uv add --dev pytest          # Dev dependency
uv sync                      # Reinstall from lockfile

Creating new API endpoints

Follow the layered pattern documented in api_creating_instruction.md:

Router → Controller → Service → DB/ES/Agent

LangGraph development

uv sync --group dev
uv run langgraph dev

Note: langgraph.json references a legacy graph path. Active graphs are in deepagent/multiagent/legal_assistant/.

Running tests

uv run python -m pytest tests/

Deployment

Docker Compose (full stack)

docker compose -f docker/docker-compose.yml up -d --build

Services and ports:

Service	Host Port	Purpose
PostgreSQL	5445	Relational database
MinIO	6503/6504	Object storage
Redis	6376	Task queue
Elasticsearch	6505	Search + vectors
Kibana	5602	ES management UI
Embedding	6502	Vietnamese embedding model
API	6000	FastAPI backend
Web	3005	Next.js frontend

Production considerations

Set strong secrets for JWT_SECRET_KEY, database passwords, and MinIO credentials
Configure RERANK_DEVICE=cuda:0 if GPU is available
Implement persistent checkpointer (Redis/Postgres scaffold exists) for HITL reliability
Import Pháp điển corpus via POST /v1/admin/doc/upload after deployment
No CI/CD pipeline is included — set up your own (Inferred from implementation)

See docs/ARCHITECTURE.md for networking diagrams and detailed deployment notes.

API Overview

Domain	Prefix	Key Endpoints
Auth	`/v1/user`	`POST /oAuth-login`
Chat	`/v1/user`	`POST /user_chat`, `GET /sessions`, `GET /session`
Contract	`/v1/user`	`POST /contract/fill`, `GET /contract/draft/*`
Documents	`/v1`	`POST /doc/upload`, `GET /docs`, `POST /doc/run`
Admin	`/v1/admin`	`POST /doc/retrieval`, `POST /doc/upload`, `GET /doc/topic`

Full API documentation with inputs/outputs: docs/ARCHITECTURE.md

Interactive docs available at /docs when the API is running.

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/your-feature)
Follow the existing code patterns:
- Backend: Router → Controller → Service layering
- Agents: Add nodes to intent-specific graphs in deepagent/multiagent/legal_assistant/
- Frontend: React Query hooks in web/hooks/, services in web/service/
Run tests: uv run python -m pytest tests/
Submit a pull request

Code conventions

Python 3.12+, type hints, Pydantic v2 models
Peewee ORM for database (not SQLAlchemy)
LangGraph StateGraph with typed state (LegalAssistantState)
API response envelope: { code, msg, data }
Environment variables via .env (never commit secrets)

License

Maintained by project contributors.

Documentation

Document	Description
docs/ARCHITECTURE.md	Full technical architecture (English)
docs/ARCHITECTURE.vi.md	Kiến trúc kỹ thuật (Tiếng Việt)
api/deployment.readme.md	Manual Docker run for Postgres/MinIO/Redis
api_creating_instruction.md	API development conventions
model_serving/retrievers/elastic_search/deployment.readme.md	Elasticsearch setup

Acknowledgements

Lex Companion would not be possible without the open legal data shared by the community.

We are grateful to tmquan/phapdien-moj-gov-vn on Hugging Face for publishing the Vietnamese legal codex (Pháp điển) dataset sourced from the Ministry of Justice. This project uses multiple configs from that dataset — including tree_nodes, articles, subjects, and ontology metadata — as the foundation of our legal knowledge base, Elasticsearch indexing pipeline, and citation-backed retrieval.

Thank you to the maintainers and contributors of that dataset for making structured Vietnamese legal knowledge openly available.

Roadmap & Future Improvements

Lex Companion is actively evolving toward a full agentic legal assistant for Vietnam. The core RAG and information-intent workflows are in place; several specialized agents still need to be built out.

Agent completion

Agent	Path	Current state	Target
Decision	`deepagent/multiagent/legal_assistant/decision/`	Single-node flow: retrieval + placeholder options and estimates	Multi-step decision reasoning — risk analysis, option comparison, consequence mapping, and structured recommendations grounded in retrieved law
Problem solving	`deepagent/multiagent/legal_assistant/problem_solving/`	Single-node flow: retrieval + static strategy template	Dynamic legal problem decomposition — step-by-step action plans, milestone tracking, and iterative clarification when facts are incomplete

Other areas on the roadmap:

Exploration agent — richer open-ended legal research with web + corpus fusion
Persistent HITL checkpointing — Redis/Postgres checkpointer for reliable contract-fill resume across restarts
User document ingestion — complete Docling parse pipeline for uploaded KB documents
Calculator tools — real fine/penalty estimation logic (currently placeholder)
CI/CD & production hardening — automated testing, deployment pipelines, and observability

Contributions toward any of these areas are welcome — see Contributing above.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
api		api
deepagent		deepagent
docker		docker
docs		docs
model_serving		model_serving
tests		tests
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README.vi.md		README.vi.md
__init__.py		__init__.py
langgraph.json		langgraph.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Lex Companion

Project Overview

Features

Architecture Overview

Technology Stack

Quick Start

Prerequisites

1. Clone and configure

2. Start with Docker Compose (recommended)

3. Or run locally (development)

Installation

Backend dependencies

Frontend dependencies

Embedding service (optional, self-hosted)

Configuration

Required

Recommended

Frontend (build-time)

Development

Project structure

Running the API

Adding dependencies

Creating new API endpoints

LangGraph development

Running tests

Deployment

Docker Compose (full stack)

Production considerations

API Overview

Contributing

Code conventions

License

Documentation

Acknowledgements

Roadmap & Future Improvements

Agent completion

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages