Agentic Auditor is an AI-driven contract reviewer that runs a Drafter–Critic agent loop to identify risks and unfair clauses in legal documents quickly and with reduced hallucination.
- Drafter & Critic agent loop to reduce hallucination
- PDF ingestion and chunking
- Local policy retrieval via Qdrant vector DB
- JSON report + visual risk log output
- Self-healing: auto-downloads data and rebuilds DB if missing
- Frontend: Streamlit
- Orchestration: LangGraph + Groq (GPT OSS)
- Vector DB / Memory: Qdrant
- Ingestion: Unstructured + Poppler
- Python 3.10+
- Clone the repo:
git clone https://github.com/Akshad135/agentic-auditor
cd agentic-auditor- Create and activate a virtual environment
macOS / Linux:
python -m venv .venv
source .venv/bin/activateWindows (PowerShell):
python -m venv .venv
.venv\Scripts\Activate.ps1- Install dependencies:
pip install -r requirements.txt- (Optional, GPU users) Restore CUDA PyTorch after dependency install (Poppler tends to reinstall torch without CUDA):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 --force-reinstall- Python 3.10+
- Poppler (required for PDF processing)
Install Poppler:
- macOS:
brew install poppler- Ubuntu / Debian:
sudo apt-get install poppler-utils- Windows:
Download a Poppler binary for Windows like this and add its
bin/folder to your PATH. Restart your terminal/IDE after updating PATH.
Copy the example environment file and add your Groq API key:
cp .env.example .env
# Edit .env and set GROQ_API_KEY and other variables as neededStart the Streamlit dashboard:
streamlit run app.pyUpload a PDF or paste text in the UI. The Drafter and Critic will debate assessments in the sidebar and the app will generate a structured JSON report and a visual risk log.
- System checks (GPU, Poppler, API)
python scripts/check.py- Rebuild vector DB and playbook
python development/setup_db.py- Run CLI audit
python development/run_audit.pyapp.py # Main Streamlit dashboard
src/agents/ # LangGraph agent logic (Drafter / Critic)
data/ # Vector DB, raw PDFs, and artifacts
development/ # Scripts: build DB, CLI, data download
requirements.txt
.env.example- Designed to auto-bootstrap: on first run it will attempt to download training data, derive policies, and build the vector index if missing.
- Reports are produced as structured JSON plus an interactive risk log for review.