GovAI Secure Intelligence Assistant (G-SIA) is a compliance-aware AI chatbot that enables secure, policy-driven access to de-identified patient data. It uses multi-agent workflows orchestrated with LangGraph, Retrieval-Augmented Generation (RAG) for regulatory policy enforcement, and LangSmith for observability. This project demonstrates how AI can safely operate in regulated environments like healthcare and public services.
- Enforce data privacy and regulatory compliance (HIPAA, GDPR, CCPA).
- Enable secure structured database access through AI agents.
- Provide explainable, policy-backed responses to user queries.
- Demonstrate robust agent orchestration and monitoring using LangChain ecosystem.
- Policy-Aware Reasoning: Validates every query against policies using RAG.
- SQL Query Agent: Generates and executes parameterized queries.
- Query Rewriting: Modifies unsafe queries to meet regulations.
- Audit Logging: Tracks queries, decisions, and database interactions.
- Agent Observability: LangSmith for debugging and tracing.
- Scalability: Modular agents that can extend to new compliance rules.
+---------------------------------------------------+
| User Interface / API |
+---------------------------------------------------+
│
▼
+---------------------------------------------------+
| LangGraph Orchestration Controller |
| (manages workflow between agents and tools) |
+---------------------------------------------------+
│ │ │
▼ ▼ ▼
+--------------+ +-----------------+ +----------------+
| Policy Agent |----->| Query Rewriter | | Audit Logger |
| (RAG over | | (only if needed)| | (logs decisions|
| HIPAA/GDPR) | +-----------------+ | and executions)|
+--------------+ │ +----------------+
│ ▼
│ +------------------+
│ | SQL Query Agent |
│ | (LangChain SQL |
│ | toolkit) |
│ +------------------+
│ │
▼ ▼
+---------------------------------------------------+
| Secure PostgreSQL (De-identified Patient Database) |
+---------------------------------------------------+
│
▼
+---------------------------------------------------+
| Response Formatter → Sends Compliant Response |
+---------------------------------------------------+
│
▼
+---------------------------------------------------+
| Monitoring & Observability (LangSmith & Azure) |
+---------------------------------------------------+
-
Purpose: Determines if a user query complies with regulations.
-
Functionality:
- Uses RAG to retrieve and interpret policies from HIPAA, GDPR, and CCPA.
- Classifies queries as Allowed, Partially Allowed, or Blocked.
- Provides reasons for blocking or modifying queries.
-
Purpose: Adjusts non-compliant queries to make them compliant.
-
Functionality:
- Removes sensitive fields or replaces them with aggregated metrics.
- Ensures the rewritten query still provides useful information without violating policies.
-
Purpose: Executes secure data retrieval from the PostgreSQL database.
-
Functionality:
- Converts natural language into parameterized SQL using LangChain SQL Toolkit.
- Prevents direct access to identifiers, enforces aggregation thresholds.
- Only executes queries approved by the Policy Agent.
-
Purpose: Provides complete traceability of system actions.
-
Functionality:
- Logs every query, decision, SQL command, and response metadata.
- Integrates with Azure Monitor & SIEM for compliance-friendly storage.
-
User Query: User asks a question (e.g., patient statistics).
-
Policy Agent: Checks query legality using policy embeddings.
-
Decision:
- ✅ Allowed: Forward to SQL Agent.
- ⚠ Partial: Query Rewriter modifies it.
- ❌ Blocked: Returns explanation of violated policy.
-
SQL Agent: Generates secure SQL, queries PostgreSQL.
-
Audit Logger: Records the complete interaction.
-
Response Formatter: Returns answer with compliance notes.
-
LangSmith & Azure Monitor: Capture reasoning and security logs.
Dataset: Synthea Synthetic Patient Dataset (fully de-identified).
Core Tables:
patients– demographicsconditions– diagnosesencounters– hospital visitsmedications– prescriptionsorganizations– hospital detailsaudit_logs– query history & policy decisions
- LLM: OpenAI models
- RAG: (Any vector database) + LangChain Retriever
- Agents: LangChain + LangGraph for orchestration
- Observability: LangSmith
- Database: PostgreSQL
- Backend: FastAPI
- Security: RBAC, TLS, PII masking
- Logging: SIEM
govai-secure-assistant/
│── data/ # Synthea CSV files
│── db/ # SQL schema & migration scripts
│── agents/ # PolicyAgent, SQLAgent, QueryRewriter
│── retrievers/ # Policy document loaders
│── logs/ # Dev logs
│── app/ # FastAPI backend
│── notebooks/ # Jupyter tests
│── README.md # Documentation
│── requirements.txt # Dependencies
git clone https://github.com/your-username/govai-secure-assistant.git
cd govai-secure-assistant
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt# Load Synthea CSVs into PostgreSQL
data/load_data.py
# Start FastAPI backend
uvicorn app.main:app --reload- LangSmith: Traces agent reasoning for debugging.
- Azure Monitor + SIEM: Stores immutable logs for compliance.
User Query: "How many patients with hypertension were treated in Denver hospitals?"
- Policy Agent: ✅ Allowed (aggregated query)
- SQL Agent: Generates secure SQL → Executes
- Audit Logger: Logs full trace
- Response: "There were 184 patients diagnosed with hypertension treated in Denver hospitals. (HIPAA compliant)."
- Uses only de-identified synthetic data.
- Policy Agent enforces regulatory checks before any data access.
- Audit logs provide full traceability for compliance audits.
- Add differential privacy for aggregate queries.
- FHIR API support.
- Extend to other regulated domains.
- ✅ Advanced multi-agent orchestration with LangGraph.
- ✅ Implements policy-driven AI reasoning.
- ✅ Provides a secure and auditable AI solution.
- ✅ Ideal for showcasing AI Solution Architect expertise.
Sai Pratheek Kerthi Venkata AI/ML Engineer | Cloud & Data Security Enthusiast LinkedIn | GitHub