A Retrieval-Augmented Generation (RAG) system that monitors LangChain GitHub issues, extracts business and technical insights, stores them as vector embeddings in Milvus, and serves intelligent answers via a FastAPI endpoint. Optionally fine-tunes a small LLM (Phi-3) on the generated insights for grounded response generation.
GitHub Issues ──> Ingestion ──> Preprocessing ──> Insight Extraction
│
▼
API <── Milvus Vector DB <── Embeddings (BGE-m3)
├── src/
│ ├── data/ # GitHub data ingestion
│ │ └── github_ingest.py
│ ├── preprocess/ # Data cleaning & insight generation
│ │ ├── prepare_github_data.py
│ │ ├── finalize_dataset_github.py
│ │ ├── send_github_issues_to_llm.py
│ │ ├── summarize_github_issues.py
│ │ └── generate_business_tech_insights.py
│ ├── agent_pipeline/ # LangGraph agent workflows
│ │ ├── agent_github_ingest/ # Issue fetching agent
│ │ └── agent_embed_data/ # Embedding & Milvus insertion agent
│ ├── rag_pipeline/ # RAG query & retrieval
│ │ ├── set_up_milvus_db.py
│ │ ├── form_data_for_collection_1.py
│ │ ├── form_data_for_collection_2.py
│ │ ├── generate_response.py
│ │ └── test_query.py
│ └── llm_finetuning/ # Phi-3 fine-tuning with LoRA
│ └── phi-instruct/
│ ├── model.py
│ ├── trainer.py
│ └── data.py
├── deployments/
│ └── deploy_rag_service/ # FastAPI deployment
│ ├── api_milvus.py
│ ├── start_service.sh
│ └── test_api.py
├── data/
│ └── processed/ # Processed insights & embeddings
Fetches GitHub issues and comments via the GitHub API with rate-limit handling. Available as both a standalone script (src/data/github_ingest.py) and a LangGraph agent (src/agent_pipeline/agent_github_ingest/).
Cleans markdown, removes emojis, classifies issues (bug/feature/etc.), batches them, and uses an LLM (Ollama/Gemma) to extract business insights and technical insights per batch.
Two collections:
- issue_batches -- batch-level metadata and concatenated summaries
- issue_insights -- individual insight embeddings (1024-dim BGE-m3 vectors) for semantic search
FastAPI endpoint that embeds user queries with BGE-m3, retrieves top-K similar insights from Milvus, and returns ranked results filtered by repo and insight type.
Fine-tunes Microsoft Phi-3-mini-4k-instruct using 4-bit quantization + LoRA (r=8, alpha=16) with SFT to generate grounded answers from retrieved context.
- Python 3.10
- GitHub personal access token
- Milvus (Lite or Server)
pip install -r requirements.txtCopy .env_example to .env and fill in:
GITHUB_TOKEN='' # GitHub API token
MILVUS_URI= # Milvus connection URI (default: localhost:19530)
DEVICE= # cpu or cuda
python src/agent_pipeline/agent_github_ingest/agent.pypython src/preprocess/prepare_github_data.py
python src/preprocess/finalize_dataset_github.py
python src/preprocess/send_github_issues_to_llm.py
python src/preprocess/generate_business_tech_insights.pypython src/rag_pipeline/form_data_for_collection_1.py
python src/rag_pipeline/form_data_for_collection_2.py
python src/rag_pipeline/set_up_milvus_db.py
python src/agent_pipeline/agent_embed_data/agent.pycd deployments/deploy_rag_service
pip install -r requirements.txt
uvicorn api_milvus:app --reloadcurl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"query": "What are common dependency issues?", "top_k": 5}'API Endpoints:
| Method | Path | Description |
|---|---|---|
| GET | /health |
Health check |
| POST | /query |
Search insights by semantic similarity |
| Layer | Technology |
|---|---|
| Orchestration | LangGraph |
| Vector DB | Milvus |
| Embeddings | BAAI/bge-m3 (1024-dim) |
| Fine-tuning | LoRA + QLoRA (4-bit NF4) via PEFT/TRL |
| API | FastAPI + Uvicorn |
| Data Source | GitHub REST API |