This repository documents my end-to-end journey to becoming a production-ready AI Engineer. The focus is not on demos or tutorials, but on building reliable, testable, and deployable LLM systems.
Each phase builds on the previous one and produces concrete artifacts.
Goal: Understand LLM behavior without abstractions.
- Raw Python scripts calling LLM APIs
- Structured JSON outputs (no free-form text)
- Basic logging for tokens, latency, and failures
- Token limits and context windows
- Determinism vs creativity
- Why structured outputs matter in production
📁 Folder: phase_0_llm_basics/
Goal: Use LangChain as an orchestration tool, not a crutch.
- PromptTemplate
- Chat models
- Pydantic output parsers
- Runnable pipelines (
|operator)
- Deterministic LLM pipelines
- Parallel execution (summary + keywords)
- Output validation and retries
📁 Folder: phase_1_langchain_core/
Goal: Build explainable, debuggable RAG systems.
- Embeddings and similarity search
- Chunking strategies and their impact
- Hallucination failure modes
- Document loaders and text splitters
- FAISS-based vector store
- Retrieval + answer synthesis pipeline
- Source citation and fallback responses
📁 Folder: phase_2_rag/
Goal: Measure quality instead of guessing.
- Golden evaluation dataset
- Prompt/version comparison scripts
- Hallucination and refusal tracking
- Structured output validation
Most GenAI systems fail silently. This phase ensures improvements are measurable and repeatable.
📁 Folder: phase_3_evaluation/
Goal: Ship an AI service, not a notebook.
- FastAPI backend
- Async and streaming LLM responses
- Caching for embeddings and LLM calls
- Rate limits, timeouts, and retries
- Deployed service with a public endpoint
📁 Folder: phase_4_production_api/
Goal: Use agents only where they make sense.
- Tool calling and ReAct pattern
- Guardrails and bounded reasoning
- Failure recovery
- Agent calling custom Python tools
- Explicit step limits
- Safe stopping conditions
📁 Folder: phase_5_agents/
Goal: Prove system-level understanding.
- Reimplemented one RAG / agent flow without LangChain
- Compared complexity, control, and debuggability
This phase ensures I understand the architecture, not just the framework.
📁 Folder: phase_6_framework_free/
- Python
- FastAPI
- LangChain
- Chroma
- Pydantic
- AsyncIO
- Structured outputs over free text
- Explicit failure handling
- Measurable improvements
- Production-first mindset
🚧 Actively building and iterating
📌 Focus: reliability, cost control, and system clarity
- Improve evaluation coverage
- Add cost dashboards
- Explore multi-modal inputs