Generative AI Engineer | RAG Architect | MLOps Specialist
I architect high-performance GenAI systems that bridge the gap between LLM research and production-grade engineering. My work focuses on Latency Optimization, Agentic Orchestration, and System Reliability.
| Metric | Result | Project / Context |
|---|---|---|
| p50 Retrieval Latency | 18.4ms | vectorDBpipe + local ChromaDB |
| p99 Retrieval Latency | 115.8ms | vectorDBpipe Stress Test |
| Ops Reduction | 80% | Agentic Workflow Automation |
| Ingestion Throughput | 42.5 chunks/s | vectorDBpipe Modular Pipeline |
| Retrieval Precision | 0.92 MRR | UPSC RAG Intelligence Engine |
| Project | Role | Tactical Edge |
|---|---|---|
| vectorDBpipe | Lead Developer | Modular Python package standardizing RAG across FAISS/Pinecone. |
| Agentic AI Workflow | Architect | Autonomous multi-agent system with long-term FAISS memory persistence. |
| UPSC RAG Intelligence | Engineer | Zero-hallucination academic engine optimized for dense/sparse retrieval. |
| Multi-Threaded Server | Systems Engineer | High-concurrency Java HTTP server with custom ThreadPool logic. |
- AI Core: LangChain, CrewAI, Prompt Engineering, RAG Architectures, Agentic Workflows.
- Data Infrastructure: Pinecone, FAISS, ChromaDB, Sentence-Transformers, SQL.
- Backend Mastery: Python (FastAPI/Flask), Java (Spring Boot), REST APIs, Multithreading.
- MLOps & DevOps: Docker, GitHub Actions, MLflow, CI/CD, Nix.
- Engineering Hub: yash.jobos.online
- Technical Dossier: Download PDF Portfolio
- LinkedIn: Yash Desai
✨ “Architecting the future of autonomous intelligence, one benchmark at a time.”


