LLM infrastructure & systems engineering. I build fast, deployable AI — from fine-tuning 70B models on Intel Gaudi2 clusters to running edge inference on Raspberry Pi.
EE + CS undergrad, Rank 1/328 @ Dayalbagh Educational Institute, Agra.
@ Gatespeed (Intel AI Partner Alliance) — Deployed DeepSeek-R1 Llama 70B on 8×Gaudi2 HPUs. Cut end-to-end inference latency from 261s → 46s. Hit 1,199 tokens/s output throughput at 256 req/s concurrency. Built multilingual RAG with Qdrant. Fine-tuned 70B model in 52 min (73.77% accuracy, 2.72 perplexity, 16.4M trainable params). Stack: vLLM, DeepSpeed ZeRO Stage 3, LoRA/PEFT, bfloat16, HPUgraph.
@ Cadence Design Systems — Applied ML (Random Forest, LightGBM, XGBoost) to ECO optimizer pipelines for Samsung/Qualcomm/Renesas 3nm chips. Filtered 65% wasteful evaluations on avg with zero QoR degradation. Integrated directly into C/C++ Tempus and Certus signoff tools.
@ IIT Delhi MindLab — Prototyped adaptive NPC dialogue with fine-tuned LLMs (OCEAN personality modeling, vector memory). Built FastAPI middleware with therapeutic RL optimization using GQ-6 metrics.
| Project | What it is | Recognition |
|---|---|---|
| deGuppe | Decentralized real-time comms over TOR with hybrid blockchain storage | Best Poster @ NSC-47, Soonami Cohort 3 funded, Best Project Web3/AI for Good @ IITD Tryst |
| Gam-i-yog | Live yoga pose classifier + multimodal GenAI feedback (MediaPipe, PyTorch, K-means) | Best Poster @ DSC Winter (Waterloo/Birmingham), Toycathon National Finalist — GoI funded |
| Pehchaan | Edge face recognition (EdgeFace + LoRaLin distillation) on Raspberry Pi for turnstile/attendance | — |
| Abhinandan | Combat-resilient battlebot with shock-damped drivetrain | 1st Prize RoboWars @ IITD Tryst (first junior HS team to beat funded college teams) |
ML & Inference — PyTorch, vLLM, Ollama, Unsloth, LoRA/PEFT, GGUF, TensorFlow, OpenCV, MediaPipe
Infra & Systems — Linux/BSD, Docker/Kubernetes, NGINX, QEMU/KVM, DPDK, WireGuard, AWS (EC2, Lambda, S3, DynamoDB, Route 53)
Languages — Python, C/C++, Rust, Shell/Bash, Tcl/Tk, SQL
Hardware — Intel Gaudi2 HPU, RTX 5070/5090, Raspberry Pi
- 3rd year @ DEI Agra (2023–2027)
- Building toward AI infra research — inference systems, world models, BCI
- Open to research collabs and infra roles