Fine-tune Llama 3 8B on your data · deploy on Vertex AI · wrap with FastAPI
- Domain-specific fine-tuned Llama 3 8B via PEFT/LoRA
- Google Vertex AI deployment — scalable, production-ready endpoint
- FastAPI wrapper with auth, rate-limit, and OpenAPI docs
- LangChain RAG pipeline (STANDARD+) — retrieval over your documents
- Full source code + Dockerfile + deployment guide
POST /v1/chat -> 200 OK model: argus-llama3-8b-finetuned-v1 latency: 312ms
Order on Fiverr - Argus Intelligence · From $180 · 48h delivery
MIT © Argus Intelligence · marcosantcs