⚡ GMI GPU Cost Optimizer Agent

An agentic AI system that helps users find the optimal GPU deployment strategy on GMI Cloud. Powered by Kimi K2.5 with autonomous tool use and multi-step reasoning.

Live Demo →

Demo

What Makes This an Agent?

Unlike a simple chatbot or dashboard, this system:

Autonomously decides which tools to call based on your natural language input
Multi-step reasoning — chains multiple tool calls to build a complete analysis
Adaptive — asks clarifying questions or makes reasonable assumptions
Conversational — maintains context across turns for follow-up questions

Agent Architecture

User (natural language) → Kimi K2.5 (reasoning + planning)
                              ↕ (function calling)
                         Tool Execution Engine
                              ↕ (structured results)
                         Kimi K2.5 (synthesis)
                              → Recommendation + Charts

Agent Tools

Tool	Description
`get_gpu_catalog`	Browse available GPUs with specs and pricing
`calculate_cost`	Compute monthly costs for specific configurations
`find_cheapest`	Find optimal option within a budget constraint
`compare_deployment_modes`	Find serverless vs dedicated crossover points
`generate_scaling_plan`	Plan infrastructure for traffic growth
`visualize_cost_comparison`	Generate interactive cost comparison charts

Quick Start

Online (Vercel): https://gmi-gpu-optimizer.vercel.app

Local (Streamlit):

pip install -r requirements.txt
export GMI_API_KEY="your-gmi-api-key"
streamlit run app.py

Then try:

"I need to deploy a Llama 70B model for a chatbot, expecting 50 QPS with a $15k/month budget"
"Compare serverless vs dedicated for a 7B model with 5 QPS"
"Help me plan scaling from 10 QPS to 200 QPS for a 34B code assistant"

Tech Stack

Agent LLM: Kimi K2.5 (Moonshot AI) via GMI Cloud Inference API
Tool Use: OpenAI-compatible function calling
Frontend: Streamlit (local) / Vanilla HTML+JS (Vercel)
Charts: Plotly
Deployment: Vercel (Python serverless + static frontend)
API: GMI Cloud (api.gmi-serving.com)

GPU Options (Mock Pricing)

GPU	VRAM	Price
H200 SXM	141GB HBM3e	$3.99/hr
H100 SXM	80GB HBM3	$2.49/hr
A100 SXM	80GB HBM2e	$1.89/hr
L40S	48GB GDDR6X	$1.29/hr
Serverless	Auto	$0.35/$1.10 per 1M tokens

Built At

GMI Cloud Hackathon 2026

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
api		api
public		public
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ GMI GPU Cost Optimizer Agent

Live Demo →

Demo

What Makes This an Agent?

Agent Architecture

Agent Tools

Quick Start

Tech Stack

GPU Options (Mock Pricing)

Built At

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ GMI GPU Cost Optimizer Agent

Live Demo →

Demo

What Makes This an Agent?

Agent Architecture

Agent Tools

Quick Start

Tech Stack

GPU Options (Mock Pricing)

Built At

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages