On-device voice assistant powered by FunctionGemma-270M + Gemini Flash cloud fallback
Try it live: Open demo/index.html in Edge or Chrome.
Features:
- 🎤 Voice-first — Tap the orb, speak naturally
- ⚡ On-device inference — 50-80ms via FunctionGemma-270M
- ☁️ Cloud escalation — Auto-routes hard queries to Gemini Flash
- 🇦🇺 Australian male voice — TTS responses via Microsoft Neural voices
- 📊 Real-time pipeline — See STT → Classify → Route → Infer → TTS live
User Query (Voice / Text)
│
▼
┌─────────────────────────┐
│ COMPLEXITY ROUTER │ ◄ Deterministic, <1ms
│ EASY | MEDIUM | HARD │ No LLM call
└────────┬────────────────┘
│
▼
┌─────────────────────────┐
│ Cactus SDK + Gemma │ ◄ Unified hybrid path
│ threshold → routing │ SDK handles escalation
└────────┬────────────────┘
│
┌────┴────┐
▼ ▼
┌────────┐ ┌──────────┐
│ ⚡ Local│ │ ☁️ Cloud │
│ 50-80ms│ │ 200-500ms│
│ GemMA │ │ Gemini │
└───┬────┘ └────┬─────┘
│ │
▼ ▼
┌─────────────────────────┐
│ TRN VALIDATOR │ ◄ Schema + type check
│ + Postprocessor │ F1 normalization
└─────────────────────────┘
- Cactus SDK Unified Path — Single
cactus_completecall with native hybrid routing - Cloud Fallback — SDK-managed escalation to Gemini Flash for complex queries
- Deterministic Complexity Router — No LLM needed for routing, zero latency overhead
- TRN Validator — Validates every tool call against its JSON schema
- Voice-First UI — Animated orb, action cards, pipeline telemetry
# Setup (one-time)
git clone https://github.com/cactus-compute/cactus
cd cactus && source ./setup && cd ..
cactus build --python
cactus download google/functiongemma-270m-it --reconvert
pip install google-genai requests
export GEMINI_API_KEY="your-key"
# Run voice demo
open demo/index.html
# Submit to leaderboard
python submit.py --team "SwissblAIz" --location "Online"SwissblAIz 🌵
Built with ❤️ for the Google DeepMind × Cactus Compute Hackathon
Powered by FunctionGemma-270M (on-device) + Gemini Flash (cloud)