Skip to content

[Epic] Phase 2: VAZHI Full (Offline Mode) #2

@CryptoYogiLLC

Description

@CryptoYogiLLC

Overview

Add offline AI inference capability to complement the hybrid retrieval architecture.

Status

Hybrid retrieval architecture is COMPLETE (ADR-009). The app provides full value via SQLite-backed deterministic lookups for 10 knowledge categories. This phase now focuses on adding optional on-device AI inference for conversational capabilities.

Completed

  • Hybrid retrieval architecture (Epic Epic: Hybrid Retrieval Architecture for VAZHI App #14, sub-issues Create SQLite database structure and migrations #15-Integration testing for hybrid architecture #21)
  • Query router with 10 category matchers and hybrid/deterministic/AI classification
  • 7 retrieval services (Thirukkural, Schemes, Emergency, Healthcare, GenericData x6 categories)
  • Model download service with pause/resume, network detection, cellular warning
  • Download dialog UI with progress, speed, ETA
  • Model status indicator widget
  • SHA256 model integrity verification
  • SSL certificate pinning for download URLs
  • Hybrid chat provider integrating knowledge retrieval + AI responses
  • Bilingual UI (English + Tamil) across result cards and chat bubbles
  • Expandable content with Show more/Show less
  • Smart routing: conversational queries route as hybrid (SQLite + AI), not forced deterministic
  • 232 tests passing

Remaining (blocked on model training)

  • Successful model training (v3.4 Qwen3-0.6B-Base pending — see models/TRAINING_LOG.md)
  • Convert trained model to GGUF format (Issue Convert VAZHI model to GGUF format for mobile (<1GB target) #13)
  • Integrate llama.cpp / llamadart for on-device inference
  • Test inference quality on 4GB+ RAM devices (Android 10+ / iOS 15+)
  • Wire AI response path in HybridChatProvider to actual model inference
  • Validate Tamil output quality of quantized model
  • End-to-end test: hybrid query returns SQLite data + AI explanation

Technical Scope

  • Inference: llamadart (GGUF format) for on-device processing
  • Model target: Qwen3-0.6B-Base, Q4_K_M quantization (<1GB)
  • Min device: 4GB RAM, Android 10+ / iOS 15+
  • Fallback: App already works fully without model via hybrid retrieval

Dependencies

Related ADRs

Metadata

Metadata

Assignees

Labels

epicLarge feature spanning multiple issuesphase-2Phase 2: VAZHI Full (Offline)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions