Skip to content

prathamamritkar/genAI-qualityBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

118 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

title Qualora - Enterprise Quality Auditor
emoji πŸŽ™οΈ
colorFrom blue
colorTo indigo
sdk docker
pinned false
Qualora Logo

Qualora

Turn every conversation into structured intelligence.

An enterprise-grade AI auditing platform that transcribes, analyzes, and scores support interactions β€” across voice, text, and file β€” with zero single point of failure.

Python 3.10+ Flask Blueprint Routes MongoDB Atlas ChromaDB FAISS OpenRouter Groq Hugging Face Inference Voyage AI sentence-transformers HF Space ElevenLabs Deepgram Groq Speech LLM model families STT model families

πŸ“‹ Table of Contents

  1. Project Overview & Purpose
  2. Key Capabilities
  3. Architecture & System Workflow
  4. The LLM-as-a-Judge Audit Matrix
  5. Technology Stack
  6. Getting Started & Installation
  7. Environment Configuration
  8. Hugging Face Space Node Setup
  9. API Reference & Endpoints
  10. Enterprise Standards & Security
  11. Dynamic Content Maintenance
  12. Troubleshooting & Roadmap

🌟 1. Project Overview & Purpose

Organizations struggle to capture and evaluate the human element of customer support. Traditional metrics (CSAT, NPS) miss critical nuances in agent performance, while compliance auditing remains manual, expensive, and inconsistent.

Qualora turns unstructured human interaction into highly structured, actionable intelligence. It provides zero single point of failure through a distributed hybrid architecture, returning comprehensive, deterministic analysis for:

  • Customer Success Teams: SaaS performance tracking and churn signal detection.
  • Contact Centers: BPO quality assurance at massive scale.
  • Compliance & Legal Officers: Automated PII auditing and regulatory deviation flags.
  • QA Managers: Data-driven behavioral coaching and agent trend analysis.

✨ 2. Key Capabilities

  • πŸŽ™οΈ Multi-Speaker Recognition & Acoustic Profiling: Uses Pyannote speaker diarization to separate voices. Combined with SpeechBrain and Parselmouth, it detects acoustic stress and physiological emotion directly from the audio envelope.
  • πŸ€– Tiered Voice Capture Pipeline: Voice calls are processed through HF Space first, then fail over to an API chain (ElevenLabs Scribe β†’ Deepgram Nova-2 β†’ Groq Whisper) when needed.
  • βš–οΈ LLM-as-a-Judge Auditing Framework: Uses a hardened fallback cascade (OpenRouter β†’ Groq β†’ HuggingFace Inference) to output structured JSON with F1 scores, emotional timelines, and compliance flags.
  • πŸ“š Retrieval-Augmented Generation (RAG) KB: Audits interactions against your specific company policy documents uploaded to a hybrid MongoDB/ChromaDB vector store.
  • πŸ‘₯ Human-in-the-Loop (HITL) Review: Supervisors and admins can approve, flag, or reject machine-generated audits with persistent override trails.

πŸ›οΈ 3. Architecture & System Workflow

Qualora employs a Master-Worker pattern designed for maximum resilience in serverless and containerized environments.

The Transcription Race

When audio is submitted, a resilient two-stage capture sequence runs:

  1. Stage T1 (HF Space Node): Streams the file to a private node for deep acoustic analysis, pitch detection, and diarization.
  2. Stage T2 (Cloud API Cascade): If T1 is unavailable or returns empty speech, the pipeline falls back through ElevenLabs Scribe β†’ Deepgram Nova-2 β†’ Groq Whisper.

Workflow Sequence Diagram

flowchart TB
    A[Input: Audio, Chat, File] --> B{Input Type}
    B -- Text or File --> C[Transcript Ready]
    B -- Voice --> V0

    subgraph V[Voice Capture Failover]
      direction TB
      V0[HF Space T1] --> V1[Transcript Ready]
      V0 -. fail .-> V2[ElevenLabs T2a]
      V2 -. fail .-> V3[Deepgram T2b]
      V3 -. fail .-> V4[Groq Whisper T2c]
      V4 --> V1
    end

    V1 --> C

    subgraph R[RAG Retrieval Cascade]
      direction TB
      R1[Voyage plus MongoDB Atlas]
      R2[ChromaDB local fallback]
      R3[FAISS memory fallback]
      RX[Policy Context or Null Sentinel]
      R1 --> RX
      R1 -. fail or empty .-> R2
      R2 -. fail or empty .-> R3
      R3 --> RX
    end

    C --> R1

    subgraph L[LLM Judge Cascade]
      direction TB
      L1[OpenRouter T1]
      L2[Groq T2]
      L3[HF Inference T3]
      L1 -. fail or rate-limit .-> L2
      L2 -. fail or rate-limit .-> L3
    end

    RX --> L1

    subgraph O[Audit Output and Controls]
      direction TB
      O1[JSON validation and coercion]
      O2[Persist audit and metadata]
      O3[Alert evaluation]
      O4[HITL override]
      O5[Dashboard, trends, agent scoring]
      L1 --> O1
      L2 --> O1
      L3 --> O1
      O1 --> O2 --> O3 --> O5
      O2 --> O4 --> O5
    end
Loading

🧠 4. The LLM-as-a-Judge Audit Matrix

Every audit produces a deterministic JSON matrix that captures conversation quality across multiple psychological and compliance dimensions. Identical inputs produce identical outputs via SHA-256 result caching.

Output Schema Specification

{
  "summary": "Customer reported a billing discrepancy; agent resolved without escalation.",
  "agent_f1_score": 0.91,
  "satisfaction_prediction": "High",
  "compliance_risk": "Green",
  "quality_matrix": {
    "language_proficiency": 9,
    "cognitive_empathy": 8,
    "efficiency": 9,
    "bias_reduction": 10,
    "active_listening": 9
  },
  "emotional_timeline": [
    { "turn": 1, "speaker": "Customer", "emotion": "Frustrated", "intensity": 8 },
    { "turn": 2, "speaker": "Agent",    "emotion": "Empathetic",  "intensity": 6 }
  ],
  "compliance_flags": ["Missing identity-verification step"],
  "emotions": { "agent": "calm", "customer": "frustrated" },
  "behavioral_nudges": [
    "Mirroring: Repeat the specific billing date back to the customer earlier."
  ]
}

Audit Model Cascade (High Availability)

The engine executes a waterfall attempt across multiple providers to ensure 100% uptime:

  1. Tier 1: OpenRouter (configurable model set via OPENROUTER_MODELS).
  2. Tier 2: Groq (default includes llama-3.3-70b-versatile).
  3. Tier 3: HuggingFace Inference (fallback instruction-tuned models).

Provider Matrix (Current Runtime)

Service Layer Provider Role in Pipeline Failover Behavior
Voice T1 Hugging Face Space (Gradio) Primary transcription + diarization + acoustic profile If unavailable/empty transcript, system moves to Voice T2
Voice T2a ElevenLabs Premium transcription with diarization On error, move to Deepgram
Voice T2b Deepgram STT fallback with utterance diarization On error, move to Groq Whisper
Voice T2c Groq Speech Last fallback in voice capture chain On error, voice request fails with provider exhausted
Audit LLM T1 OpenRouter Primary JSON audit judge Rate-limit/error skips to Groq tier
Audit LLM T2 Groq Secondary audit judge Rate-limit/error skips to HF Inference tier
Audit LLM T3 Hugging Face Inference Final audit fallback If all fail, request returns provider exhaustion
RAG Retrieval L1 Voyage AI embeddings + MongoDB Atlas Vector Search Primary policy retrieval If unavailable, fallback to Chroma local embeddings
RAG Retrieval L2 ChromaDB (local) Local persistent vector fallback If unavailable, fallback to FAISS memory
RAG Retrieval L3 FAISS (in-memory) Emergency retrieval fallback If unavailable, audit proceeds with no policy context sentinel

Model Matrix (Current Defaults)

Service Layer Models / Engines Override Variable(s)
Voice T1 faster-whisper (large-v3), pyannote/speaker-diarization-3.1, SpeechBrain wav2vec2-IEMOCAP, Parselmouth WHISPER_MODEL (HF Space secret)
Voice T2a scribe_v1 None
Voice T2b nova-2 None
Voice T2c whisper-large-v3, whisper-large-v3-turbo None
Audit LLM T1 openrouter/free, qwen/qwen3.6-plus:free, x-ai/grok-4.20 OPENROUTER_MODELS
Audit LLM T2 llama-3.3-70b-versatile, llama-3.1-8b-instant, groq/compound, groq/compound-mini GROQ_MODELS
Audit LLM T3 Qwen/Qwen2.5-7B-Instruct, mistralai/Mistral-7B-Instruct-v0.3, meta-llama/Llama-3.1-8B-Instruct HF_MODELS
RAG Embeddings L1 voyage-4-lite (documents + queries) VOYAGE_MODEL_DOC, VOYAGE_MODEL_QUERY
RAG Embeddings L2 all-MiniLM-L6-v2 LOCAL_EMBED_MODEL
RAG Embeddings L3 In-process FAISS index None

RAG Storage Semantics (MongoDB + ChromaDB)

  1. MongoDB Atlas (kb_chunks) stores Voyage vectors for organization-scoped retrieval.
  2. ChromaDB (data/chroma) stores local sentence-transformer vectors for warm local search.
  3. Atomic indexing is enforced: if Chroma indexing fails, MongoDB chunk vectors are rolled back to prevent split-brain policy context.
  4. Null-context protocol is explicit: when no policy chunks are available across all tiers, the auditor receives a structured no-context sentinel instead of hallucinated policy data.

πŸ› οΈ 5. Technology Stack

Layer Technologies Purpose
Frontend Vanilla JS, CSS3, MD3 Zero-framework, high-performance UI
Backend Python 3.10+, Flask RESTful API and Job Orchestration
Databases MongoDB Atlas, ChromaDB Hybrid transactional & vector storage
ASR Engine Faster-Whisper, Pyannote Acoustic profiling and diarization
Security JWT, CSRF, HTTP-Only Enterprise-grade session protection
Analytics ECharts, Chart.js 3D Emotional Landscapes & Radar charts

Language & Framework Cards

Card Primary Supporting Notes
Language Runtime Python 3.10+ JavaScript (ES6+) Python powers API and orchestration; JS drives modular controllers
Backend Framework Flask 3 Flask-CORS, PyMongo Blueprint architecture across auth, audits, KB, admin, alerts, agents
Frontend Framework Style Vanilla JS + server-rendered Jinja templates Material Design 3 patterns No SPA framework dependency; optimized for controlled enterprise UX
RAG Vector Stores MongoDB Atlas Vector Search ChromaDB, FAISS Hierarchical retrieval with strict rollback semantics
LLM Providers OpenRouter Groq, Hugging Face Inference Multi-provider cascade for deterministic JSON auditing
LLM Model Families Qwen, Llama Mistral Fallback-safe instruction models for judge output
Embedding Providers Voyage AI sentence-transformers Query and document embedding generation
Speech Providers HF Space (Gradio app) ElevenLabs, Deepgram, Groq Speech T1/T2 failover for resilient voice processing
Speech Model Families faster-whisper, pyannote Whisper, Nova, Scribe Transcription, diarization, and acoustic profiling

πŸš€ 6. Getting Started & Installation

Prerequisites

  • Python 3.10 or higher
  • MongoDB 5.0+ (Atlas cloud recommended)
  • At least one LLM provider key (OPENROUTER_API_KEY or GROQ_API_KEY or HF_SPACE_TOKEN)

Installation Steps

  1. Clone the Repository

    git clone https://github.com/prathamamritkar/genAI-qualityBot.git
    cd genAI-qualityBot
  2. Environment Setup

    python -m venv venv
    source venv/bin/activate  # Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Database Initialization The application will automatically verify and create required MongoDB indexes and seed the system admin accounts upon the first successful connection.

  4. Launch Application

    python app.py

    Access the dashboard at http://localhost:8000.


πŸ” 7. Environment Configuration

Create a .env file in the root directory. Values here override standard OS environment variables.

# --- DATABASE ---
MONGODB_URI=mongodb+srv://user:pass@cluster.mongodb.net/qualora

# --- SECURITY ---
JWT_SECRET=your_random_32_char_secret_key
JWT_EXPIRATION_SECONDS=3600
ALLOWED_ORIGIN=http://localhost:5173
ELEVENLABS_WEBHOOK_SECRET=your_webhook_secret

# --- AI PROVIDERS (CORE) ---
GROQ_API_KEY=your_groq_api_key
OPENROUTER_API_KEY=your_openrouter_key
VOYAGE_API_KEY=your_voyage_api_key

# --- AI PROVIDERS (OPTIONAL TRANSCRIPTION) ---
ELEVENLABS_API_KEY=your_elevenlabs_key
DEEPGRAM_API_KEY=your_deepgram_key

# --- DISTRIBUTED NODES ---
HF_SPACE_URL=your-username/qualora-asr-node
HF_SPACE_TOKEN=your_hf_read_token

# --- OPTIONAL MODEL OVERRIDES ---
OPENROUTER_MODELS=openrouter/free,qwen/qwen3.6-plus:free,x-ai/grok-4.20
GROQ_MODELS=llama-3.3-70b-versatile,llama-3.1-8b-instant
HF_MODELS=Qwen/Qwen2.5-7B-Instruct,mistralai/Mistral-7B-Instruct-v0.3
VOYAGE_MODEL_DOC=voyage-4-lite
VOYAGE_MODEL_QUERY=voyage-4-lite
LOCAL_EMBED_MODEL=all-MiniLM-L6-v2

Demo User & Admin Credentials

On first successful database connection, Qualora seeds accounts only when these environment variables are present.

Role Login Email Password
Admin admin@auditor.local change_this_immediately
Demo User (Agent) demo_agent@demo.local demo_pass_123

Notes:

  • The demo user email is generated as <DEMO_USER_LOGIN>@demo.local.
  • Change these credentials before any shared/staging/production deployment.

☁️ 8. Hugging Face Space Node Setup

The HF Space node provides the deep acoustic analysis and diarization engine. It acts as the primary voice path, with API-chain failover when unavailable.

  1. Create a Space: Create a new Private Space on Hugging Face using the Docker SDK.
  2. Push Source: Push the contents of the /hf_space directory (Dockerfile, app.py, requirements.txt) to the repository.
  3. Repository Secrets: Add the following secrets in your Space Settings:
    • HF_TOKEN: Your HuggingFace Read Token.
    • WHISPER_MODEL: Set to medium or large-v3.
  4. Model Access: You must visit pyannote/speaker-diarization-3.1 and pyannote/segmentation-3.0 to accept the user agreements, or the node will fail to initialize.

πŸ“‘ 9. API Reference & Endpoints

9.1 Audit Endpoints

Endpoint Method Description
/api/audits/process-chat POST Audit raw text or pasted transcripts.
/api/audits/process-file POST Extract text from PDF/TXT/CSV and audit.
/api/audits/process-call POST Voice pipeline for audio files (T1 HF Space, T2 API fallback).
/api/audits/<id>/override POST HITL: Approve, Flag, or Reject an AI audit.
/api/audits/history GET Retrieve paginated audit logs for the organization.
/api/audits/<id> GET Retrieve a full single-audit detail payload.

9.2 Knowledge Base Endpoints

Endpoint Method Description
/api/kb/upload POST Upload policy documents (Max 5 per batch).
/api/kb/documents GET List all indexed organizational policies.
/api/kb/<id>/file GET Stream the original uploaded document.
/api/kb/<id>/status GET Poll the indexing progress (Mongo vs Chroma).
/api/kb/<id>/reindex POST Re-trigger indexing for failed or partial documents.
/api/kb/<id> DELETE Atomic removal of document and vector chunks.

9.3 Authentication & Platform Endpoints

Endpoint Method Description
/api/auth/register POST Create user + organization and set secure auth cookies.
/api/auth/login POST Authenticate and issue HTTP-only JWT cookie + CSRF token.
/api/auth/logout POST Clear auth cookies and terminate the session.
/api/auth/me GET Fetch current authenticated user profile.
/api/health GET Fast health check for DB connectivity.
/api/health/deep GET Deep health check including vector-store readiness.

πŸ›‘οΈ 10. Enterprise Standards & Security

Qualora is built with a "Security-First" architecture to handle sensitive customer interactions:

  • XSS Immunity: JWTs are stored in HTTP-Only cookies. JavaScript cannot access the session token, neutralizing most cross-site scripting vectors.
  • CSRF Protection: All state-changing requests (POST/PUT/DELETE) require a valid X-CSRF-Token header that is validated using constant-time comparison.
  • Atomic RAG Indexing: Our "Zombie Guard" ensures that if the local ChromaDB index fails, the MongoDB Atlas vectors are rolled back instantly to prevent policy hallucinations.
  • Prompt Injection Defense: Transcripts are strictly bounded by XML sanitization tags within the LLM prompt, preventing "Ignore previous instructions" style attacks.

πŸ“‚ 11. Dynamic Content Maintenance

The sitemap, privacy policy, and terms are loaded dynamically to allow updates without redeploying code.

  • Sitemap: Managed in /data/sitemap.json. Defines the dashboard hierarchy and icons.
  • Policies: Managed in /data/privacy-policy.md and /data/terms-of-service.md.
  • UI Hint: Use standard Markdown. The frontend automatically converts these to MD3-compliant HTML on load.

🚧 12. Troubleshooting & Roadmap

Common Fixes

  • 401 Unauthorized: Session has expired. The fetch-wrapper.js will automatically redirect you to the landing page.
  • Indexing Stuck: Check if the VOYAGE_API_KEY is active. Use POST /api/kb/<doc_id>/reindex to retry.
  • All Speakers are "Speaker 0": This indicates the HF Space node is offline. Configure HF_SPACE_URL to enable diarization.

Roadmap

  • Real-time Streaming: WebSockets for mid-call live auditing.
  • CRM Sync: Automated export of audit scores to Salesforce and Zendesk.
  • Multilingual Support: Expanding RAG and ASR to Spanish, French, and German.

πŸ“„ License

MIT License - Copyright (c) 2026 Qualora. Built by Pratham Amritkar.


About

GenAI-Powered Customer-Support Quality Auditor

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors