Skip to content

JahnelGroup/liveavatar-chat

Repository files navigation

LiveAvatar Chat

A real-time AI life coach with a speaking avatar. Chat with Claude AI while a LiveAvatar streams video and speaks the responses aloud.

How It Works

This application combines three services to create an interactive AI coaching experience:

┌─────────────────────────────────────────────────────────────────────────┐
│                              User Browser                               │
│  ┌─────────────┐    ┌──────────────┐    ┌─────────────────────────┐    │
│  │  Chat Input │───▶│  React App   │◀───│  LiveAvatar Video Stream │    │
│  └─────────────┘    └──────────────┘    └─────────────────────────┘    │
└─────────────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         Node.js Backend                                  │
│                                                                         │
│   1. /api/chat ──────▶ Claude API ──────▶ AI Response (text)           │
│                                                                         │
│   2. /api/tts ───────▶ ElevenLabs ──────▶ Audio (base64 PCM)           │
│                                                                         │
│   3. /api/liveavatar/session ──▶ LiveAvatar ──▶ Session Token          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

The Flow

  1. User sends a message in the chat interface
  2. Claude AI generates a thoughtful response based on the life coach prompt
  3. ElevenLabs TTS converts Claude's text response to audio
  4. LiveAvatar SDK receives the audio and the avatar speaks it in real-time

Why Three Services?

Service Purpose Why It's Needed
Claude AI brain Generates intelligent, contextual responses as a life coach
LiveAvatar Avatar video Provides real-time streaming video of a speaking avatar
ElevenLabs Text-to-speech Converts text to audio for the avatar to speak

Why ElevenLabs?

LiveAvatar offers two modes:

  • FULL mode: LiveAvatar's built-in AI handles everything (chat + TTS + avatar). You don't control the conversation.
  • CUSTOM mode: You provide your own AI (Claude) and audio. LiveAvatar only provides the avatar video stream.

We use CUSTOM mode so Claude can be the AI brain. But CUSTOM mode requires you to provide pre-generated audio — the avatar won't convert text to speech for you. That's where ElevenLabs comes in: it converts Claude's text responses to PCM audio that the avatar can speak.

Project Structure

liveavatar-chat/
├── server/
│   └── index.js              # Express API server
├── client/
│   ├── src/
│   │   ├── App.js            # Main React app
│   │   ├── Chat.js           # Chat UI + avatar integration
│   │   └── liveavatar/       # LiveAvatar SDK wrapper
│   │       ├── LiveAvatarContext.js  # React context for session state
│   │       ├── useSession.js         # Hook for session methods
│   │       └── index.js              # Exports
│   └── public/
│       └── index.html
├── prompt.md                 # System prompt for Claude (editable)
├── .env                      # API keys (create from .env.example)
├── docker-compose.yml        # Docker deployment
└── Dockerfile

The System Prompt (prompt.md)

The prompt.md file defines Claude's personality and behavior. It's loaded by the server on startup and sent with every chat request.

Current configuration: A life coach that:

  • Listens actively and asks good questions
  • Offers perspective without prescribing solutions
  • Is warm but direct, honest but non-judgmental
  • Helps users discover their own answers

To customize: Edit prompt.md and restart the server. The file supports full Markdown formatting.

# Example: Change to a coding mentor

You are a senior software engineer and coding mentor...

The server watches this file and reloads it periodically, so changes take effect without a full restart.

Quick Start

Prerequisites

Setup

  1. Clone and install:

    git clone <repo-url>
    cd liveavatar-chat
    npm run install-all
  2. Configure environment:

    cp .env.example .env

    Edit .env and add your API keys:

    # Required
    CLAUDE_API_KEY=sk-ant-...
    LIVEAVATAR_API_KEY=your-liveavatar-key
    LIVEAVATAR_AVATAR_ID=your-avatar-uuid
    ELEVENLABS_API_KEY=sk_...
    
    # Optional
    ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM  # Default: Rachel
    CLAUDE_MODEL=claude-sonnet-4-20250514
  3. Start development server:

    npm run dev
  4. Open browser:

    http://localhost:5000
    

Docker Deployment

docker-compose up --build

API Endpoints

POST /api/chat

Send a message to Claude and get a response.

// Request
{ "message": "I'm feeling stuck in my career", "sessionId": "abc123" }

// Response
{ "id": "msg_...", "content": "I hear you. Let's explore that...", "timestamp": 1704067200000 }

POST /api/tts

Convert text to speech audio (ElevenLabs).

// Request
{ "text": "Hello, how can I help you today?" }

// Response
{ "audio": "base64-encoded-pcm-audio..." }

POST /api/liveavatar/session

Get a session token for LiveAvatar streaming.

// Response
{ "session_token": "...", "session_id": "...", "api_url": "https://api.liveavatar.com" }

GET /api/health

Health check endpoint.

{ "status": "ok", "hasClaudeKey": true, "hasAvatarKey": true, "model": "claude-sonnet-4-20250514" }

Environment Variables

Variable Required Description
CLAUDE_API_KEY Yes Anthropic API key
LIVEAVATAR_API_KEY Yes LiveAvatar API key
LIVEAVATAR_AVATAR_ID Yes UUID of your avatar
ELEVENLABS_API_KEY Yes ElevenLabs API key for TTS
ELEVENLABS_VOICE_ID No Voice ID (default: Rachel)
CLAUDE_MODEL No Model to use (default: claude-sonnet-4-20250514)
PORT No Server port (default: 5000)
NODE_ENV No Environment (development/production)

How the LiveAvatar Integration Works

The client/src/liveavatar/ directory contains a React wrapper around the @heygen/liveavatar-web-sdk:

  1. LiveAvatarContext.js — Manages the SDK session lifecycle and state
  2. useSession.js — Hook that exposes session methods to components
  3. Key methods:
    • startSession() — Connects to LiveAvatar and starts video streaming
    • attachElement(videoRef) — Attaches the video stream to a DOM element
    • speakAudio(base64Audio) — Makes the avatar speak (requires PCM 24kHz audio)
    • stopSession() — Ends the session

The flow in Chat.js:

// 1. User sends message
const response = await fetch('/api/chat', { body: { message } });

// 2. Get Claude's response
const { content } = await response.json();

// 3. Convert to audio via ElevenLabs
const ttsResponse = await fetch('/api/tts', { body: { text: content } });
const { audio } = await ttsResponse.json();

// 4. Make avatar speak
speakAudio(audio);  // Calls session.repeatAudio(audio)

Development Notes

  • Session-based chat history — Conversations persist during the browser session but reset on page reload
  • Hot-reload prompt — The server reloads prompt.md periodically
  • Rate limiting — API endpoints have basic rate limiting
  • CORS — Configured for localhost development; update ALLOWED_ORIGINS for production

Manual Installation (Alternative)

If npm run install-all doesn't work:

npm install
cd server && npm install
cd ../client && npm install
cd ..

Running Frontend and Backend Separately

# Terminal 1: Backend
cd server && npm run dev

# Terminal 2: Frontend  
cd client && npm start

If running separately, add a proxy to client/package.json:

"proxy": "http://localhost:5000"

Troubleshooting

Server won't start (port in use):

lsof -i :5000
kill -9 <PID>

Frontend won't compile:

cd client && rm -rf node_modules && npm install

API calls failing:

  • Verify .env exists in root directory
  • Check API keys are correct (no extra spaces)
  • Ensure backend is running on port 5000

Avatar video not appearing:

  • Check browser console for WebRTC errors
  • Verify LIVEAVATAR_AVATAR_ID is a valid UUID from your dashboard
  • Ensure LIVEAVATAR_SANDBOX=false for non-sandbox avatars

Avatar not speaking:

  • Check that ELEVENLABS_API_KEY is set
  • Look for TTS errors in server logs
  • Verify ElevenLabs account has available characters

Docker issues:

docker-compose down -v
docker-compose up --build

Costs

  • Claude API: ~$3 per million input tokens, ~$15 per million output tokens
  • ElevenLabs: ~$0.30 per 1,000 characters (pay-as-you-go) or subscription plans
  • LiveAvatar: Check liveavatar.com for current pricing

License

MIT

About

A real-time AI life coach with a speaking avatar.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors