🎤 Vienna - AI Voice Agent For Medical Expertise aka Medpulse

All API should be added on own, There is instruction how to do it

Real-time voice AI assistant built with Python, LiveKit, Deepgram, GPT-4.1 & Cartesia TTS

Features autonomous actions, medical guidance, strong privacy, and fully customizable voice cloning technology

Features • Installation • Voice Cloning • Usage • Architecture • Contributing

Youtube Demo video : https://youtu.be/sKE2JrOVLhs

📋 Overview

Vienna is a cutting-edge voice AI assistant powered by the LiveKit Agents framework, designed specifically for medical interactions with Indian English support. Built by Avijit Shil, Vienna combines advanced speech recognition, natural language processing, and text-to-speech synthesis to deliver conversational, witty, and highly capable AI assistance.

🎯 What Makes Vienna Special?

Vienna features revolutionary voice cloning technology that allows it to speak in ANY voice you want - a revolutionary next-generation multilingual voice assistant designed for real-world chaos , including your own! Using Cartesia's Sonic 3 TTS engine, Vienna can be customized to sound exactly like you, a loved one, or any voice profile you create. The current deployment uses the creator's own cloned voice for authentic, personalized interactions.

Perfect for agentic healthcare & productivity applications, deployable anywhere with enterprise-grade privacy and security. Expertising in medical conversations like a human, takes less time than a doctor to response. Vienna shatters the language barrier with comprehensive support for Indian regional languages (Assamese, Bengali, Tamil, Telugu, Malayalam, and 10+ more) alongside international languages (English, Chinese, Spanish, Arabic, and 90+ more). With advanced Background Voice Cancellation (BVC), BoliGen isolates your voice even in chaotic environments like crowded markets, busy streets, or noisy offices - making it the first truly noise-proof, hyper-realistic AI companion.

Perfect for global communication, healthcare, education & customer service, deployable anywhere with enterprise-grade privacy.

🌟 Key Highlights

Feature	Description
🏥 Medical-Grade STT	Deepgram Nova 3 Medical model with 95%+ accuracy on medical terminology
🇮🇳 Indian English Support	Optimized for Indian accents and dialects (en-IN) with regional variations
🧠 GPT-4.1 Mini Intelligence	Fast, efficient responses with advanced reasoning and medical knowledge
🎭 Revolutionary Voice Cloning	Clone ANY voice in 2-3 minutes - your own, family, friends, or custom personas
🗣️ Natural Voice Synthesis	Cartesia Sonic 3 TTS with fully customizable voice options and emotions
🛠️ Autonomous Actions	10+ integrated tools: web search, email, weather, location services, and more
⚡ High-Performance VAD	Silero VAD/TurnDetector with <50ms latency and 98%+ accuracy
🎙️ Real-time Processing	End-to-end latency <500ms with preemptive generation technology
🔇 Advanced Noise Cancellation	LiveKit BVC removes background noise, echo, and ambient sounds
🔒 Enterprise Security	Zero data retention, encrypted WebRTC, HIPAA-ready architecture
🎭 Revolutionary Voice Cloning Clone ANY voice in 2-3 minutes - your own, family, friends, or custom personas
🗣️ Hyper-Realistic Voice Cartesia Sonic TTS with fully customizable voice options and emotions

✨ Features

🎙️ Voice & Speech Processing

Component	Technology	Description
Speech-to-Text	Deepgram Nova 3 Medical	Medical-grade transcription with specialized vocabulary for healthcare
Text-to-Speech	Cartesia Sonic 3	Custom voice cloning - speaks in ANY voice you want
Voice Activity Detection	Silero VAD/TurnDetector	High-performance detection with minimal latency (<50ms)
Noise Cancellation	LiveKit BVC	Real-time background noise, echo, and ambient sound removal
Voice Customization	✅ Fully Customizable	Clone any voice in 2-3 minutes or use preset professional voices
Language Support	Multi-language	Indian English (en-IN), US English (en-US), UK English (en-GB), and more

🌐 Supported Languages

Vienna Aka Medpulse supports 100+ languages with native pronunciation and cultural context awareness.

🇮🇳 Indian Regional Languages (15)

Language	Script	Native Name	Status
Assamese	অসমীয়া	Ôxômiya	✅ Full Support
Bengali	বাংলা	Bangla	✅ Full Support
Gujarati	ગુજરાતી	Gujarātī	✅ Full Support
Kannada	ಕನ್ನಡ	Kannaḍa	✅ Full Support
Malayalam	മലയാളം	Malayāḷam	✅ Full Support
Marathi	मराठी	Marāṭhī	✅ Full Support
Nepali	नेपाली	Nepālī	✅ Full Support
Punjabi	ਪੰਜਾਬੀ	Pañjābī	✅ Full Support
Pashto	پښتو	Pax̌tō	✅ Full Support
Sanskrit	संस्कृतम्	Saṃskṛtam	✅ Full Support
Sindhi	سنڌي	Sindhī	✅ Full Support
Sinhala	සිංහල	Siṁhala	✅ Full Support
Tamil	தமிழ்	Tamiḻ	✅ Full Support
Telugu	తెలుగు	Telugu	✅ Full Support
Urdu	اردو	Urdū	✅ Full Support

🎬 Current Configuration

Vienna currently uses the creator's own cloned voice for authentic interactions:

Voice ID: abc6eacf-2626-4d87-902b-7b49c2dcae16
Characteristics: Natural Indian English accent, warm tone, conversational style
Use Case: Medical consultations, health advice, general assistance

🤖 AI Capabilities

✅ Conversational Intelligence - GPT-4.1 Mini delivers fast, accurate, contextually-aware responses
✅ Medical Knowledge - Specialized training on medical terminology, symptoms, and healthcare guidance
✅ Context Awareness - Maintains full conversation history and understands references to previous topics
✅ Preemptive Generation - Begins formulating response while user is still speaking for instant replies
✅ Dry Humor & Personality - Engaging, witty responses with a sophisticated sense of humor (never sarcastic or rude)
✅ Multi-turn Reasoning - Handles complex, multi-step queries with logical follow-through
✅ Privacy-First Design - All processing in real-time, zero data retention

🛠️ Autonomous Tool Functions

Vienna can perform various autonomous actions through integrated function tools. All tools require user confirmation for sensitive operations.

🌐 Web & Information Retrieval

✅ open_website(url)          # Open any website in browser
✅ search_web(query)          # Perform real-time web searches
✅ get_news(topic)            # Fetch latest news headlines and articles
✅ get_stock_price(symbol)    # Check stock/crypto prices (e.g., AAPL, BTC)

⏰ Time & Weather Services

✅ get_datetime()             # Current date and time with timezone
✅ lookup_weather(location)   # Real-time weather information for any location

📧 Communication Tools

✅ send_email(to, subject, body)  # Send emails with confirmation
✅ read_emails(count)             # Read unread emails (privacy-protected)

📍 Location & Navigation

✅ find_nearby_places(type)   # Find restaurants, ATMs, hospitals, pharmacies, etc.

Coming Soon:

Calendar integration
Reminders and alarms
File management
Smart home control
Appointment scheduling

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        VIENNA AI PIPELINE                       │
│                     Real-time Voice Processing                  │
└─────────────────────────────────────────────────────────────────┘

User Voice Input (Microphone)
      ↓
┌─────────────────────────────────┐
│ Noise Cancellation (LiveKit BVC)│  ← Removes background noise, echo
└────────┬────────────────────────┘
         ↓
┌─────────────────────────────────┐
│ Voice Activity Detection (VAD)  │  ← Silero VAD/TurnDetector
│ Latency: <50ms                  │     Detects speech start/end
└────────┬────────────────────────┘
         ↓
┌─────────────────────────────────┐
│ Speech-to-Text (Deepgram)       │  ← Nova 3 Medical Model
│ Model: nova-3-medical           │     Language: en-IN
│ Accuracy: 95%+ on medical terms │
└────────┬────────────────────────┘
         ↓
┌─────────────────────────────────┐
│ LLM Processing (GPT-4.1 Mini)   │  ← Natural language understanding
│ + Function Tool Calling         │     Context-aware reasoning
│ Latency: ~200ms                 │     Autonomous action selection
└────────┬────────────────────────┘
         ↓
┌─────────────────────────────────┐
│ Text-to-Speech (Cartesia)       │  ← Sonic 3 Engine
│ Voice: Custom Cloned Voice      │     Voice ID: abc6eacf-...
│ Latency: ~150ms                 │     Natural intonation
└────────┬────────────────────────┘
         ↓
Voice Output (Speakers/Headphones)
  Total End-to-End Latency: <500ms

🔧 System Components

Agent Session Manager - Orchestrates the entire voice pipeline with LiveKit
STT Engine - Deepgram Nova 3 Medical converts speech to text with medical vocabulary
LLM Core - GPT-4.1 Mini processes queries and decides when to use autonomous tools
Function Tools - 10+ autonomous action capabilities with safety guardrails
TTS Engine - Cartesia Sonic 3 generates natural-sounding voice with cloned voice model
Metrics Collector - Real-time monitoring of performance, usage, and quality metrics

⚡ Performance Characteristics

End-to-End Latency: <500ms (from speech to response)
STT Accuracy: 95%+ on medical terminology
VAD Latency: <50ms (voice activity detection)
TTS Quality: Natural, human-like with emotional expressiveness
Concurrent Users: Scales horizontally with LiveKit infrastructure
Uptime: 99.9%+ with proper deployment

🚀 Installation

Prerequisites

Python 3.8 or higher (Python 3.10+ recommended)
LiveKit Cloud account or self-hosted LiveKit server
API keys for:
- Deepgram (Speech-to-Text) - Get key
- OpenAI (LLM) - Get key
- Cartesia (Text-to-Speech) - Get key

Step 1: Clone the Repository

git clone https://github.com/Avijitshil/Vienna.git
cd Vienna

Step 2: Install Dependencies

pip install -r requirements.txt

Required packages:

livekit-agents>=0.8.0
livekit-plugins-deepgram
livekit-plugins-openai
livekit-plugins-cartesia
livekit-plugins-silero
python-dotenv

Or install manually:

pip install livekit-agents livekit-plugins-deepgram livekit-plugins-openai livekit-plugins-cartesia livekit-plugins-silero python-dotenv

Step 3: Configure Environment Variables

Create a .env.local file in the project root:

# LiveKit Configuration
LIVEKIT_URL=wss://your-livekit-server.livekit.cloud
LIVEKIT_API_KEY=your_api_key_here
LIVEKIT_API_SECRET=your_api_secret_here

# AI Service API Keys
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_key
CARTESIA_API_KEY=your_cartesia_api_key

Getting API Keys:

LiveKit: Sign up at livekit.io → Create project → Get credentials
Deepgram: Sign up at deepgram.com → Get API key (free tier available)
OpenAI: Sign up at platform.openai.com → Create API key
Cartesia: Sign up at cartesia.ai → Get API key

Step 4: Run the Agent

Start Vienna in console mode:

python src/agent.py console

You should see:

✅ Environment loaded successfully
✅ Connecting to LiveKit...
✅ Agent started and listening for voice input
🎤 Vienna is ready! Start speaking...

🎭 Voice Cloning Guide

Why Voice Cloning?

Voice cloning makes Vienna feel personal and authentic. Instead of a generic AI voice, Vienna can speak exactly like you, a trusted healthcare provider, or any voice that makes users comfortable.

🚀 Quick Start: Clone Your Voice in 3 Minutes

Step 1: Record Your Voice

Requirements:

30 seconds to 2 minutes of clear speech
Quiet environment (minimal background noise)
Natural speaking style (not robotic or overly formal)
Good quality microphone (built-in laptop mic works fine)

Tips for Best Results:

Speak naturally and conversationally
Include variation in tone and emotion
Read a paragraph from a book or article
Don't pause too long between sentences

Step 2: Clone on Cartesia

Visit Cartesia Voice Lab
Sign up / Log in to your account
Click "Create Voice" or "Clone Voice"
Upload your audio recording
Wait 30-60 seconds for processing
Preview the cloned voice
Copy your Voice ID (format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

Step 3: Configure Vienna

Edit src/agent.py and update the TTS configuration:

# Find this section in the entrypoint() function:
tts=inference.TTS(
    model="cartesia/sonic-3", 
    voice="abc6eacf-2626-4d87-902b-7b49c2dcae16"  # Replace with YOUR voice ID
)

Replace the voice ID with your new cloned voice ID:

tts=inference.TTS(
    model="cartesia/sonic-3", 
    voice="your-new-voice-id-here"  # Your cloned voice ID from Cartesia
)

Step 4: Test Your Voice

Restart Vienna:

python src/agent.py console

Vienna now speaks in YOUR voice! 🎉

🎨 Advanced Voice Customization

Multiple Voice Profiles

Create different voices for different scenarios:

# Professional medical voice for consultations
MEDICAL_VOICE = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

# Friendly casual voice for general chat
CASUAL_VOICE = "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"

# Urgent/serious voice for emergencies
URGENT_VOICE = "zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz"

# Switch based on context
voice_id = MEDICAL_VOICE if is_medical_query else CASUAL_VOICE

Voice Parameters

Fine-tune voice characteristics:

tts=inference.TTS(
    model="cartesia/sonic-3",
    voice="your-voice-id",
    # Additional parameters (if supported by your SDK version):
    speed=1.0,        # 0.5 to 2.0 (1.0 = normal)
    pitch=1.0,        # 0.5 to 2.0 (1.0 = normal)
    emotion="neutral" # Options: neutral, happy, sad, urgent, calm
)

Preset Professional Voices

If you don't want to clone a voice, use Cartesia's preset voices:

Available Categories:

Medical Professional: Warm, authoritative, trustworthy
Young Adult: Energetic, friendly, casual
Elderly: Wise, calm, reassuring
Gender-Neutral: Professional, clear, unbiased

Example Voice IDs:

# Professional female medical voice
voice="79a125e8-cd45-4c13-8a67-188112f4dd22"

# Friendly male casual voice
voice="a0e99841-438c-4a64-b679-ae501e7d6091"

# Neutral professional voice
voice="2ee87190-8f84-4925-97da-e52547f9462c"

Browse all voices at: cartesia.ai/voices

🔒 Voice Cloning Ethics & Privacy

Important Guidelines:

✅ DO:

Clone your own voice
Clone voices with explicit written consent
Use for personal, educational, or authorized medical purposes
Respect voice ownership and intellectual property

❌ DON'T:

Clone someone's voice without permission
Impersonate others maliciously
Use cloned voices for fraud or deception
Share voice models without authorization

Vienna's Position: We strongly advocate for ethical AI voice cloning. Always obtain consent before cloning someone's voice, and use the technology responsibly for beneficial purposes.

💻 Usage

Starting a Voice Session

Run the agent in console mode:
```
python src/agent.py console
```
Connect to the LiveKit room using:
- Web browser (LiveKit web client)
- Mobile app (iOS/Android LiveKit app)
- Desktop application
- Custom integration
Start speaking - Vienna will respond in real-time with the cloned voice!

Example Interactions

Medical Query:

User: "What are the symptoms of diabetes?"

Vienna: "Common symptoms of diabetes include increased thirst, 
         frequent urination, unexplained weight loss, extreme 
         fatigue, blurred vision, and slow-healing wounds. 
         Type 1 symptoms often appear suddenly, while Type 2 
         develops gradually. Would you like more details about 
         any specific symptom?"

Autonomous Action - Web Search:

User: "Search for nearby hospitals"

Vienna: "Sure, let me find nearby hospitals for you... 
         I've found several options: City General Hospital 
         is 2.3 km away with 4.5 star rating, Care Medical 
         Center is 3.1 km with emergency services, and 
         Apollo Hospital is 4.7 km with specialized cardiology. 
         Would you like directions to any of these?"

Weather Information:

User: "What's the weather like in Mumbai?"

Vienna: "The weather in Mumbai right now is sunny with 
         partly cloudy skies. Temperature is 27 degrees 
         Celsius with 65% humidity. There's a light breeze 
         from the west. It's a pleasant day overall!"

Complex Medical Reasoning:

User: "I've been having chest pain and shortness of breath"

Vienna: "⚠️ These symptoms can be serious. Chest pain combined 
         with shortness of breath could indicate several 
         conditions ranging from anxiety to cardiac issues. 
         I strongly recommend you seek immediate medical 
         attention or call emergency services right away. 
         Do you want me to help you find the nearest emergency 
         room?"

Email Assistance:

User: "Send an email to my doctor about tomorrow's appointment"

Vienna: "I can help you draft that email. What would you like 
         to say in the message? Should I mention a specific 
         time or any concerns you'd like to discuss?"

Advanced Usage

Console Mode (Development)

python src/agent.py console

Direct terminal interaction
Real-time logging and debugging
Best for development and testing

Production Deployment

python src/agent.py

Runs as a persistent service
Connects to LiveKit cloud
Handles multiple concurrent users
Enterprise-ready with metrics

Docker Deployment

docker build -t vienna-ai .
docker run -d --env-file .env.local vienna-ai

🔧 Configuration

Customizing the Voice

Vienna currently uses the creator's cloned voice for authentic, personalized interactions.

To use your own voice:

Visit Cartesia Voice Lab
Clone your voice (takes 2-3 minutes with 30 seconds of audio)
Copy your custom voice ID
Update in src/agent.py:

tts=inference.TTS(
    model="cartesia/sonic-3", 
    voice="your-custom-voice-id"  # Replace with your Voice ID
)

Available Voice Options:

✅ Custom cloned voices (your own voice or anyone with consent)
✅ Preset professional voices (50+ options)
✅ Multiple accent options (Indian, American, British, Australian, etc.)
✅ Adjustable parameters (speed, pitch, emotion)
✅ Gender-neutral voices (inclusive and professional)

Adjusting STT Language

Change the speech recognition language:

stt=inference.STT(
    model="deepgram/nova-3-medical", 
    language="en-US"  # Options: en-IN, en-US, en-GB, en-AU, etc.
)

Supported Languages:

en-IN - Indian English (default for Vienna)
en-US - American English
en-GB - British English
en-AU - Australian English
And many more...

Switching LLM Models

Change the AI model for different capabilities:

# GPT-4.1 Mini (default) - Fast and cost-effective
llm=inference.LLM(model="openai/gpt-4.1-mini")

# GPT-4 - More complex reasoning, higher cost
llm=inference.LLM(model="openai/gpt-4")

# GPT-4 Turbo - Balance of speed and capability
llm=inference.LLM(model="openai/gpt-4-turbo")

Adjusting Voice Activity Detection

Configure VAD sensitivity:

# In the prewarm function, adjust VAD parameters
vad = silero.VAD(
    min_speech_duration_ms=100,  # Minimum speech duration
    speech_pad_ms=300,            # Padding around speech
    threshold=0.5                 # Detection threshold (0.0-1.0)
)

Enable/Disable Tools

Customize which autonomous tools are available:

# Comment out tools you don't want to use
class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="...")
    
    # @function_tool  # Uncomment to disable
    # async def send_email(self, ...):
    #     pass

📊 Monitoring & Metrics

Vienna automatically collects and logs comprehensive performance metrics:

Available Metrics

Latency Metrics
- End-to-end response time (typically <500ms)
- STT processing time
- LLM inference time
- TTS generation time
- Network latency
Quality Metrics
- STT word error rate
- VAD accuracy
- TTS naturalness score
- User satisfaction indicators
Usage Metrics
- Token consumption (LLM)
- API call counts
- Concurrent user sessions
- Tool invocation frequency
Cost Metrics
- Per-session cost breakdown
- Monthly spending projections
- Cost per user interaction

Accessing Metrics

Metrics are logged in real-time:

# Metrics are automatically collected
usage_collector = metrics.UsageCollector()

@session.on("metrics_collected")
def _on_metrics_collected(ev: MetricsCollectedEvent):
    metrics.log_metrics(ev.metrics)
    usage_collector.collect(ev.metrics)

# Get summary
summary = usage_collector.get_summary()
logger.info(f"Session Usage: {summary}")

Example Metrics Output

Session Metrics:
- Duration: 5m 32s
- Total Interactions: 12
- Avg Response Time: 387ms
- STT Accuracy: 96.3%
- Token Usage: 2,847 tokens
- Estimated Cost: $0.08
- Tools Invoked: 3 (web_search, weather, datetime)

🛡️ Privacy & Security

Vienna is built with privacy-first design principles:

Data Privacy

✅ Zero Data Retention - All conversations processed in real-time, nothing stored
✅ No Conversation Logs - No audio recordings or transcripts saved
✅ Ephemeral Processing - Data deleted immediately after response
✅ No Training Data - Your conversations never used to train AI models

Security Features

✅ Encrypted Communication - End-to-end encryption via WebRTC
✅ Secure API Keys - Environment variables, never hardcoded
✅ User Confirmation - Required for sensitive actions (email, file access)
✅ HIPAA-Ready Architecture - Compliant with healthcare data regulations
✅ Rate Limiting - Prevents abuse and unauthorized access
✅ Audit Trails - Optional logging for compliance (disabled by default)

Medical Privacy

HIPAA Compliance Ready - Architecture supports HIPAA requirements
PHI Protection - No Protected Health Information stored or logged
Consent-Based - Users control what information is shared
Secure Integrations - Third-party tools require explicit authorization

Best Practices

Never share API keys publicly or in version control
Use environment variables for all sensitive configuration
Enable rate limiting in production deployments
Regular security audits of dependencies and configurations
User education about what Vienna can and cannot access

🤝 Contributing

Contributions are welcome! We'd love your help making Vienna even better.

How to Contribute

Fork the repository

git clone https://github.com/yourusername/Vienna.git
cd Vienna

Create a feature branch
```
git checkout -b feature/AmazingFeature
```
Make your changes
- Write clean, documented code
- Follow existing code style
- Add tests if applicable
- Update documentation
Commit your changes
```
git commit -m 'Add some AmazingFeature'
```
Push to your branch
```
git push origin feature/AmazingFeature
```
Open a Pull Request
- Describe your changes clearly
- Reference any related issues
- Wait for review and feedback

Contribution Guidelines

Code Style: Follow PEP 8 for Python code
Documentation: Update README for new features
Testing: Add tests for new functionality
Commits: Use clear, descriptive commit messages
Issues: Open issues for bugs or feature requests

Areas for Contribution

🌐 Language Support - Add support for more languages
🎤 Voice Models - Contribute new voice profiles
🛠️ New Tools - Add autonomous action capabilities
📚 Documentation - Improve guides and examples
🐛 Bug Fixes - Report and fix issues
⚡ Performance - Optimize latency and resource usage

📝 API Reference

Function Tools

All autonomous tools are documented below. Vienna automatically invokes these based on user requests.

`open_website(url: str)`

Opens a specified website URL in the user's default browser.

Parameters:

url (str): Website URL to open (must include protocol: http:// or https://)

Returns:

Confirmation message indicating the website is being opened

Example:

User: "Open google.com"
Vienna calls: open_website("https://google.com")
Vienna says: "Opening google.com for you now."

`search_web(query: str)`

Performs a web search for the given query and returns relevant results.

Parameters:

query (str): Search query string

Returns:

Search results summary and top links

Example:

User: "Search for diabetes symptoms"
Vienna calls: search_web("diabetes symptoms")
Vienna says: "Here are the search results for diabetes symptoms..."

`get_datetime()`

Retrieves the current date and time.

Parameters: None

Returns:

Formatted datetime string (e.g., "02:30 PM on Monday, November 13, 2025")

Example:

User: "What time is it?"
Vienna calls: get_datetime()
Vienna says: "It's 02:30 PM on Monday, November 13, 2025."

`lookup_weather(location: str)`

Gets current weather information for a specified location.

Parameters:

location (str): City or location name (e.g., "Mumbai", "New York")

Returns:

Weather information including temperature, conditions, humidity

Example:

User: "What's the weather in Mumbai?"
Vienna calls: lookup_weather("Mumbai")
Vienna says: "The weather in Mumbai is sunny, 27°C with 65% humidity."

`get_news(topic: str)`

Fetches latest news headlines and articles about a specific topic.

Parameters:

topic (str): News topic or category (e.g., "technology", "health", "sports")

Returns:

List of recent news headlines and brief summaries

Example:

User: "Get me the latest health news"
Vienna calls: get_news("health")
Vienna says: "Here are the latest health headlines..."

`get_stock_price(symbol: str)`

Retrieves current stock or cryptocurrency prices.

Parameters:

symbol (str): Stock ticker symbol (e.g., "AAPL", "GOOGL") or crypto symbol (e.g., "BTC", "ETH")

Returns:

Current price, change, and percentage change

Example:

User: "What's Apple's stock price?"
Vienna calls: get_stock_price("AAPL")
Vienna says: "AAPL is currently trading at $188.40, up 2.3% today."

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.env.local		.env.local
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
requirements.txt		requirements.txt

License

AvijitShil/Vienna

Folders and files

Latest commit

History

Repository files navigation

🎤 Vienna - AI Voice Agent For Medical Expertise aka Medpulse

📋 Overview

🎯 What Makes Vienna Special?

🌟 Key Highlights

✨ Features

🎙️ Voice & Speech Processing

🌐 Supported Languages

🇮🇳 Indian Regional Languages (15)

🎬 Current Configuration

🤖 AI Capabilities

🛠️ Autonomous Tool Functions

🌐 Web & Information Retrieval

⏰ Time & Weather Services

📧 Communication Tools

📍 Location & Navigation

🏗️ Architecture

🔧 System Components

⚡ Performance Characteristics

🚀 Installation

Prerequisites

Step 1: Clone the Repository

Step 2: Install Dependencies

Step 3: Configure Environment Variables

Step 4: Run the Agent

🎭 Voice Cloning Guide

Why Voice Cloning?

🚀 Quick Start: Clone Your Voice in 3 Minutes

Step 1: Record Your Voice

Step 2: Clone on Cartesia

Step 3: Configure Vienna

Step 4: Test Your Voice

🎨 Advanced Voice Customization

Multiple Voice Profiles

Voice Parameters

Preset Professional Voices

🔒 Voice Cloning Ethics & Privacy

💻 Usage

Starting a Voice Session

Example Interactions

Advanced Usage

Console Mode (Development)

Production Deployment

Docker Deployment

🔧 Configuration

Customizing the Voice

Adjusting STT Language

Switching LLM Models

Adjusting Voice Activity Detection

Enable/Disable Tools

📊 Monitoring & Metrics

Available Metrics

Accessing Metrics

Example Metrics Output

🛡️ Privacy & Security

Data Privacy

Security Features

Medical Privacy

Best Practices

🤝 Contributing

How to Contribute

Contribution Guidelines

Areas for Contribution

📝 API Reference

Function Tools

open_website(url: str)

search_web(query: str)

get_datetime()

lookup_weather(location: str)

get_news(topic: str)

get_stock_price(symbol: str)

send_email(to: str, subject: str, body: str)

About

Resources

License

`open_website(url: str)`

`search_web(query: str)`

`get_datetime()`

`lookup_weather(location: str)`

`get_news(topic: str)`

`get_stock_price(symbol: str)`

`send_email(to: str, subject: str, body: str)`

Packages