Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions voice_driven_banking/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
ENV/

# Flask
instance/
.webassets-cache

# Project specific
uploads/
data/
voice_prints/

# IDEs and editors
.idea/
.vscode/
*.swp
*.sublime-workspace

# OS specific files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
1 change: 1 addition & 0 deletions voice_driven_banking/Procfile
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
web: gunicorn app:app
193 changes: 175 additions & 18 deletions voice_driven_banking/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,185 @@
# Voice Banking Automated Testing Framework
# Voice-Driven Banking via LAMs

This framework provides automated testing of voice commands for banking interfaces using Selenium and a voice command simulator.
A proof-of-concept for a voice-based banking platform supporting multiple languages including low-resource languages and dialects.

## Features
## Project Overview

- Simulates voice commands with realistic variations and confidence scores
- Automates testing of banking interfaces through Selenium
- Configurable command sets and locators
- Detailed test reports with success rates and recognized text
This project demonstrates how Large Acoustic Models (LAMs) can be utilized to create an inclusive, voice-controlled banking system that works with languages that are typically underserved by mainstream voice recognition technologies.

## Prerequisites
### Key Features

- Python 3.7+
- Chrome browser
- ChromeDriver matching your Chrome version
- Selenium and other required packages (see requirements.txt)
- **Multilingual Speech Recognition**: Supports English, Hindi, and Tamil (extendable to other languages)
- **Intent Recognition**: Identifies banking operations such as balance checks, money transfers, and transaction history requests
- **Voice Biometrics**: Simple voice authentication system for security
- **Banking Operations**: Basic simulation of banking functionality
- **Responsive UI**: Web interface for interacting with the voice banking system

## Technology Stack

- **Backend**: Python, Flask
- **Speech Recognition**: SpeechRecognition, Wav2Vec2 (for low-resource languages)
- **NLP**: spaCy, regex-based intent detection
- **Voice Biometrics**: Gaussian Mixture Models with MFCC features
- **Frontend**: HTML, CSS, JavaScript
- **Data Storage**: Simple JSON-based storage (for demonstration purposes)

## Installation

1. Clone this repository
2. Install dependencies: `pip install -r requirements.txt`
3. Update `config.json` with your test environment details
### Prerequisites

- Miniconda or Anaconda
- A modern web browser

### Setup with Miniconda

1. Install Miniconda:
- [Download Miniconda](https://docs.conda.io/en/latest/miniconda.html)
- Follow the installation instructions for your operating system

2. Clone the repository:
```
git clone <repository-url>
cd GSoC'25 Mifos POC
```

3. Create and activate the conda environment:
```
conda env create -f environment.yml
conda activate voice-banking
```

4. Download required language models for spaCy:
```
python -m spacy download en_core_web_sm
python -m spacy download xx_ent_wiki_sm
```

5. (Alternative) If you prefer using pip instead of conda environment:
```
pip install -r requirements.txt
```

## Running the Application

1. Start the Flask server:
```
python app.py
```

2. Open your web browser and navigate to:
```
http://127.0.0.1:5000
```

3. Register a new account or log in with the demo accounts:
- Username: `johndoe`, Password: `password123` (English)
- Username: `janesmith`, Password: `password456` (Hindi)

## Usage Guide

### Voice Commands

The system supports the following banking operations:

1. **Check Balance**
- English: "What is my balance?", "Check my account balance"
- Hindi: "मेरा बैलेंस क्या है", "मेरा बैलेंस दिखाएं"
- Tamil: "என் இருப்பு என்ன", "என் கணக்கு இருப்பு காட்டு"

2. **Transfer Money**
- English: "Transfer 100 dollars to Jane", "Send 50 to John"
- Hindi: "जेन को 100 रुपये ट्रांसफर करें"
- Tamil: "ஜேனுக்கு 100 ரூபாய் அனுப்பு"

3. **Transaction History**
- English: "Show my recent transactions", "Show my transaction history"
- Hindi: "मेरे हाल के लेनदेन दिखाएं"
- Tamil: "என் சமீபத்திய பரிவர்த்தனைகளைக் காட்டு"

### Voice Authentication

Upon first use, the system will automatically enroll your voice. For subsequent uses, it will authenticate your voice against the stored voiceprint. In this proof-of-concept, authentication thresholds are set low for ease of demonstration.

## Project Structure

```
/
├── app.py # Main Flask application
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── /config/
│ └── intent_patterns.json # Language-specific patterns for intent recognition
├── /data/
│ ├── mock_db.json # Mock banking data (auto-generated)
│ ├── users.json # User data (auto-generated)
│ └── /voice_prints/ # Voice authentication models (auto-generated)
├── /models/
│ ├── speech_recognition.py # Speech-to-text conversion
│ ├── intent_recognition.py # Banking intent detection
│ └── voice_biometrics.py # Voice authentication logic
├── /services/
│ ├── banking_service.py # Banking operations
│ └── user_service.py # User management
├── /static/
│ ├── /css/
│ │ └── style.css # Frontend styling
│ └── /js/
│ └── app.js # Frontend logic
├── /templates/
│ └── index.html # Main application page
└── /uploads/
└── /audio/ # Temporary storage for audio files
```

## Technical Implementation

### Speech Recognition

- For English and well-supported languages, we use Google's Speech Recognition API
- For low-resource languages like Hindi and Tamil, we employ fine-tuned versions of Wav2Vec2 models

### Intent Recognition

Intent recognition uses a combination of:
- Regular expression pattern matching based on language-specific patterns
- Simple NLP processing using spaCy to handle variations

### Voice Biometrics

The voice authentication system:
- Extracts MFCC features from audio samples
- Uses Gaussian Mixture Models (GMMs) to create voice prints
- Computes likelihood scores for authentication decisions

### Banking Simulation

The banking functionality:
- Uses a simple JSON file as a mock database
- Supports account balance queries
- Processes simulated money transfers
- Returns transaction history

## Limitations and Future Work

This project is a proof-of-concept with the following limitations:

1. **Speech Recognition**: Uses pre-trained models rather than custom-trained LAMs for low-resource languages
2. **Voice Authentication**: Uses basic GMM modeling rather than more sophisticated deep learning approaches
3. **Banking Integration**: Simulates banking operations rather than connecting to actual banking systems
4. **Security**: Implements basic security measures; a production system would need more robust security
5. **Offline Support**: Currently requires internet for some speech recognition; a full implementation would support offline operation

Future work would focus on:
- Training custom LAMs for targeted low-resource languages
- Improving voice biometrics with anti-spoofing measures
- Adding more banking operations
- Creating native mobile applications
- Supporting offline operation for areas with limited connectivity

## License

[Specify license information]

## Usage
## Contact

Run the main test suite:
Or run individual components:
[Your contact information]
48 changes: 48 additions & 0 deletions voice_driven_banking/RENDER_DEPLOYMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Deploying Voice Banking System on Render

This guide explains how to deploy the Voice-Driven Banking System to Render.com.

## Deployment Options

### Option 1: Manual Deployment

1. Create a new Web Service on Render
2. Connect your GitHub repository
3. Use the following settings:
- **Environment**: Python
- **Build Command**: `pip install -r requirements.txt`
- **Start Command**: `gunicorn app:app`
4. Click "Create Web Service"

### Option 2: Blueprint Deployment

1. Fork this repository to your GitHub account
2. Go to Render Dashboard: https://dashboard.render.com/
3. Click "New" and select "Blueprint"
4. Connect your GitHub repository
5. Render will automatically detect the `render.yaml` file and set up the service

## Environment Variables

If your application uses any API keys or sensitive information, add them as environment variables in the Render dashboard:

1. Go to your web service in the Render dashboard
2. Click on "Environment" tab
3. Add your environment variables (e.g., API keys)

## Persistent Storage (Optional)

If you need persistent storage for user data or voice prints:

1. Create a Render Disk
2. Attach it to your service
3. Update your code to use the disk path

## Custom Domain (Optional)

To use a custom domain:

1. Go to your web service in the Render dashboard
2. Click on "Settings" tab
3. Scroll to "Custom Domains" section
4. Add your domain and follow the DNS configuration instructions
Loading