Skip to content

r-camara/AiNotes

 
 

Repository files navigation

AiNotes

Desktop app for recording, transcribing, and searching meetings — 100% local, GPU-accelerated, with speaker diarization. Built with Tauri + React + faster-whisper + SpeechBrain.

No cloud. No API keys. Your audio and transcripts never leave your machine.

Features

  • Record meetings with mic + system audio (loopback) captured together
  • Call detection — suggests "Record now" when Teams / Zoom / Meet are active
  • GPU-accelerated transcription (faster-whisper + CTranslate2, int8 on NVIDIA) — typical RTF ~0.03x (30s of audio → 1s on GTX 1080 Ti)
  • Speaker diarization (SpeechBrain ECAPA-TDNN + agglomerative clustering) — fully local, no HuggingFace token required
  • Full-text search across all transcriptions (SQLite FTS5)
  • Multiple Whisper models — tiny / base / small / medium / large-v2

Stack

Layer Tech
UI React 19 + TypeScript + Vite
Shell Tauri 2 (Rust)
Backend Python (Click CLI, JSON IPC)
Storage SQLite (FTS5 for search)
ASR faster-whisper / CTranslate2
Diarization SpeechBrain ECAPA-TDNN + scikit-learn

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   React UI  │────▶│ Tauri (Rust)│────▶│ Python CLI  │
│  (Vite dev) │     │  commands   │     │   (Click)   │
└─────────────┘     └─────────────┘     └──────┬──────┘
                                               ▼
                                         ┌───────────┐
                                         │  SQLite   │
                                         │ recordings│
                                         │  models   │
                                         └───────────┘

Every Tauri command invokes the Python CLI as a subprocess and parses its JSON stdout. The recording process spawns detached and is controlled via a lock file.

Requirements

  • Windows 10 / 11 (primary target; Linux/macOS untested)
  • Python 3.10+
  • Node 18+
  • Rust (for Tauri) — install via rustup
  • NVIDIA GPU with CUDA 12 support (optional but strongly recommended)

Setup

:: 1. Install Python dependencies
cd python
pip install -r requirements.txt
:: Install PyTorch with CUDA support separately (not on PyPI)
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu124
cd ..

:: 2. Install Node dependencies
npm install

:: 3. Run in development
npm run tauri dev

On first launch, Whisper models (~500 MB – 3 GB depending on size) and the SpeechBrain ECAPA-TDNN model (~80 MB) download into data/models/.

Producing a standalone installer

build_app.bat

This runs PyInstaller to bundle the Python CLI into a single .exe sidecar, then builds the Tauri installer (.msi + .exe) in src-tauri/target/release/bundle/.

Project layout

AiNotes/
├── src/                    React + TypeScript UI
├── src-tauri/              Rust / Tauri shell
│   ├── src/lib.rs          Command handlers
│   └── binaries/           PyInstaller sidecar (generated)
├── python/
│   └── ainotes/            Python backend
│       ├── cli.py          Click CLI (all JSON output)
│       ├── db.py           SQLite + FTS5
│       ├── recorder.py     WASAPI loopback + mic capture
│       ├── transcriber.py  faster-whisper pipeline
│       ├── diarizer.py     SpeechBrain ECAPA-TDNN
│       └── call_detector.py pycaw-based call detection
└── data/                   Runtime data (gitignored)
    ├── recordings/
    ├── models/
    └── ainotes.db

Performance notes

On an i7-7700K + GTX 1080 Ti (Pascal, CC 6.1):

Audio length Model Device Time RTF
71s small cuda int8 ~2s 0.025x
71s small cpu int8 ~30s 0.4x

Diarization runs on CPU (to avoid a cuDNN conflict with CTranslate2) and adds ~1-3 minutes for a one-hour meeting.

Why CPU for diarization?

faster-whisper (via CTranslate2) loads cuDNN 9 DLLs from the nvidia-* pip packages. PyTorch 2.6 ships its own cuDNN, and when both sit in the same process they compete for symbols, crashing with exit code 127. Since ECAPA-TDNN is small and fast on CPU, running diarization there is the simplest robust fix.

License

MIT — see LICENSE file.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 48.4%
  • TypeScript 29.9%
  • CSS 11.9%
  • Rust 8.5%
  • Batchfile 1.1%
  • HTML 0.2%