A high-performance, real-time streaming audio transcription backend built with Spring Boot, WebFlux, and Google Gemini API.
Designed for low latency, network resilience, and efficient resource usage, this service streams audio chunks and returns partial transcriptions instantly.
- ✅ Real-time audio streaming (no buffering)
- ✅ Bi-directional communication using WebSockets
- ✅ Low-latency transcription with async, non-blocking processing
- ✅ Network resilience with circuit breakers & retries
- ✅ Scalable architecture using Spring WebFlux
- ✅ Concurrent session management
- ✅ Health checks & metrics monitoring
| Layer | Technology |
|---|---|
| Framework | Spring Boot 3.2.1 |
| Reactive | Spring WebFlux, Project Reactor |
| Streaming | WebSockets (SockJS fallback) |
| AI Engine | Google Gemini 2.0 Flash API |
| Resilience | Resilience4j |
| HTTP Client | OkHttp (connection pooling) |
| Build Tool | Maven |
| Java | Java 17+ |
Below are sample screenshots demonstrating the service in action:
- Java 17+
- Maven 3.8+
- Internet access to Google Gemini API
- Visit 👉 https://aistudio.google.com
- Create an API key
- Set it as an environment variable:
# Windows
set GEMINI_API_KEY=your-api-key-here
# Linux / macOS
export GEMINI_API_KEY=your-api-key-here
