Test Gemini with Voice Input + Analysis + Structured Output

**Goal:** Evaluate Gemini’s voice/audio capabilities by sending an audio sample, analyzing the model output, and returning results in a structured format that matches our requirements.

---

### Requirements
- Provide an **audio/voice input** to Gemini (recorded file or microphone stream).
- Convert audio to text (if needed) and/or use Gemini’s native audio input (depending on the API mode).
- Ask Gemini to analyze the voice input and return:
  - transcription
  - key information extraction (what we need)
  - summary
  - optional metrics (confidence, timestamps, language, etc.)
- Output must be **structured** (JSON) so it can be consumed by the backend.

---

### Inputs
- [ ] Audio file (wav/mp3/m4a) OR live microphone recording
- [ ] Test prompt template (what we ask Gemini to do)
- [ ] Expected output schema (JSON fields)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Gemini with Voice Input + Analysis + Structured Output #11

Requirements

Inputs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test Gemini with Voice Input + Analysis + Structured Output #11

Description

Requirements

Inputs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions