[Improvement] TTS and STT endpoints

It would be nice to add /v1/audio/transcriptions, /v1/audio/translations and /v1/audio/speech endpoints. I'm currently working on /v1/audio/transcriptions but my wifi network is dead and I can't continue working on it yet. For STT I'm planning to use whisper.cpp and maybe F5-TTS for speech synthesis.