Streaming

Wouldn't it be better to use streaming interfaces in both the llm and speech systems?

For example:

https://github.com/elevenlabs/elevenlabs-js/issues/4#issuecomment-2004696164

vercel should support this:

https://vercel.com/docs/functions/streaming