For some reason (most likely because I am stupid) yielding audio chunks directly from piper does not work. As a workaround, we wait for the whole sentence to be returned, convert it to MP3 and then it works with the rest of the pipeline. We could probably shave off 0.5-1 seconds from the response time if this is fixed. Low priority since the impact is small, but please vote/comment if it's a big deal for you personally.