Use left-padding rather than right-padding for prefix audio by coezbek · Pull Request #198 · Zyphra/Zonos

coezbek · 2025-03-20T12:42:00Z

The existing code uses right-padding of the audio prefix data in case the audio prefix data isn't a multiple of 512 samples.

This might cause an unintended gap between prefix audio and generated audio.

This patch switches to left padding of the audio prefix.

In this patch I also added a 350ms silence prefix which is exactly 30 audio tokens long (30 * 512) to ensure that clipping does not happen and also because the model seems to use roughly 20-25 tokens look-ahead to warm-up.

mrdrprofuroboros · 2025-04-03T01:04:42Z

@coezbek

This might cause an unintended gap between prefix audio and generated audio.

the gap is ~11ms (512/44100) at max, right? Have you experienced a click there?

coezbek · 2025-04-03T07:58:38Z

Correct max is 511. And no, I haven't noticed any clicks, because I am using just silence audio anyway as prefix (btw. 350ms silence improved generation quality for me a bit).

But conceptually it just seems wrong to put zeros in the gap.

coezbek added 2 commits March 20, 2025 13:34

Use left-padding rather than right-padding for prefix audio

c0f4b05

Add 350ms of silence as an alternative prefix.

b83c23d

mrdrprofuroboros mentioned this pull request Apr 3, 2025

Infinite streaming #208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use left-padding rather than right-padding for prefix audio#198

Use left-padding rather than right-padding for prefix audio#198
coezbek wants to merge 2 commits into
Zyphra:mainfrom
coezbek:audio_prefix_fix

coezbek commented Mar 20, 2025

Uh oh!

mrdrprofuroboros commented Apr 3, 2025

Uh oh!

coezbek commented Apr 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

coezbek commented Mar 20, 2025

Uh oh!

mrdrprofuroboros commented Apr 3, 2025

Uh oh!

coezbek commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coezbek commented Apr 3, 2025 •

edited

Loading