Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add HTTP streaming for local models #37

Merged
merged 3 commits into from
Jan 2, 2024
Merged

Conversation

Cohee1207
Copy link
Contributor

@Cohee1207 Cohee1207 commented Jan 2, 2024

Description

Adds a proper TTS streaming via HTTP by using coqui's inference_stream method and fastapi's StreamingResponse. The client can consume the new data as soon as it's ready. I found the chunk size of 100 (coqui tokens I assume?) to provide a favorable latency/interruption rate on my MacBook running CPU inference.

Implications

  1. Works only with local models.
  2. Uses HTTP GET instead of HTTP POST. Explanation below.

Initially, I wanted to stick to HTTP POST requests only and do audio playback using client-side JavaScript, but unfortunately MediaSource does not support working with WAV data. Adding intermediate compression would only increase latency and create more complexity. Using HTTP GET allows doing playback directly from HTML by setting the audio source to the API endpoint, the browser will do all the buffering and decoding at no extra cost.

Related SillyTavern pull request: SillyTavern/SillyTavern#1623

References

@daswer123
Copy link
Owner

Wow great job, I'll check it out now

@daswer123 daswer123 added the enhancement New feature or request label Jan 2, 2024
@daswer123
Copy link
Owner

Checked everything works great, thanks for your work :)

@daswer123 daswer123 merged commit fb16a09 into daswer123:main Jan 2, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants