Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

generate duplicated phrases #94

Open
x180380 opened this issue May 19, 2023 · 8 comments
Open

generate duplicated phrases #94

x180380 opened this issue May 19, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@x180380
Copy link

x180380 commented May 19, 2023

Whisper-timestamped will generate duplicated phrases for some audio, such as https://flex2.acast.com/s/pbs-newshour-segments/u/d3i6fh83elv35t.cloudfront.net/static/2023/05/newswrap-15.mp3
I use small and medium model

@passerbya
Copy link

图片
I have also encountered the same issue.

@blundercode
Copy link

I have seen this happen outside of whisper-timestamped with other whisper implementations as well. Is it caused by hallucination or not using VAD, I am curious?

@pinballelectronica
Copy link

Also seeing this- mostly during quiet parts if that helps at all. Otherwise the transcription is spot on- even with the hardest content.

@misutoneko
Copy link

For this particular sample, --accurate will get rid of the duplicates.
The problem is, there is no single set of parameters that works best for everything.
Sometimes I've even had to switch to a smaller model to get the timings right.

@Jeronymous
Copy link
Member

Yes, exactly @misutoneko
No free lunch...

@x180380
Copy link
Author

x180380 commented May 26, 2023

When using small or tiny model, the duplicated phrases decrease. WhiperX also has this issue.

@Jeronymous
Copy link
Member

Some people reported that using a higher value for compression_ratio_threshold than the default improves this issue.
typically --compression_ratio_threshold 1

@Jeronymous Jeronymous added the bug Something isn't working label Nov 15, 2023
@mattdl-radix
Copy link

mattdl-radix commented Nov 21, 2023

Had the same problem, with >10 repititions for several .mp3's.
Solution that worked for me was adding --compression_ratio_threshold 1 --accurate

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants