WhisperKit Benchmarks #243

atiorh · 2024-11-02T14:58:18Z

atiorh
Nov 2, 2024
Maintainer

We are thrilled to announce our comprehensive benchmark suite for WhisperKit!

Benchmarks (Hugging Face Space)
Detailed Announcement (Twitter)

The benchmarks will be updated with every release starting WhisperKit-0.9!

Performance (speed) is reported on long-form ("from file" proxy) and short-form ("streaming" proxy) audio. The test data used in benchmarks is published on Hugging Face and benchmarks are reproducible by following instructions in BENCHMARKS.md.

Quality is reported across 3 datasets and 77 languages using WER and other metrics. Speech-to-text as well as Language Detection tasks are evaluated.

Device Support data is also published so developers can build presets for WhisperKit to best fit each end-user device while maximizing speed and/or accuracy as much as possible. Raw data here.

Looking forward to the community feedback!

atiorh · 2024-11-02T14:59:37Z

atiorh
Nov 2, 2024
Maintainer Author

Note: Higher performance (speed) with WhisperKit is possible. However, the benchmark data represents using the recommended (default) configuration that best balances battery life, thermal sustainability, memory consumption and latency for a smooth user experience. For example, on M2 Ultra, WhisperKit runs the latest OpenAI Large V3 Turbo model (v20240930/turbo in WhisperKit) as fast as 72x real-time with a GPU+ANE config. However, the default config (ANE only) is published as 42x real-time on the benchmarks.

M2_Ultra_large_v3_turbo.mov

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WhisperKit Benchmarks #243

{{title}}

Replies: 1 comment

{{title}}

Select a reply

WhisperKit Benchmarks #243

atiorh Nov 2, 2024 Maintainer

Replies: 1 comment

atiorh Nov 2, 2024 Maintainer Author

atiorh
Nov 2, 2024
Maintainer

atiorh
Nov 2, 2024
Maintainer Author