Problems running the `stream` example - [Start speaking] frozen #747

catdumitru · 2023-04-12T02:31:12Z

I'm having problems running the stream example on a Mac. There is no transcript displayed in the console, instead the output is frozen in the "[Start speaking]" state:

Below is the output for "make stream":
sysctl: unknown oid 'hw.optional.arm64'
I whisper.cpp build info:
I UNAME_S: Darwin
I UNAME_P: i386
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mf16c -mfma -mavx -mavx2 -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS: -framework Accelerate
I CC: Apple clang version 14.0.0 (clang-1400.0.29.202)
I CXX: Apple clang version 14.0.0 (clang-1400.0.29.202)

make: `stream' is up to date.

./stream -m ./models/ggml-base.en.bin -t 8 --step 500 --length 5000 -c 0
init: found 2 capture devices:
init: - Capture device #0: 'Built-in Microphone'
init: - Capture device #1: 'Microsoft Teams Audio'
init: attempt to open capture device 0 : 'Built-in Microphone' ...
init: obtained spec for input device (SDL Id = 2):
init: - sample rate: 16000
init: - format: 33056 (required: 33056)
init: - channels: 1 (required: 1)
init: - samples per frame: 1024
whisper_init_from_file_no_state: loading model from './models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 2
whisper_model_load: mem required = 218.00 MB (+ 6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 140.60 MB
whisper_model_load: model size = 140.54 MB
whisper_init_state: kv self size = 5.25 MB
whisper_init_state: kv cross size = 17.58 MB

main: processing 8000 samples (step = 0.5 sec / len = 5.0 sec / keep = 0.2 sec), 8 threads, lang = en, task = transcribe, timestamps = 0 ...
main: n_new_line = 9, no_context = 1

[Start speaking]

The text was updated successfully, but these errors were encountered:

ggerganov · 2023-04-14T17:20:12Z

Try to increase the --step. For example, to 2000 ms:

make clean && make stream
./stream -m ./models/ggml-base.en.bin -t 8 --step 2000 --length 10000 -c 0

catdumitru · 2023-04-14T17:39:08Z

I'm getting the same result unfortunately even if I increase the step size to 2000, 4000 or 8000

kinory24 · 2023-12-15T15:38:23Z

@catdumitru you fixed that? i'm having the same issue right now

khromalabs · 2024-03-10T12:18:49Z

Hi same issue, in a Linux environment. I already verified that the speech recognition via main works all right. stream freezes in my computer and even pressing C+c it won't shut down. I tried the parameter to dump the captured audio and I just get a blank wav file of around 900K, so I suspect something is going on related with the audio initialization maybe something related with the sdl2 library? BTW, the sdl2 version installed in my system is 2.30, other sdl2 dependent tools like ffmpeg work all right. I'll keep digging this.

arosov · 2024-05-06T17:27:28Z

Had the same issue, I opened libsdl-org/SDL#9706.
It seems this comes from sdl2 >= 2.30.0.
In the meantime, consider downgrading sdl2 to 2.28.5.

ggerganov closed this as completed Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems running the `stream` example - [Start speaking] frozen #747

Problems running the `stream` example - [Start speaking] frozen #747

catdumitru commented Apr 12, 2023

ggerganov commented Apr 14, 2023

catdumitru commented Apr 14, 2023

kinory24 commented Dec 15, 2023

khromalabs commented Mar 10, 2024 •

edited

Loading

arosov commented May 6, 2024

Problems running the stream example - [Start speaking] frozen #747

Problems running the stream example - [Start speaking] frozen #747

Comments

catdumitru commented Apr 12, 2023

ggerganov commented Apr 14, 2023

catdumitru commented Apr 14, 2023

kinory24 commented Dec 15, 2023

khromalabs commented Mar 10, 2024 • edited Loading

arosov commented May 6, 2024

Problems running the `stream` example - [Start speaking] frozen #747

Problems running the `stream` example - [Start speaking] frozen #747

khromalabs commented Mar 10, 2024 •

edited

Loading