Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Word-level timestamps not working with python implementation #910

Open
rkulyassa opened this issue Oct 29, 2024 · 0 comments
Open

Word-level timestamps not working with python implementation #910

rkulyassa opened this issue Oct 29, 2024 · 0 comments

Comments

@rkulyassa
Copy link

I am attempting to run whisperx with word-level timestamps, but despite passing the relevant option, the output is of the form {'segments': [ ... ], 'language': 'en'} with no word_segments.

I dug around a bit but could not find out why this is happening. I have confirmed that model.options.word_timestampts is True, so I believe it is an internal issue with model.transcribe, and perhaps the options are not properly being wrapped to faster-whisper.

My code:

        model = whisperx.load_model(
            model_name,
            device=device,
            compute_type=compute_type,
            language="en",
            task="transcribe",
            asr_options={"word_timestamps": True},
        )
        print(model.options.word_timestamps)  # True
        transcript = model.transcribe(video_path, language="en")  # doesn't include word-level timestamps

It should be noted that running via command line works fine:

whisperx \
    --model large-v2 \
    --compute_type int8 \
    --output_format json \
    --suppress_numerals \
    --task transcribe \
    --language en \
    $input_file

This properly includes word_segments in the json output.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant