Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

transcribe_timestamped cutoff script and starts again #203

Open
CorrM opened this issue Aug 9, 2024 · 0 comments
Open

transcribe_timestamped cutoff script and starts again #203

CorrM opened this issue Aug 9, 2024 · 0 comments

Comments

@CorrM
Copy link

CorrM commented Aug 9, 2024

The original audio audio_voice.zip

["Reddit, You've had to apologize for something ridiculous.", 'What was it?', "I once apologized for accidentally convincing my best friend's grandma that I was a professional cage fighter and getting her to attend one of my (fake) matches.", 'She brought homemade chicken soup as a "good luck charm" and sat in the front row, cheering me on with her cane while yelling "You show \'em, young man!"', 'The whole crowd thought it was hilarious, but I had to apologize when she found out it was all an elaborate prank just for kicks.', '(ends abruptly)']

here is my code:

    def audio_to_text(filename: str, model_size: str = "base") -> dict[str, Any]:
        """
        Converts an audio file to text using a pre-trained model.

        :param filename: The path to the audio file.
        :param model_size: The size of the model to use (default is "base").
        :return: A generator object that yields the transcribed text and its corresponding timestamps.
        """
        from whisper_timestamped import load_model, transcribe_timestamped

        global WHISPER_MODEL
        if WHISPER_MODEL is None:
            WHISPER_MODEL = load_model(model_size)

        gen = transcribe_timestamped(WHISPER_MODEL, filename, verbose=False, fp16=False)
        return gen

Here is what transcribe_timestamped["text"] return

Reddit? You've had to apologize for something ridiculous. What was it? I once apologized for accidentally convincing my best friend's grandma that I was a professional cage fighter and getting her to attend one of my fake matches. She brought home a chicken soup as a good luck charm and sat in the front row, cheering me on with her came while yelling you show him young man. The whole crowd thought it was hilarious, but I had to apologize when she found out it was all 
Reddit? You've had to apologize for something ridiculous. What was it? I once apologized for accidentally convincing my best friend's grandma that I was a professional cage fighter and getting her to attend one of my fake matches. She brought home a chicken soup as a good luck charm and sat in the front row, cheering me on with her came while yelling you show him young man. The whole crowd thought it was hilarious, but I had to apologize when she found out it was all an elaborate prank just for kicks. Ends abruptly.

I added a new line to make it more visible that the original script cuts off and starts again.


when i do something like that:

text: str = "".join(x["text"] for x in whisper_analysis["segments"])

I get:

 Reddit? You've had to apologize for something ridiculous. What was it? I once apologized for accidentally convincing my best friend's grandma that I was a professional cage fighter and getting her to attend one of my fake matches. She brought home a chicken soup as a good luck charm and sat in the front row, cheering me on with her came while yelling you show him young man. The whole crowd thought it was hilarious, but I had to apologize when she found out it was all an elaborate prank just for kicks. Ends abruptly.

Which it is what transcribe_timestamped["text"] should return/

@CorrM CorrM changed the title transcribe_timestamped duplicate text transcribe_timestamped cutoff script and starts again Aug 9, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant