Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

verbose print catch UnicodeEncodeError #670

Closed
wants to merge 1 commit into from

Conversation

simon300000
Copy link

New to python.
Sometimes I have this error in Windows.
This is a work around, maybe set encoding is a better solution, but I dont know if other encoding will cause the output in terminal to be unreadable.

Traceback (most recent call last):
  File "C:\Python39\Scripts\whisper-script.py", line 33, in <module>
    sys.exit(load_entry_point('whisper==1.0', 'console_scripts', 'whisper')())
  File "C:\Python39\lib\site-packages\whisper\transcribe.py", line 307, in cli
    result = transcribe(model, audio_path, temperature=temperature, **args)
  File "C:\Python39\lib\site-packages\whisper\transcribe.py", line 207, in transcribe
    add_segment(
  File "C:\Python39\lib\site-packages\whisper\transcribe.py", line 168, in add_segment
    print(f"[{format_timestamp(start)} --> {format_timestamp(end)}] {text}")
UnicodeEncodeError: 'gbk' codec can't encode character '\u266a' in position 33: illegal multibyte sequence

@simon300000 simon300000 changed the title verbose catch text encode error verbose print catch UnicodeEncodeError Dec 11, 2022
@simon300000
Copy link
Author

Any update?

@jongwook
Copy link
Collaborator

Thanks for reporting this! I've merged #859, which will replace any character that can't be encoded using the system encoding with an ?.

@jongwook jongwook closed this Jan 18, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants