Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug fix] Tokenization multiprocessing fix #845

Merged
merged 1 commit into from
May 31, 2024

Conversation

wheresmyhair
Copy link
Collaborator

  1. Tokenization multiprocessing bug fix
    fix

  2. Disable passing a ConversationTemplate object directly through model.tokenize() method. For customizing conversation templates, please refer to doc.

Copy link
Contributor

@research4pan research4pan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 The fix moves the tokenization function outside, avoiding shared arguments passing through closure, which is highly probable the source of the multi-process bug.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants