Skip to content

Issues: huggingface/tokenizers

Training a model from in-memory data
#198 by loicbarrault was closed Nov 28, 2020
Closed 1
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Serializing k-mer style pre-tokenizer
#1654 opened Oct 15, 2024 by millanp95
Inconsistent behaviour of PreTrainedTokenizerFasts on diacritics marked texts bug Something isn't working
#1663 opened Oct 11, 2024 by sven-nm
2 of 4 tasks
NormalizedString.clear() broken? bug Something isn't working
#1636 opened Sep 25, 2024 by lkurlandski
.NET bindings
#1615 opened Aug 16, 2024 by sappho192
RefMutContainer is unsound
#1612 opened Aug 13, 2024 by CheaterCodes
[test-infra] Enable Codecov for tokenizers
#1611 opened Aug 12, 2024 by hvaara
ProTip! Exclude everything labeled bug with -label:bug.