-
Notifications
You must be signed in to change notification settings - Fork 905
Pull requests: huggingface/tokenizers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add benchmark for deserializing large added vocab + optimizations
#1782
opened May 27, 2025 by
ArthurZucker
•
Draft
Update __init__.pyi: fix 525: SyntaxWarning: invalid escape sequence '\w'
#1764
opened Apr 18, 2025 by
wyattscarpenter
Loading…
Implement
from_bytes
and read_bytes
Methods in WordPiece Tokenizer for WebAssembly Compatibility
#1758
opened Mar 31, 2025 by
sondalex
Loading…
Pre-tokenizers that support multi-word/non-whitespace BPE in single pass
#1753
opened Mar 22, 2025 by
mjbommar
Loading…
Add FxHash and ShortStringOptimization.
#1733
opened Feb 10, 2025 by
MeetThePatel
Loading…
3 of 4 tasks
[WIP] free speed/mem optimizations with ahash, dary_heap, and compact_str
#1618
opened Aug 21, 2024 by
mjbommar
Loading…
[Feature] support Assign token to update the content of a token
Stale
#1570
opened Jul 12, 2024 by
ArthurZucker
•
Draft
1 task
ProTip!
What’s not been updated in a month: updated:<2025-04-27.