We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
作者您好,感谢您分享模型。之前问过您问题如何预训练。 我发现加载模型后embedding层大小是31128但是加载tokenzier分词器词表大小32228.原因就是多了预训练需要的extra_0到extra_100.而这是预训练所需要的。所以如何基于您分享这个embedding的32128的模型预训练。 tokenizer的 model的
The text was updated successfully, but these errors were encountered:
已经修复,可以重新加载下
Sorry, something went wrong.
您好,感谢您的回复。 刚才试了加载chatyuanV2。您是加载词表吧extra_id的数量设置为0了,所以tokinzer的vocab_size减少了100.但是T5模型预训练期间需要extra_0到extra_100把。不应该是把模型的embdding层的维度增加为32228来适应extra_0到extra_100这100个mask词么
No branches or pull requests
作者您好,感谢您分享模型。之前问过您问题如何预训练。


我发现加载模型后embedding层大小是31128但是加载tokenzier分词器词表大小32228.原因就是多了预训练需要的extra_0到extra_100.而这是预训练所需要的。所以如何基于您分享这个embedding的32128的模型预训练。
tokenizer的
model的
The text was updated successfully, but these errors were encountered: