-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Is the loss computation in UnigramTrainer correct? #628
Labels
Comments
@taku910 I ran into the issue when training a Transformer-XL model on various configurations of WikiText-103 using sentencepiece:
I looked at the code and @mbollmann seems to be correct that |
Sorry for the late response. yes, the computation was incorrect. Going to be fixed in the next release. |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
When computing
logsum_alt
, the frequency of a removed piece is re-assigned to alternatives:sentencepiece/src/unigram_model_trainer.cc
Lines 389 to 394 in ba7e11a
But the code uses
alternatives.size()
which, if I'm not mistaken, is always equal tosentencepieces.size()
. Don't we want to multiply with the number of alternatives for this particular sentencepiece, i.e.,alternatives[i].size()
? @taku910 ?The text was updated successfully, but these errors were encountered: