Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[MRG] Make WMD normalization optional #3073

Merged
merged 1 commit into from
Mar 14, 2021
Merged

[MRG] Make WMD normalization optional #3073

merged 1 commit into from
Mar 14, 2021

Conversation

piskvorky
Copy link
Owner

@piskvorky piskvorky commented Mar 14, 2021

Follow-up from #3067:

This PR introduces an optional KeyedVectors.wmdistance(norm=True/False) parameter. Users can now choose whether to normalize their embedding vectors in each wmdistance call or not.

Prior behaviour:

  • gensim < 4.0: no normalization; user had to normalize all vectors explicitly manually (destructive operation).
  • gensim == 4.0.0beta: normalize dynamically with each call = norm=True hardwired.
  • gensim > 4.0.0beta (this PR): default norm=True, optional norm=False.

@piskvorky piskvorky requested a review from gojomo March 14, 2021 13:01
@piskvorky piskvorky added this to the 4.0.0 milestone Mar 14, 2021
@piskvorky piskvorky merged commit 1300929 into develop Mar 14, 2021
@piskvorky piskvorky deleted the wmd_norm branch March 14, 2021 13:22
@gojomo
Copy link
Collaborator

gojomo commented Mar 15, 2021

Other than my line-comment highlighting some possibilities around the zero-case threshold & default-return, the core default-but-optional norming looks good!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants