Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Did not multiply embedding weights by sqrt(d_model) #10

Open
orena1 opened this issue Jul 23, 2019 · 4 comments
Open

Did not multiply embedding weights by sqrt(d_model) #10

orena1 opened this issue Jul 23, 2019 · 4 comments

Comments

@orena1
Copy link

orena1 commented Jul 23, 2019

Hi,
In this line:

return self.embed(x)

I think you need to multiply the embedding by sqrt(d_model)
image

@fabrahman
Copy link

@orena1 Hi, the implementation also didn't share the embedding weights, right?

@fabrahman
Copy link

@orena1 The code actually has * math.sqrt(self.d_model) in the positional embedding class. In forward method.

@zhangxixi0904
Copy link

Did somebody know the reason for multiplying embedding weights by sqrt(d_model)?

@wangzelin-em
Copy link

@orena1 Hi, the implementation also didn't share the embedding weights, right?

Yes, the implementation didn't share the embedding weights.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants