Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

top-k功能实现的代码 #2

Open
z972778371 opened this issue Mar 3, 2022 · 8 comments
Open

top-k功能实现的代码 #2

z972778371 opened this issue Mar 3, 2022 · 8 comments

Comments

@z972778371
Copy link

您好,我想咨询一下实现top-k功能的代码都集中在sparse_activated_multihead_attention.py中的SparseActivatedMultiheadAttention类里了吗?

@zhaoguangxiang
Copy link
Collaborator

Yes

@z972778371
Copy link
Author

Yes

您好,关于top-k功能代码部分,我有些问题想请教您一下:
1、首先就是代码中许多参数不太明白它是用来干什么的。
1)例如parameters中的self.onnx_trace、entmax、bmm_fp16_support、cur_san_active等

2、代码260行的attn_weights = self.apply_sparse_mask(attn_weights, tgt_len, src_len, bsz)
查apply_sparse_mask函数的define 仅是返回了attn_weights,并未做任何处理,这一步是什么用处?
3、代码中的entmax用的是tf,原论文的pytorch版本可以平替代码中的entmax吗?

@zhaoguangxiang
Copy link
Collaborator

zhaoguangxiang commented Mar 6, 2022 via email

@z972778371
Copy link
Author

非常感谢您的回复。
关于您的代码我还有一些问题请教,如果您能帮我解答,将不尽感激^_^
目前我模型的attention_mask仅仅把文本padding的位置mask为-∞,我想对它引入稀疏注意力来检查效果是否有进一步提升。
PS:我的代码是把attention计算、encoder、decoder和transformer分成4个python file,可能需要将您的代码分块调用实现。
1、您代码中args参数是什么?
self.args = args、self.div = args.div、self.lb = args.lb,包括cur_san_active的bool值和self.entmax 也是根据args.use_att判断的,所以想知道一下参数args值是怎么设置的。
2、参数self.div和self.lb的值决定变量top_k的值,这两个参数是在args人为设置还是?
3、代码297行-312行,根据self.entmax来判断使用哪种形式的归一化操作。原论文提出的是1.5-entmax,那么在实际运行中,参数args.entmax的值是设置为2吗?

@zhaoguangxiang
Copy link
Collaborator

  1. args is setting in fairseq/model/transformer
  2. they are set by you
  3. Yes

@z972778371
Copy link
Author

您好,请问entmax15和top-k是如何选择的呢?
在您sparse_activated_multihead_attention.py代码中entmax和top-k是二选一的,在您测试的经验来看,两者各适用于什么情况?

@zhaoguangxiang
Copy link
Collaborator

zhaoguangxiang commented Mar 8, 2022 via email

@z972778371
Copy link
Author

Thank you very much for your patient answer, which helps me a lot.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants