Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add doc about attention_mask on gpt2 #16829

Merged
merged 6 commits into from
Apr 19, 2022
Merged

Conversation

wiio12
Copy link
Contributor

@wiio12 wiio12 commented Apr 19, 2022

What does this PR do?

Fixes #16811.

If past_key_values is used, attention_mask needs to contain the masking strategy that was used for past_key_values. In other words, the attention_mask always has to have the length: len(past_key_values) + len(input_ids)

I add the sentence above describing how attention_mask needs to be constructed when past_key_values is used. The sentence is added in both the PyTorch and TensorFlow versions of code.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 19, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR!

@patrickvonplaten patrickvonplaten merged commit 7481457 into huggingface:main Apr 19, 2022
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* Add doc about `attention_mask` on gpt2

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

* Add doc about attention_mask on gpt2_tf

* clean up style

* remove empty line white spaces

* remove whitespace in empty line
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusion about past_key_values and attention_mask in GPT2Attention
4 participants