Add doc about `attention_mask` on gpt2 #16829

wiio12 · 2022-04-19T11:10:35Z

What does this PR do?

If past_key_values is used, attention_mask needs to contain the masking strategy that was used for past_key_values. In other words, the attention_mask always has to have the length: len(past_key_values) + len(input_ids)

I add the sentence above describing how attention_mask needs to be constructed when past_key_values is used. The sentence is added in both the PyTorch and TensorFlow versions of code.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

HuggingFaceDocBuilderDev · 2022-04-19T11:25:32Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for your PR!

* Add doc about `attention_mask` on gpt2 Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used. * Add doc about attention_mask on gpt2_tf * clean up style * remove empty line white spaces * remove whitespace in empty line

wiio12 added 2 commits April 19, 2022 18:32

Add doc about attention_mask on gpt2

617a600

Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used.

Add doc about attention_mask on gpt2_tf

008b67c

wiio12 mentioned this pull request Apr 19, 2022

Confusion about past_key_values and attention_mask in GPT2Attention #16811

Closed

wiio12 and others added 4 commits April 19, 2022 20:31

clean up style

b97e82b

remove empty line white spaces

564de04

Merge branch 'huggingface:main' into patch-1

5092d5f

remove whitespace in empty line

d5187ae

patrickvonplaten approved these changes Apr 19, 2022

View reviewed changes

patrickvonplaten requested review from sgugger and gante April 19, 2022 14:26

sgugger approved these changes Apr 19, 2022

View reviewed changes

patrickvonplaten merged commit 7481457 into huggingface:main Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add doc about `attention_mask` on gpt2 #16829

Add doc about `attention_mask` on gpt2 #16829

wiio12 commented Apr 19, 2022

HuggingFaceDocBuilderDev commented Apr 19, 2022 •

edited

Loading

sgugger left a comment

Add doc about attention_mask on gpt2 #16829

Add doc about attention_mask on gpt2 #16829

Conversation

wiio12 commented Apr 19, 2022

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Apr 19, 2022 • edited Loading

sgugger left a comment

Choose a reason for hiding this comment

Add doc about `attention_mask` on gpt2 #16829

Add doc about `attention_mask` on gpt2 #16829

HuggingFaceDocBuilderDev commented Apr 19, 2022 •

edited

Loading