Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

The usage of past_key_values in AutoCompressorMixin #29

Open
RewindL opened this issue Oct 10, 2024 · 1 comment
Open

The usage of past_key_values in AutoCompressorMixin #29

RewindL opened this issue Oct 10, 2024 · 1 comment

Comments

@RewindL
Copy link

RewindL commented Oct 10, 2024

Hello authors. There is no doubt that AutoCompressors is an excellent work.

I have carefully read and learned the mainly designed codes in auto_compressor.py and noticed that the class AutoCompressorMixin contains quite a bit of codes for processing past_key_values (including softprompt in it). But i checked the intermediate values of ​​past_key_values in forward() during training and found it seems always be None.

That confuses me and I have question that whether the processing of past_key_values ​​is redundant in your code for aligning standard interfaces for CasualLM , or for other possible purposes?If not, I'm curious about what the situation is past_key_values!=None during model forwarding ?

@RewindL
Copy link
Author

RewindL commented Oct 10, 2024

I re-read and further test the code again, and found that when using model.generate() for inference, use_cache=True will be default chosen. In this case, model.generate() will recursively pass past_key_values to model.forward() when generating output sequence token-by-token.

So far, in my understanding, the past_key_values only serves inference with use_cache and will not be used during training or just extract softprompt using model(input_ids, output_softprompt=True).softprompt.

However, I still hope that my opinions can be reviewed by the authors to prevent myself from misunderstanding the design of the code.

Thanks.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant