You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello authors. There is no doubt that AutoCompressors is an excellent work.
I have carefully read and learned the mainly designed codes in auto_compressor.py and noticed that the class AutoCompressorMixin contains quite a bit of codes for processing past_key_values (including softprompt in it). But i checked the intermediate values of past_key_values in forward() during training and found it seems always be None.
That confuses me and I have question that whether the processing of past_key_values is redundant in your code for aligning standard interfaces for CasualLM , or for other possible purposes?If not, I'm curious about what the situation is past_key_values!=None during model forwarding ?
The text was updated successfully, but these errors were encountered:
I re-read and further test the code again, and found that when using model.generate() for inference, use_cache=True will be default chosen. In this case, model.generate() will recursively pass past_key_values to model.forward() when generating output sequence token-by-token.
So far, in my understanding, the past_key_values only serves inference with use_cache and will not be used during training or just extract softprompt using model(input_ids, output_softprompt=True).softprompt.
However, I still hope that my opinions can be reviewed by the authors to prevent myself from misunderstanding the design of the code.
Hello authors. There is no doubt that AutoCompressors is an excellent work.
I have carefully read and learned the mainly designed codes in
auto_compressor.py
and noticed that the classAutoCompressorMixin
contains quite a bit of codes for processing past_key_values (including softprompt in it). But i checked the intermediate values of past_key_values
in forward() during training and found it seems always be None.That confuses me and I have question that whether the processing of past_key_values is redundant in your code for aligning standard interfaces for CasualLM , or for other possible purposes?If not, I'm curious about what the situation is
past_key_values!=None
during model forwarding ?The text was updated successfully, but these errors were encountered: