Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

About limitations on the task #2

Open
WuTianming opened this issue Aug 28, 2023 · 0 comments
Open

About limitations on the task #2

WuTianming opened this issue Aug 28, 2023 · 0 comments

Comments

@WuTianming
Copy link

Hi, and thank you for your great work!

I was wondering if the early exit techniques introduced in the paper can be extended to be used with language modeling, or do they only apply to classification tasks? I think the only difference is that (1) language modeling has a rather large answer space at tens of thousands of vocabularies, and that (2) language models usually output a probability distribution to be sampled. Maybe it is because the conservative predictions are not strong enough when facing such a large number of possible sampling outcomes?

I see that you have a later work (CALM) addressing the case on language models by enforcing the early-exit objective during training, but I think the approaches used in CATs are more desirable because it is distribution-free and model-agnostic.

Thank you for your time!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant