Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add a GPT-2 training example #19

Open
bwdGitHub opened this issue Dec 10, 2021 · 2 comments
Open

Add a GPT-2 training example #19

bwdGitHub opened this issue Dec 10, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@bwdGitHub
Copy link
Collaborator

We would like to use these issues to gauge user interest.

It is possible to use the GPT-2 implementation for further language model training. There is no example demonstrating this on the repo or otherwise.

To make this possible on a typical consumer GPU will likely require some technique to reduce the amount of GPU memory required to train. There are a number of options:

  1. Add support for a smaller GPT-2 model.
  2. Only train a subset of the GPT-2 parameters.
  3. Use gradient accumulation.
  4. Gradient checkpointing.
  5. Reduced precision gradients.
@bwdGitHub bwdGitHub added the enhancement New feature or request label Dec 10, 2021
@misataguchi
Copy link
Contributor

I have received one inquiry for fine-tuning GPT-2.

@qwer1304
Copy link

I second for being able to fine-tune GPT-2.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants