Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Unified Checkpoint] Fix generation config save #9223

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions paddlenlp/trainer/plugins/unified_checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,9 @@
config_to_save.architectures = [model_to_save.__class__.__name__]
if self.args.should_save:
config_to_save.save_pretrained(save_directory)
# save generation config
if model_to_save.can_generate():
model_to_save.generation_config.save_pretrained(save_directory)

Check warning on line 365 in paddlenlp/trainer/plugins/unified_checkpoint.py

View check run for this annotation

Codecov / codecov/patch

paddlenlp/trainer/plugins/unified_checkpoint.py#L364-L365

Added lines #L364 - L365 were not covered by tests
paddle.device.cuda.empty_cache()

if strtobool(os.getenv("FLAG_LLM_PDC", "False")) and self.args.should_save:
Expand Down Expand Up @@ -667,6 +670,10 @@
config_to_save.architectures = [model_to_save.__class__.__name__]
config_to_save.save_pretrained(output_dir)

# save generation config
if model_to_save.can_generate():
model_to_save.generation_config.save_pretrained(output_dir)

Check warning on line 675 in paddlenlp/trainer/plugins/unified_checkpoint.py

View check run for this annotation

Codecov / codecov/patch

paddlenlp/trainer/plugins/unified_checkpoint.py#L674-L675

Added lines #L674 - L675 were not covered by tests

def save_single_card_optimizer(self, model, optimizer, output_dir):
""" "Save optimizer for non-distributed environment."""
# Split into optimizer params and master weights.
Expand Down
Loading