Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Refactor tool of creating pretrain dataset #9454

Merged
merged 1 commit into from
Nov 19, 2024

Conversation

gongel
Copy link
Member

@gongel gongel commented Nov 19, 2024

PR types

Function optimization

PR changes

Others

Description

  • Refactor tool of creating pretrain dataset, using AutoTokenizer

Copy link

paddle-bot bot commented Nov 19, 2024

Thanks for your contribution!

Copy link

codecov bot commented Nov 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.92%. Comparing base (7072406) to head (17e2655).
Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9454      +/-   ##
===========================================
+ Coverage    52.82%   52.92%   +0.10%     
===========================================
  Files          677      677              
  Lines       107941   107941              
===========================================
+ Hits         57018    57127     +109     
+ Misses       50923    50814     -109     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@gongel gongel merged commit 619c1b9 into PaddlePaddle:develop Nov 19, 2024
10 of 12 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants