adding packed bert from optimum-main #71
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adding PackedBERT notebooks/ models/ utils folder into Paperspace from HF Optimum:
------ copied from the original PR description for Hugging Face Optimum Graphcore (now merged):
Contents:
Simplified notebooks for all three supported Packed BERT tasks for easy implementation
Adds all of the necessary utils/model heads imported into notebooks - preprocessing, postprocessing, model changes
Notes: For the time being, the models/ and utils/ are in this folder that goes into notebooks/ but ideally it would be nice to have the utils put into optimum/graphcore/ so they could be easily importable with the package - and the models/modeling_bert_packed.py could just be options within the default modeling_bert, and packing could be enabled through the AutoConfig (some tweaks would be needed for that, but nothing extensive) This also gives us a structure to add future packing tasks/notebooks
Fixes
I've removed the model classes and packing algorithm/dataset creation utils from the notebooks, noted that they were too complex and large as notebook code blocks requiring too much explanation and would be hard to maintain here. The intention of these notebooks is to give brief explanations of the differences between unpacked and packed at each stage and allows users to easily implement it using the importable methods.
A more in depth explanation of the packing/preproc/postproc/model change process will get its own notebooks/blog in future so we don't need to cover it for this notebook
I've used the env variables for pod type and executable dir
Rewritten most of these notebooks to not be as detailed/complex and use more active language - some of it is copied from existing notebooks for the same (unpacked) tasks - happy to change stuff