Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Feature][Config] Add Tulu3/Olmo2 model configs #1407

Open
wizeng23 opened this issue Feb 7, 2025 · 3 comments
Open

[Feature][Config] Add Tulu3/Olmo2 model configs #1407

wizeng23 opened this issue Feb 7, 2025 · 3 comments
Labels
enhancement New feature or request Feature good first issue Good for newcomers

Comments

@wizeng23
Copy link
Contributor

wizeng23 commented Feb 7, 2025

Feature request

These are recently released open-source models from AI2. We should have configs for training/evaluation/inference with them. See existing configs under configs/recipes, ex. for Llama 3.1. Also see #1361 for a related feature that adds Tulu3 dataset support.

Motivation / references

List of models we'd like to add:
https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B
https://huggingface.co/allenai/Llama-3.1-Tulu-3.1-8B
https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B
https://huggingface.co/allenai/Llama-3.1-Tulu-3-405B

https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct
https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct

Your contribution

If somebody can volunteer to start this work, I can answer questions and help with testing.

@wizeng23 wizeng23 added enhancement New feature or request Feature good first issue Good for newcomers triage This issue needs review by the core team. labels Feb 7, 2025
@wizeng23 wizeng23 changed the title [Feature] Add Tulu3/Olmo2 models [Feature][Config] Add Tulu3/Olmo2 models Feb 7, 2025
@wizeng23 wizeng23 changed the title [Feature][Config] Add Tulu3/Olmo2 models [Feature][Config] Add Tulu3/Olmo2 model configs Feb 7, 2025
@taenin taenin removed the triage This issue needs review by the core team. label Feb 12, 2025
@bwalshe
Copy link
Contributor

bwalshe commented Feb 12, 2025

I did the pull request to add the dataset support. The reason that I didn't include proper configs for training the models was that training the full model would take about 50 hours on 64 GPUs and I don't think I have the resources to do this myself.

There are details on how to reproduce the Tulu 3 models here and I could make configs based on this, but I am not going to be able to actually build and evaluate them properly.

@wizeng23
Copy link
Contributor Author

Yep I saw that change, thanks for making it! This is a separate feature request to add configs for the models in Oumi, similar to how we have configs for Llama. This is in case users want to fine-tune these models further, evaluate them, run inference on them, etc.

@bwalshe
Copy link
Contributor

bwalshe commented Feb 12, 2025

Sure. I am just saying that the configs from the original project are there.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request Feature good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants