Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

OLMo 2 (WIP) #1897

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

OLMo 2 (WIP) #1897

wants to merge 10 commits into from

Conversation

ysjprojects
Copy link
Contributor

https://huggingface.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc
https://arxiv.org/abs/2501.00656

Version 2 of OLMo released by Ai2.

Comes in 7B and 13B Base, Instruct, and additional SFT and DPO models.

First, we find that OLMo 2 7B and 13B are the best fully-open models to-date, often outperforming open weight models of equivalent size. Not only do we observe a dramatic improvement in performance across all tasks compared to our earlier OLMo 0424 model but, notably, OLMo 2 7B outperforms LLama-3.1 8B and OLMo 2 13B outperforms Qwen 2.5 7B despite its lower total training FLOPs. The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance (see figure above).

@ysjprojects ysjprojects changed the title OLMo 2 OLMo 2 (WIP) Jan 4, 2025
@rasbt
Copy link
Collaborator

rasbt commented Jan 8, 2025

Hi there,
just wanted to say thanks for taking on this PR (I know this is a lot of work)! The OLMo models are awesome, and I'd be great to have OLMo 2 in LitGPT.

@ysjprojects
Copy link
Contributor Author

Hi there, just wanted to say thanks for taking on this PR (I know this is a lot of work)! The OLMo models are awesome, and I'd be great to have OLMo 2 in LitGPT.

Thanks mate!

Currently on vacation, will resume working on this PR once I'm back.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants