minolmo has been forked from the original OLMo model, with the primary goal of removing extra complexity and distributed training capabilities. This streamlined version is now a large language model featuring a simplified codebase that is easy to understand and follow. It is designed to be run and explored by researchers on a single GPU, making it accessible for those who want to delve into the workings of large language models without the need for extensive computational resources.
- Apr 5, 2024: David Brandfonbrener forked the original OLMo repository to create the min-olmo repository. David removed the distributed training capabilities.
- Apr 22, 2024: Kempner Institute forked the min-olmo repository from David's fork to create the KempnerInstitute/min-olmo repository. Any code after this date is from Kempner Institute affiliated contributors.
The project includes two main categories of files and directories:
minOLMo
python package: This package contains the source code for the model.- Scripts, configs, and other helper files to run the model.
In the following, we provide a brief description of the main directories and files in the project:
configs
: This directory contains the configuration files for the model. The configuration files include the model parameters, input data, and other necessary parameters to train the model.docs
: This directory contains the documentation for the project.- Single documentation files can be added using a single markdown file.
- The PDF technical report is located in
docs/technical_report
directory. - Documentations that my need extra files (e.g., images) can be added in a separate directory.
minolmo
: This directory contains the source code for the model.scripts
: This directory contains the scripts to run the model.notebooks
: This directory contains the notebooks to explore the model.tests
: This directory contains the tests for the model.CHANGELOG.md
: This file contains the changes made to the project.README.md
: The current file.LICENSE
: The license file for the project.pyproject.toml
: The configuration file for building the python package.
To install the package:
-
Step 1: Clone the repository
-
For developers:
git clone git@github.com:KempnerInstitute/minOLMo.git
-
For users:
git clone https://github.com/KempnerInstitute/minOLMo.git
-
-
Step 2: Create a conda environment
- Please visit: Setting up a conda environment
-
Step 3: Load Modules
module load python/3.12.5-fasrc01
module load cuda/12.4.1-fasrc01
module load cudnn/8.9.2.26_cuda12-fasrc01
- Step 4: Install the package
pip install -r requirements.txt
pip install -e .
To run the model, you can use the provided scripts in the scripts
directory. Before running the model, you need to have the following:
- Binary numpy files for the training data.
- Binary numpy files for the validation data.
- You Weights and Biases entity for logging the training process (in case you want to use it).
- Run name.
- Save folder.
- Configuration file.
- Path to the input training data folder.
- Path to the input validation data folder.
- W & B entity.
After you have all the necessary files and configurations, you can run the model using the following command:
Firs you need to allocate a compute node:
salloc -p kempner_h100 --account=[your account] --nodes=1 --ntasks=1 --cpus-per-task=24 --mem=375G --gres=gpu:1 -t 00-12:00:00
Then you can run the model:
python scripts/train.py configs/base-c4-t5.yaml --run_name=olmo --save_folder=save_folder
For submitting a batch job, you can use the run_single_gpu.sh
script in the scripts
directory. The script will submit a batch job to the SLURM scheduler. You can modify the script based on your requirements.
sbatch scripts/run_single_gpu.sh