Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Missing file and a question regarding DPO training #2

Open
SachinVashisth opened this issue Jan 17, 2025 · 2 comments
Open

Missing file and a question regarding DPO training #2

SachinVashisth opened this issue Jan 17, 2025 · 2 comments

Comments

@SachinVashisth
Copy link

Hi

Missing File
I am trying to run the Step 2.3 i.e Sampling pseudo-labels from DPO model using the command

ARGS='+data.split="train" eval.mode="sampling" eval.sampling.max_seed=3'
torchrun --nproc_per_node 2 greedy_decode.py --config-name=dpo-1 $ARGS
python3 eval_sampling.py --config-name=dpo-1 $ARGS
python3 utils/make_rft_data.py --config-name=dpo-1

But the file greedy_decode.py seems missing. can you please provide the file?

Regarding DPO training
In the paper, it is mentioned that the training was done on a single NVIDIA A40 GPU.
I am currently working on a remote server that has two NVIDIA A40 GPUs with 48 GB of Cuda memory each.

But when I ran the commands given in Step 2.2: Train SFT model with DPO objective, then I received the out-of-memory error. When I made changes to the following variables in dpo-1.yaml, then only I was able to train:

per_device_train_batch_size: 3

per_device_eval_batch_size: 2

eval:
  per_device_eval_batch_size: 48

However, I want to clarify that the version of the trl and transformers library mentioned in the requirements file was not working somehow. For me, these versions worked:

trl==0.13.0
transformers==4.46.0
@TianduoWang
Copy link
Owner

Hi,

Thanks for your information!

For the Missing File, greedy_decode.py is actually from a previous version and it is deprecated now. You may use generate.py to generate the pseudo-labels.

Regarding DPO training, I think we don't mention the training is performed on single A40 GPU in our paper. Actually, we use a 8*A40 GPU server to perform the training. We only mention single A40 GPU in our Figure 4, which is used to illustrate the inference speedup.

@Gank0078
Copy link

Hi, I have a similar question. I run the code on 4 A6000 GPUs (each GPU has 48GB memory). I want to know how to change the config files so that I can train the model.
So far, I’ve tried reducing per_device_train_batch_size in sft-0.yaml and setting num_processes in the fsdp.yaml file to 4, but I’m still encountering out-of-memory errors. Are there any other parameters that should be adjusted to reduce the memory usage during training?
Additionally, as far as I know, the model seems to have been full fine-tuned. I’m curious why methods like LoRA weren’t used for fine-tuning instead. Thank you!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants