Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

During the tranining dataloader problem #2064

Open
asdemirel33 opened this issue Jul 31, 2024 · 1 comment
Open

During the tranining dataloader problem #2064

asdemirel33 opened this issue Jul 31, 2024 · 1 comment

Comments

@asdemirel33
Copy link

We are using the YoloV7 object detection model and want to train it with a custom dataset. Our GPU hardware consists of Nvidia V100. Our dataset includes 60k training images, 2k validation images, and 4k test images. The image resolution is 640x640. We are using a batch size of 16 and 8 workers.

We are encountering a problem during training, which I believe is related to the dataloader. During training, GPU utilization sometimes drops to 0% and at other times increases to 50%-70%, resulting in very long iteration times. One epoch takes almost 50 minutes. I tried increasing the number of workers (16, 24), but it did not fix the issue.

Summary, dataloader cannot transfer data synchronously to GPU. So GPU consumption is decreasing.

@Ian-Work-AI
Copy link

@asdemirel33 I have the same question too. Did you solve the problem?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants