Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to run on A100 40G? #31

Open
TopIdiot opened this issue Dec 20, 2024 · 2 comments
Open

How to run on A100 40G? #31

TopIdiot opened this issue Dec 20, 2024 · 2 comments

Comments

@TopIdiot
Copy link

TopIdiot commented Dec 20, 2024

When I run ./test_compute ../config_all/llama3-8B/1024.json directly, I got "Got bad cuda status: out of memory at line: 27/root/Nanoflow/pipeline/src/vortexData.cu".
Change the config model_configs.allocate_kv_data_batch to 100, I got Segmentation fault (core dumped). Then I change pipeline_configs to smaller, got Segmentation fault (core dumped) too.

I want to know if there are some rules on how to config it when using different kind of GPUs?

@durant1999
Copy link

the same question...

@fangbaolei
Copy link

Got bad cuda status: out of memory at line: 27/ai/zhiyi/w/multimodal/openbmb/Nanoflow/pipeline/src/vortexData.cu 4090 24G报同样的错误

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants