Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

4060 ti vs 1650 #142

Open
David67821 opened this issue Sep 6, 2024 · 1 comment
Open

4060 ti vs 1650 #142

David67821 opened this issue Sep 6, 2024 · 1 comment

Comments

@David67821
Copy link

I'm seeing something strange...
on the 1650 video card it finds 10 matches per template per day...
on the 4060 ti it finds 18-20 matches per template per day...

but the 4060 ti is 5 times faster! and shows 5 times more Mkey/s

I don't understand why it doesn't find 5 times more matches?

@joaoescribano
Copy link

As far i can tell, the code is bad optimized for newer GPUs (i'm usign a RTX 3080ti) and at the standard code compilation, i'm getting 1.4 GKey/s, after few updates at the cuda engine, i'm getting now 3 GKey/s.

in GPUEngie.cu#456 at function bool GPUEngine::callKernel() It uses nbThread / nbThreadPerGroup, nbThreadPerGroup as cuda parameters, i've been playing with it to find the best tune.

I've also changed the 8.6 Cores per SM at CPUEngie.cu#131 (_ConvertSMVer2Cores) to 1024, as my GPU can handle it.

Dunno if it's the case or not, but it helped me.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants