Skip to content

Use compute queue for AMD devices #102

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 3 commits into from
Nov 19, 2020

Conversation

PatriceVignola
Copy link
Contributor

All the evidence we've seen so far points to compute queues being better at preventing TDRs, but also being more performant on AMD. We can always revert the change later if it turns out to not be stable enough, but we should at least have this change be tested by Autopilot and ai-benchmark tests.

D3D12_COMMAND_LIST_TYPE queue_type = D3D12_COMMAND_LIST_TYPE_DIRECT;

if (adapter.VendorID() == VendorID::kAmd) {
queue_type = D3D12_COMMAND_LIST_TYPE_COMPUTE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep an environment variable to force the queue on/off as well, but default to on for AMD. Might be useful for experimentation.

@PatriceVignola PatriceVignola merged commit 1838dc3 into directml Nov 19, 2020
@PatriceVignola PatriceVignola deleted the user/pavignol/use-compute-queue-amd branch November 19, 2020 02:20
jstoecker pushed a commit that referenced this pull request Dec 15, 2020
jstoecker added a commit that referenced this pull request Dec 15, 2020
Merges some of the recent changes from the directml branch:
* Use compute queue for AMD devices (#102)
* Register List Kernels for DML (#95)
* Update DirectMLX to latest (#104)
* Remove extra rows from test email (#106)
* Fix DML's Select kernel for int64 (#113)
* Fix list kernels and tensor array ops registration (#114)
* Simplify CI scripts (#112)
* Fix StridedSlice's input size coalescing (#115)
* Disable int64 image test (#116)
* Fix network share copy path (#117)
* Pipeline should continue if a test job fails (#118)
* Switch network share path to use build number instead of build ID
* Add missing HostMemory int32 registrations for _Arg and _RetVal (#122)
* Implement all the arithmetic Scatter and ResourceScatter operators (#121)
* Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal (#120)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants