Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal #120

adtsai · 2020-12-12T03:49:25Z

For some reason, on some models, TensorFlow has a habit of forcibly colocating kernels like ApplyAdam with RandomUniform, RandomStandardNormal, and TruncatedNormal. This may be because the initial weights, which is what Adam optimizes, are initialized at training start by one of the Random operators.

This change registers a set of kernels to "emulate" support for RandomStandardNormal and TruncatedNormal. These kernels re-use the CPU implementations and merely upload the values to a GPU tensor. This means that computation is still done on the CPU (which is okay, since it's usually only done once during initialization), but the DML registration means they can now be colocated with other operators, like ApplyAdam.

This change should improve our AI-Benchmark scores by about 5-10%.

PatriceVignola

jstoecker · 2020-12-14T18:24:39Z

We need to track that these aren't truly implemented. Can you add some metadata to the ops table to track emulated kernel implementations?

adtsai · 2020-12-15T00:48:45Z

@jstoecker I'll add something to the op report for these kernels.

… TruncatedNormal (#120)

Merges some of the recent changes from the directml branch: * Use compute queue for AMD devices (#102) * Register List Kernels for DML (#95) * Update DirectMLX to latest (#104) * Remove extra rows from test email (#106) * Fix DML's Select kernel for int64 (#113) * Fix list kernels and tensor array ops registration (#114) * Simplify CI scripts (#112) * Fix StridedSlice's input size coalescing (#115) * Disable int64 image test (#116) * Fix network share copy path (#117) * Pipeline should continue if a test job fails (#118) * Switch network share path to use build number instead of build ID * Add missing HostMemory int32 registrations for _Arg and _RetVal (#122) * Implement all the arithmetic Scatter and ResourceScatter operators (#121) * Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal (#120)

Register emulated RandomNormal kernels

ecfe151

adtsai requested review from jstoecker and PatriceVignola December 12, 2020 03:49

PatriceVignola approved these changes Dec 12, 2020

View reviewed changes

adtsai merged commit a2134ad into directml Dec 15, 2020

jstoecker pushed a commit that referenced this pull request Dec 15, 2020

Register emulated kernel implementations for RandomStandardNormal and…

6fb9a66

… TruncatedNormal (#120)

adtsai deleted the p/adtsai/normal branch December 15, 2020 22:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal #120

Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal #120

adtsai commented Dec 12, 2020

PatriceVignola left a comment

jstoecker commented Dec 14, 2020

adtsai commented Dec 15, 2020

Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal #120

Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal #120

Conversation

adtsai commented Dec 12, 2020

PatriceVignola left a comment

Choose a reason for hiding this comment

jstoecker commented Dec 14, 2020

adtsai commented Dec 15, 2020