-
Notifications
You must be signed in to change notification settings - Fork 34
Register List Kernels for DML #95
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
Can you rename the PR to something that doesn't involve the branch name? Also don't forget to change the commit message when merging. #Resolved |
@@ -17,6 +17,7 @@ limitations under the License. | |||
#define TENSORFLOW_CORE_KERNELS_TENSOR_ARRAY_H_ | |||
|
|||
#include <limits.h> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra line #Resolved
@@ -33,6 +31,7 @@ limitations under the License. | |||
#include "tensorflow/core/lib/core/coding.h" | |||
#include "tensorflow/core/lib/core/errors.h" | |||
#include "tensorflow/core/util/util.h" | |||
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor" #include "tensorflow/core/framework/op_kernel.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has it been modified? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake? Looks like you joined the two lines together. :) #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was modified by the clang-format. But no change here, so I reverted.
In reply to: 524980697 [](ancestors = 524980697)
} | ||
} | ||
auto output_flat = output->shaped<T, 2>({1, output->NumElements()}); | ||
// auto output_flat = output->shaped<T, 2>({1, output->NumElements()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code? #Resolved
// OP_REQUIRES_OK(c, c->allocate_temp(tmp.dtype(), tmp.shape(), | ||
// &aligned)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adtsai needs to look at this. The coupling of tensor_array with list_kernels is a little concerning.
@@ -33,6 +31,7 @@ limitations under the License. | |||
#include "tensorflow/core/lib/core/coding.h" | |||
#include "tensorflow/core/lib/core/errors.h" | |||
#include "tensorflow/core/util/util.h" | |||
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor" #include "tensorflow/core/framework/op_kernel.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake? Looks like you joined the two lines together. :) #Resolved
d48500e
to
0bd88ad
Compare
@@ -798,7 +764,7 @@ class TensorListFromTensor : public OpKernel { | |||
TensorShape output_shape(t.shape()); | |||
output_shape.RemoveDim(0); | |||
OP_REQUIRES(c, element_shape.IsCompatibleWith(output_shape), | |||
errors::InvalidArgument( | |||
errors::InvalidArgument(s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s [](start = 40, length = 1)
added by mistake, will be reverted #Resolved
tmp.unaligned_flat<T>(); | ||
output_list.tensors().emplace_back(aligned); | ||
tensor_array::TensorCopyUnaligned<Device, T>(c, &tmp, &aligned); | ||
output_list.tensors().emplace_back(tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tmp [](start = 41, length = 3)
Why was this changed? #Resolved
@@ -664,9 +642,8 @@ class TensorListSplit : public OpKernel { | |||
// prevent this. | |||
Tensor aligned; | |||
OP_REQUIRES_OK(c, c->allocate_temp(tmp.dtype(), tmp.shape(), &aligned)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocate_temp [](start = 27, length = 13)
It looks like TensorCopyUnaligned already calls allocate_temp. Do callers need to allocate their own tensors, or not? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callers need. Removed the extra line to call allocate_temp.
In reply to: 525578893 [](ancestors = 525578893)
Thanks for the remind:> In reply to: 728780643 [](ancestors = 728780643) |
ConstMatrixVector inputs_flat; | ||
inputs_flat.reserve(indices.NumElements()); | ||
// ConstMatrixVector inputs_flat; | ||
std::vector<PersistentTensor> values; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merges some of the recent changes from the directml branch: * Use compute queue for AMD devices (#102) * Register List Kernels for DML (#95) * Update DirectMLX to latest (#104) * Remove extra rows from test email (#106) * Fix DML's Select kernel for int64 (#113) * Fix list kernels and tensor array ops registration (#114) * Simplify CI scripts (#112) * Fix StridedSlice's input size coalescing (#115) * Disable int64 image test (#116) * Fix network share copy path (#117) * Pipeline should continue if a test job fails (#118) * Switch network share path to use build number instead of build ID * Add missing HostMemory int32 registrations for _Arg and _RetVal (#122) * Implement all the arithmetic Scatter and ResourceScatter operators (#121) * Register emulated kernel implementations for RandomStandardNormal and TruncatedNormal (#120)
No description provided.