Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Can't run Keras example using PlaidML backend on MacOs on GPU #290

Open
robinhad opened this issue Oct 18, 2019 · 5 comments
Open

Can't run Keras example using PlaidML backend on MacOs on GPU #290

robinhad opened this issue Oct 18, 2019 · 5 comments

Comments

@robinhad
Copy link

OS: macOS 10.15
Python: 3.6.9
Tensorflow: 1.14
Ngraph installed from PyPI.
The following code:

import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input, decode_predictions
import numpy as np
import ngraph_bridge
ngraph_bridge.set_backend('PLAIDML')

model = ResNet50(weights='imagenet')

batch_size = 128
img = np.random.rand(batch_size, 224, 224, 3)
preds = model.predict(preprocess_input(img))
print('Predicted:', decode_predictions(preds, top=3)[0])
model.compile(tf.keras.optimizers.SGD(), loss='categorical_crossentropy')
preds = model.fit(
    preprocess_input(img), np.zeros((batch_size, 1000), dtype='float32'))
print('Ran a train round')

Throws the following error:

Compilation failed: 
program_source:44:8: error: cannot initialize a variable of type 'int4' (vector of 4 'int' values) with an rvalue of type 'metal::float4' (aka 'float4')
  int4 LX_T3 = floor((float4)LX_T2);
       ^       ~~~~~~~~~~~~~~~~~~~~

CPU backend works fine.
Seems like it cannot cast "metal:float4" to "float4".

@denise-k
Copy link

denise-k commented Oct 21, 2019

Thanks for reporting @robinhad. I'm from the PlaidML team.

Does this code work if you use PlaidML with OpenCL instead of Metal? You can configure PlaidML for the nGraph-TF bridge using the following instructions.

As a note: we've had several open issues with Metal on MacOS Mojave (10.14) which we expect to be fixed in Catalina (10.15), but we have not yet been able to upgrade our build/test machines to Catalina to test things out. Either way, this information is helpful for our team to resolve any issues with Catalina.

@robinhad
Copy link
Author

Thanks for response, @dgkutnic.

I run with OpenCL and it crashed. Error looks pretty similar to the Metal one.
Device: opencl_intel_iris(tm)_plus_graphics_655.0
Error:
[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Failed to build program kernel_c45_sdk_0
[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Failed build log: :34:8: error: initializing '__int4' with an expression of incompatible type '__float4'
int4 LX_T3 = floor(convert_float4(LX_T2));
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Code was:
1: // gid: 1 1 1
2: // lid: 1 1 1
3: // Names: { i1 }
4: // Ranges: { 1 }
5: // Out stride: { 1 }
6: // Elementwise input X_I_1 shape: i32(4):(1):16 bytes
7: // Elementwise input X_I_0 shape: i32(4):(1):16 bytes
8: // Elementwise op: X_T4 = cmp_lt(X_I_1, X_T0)
9: // Elementwise op: X_T1 = cond(X_T4, X_T0, X_I_1)
10: // Elementwise op: X_T2 = div(X_I_0, X_T1)
11: // Elementwise op: X_T3 = floor(X_T2)
12: // Tile size: { 1 }
13: // Contraction output var shape: bool(4):(1):4 bytes
14: // Computed true ops: 16
15: // Computed work groups: 1
16: // Computed inner loops: 1
17: // Computed shared mem: 0
18: // Computed out regs: 512
19: // Computed mem read: 8
20: // Computed mem write: 256
21: // Computed operations: 1
22: // Computed rollups: 0
23: // Computed threads used: 1
24: // lwork = 1, 1, 1
25: // gwork = 1, 1, 1
26: __kernel void kernel_c45_sdk_0(__global int4* restrict X_T3, __global const int4* restrict X_I_1, __global const int4* restrict X_I_0)
27: {
28: int tid = get_local_id(0);
29: int4 LX_I_1 = X_I_1[0];
30: int4 LX_I_0 = X_I_0[0];
31: int4 LX_T4 = (LX_I_1 < (int4)1);
32: int4 LX_T1 = select(convert_int4(LX_I_1), (int4)1, convert_int4(LX_T4));
33: int4 LX_T2 = (LX_I_0 / LX_T1);
34: int4 LX_T3 = floor(convert_float4(LX_T2));
35: X_T3[0] = LX_T3;
36: }

[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Failed to build program kernel_c45_sdk_0: CL_BUILD_PROGRAM_FAILURE
[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Failed to build program kernel_c45_sdk_0
[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Failed build log: :34:8: error: initializing '__int4' with an expression of incompatible type '__float4'
int4 LX_T3 = floor(convert_float4(LX_T2));
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[WARN] 2019-10-21T18:22:49z src/ngraph/runtime/plaidml/plaidml_logger.cpp 44 Code was:
1: // gid: 1 1 1
2: // lid: 1 1 1
3: // Names: { i1 }
4: // Ranges: { 1 }
5: // Out stride: { 1 }
6: // Elementwise input X_I_1 shape: i32(4):(1):16 bytes
7: // Elementwise input X_I_0 shape: i32(4):(1):16 bytes
8: // Elementwise op: X_T4 = cmp_lt(X_I_1, X_T0)
9: // Elementwise op: X_T1 = cond(X_T4, X_T0, X_I_1)
10: // Elementwise op: X_T2 = div(X_I_0, X_T1)
11: // Elementwise op: X_T3 = floor(X_T2)
12: // Tile size: { 1 }
13: // Contraction output var shape: bool(4):(1):4 bytes
14: // Computed true ops: 16
15: // Computed work groups: 1
16: // Computed inner loops: 1
17: // Computed shared mem: 0
18: // Computed out regs: 512
19: // Computed mem read: 8
20: // Computed mem write: 256
21: // Computed operations: 1
22: // Computed rollups: 0
23: // Computed threads used: 1
24: // lwork = 1, 1, 1
25: // gwork = 1, 1, 1
26: __kernel void kernel_c45_sdk_0(__global int4* restrict X_T3, __global const int4* restrict X_I_1, __global const int4* restrict X_I_0)
27: {
28: int tid = get_local_id(0);
29: int4 LX_I_1 = X_I_1[0];
30: int4 LX_I_0 = X_I_0[0];
31: int4 LX_T4 = (LX_I_1 < (int4)1);
32: int4 LX_T1 = select(convert_int4(LX_I_1), (int4)1, convert_int4(LX_T4));
33: int4 LX_T2 = (LX_I_0 / LX_T1);
34: int4 LX_T3 = floor(convert_float4(LX_T2));
35: X_T3[0] = LX_T3;
36: }

libc++abi.dylib: terminating with uncaught exception of type boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::promise_already_satisfied >: The state of the promise has already been set.

@denise-k
Copy link

Thanks for this info. We'll look into this and let you know once we've triaged the errors you are seeing.

@robinhad
Copy link
Author

is there any news regarding this?

@robinhad
Copy link
Author

robinhad commented Jul 7, 2020

@dgkutnic, tried it again on 5700 XT - works fine without crash
maybe iGPU has not enough memory?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants