Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

RuntimeError: could not create an engine #2379

Open
aruhela opened this issue Jul 6, 2024 · 1 comment
Open

RuntimeError: could not create an engine #2379

aruhela opened this issue Jul 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@aruhela
Copy link

aruhela commented Jul 6, 2024

Hi Intel Team

I am observing "could not create an engine" error in executing demo.py example from "oneCCL Bindings for PyTorch Getting Started Sample*". The code is run on Saphire node with 4 PVCs at TACC system. Any suggestions on identifying the cause and fixing it?

(base) c551-003pvc$ mpirun -n 2 -l python demo.py -dev xpu
[0] Runing Iteration: 0 on device xpu:0
[0] Runing forward: 0 on device xpu:0
[0] Traceback (most recent call last):
[0] File "/scratch/05231/aruhela/demo.py", line 67, in
[1] Runing Iteration: 0 on device xpu:1
[1] Runing forward: 0 on device xpu:1
[1] Traceback (most recent call last):
[1] File "/scratch/05231/aruhela/demo.py", line 67, in
[0] res = model(input)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[1] res = model(input)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[0] return self._call_impl(*args, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[1] return self._call_impl(*args, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[0] return forward_call(*args, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1519, in forward
[1] return forward_call(*args, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1519, in forward
[0] else self._run_ddp_forward(*inputs, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1355, in _run_ddp_forward
[1] else self._run_ddp_forward(*inputs, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1355, in _run_ddp_forward
[0] return self.module(*inputs, **kwargs) # type: ignore[index]
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[1] return self.module(*inputs, **kwargs) # type: ignore[index]
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[0] return self._call_impl(*args, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[1] return self._call_impl(*args, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[0] return forward_call(*args, **kwargs)
[0] File "/scratch/05231/aruhela/demo.py", line 26, in forward
[1] return forward_call(*args, **kwargs)
[1] File "/scratch/05231/aruhela/demo.py", line 26, in forward
[0] return self.linear(input)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[1] return self.linear(input)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
[0] return self._call_impl(*args, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[1] return self._call_impl(*args, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
[0] return forward_call(*args, **kwargs)
[0] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
[1] return forward_call(*args, **kwargs)
[1] File "/scratch/projects/compilers/intel24.2/oneapi/intelpython/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
[0] return F.linear(input, self.weight, self.bias)
[0] RuntimeError: could not create an engine
[1] return F.linear(input, self.weight, self.bias)
[1] RuntimeError: could not create an engine
(base) c551-003pvc$

Notes: OneAPI release is 2024.2
Install command (AI Selector Tool):
conda install -c intel -c conda-forge --override-channels intel/label/oneapi::intel-extension-for-pytorch=2.1.20 intel/label/oneapi::pytorch=2.1.0 intel/label/oneapi::oneccl_bind_pt=2.1.200 intel/label/oneapi::torchvision=0.16.0 intel/label/oneapi::torchaudio=2.1.0 conda-forge::deepspeed=0.14.0 python=3.9

Thanks
Amit Ruhela

@aruhela aruhela added the bug Something isn't working label Jul 6, 2024
@xyang2013
Copy link

Hi, I also experienced this error. The message before the exception:

File c:\Users\xiaoy\anaconda3\envs\llm2\Lib\site-packages\torch\nn\modules\linear.py:125, in Linear.forward(self, input)
124 def forward(self, input: Tensor) -> Tensor:
--> 125 return F.linear(input, self.weight, self.bias)

RuntimeError: could not create an engine

GPU: Intel ARC B580 (with the latest driver)
OS: Windows 11
Conda/Python: 3.12
PyTorch instance:
pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/xpu

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants