Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

XLA 2.7 releases don't work properly with the upstream torch 2.7 #8626

Open
hosseinsarshar opened this issue Jan 26, 2025 · 7 comments
Open
Assignees
Labels
needs reproduction xla:tpu TPU specific issues and PRs

Comments

@hosseinsarshar
Copy link
Contributor

hosseinsarshar commented Jan 26, 2025

🐛 Bug

I tested the xla 2.7 nightly builds with the upstream torch 2.7 with many date combinations and all resulted in faulty execution - for example:

for this install:

pip install -U --pre torch==2.7.0.dev20250124+cpu --index-url https://download.pytorch.org/whl/nightly/cpu

pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev20250124-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html

pip install -U torch_xla[pallas] -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

I get this behaviour:

$python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hosseins/miniconda3/envs/test-new-nightly/lib/python3.10/site-packages/torch_xla/__init__.py", line 20, in <module>
    import _XLAC
ImportError: /home/hosseins/miniconda3/envs/test-new-nightly/lib/python3.10/site-packages/_XLAC.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5torch4lazy13MetricFnValueEd

But once, I downgrade the upstream torch to 2.6.0 dev20241216 (it's the highest that worked) - xla works as expected:

pip install -U --pre torch==2.6.0.dev20241216+cpu --index-url https://download.pytorch.org/whl/nightly/cpu

pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev20250119-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html


pip install -U torch_xla[pallas] -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

I get it fixed:

$ python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'xla'
>>> import torch_xla
WARNING:root:libtpu.so and TPU device found. Setting PJRT_DEVICE=TPU.
>>> torch_xla.devices()
[device(type='xla', index=0), device(type='xla', index=1), device(type='xla', index=2), device(type='xla', index=3), device(type='xla', index=4), device(type='xla', index=5), device(type='xla', index=6), device(type='xla', index=7)]
@hosseinsarshar hosseinsarshar changed the title XLA 2.7 releases don't work properly with upstream torch 2.7 XLA 2.7 releases don't work properly with the upstream torch 2.7 Jan 26, 2025
@bhavya01
Copy link
Collaborator

We recently updated the readme with newer instructions for nightly which should fix this issue.

pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html

# Optional: if you're using custom kernels, install pallas dependencies
pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html

Upstream pytorch changed some environment variables while building their wheels which caused this issue.

@hosseinsarshar
Copy link
Contributor Author

Thanks @bhavya01 - I tried the new instructions as well - but the issue still remains - here is a simple test:

$ conda create -n new-xla python=3.10
Retrieving notices: ...working... done
Channels:
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/hosseins/miniconda3/envs/new-xla

  added / updated specs:
    - python=3.10


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main 
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu 
  bzip2              pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_6 
  ca-certificates    pkgs/main/linux-64::ca-certificates-2024.12.31-h06a4308_0 
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.40-h12ee557_0 
  libffi             pkgs/main/linux-64::libffi-3.4.4-h6a678d5_1 
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1 
  libgomp            pkgs/main/linux-64::libgomp-11.2.0-h1234567_1 
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1 
  libuuid            pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0 
  ncurses            pkgs/main/linux-64::ncurses-6.4-h6a678d5_0 
  openssl            pkgs/main/linux-64::openssl-3.0.15-h5eee18b_0 
  pip                pkgs/main/linux-64::pip-24.2-py310h06a4308_0 
  python             pkgs/main/linux-64::python-3.10.16-he870216_1 
  readline           pkgs/main/linux-64::readline-8.2-h5eee18b_0 
  setuptools         pkgs/main/linux-64::setuptools-75.1.0-py310h06a4308_0 
  sqlite             pkgs/main/linux-64::sqlite-3.45.3-h5eee18b_0 
  tk                 pkgs/main/linux-64::tk-8.6.14-h39e8969_0 
  tzdata             pkgs/main/noarch::tzdata-2025a-h04d1e81_0 
  wheel              pkgs/main/linux-64::wheel-0.44.0-py310h06a4308_0 
  xz                 pkgs/main/linux-64::xz-5.4.6-h5eee18b_1 
  zlib               pkgs/main/linux-64::zlib-1.2.13-h5eee18b_1 


Proceed ([y]/n)? y


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate new-xla
#
# To deactivate an active environment, use
#
#     $ conda deactivate

$ conda activate new-xla
(new-xla) $ pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250127%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (26 kB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250127%2Bcpu-cp310-cp310-linux_x86_64.whl.metadata (6.2 kB)
Collecting filelock (from torch)
  Using cached https://download.pytorch.org/whl/nightly/filelock-3.16.1-py3-none-any.whl (16 kB)
Collecting typing-extensions>=4.10.0 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Collecting sympy==1.13.1 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/sympy-1.13.1-py3-none-any.whl (6.2 MB)
Collecting networkx (from torch)
  Using cached https://download.pytorch.org/whl/nightly/networkx-3.4.2-py3-none-any.whl (1.7 MB)
Collecting jinja2 (from torch)
  Using cached https://download.pytorch.org/whl/nightly/jinja2-3.1.4-py3-none-any.whl (133 kB)
Collecting fsspec (from torch)
  Using cached https://download.pytorch.org/whl/nightly/fsspec-2024.10.0-py3-none-any.whl (179 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy==1.13.1->torch)
  Using cached https://download.pytorch.org/whl/nightly/mpmath-1.3.0-py3-none-any.whl (536 kB)
Collecting numpy (from torchvision)
  Using cached https://download.pytorch.org/whl/nightly/numpy-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 MB)
Collecting pillow!=8.3.*,>=5.3.0 (from torchvision)
  Using cached https://download.pytorch.org/whl/nightly/pillow-11.0.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.4 MB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
  Using cached https://download.pytorch.org/whl/nightly/MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250127%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl (175.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 175.4/175.4 MB 142.6 MB/s eta 0:00:00
Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250127%2Bcpu-cp310-cp310-linux_x86_64.whl (1.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 133.2 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch, torchvision
Successfully installed MarkupSafe-2.1.5 filelock-3.16.1 fsspec-2024.10.0 jinja2-3.1.4 mpmath-1.3.0 networkx-3.4.2 numpy-2.1.2 pillow-11.0.0 sympy-1.13.1 torch-2.7.0.dev20250127+cpu torchvision-0.22.0.dev20250127+cpu typing-extensions-4.12.2
(new-xla) $ pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html
Looking in links: https://storage.googleapis.com/libtpu-releases/index.html, https://storage.googleapis.com/libtpu-wheels/index.html
Collecting torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (from torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (94.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.6/94.6 MB 146.6 MB/s eta 0:00:00
Collecting absl-py>=1.0.0 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Requirement already satisfied: numpy in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl) (2.1.2)
Collecting pyyaml (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting requests (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting libtpu==0.0.8.dev20250113 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu/libtpu-0.0.8.dev20250113%2Bnightly-py3-none-linux_x86_64.whl (132.5 MB)
Collecting tpu-info (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached tpu_info-0.2.0-py3-none-any.whl.metadata (3.7 kB)
Collecting libtpu-nightly==0.1.dev20241010+nightly.cleanup (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu-nightly/libtpu_nightly-0.1.dev20241010%2Bnightly.cleanup-py3-none-any.whl (1.3 kB)
Collecting charset-normalizer<4,>=2 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (35 kB)
Collecting idna<4,>=2.5 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached urllib3-2.3.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached certifi-2024.12.14-py3-none-any.whl.metadata (2.3 kB)
Collecting grpcio>=1.65.5 (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached grpcio-1.70.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.9 kB)
Collecting protobuf (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached protobuf-5.29.3-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Collecting rich (from tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached rich-13.9.4-py3-none-any.whl.metadata (18 kB)
Collecting markdown-it-py>=2.2.0 (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Collecting pygments<3.0.0,>=2.13.0 (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached pygments-2.19.1-py3-none-any.whl.metadata (2.5 kB)
Requirement already satisfied: typing-extensions<5.0,>=4.0.0 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl) (4.12.2)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->tpu-info->torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Using cached absl_py-2.1.0-py3-none-any.whl (133 kB)
Using cached PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Using cached tpu_info-0.2.0-py3-none-any.whl (14 kB)
Using cached certifi-2024.12.14-py3-none-any.whl (164 kB)
Using cached charset_normalizer-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (146 kB)
Using cached grpcio-1.70.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached urllib3-2.3.0-py3-none-any.whl (128 kB)
Using cached protobuf-5.29.3-cp38-abi3-manylinux2014_x86_64.whl (319 kB)
Using cached rich-13.9.4-py3-none-any.whl (242 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached pygments-2.19.1-py3-none-any.whl (1.2 MB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: libtpu-nightly, libtpu, urllib3, pyyaml, pygments, protobuf, mdurl, idna, grpcio, charset-normalizer, certifi, absl-py, requests, markdown-it-py, torch_xla, rich, tpu-info
Successfully installed absl-py-2.1.0 certifi-2024.12.14 charset-normalizer-3.4.1 grpcio-1.70.0 idna-3.10 libtpu-0.0.8.dev20250113+nightly libtpu-nightly-0.1.dev20241010+nightly.cleanup markdown-it-py-3.0.0 mdurl-0.1.2 protobuf-5.29.3 pygments-2.19.1 pyyaml-6.0.2 requests-2.32.3 rich-13.9.4 torch_xla-2.7.0+git8b24140 tpu-info-0.2.0 urllib3-2.3.0
(new-xla) $ pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Looking in links: https://storage.googleapis.com/jax-releases/jax_nightly_releases.html, https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Requirement already satisfied: torch_xla[pallas] in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (2.7.0+git8b24140)
Requirement already satisfied: absl-py>=1.0.0 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.1.0)
Requirement already satisfied: numpy in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.1.2)
Requirement already satisfied: pyyaml in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (6.0.2)
Requirement already satisfied: requests in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from torch_xla[pallas]) (2.32.3)
Collecting jaxlib==0.4.39.dev20250113 (from torch_xla[pallas])
  Using cached https://storage.googleapis.com/jax-releases/nightly/nocuda/jaxlib-0.4.39.dev20250113-cp310-cp310-manylinux2014_x86_64.whl (101.5 MB)
Collecting jax==0.4.39.dev20250113 (from torch_xla[pallas])
  Using cached https://storage.googleapis.com/jax-releases/nightly/jax/jax-0.4.39.dev20250113-py3-none-any.whl (2.3 MB)
Collecting ml_dtypes>=0.4.0 (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (21 kB)
Collecting opt_einsum (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached opt_einsum-3.4.0-py3-none-any.whl.metadata (6.3 kB)
Collecting scipy>=1.11.1 (from jax==0.4.39.dev20250113->torch_xla[pallas])
  Using cached scipy-1.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in ./miniconda3/envs/new-xla/lib/python3.10/site-packages (from requests->torch_xla[pallas]) (2024.12.14)
Using cached ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.7 MB)
Using cached scipy-1.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (40.6 MB)
Using cached opt_einsum-3.4.0-py3-none-any.whl (71 kB)
Installing collected packages: scipy, opt_einsum, ml_dtypes, jaxlib, jax
Successfully installed jax-0.4.39.dev20250113 jaxlib-0.4.39.dev20250113 ml_dtypes-0.5.1 opt_einsum-3.4.0 scipy-1.15.1
(new-xla) $ python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hosseins/miniconda3/envs/new-xla/lib/python3.10/site-packages/torch_xla/__init__.py", line 20, in <module>
    import _XLAC
ImportError: /home/hosseins/miniconda3/envs/new-xla/lib/python3.10/site-packages/_XLAC.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5torch6dynamo8autograd18get_input_metadataERKSt6vectorINS_8autograd4EdgeESaIS4_EE
>>> 

@miladm
Copy link
Collaborator

miladm commented Jan 27, 2025

Ideally, this build combination should work as expected @hosseinsarshar.

I see you were able to unblock this issue by matching an older torch wheel with the latest torch_xla. I am reopening it so we can further investigate further. cc @lsy323 @ysiraichi

@miladm miladm reopened this Jan 27, 2025
@hosseinsarshar
Copy link
Contributor Author

hosseinsarshar commented Feb 3, 2025

@miladm / @lsy323 wonder if you had a chance to review this issue - I'm asking as the current pinned torch version (2.6.0.dev20241216+cpu) is out of date today. Thanks

@ysiraichi
Copy link
Collaborator

I couldn't reproduce this issue. I ran the same commands (using micromamba, instead), and I was able to import torch_xla without problems.

@ysiraichi
Copy link
Collaborator

~
➜ micromamba create -n test python=3.10
conda-forge/noarch                                  18.9MB @  19.5MB/s  1.0s
conda-forge/linux-64                                41.7MB @  15.0MB/s  2.8s


Transaction
  ...
  Updating specs:

   - python=3.10


  Package               Version  Build               Channel           Size
─────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────

  + _libgcc_mutex           0.1  conda_forge         conda-forge     Cached
  + _openmp_mutex           4.5  2_gnu               conda-forge     Cached
  + bzip2                 1.0.8  h4bc722e_7          conda-forge     Cached
  + ca-certificates   2025.1.31  hbcca054_0          conda-forge     Cached
  + ld_impl_linux-64       2.43  h712a8e2_2          conda-forge     Cached
  + libffi                3.4.2  h7f98852_5          conda-forge     Cached
  + libgcc               14.2.0  h77fa898_1          conda-forge     Cached
  + libgcc-ng            14.2.0  h69a702a_1          conda-forge     Cached
  + libgomp              14.2.0  h77fa898_1          conda-forge     Cached
  + liblzma               5.6.4  hb9d3cd8_0          conda-forge     Cached
  + libnsl                2.0.1  hd590300_0          conda-forge     Cached
  + libsqlite            3.48.0  hee588c1_1          conda-forge     Cached
  + libuuid              2.38.1  h0b41bf4_0          conda-forge     Cached
  + libxcrypt            4.4.36  hd590300_1          conda-forge     Cached
  + libzlib               1.3.1  hb9d3cd8_2          conda-forge     Cached
  + ncurses                 6.5  h2d0b736_3          conda-forge     Cached
  + openssl               3.4.0  h7b32b05_1          conda-forge     Cached
  + pip                    25.0  pyh8b19718_0        conda-forge     Cached
  + python              3.10.16  he725a3c_1_cpython  conda-forge     Cached
  + readline                8.2  h8228510_1          conda-forge     Cached
  + setuptools           75.8.0  pyhff2d567_0        conda-forge     Cached
  + tk                   8.6.13  noxft_h4845f30_101  conda-forge     Cached
  + tzdata                2025a  h78e105d_0          conda-forge     Cached
  + wheel                0.45.1  pyhd8ed1ab_1        conda-forge     Cached

  Summary:

  Install: 24 packages

  Total download: 0 B

─────────────────────────────────────────────────────────────────────────────


Confirm changes: [Y/n]
...

~
➜ micromamba activate test

~ via 🅒 test
➜ pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cpu
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.7.0.dev20250206%2Bcpu-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (26 kB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.22.0.dev20250206%2Bcpu-cp310-cp310-linux_x86_64.whl.metadata (6.2 kB)
...

~ via 🅒 test
➜ pip install 'torch_xla[tpu] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl' \
  -f https://storage.googleapis.com/libtpu-releases/index.html \
  -f https://storage.googleapis.com/libtpu-wheels/index.html
Looking in links: https://storage.googleapis.com/libtpu-releases/index.html, https://storage.googleapis.com/libtpu-wheels/index.html
Collecting torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (from torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl (95.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 95.0/95.0 MB 19.7 MB/s eta 0:00:00
Collecting libtpu==0.0.9.dev20250131 (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu/libtpu-0.0.9.dev20250131%2Bnightly-py3-none-linux_x86_64.whl (133.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 MB 28.7 MB/s eta 0:00:00
Collecting tpu-info (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading tpu_info-0.2.0-py3-none-any.whl.metadata (3.7 kB)
Collecting libtpu-nightly==0.1.dev20241010+nightly.cleanup (from torch_xla@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl->torch_xla[tpu]@ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.7.0.dev+cxx11-cp310-cp310-linux_x86_64.whl)
  Downloading https://storage.googleapis.com/libtpu-nightly-releases/wheels/libtpu-nightly/libtpu_nightly-0.1.dev20241010%2Bnightly.cleanup-py3-none-any.whl (1.3 kB)
...

~ via 🅒 test
➜ pip install 'torch_xla[pallas]' \
  -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html \
  -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Looking in links: https://storage.googleapis.com/jax-releases/jax_nightly_releases.html, https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
Collecting jaxlib==0.5.1.dev20250131 (from torch_xla[pallas])
  Downloading https://storage.googleapis.com/jax-releases/nightly/nocuda/jaxlib-0.5.1.dev20250131-cp310-cp310-manylinux2014_x86_64.whl (103.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.0/103.0 MB 22.5 MB/s eta 0:00:00
Collecting jax==0.5.1.dev20250131 (from torch_xla[pallas])
  Downloading https://storage.googleapis.com/jax-releases/nightly/jax/jax-0.5.1.dev20250131-py3-none-any.whl (2.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 5.0 MB/s eta 0:00:00
Collecting ml_dtypes>=0.4.0 (from jax==0.5.1.dev20250131->torch_xla[pallas])
  Downloading ml_dtypes-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (21 kB)
...

~ via 🅒 test took 15s509ms
➜ python
Python 3.10.16 | packaged by conda-forge | (main, Dec  5 2024, 14:16:10) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla
WARNING:root:Defaulting to PJRT_DEVICE=CPU

@hosseinsarshar
Copy link
Contributor Author

thanks @ysiraichi for testing this -

Checking your logs - I see WARNING:root:Defaulting to PJRT_DEVICE=CPU which shows that it can't find the TPU device by default which shouldn't be the case. I also mentioned it in my first comment - if you pass PJRT_DEVICE=TPU, it most probably break - if it doesn't, I expect you can't get the list of devices: please try to list them by torch_xla.devices().

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
needs reproduction xla:tpu TPU specific issues and PRs
Projects
None yet
Development

No branches or pull requests

4 participants