Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

torch.randperm: RuntimeError: Expected a 'cuda' device type for generator but found 'cpu' #2

Closed
chschroeder opened this issue Oct 12, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@chschroeder
Copy link
Contributor

I just noticed that torch>=1.9.0 seems to have brought a change that leads to an exception where none was thrown before.

Setup
Small-text: 1.0.0a4
Torch: 1.9.1

Description
The following error occurs when executing examples/pytorch_multiclass_classification.py:

  File "/my/path/.pyenv/versions/3.8.2/lib/python3.8/runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/my/path/.pyenv/versions/3.8.2/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/my/path/small-text/examples/pytorch_multiclass_classification.py", line 102, in <module>
    main()
  File "/my/path/small-text/examples/pytorch_multiclass_classification.py", line 52, in main
    active_learner.initialize_data(labeled_indices, y_initial)
  File "/my/path/small-text/small_text/active_learner.py", line 141, in initialize_data
    self._retrain(x_indices_validation=x_indices_validation)
  File "/my/path/small-text/small_text/active_learner.py", line 384, in _retrain
    self._clf.fit(x)
  File "/my/path/small-text/small_text/integrations/pytorch/classifiers/kimcnn.py", line 176, in fit
    return self._fit_main(sub_train, sub_valid)
  File "/my/path/small-text/small_text/integrations/pytorch/classifiers/kimcnn.py", line 198, in _fit_main
    res = self._train(sub_train, sub_valid, tmp_dir)
  File "/my/path/small-text/small_text/integrations/pytorch/classifiers/kimcnn.py", line 218, in _train
    train_loss, train_acc = self._train_func(sub_train)
  File "/my/path/small-text/small_text/integrations/pytorch/classifiers/kimcnn.py", line 263, in _train_func
    for i, (text, cls) in enumerate(train_iter):
  File "/my/path/.local/share/virtualenvs/myvenv-123/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/my/path/.local/share/virtualenvs/myvenv-123/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 560, in _next_data
    index = self._next_index()  # may raise StopIteration
  File "/my/path/.local/share/virtualenvs/myvenv-123/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 512, in _next_index
    return next(self._sampler_iter)  # may raise StopIteration
  File "/my/path/.local/share/virtualenvs/myvenv-123/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 226, in __iter__
    for idx in self.sampler:
  File "/my/path/.local/share/virtualenvs/myvenv-123/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 124, in __iter__
    yield from torch.randperm(n, generator=generator).tolist()
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

Solution

I have yet to investigate what would be the optimal solution for this.

A workaround is to downgrade pytorch (and torchtext):
pip install torch==1.8.1 torchtext==0.9.1

More information
pytorch/pytorch/issues/44714

@chschroeder chschroeder added the bug Something isn't working label Feb 6, 2022
@chschroeder chschroeder added this to the small-text-1.1.0 milestone Jun 15, 2022
@chschroeder
Copy link
Contributor Author

Okay, I finally investigated this.

The main problem is that we are using torch.set_default_tensor_type which is bad and might even get deprecated in the future.

This affects mainly the ExpectedGradientLength query strategy and thereby the Pytorch multiclass example in the examplecode folder.

Solution:

  • Remove all calls to small_text.integrations.pytorch.utils.misc.default_tensor_type
  • Deprecate small_text.integrations.pytorch.utils.misc.default_tensor_type (and remove this in 2.0.0)

chschroeder added a commit that referenced this issue Jul 18, 2022
Signed-off-by: Christopher Schröder <chschroeder@users.noreply.github.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant