Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

使用Unimol Plus在训练阶段的验证集上报错RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half #315

Open
Carpdong opened this issue Jan 22, 2025 · 0 comments

Comments

@Carpdong
Copy link

使用Unimol Plus在训练阶段的验证集上报错RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half。
Data Preparation按照git操作。Inference.sh是原始脚本,只改了各种path

2025-01-22 18:31:46 | INFO | unicore_cli.train | begin validation on "valid" subset
2025-01-22 18:31:46 | INFO | unicore.tasks.unicore_task | get EpochBatchIterator for epoch 1
Traceback (most recent call last):
File "/root/miniconda3/envs/kpi_lyd/bin/unicore-train", line 8, in
sys.exit(cli_main())
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore_cli/train.py", line 418, in cli_main
distributed_utils.call_main(args, main)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/distributed/utils.py", line 186, in call_main
distributed_main(int(os.environ["LOCAL_RANK"]), main, args, kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/distributed/utils.py", line 160, in distributed_main
main(args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore_cli/train.py", line 125, in main
valid_losses, should_stop = train(
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore_cli/train.py", line 233, in train
valid_losses, should_stop = validate_and_save(
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore_cli/train.py", line 314, in validate_and_save
valid_losses = validate(args, trainer, task, epoch_itr, valid_subsets)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore_cli/train.py", line 381, in validate
inner_logging_outputs = trainer.valid_step(sample)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/trainer.py", line 823, in valid_step
raise e
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/trainer.py", line 807, in valid_step
_loss, sample_size, logging_output = self.task.valid_step(
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/tasks/unicore_task.py", line 289, in valid_step
loss, sample_size, logging_output = loss(model, sample)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/lyd/unimol_plus/unimol_plus/losses/unimol_plus.py", line 56, in forward
) = model(**sample)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/unicore/distributed/module_proxy_wrapper.py", line 56, in forward
return self.module(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1519, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1355, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/lyd/unimol_plus/unimol_plus/models/unimol_plus_pcq.py", line 332, in forward
x, pair, pos = one_block(x, pair, pos)
File "/root/lyd/unimol_plus/unimol_plus/models/unimol_plus_pcq.py", line 312, in one_block
attn_bias_3d = self.se3_invariant_kernel(dist.detach(), pair_type)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/lyd/unimol_plus/unimol_plus/models/layers.py", line 386, in forward
edge_feature = self.out_proj(edge_feature)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/lyd/unimol_plus/unimol_plus/models/layers.py", line 433, in forward
x = self.layer1(x)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/kpi_lyd/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant