Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Error when trying to train #5

Closed
jpsml opened this issue Jan 14, 2022 · 6 comments
Closed

Error when trying to train #5

jpsml opened this issue Jan 14, 2022 · 6 comments

Comments

@jpsml
Copy link

jpsml commented Jan 14, 2022

I am getting the following error when I try to run training, how should I proceed in order to solve it?

(mvp) jpsml@jpsml-ubuntu:~/mvp$ python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/campus/mvp_campus.yaml


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "run/train_3d.py", line 34, in
import dataset
File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in
from dataset.h36m import H36M as h36m
File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in
from lib.utils.cameras_cpu import camera_to_world_frame, project_pose
ModuleNotFoundError: No module named 'lib'
Traceback (most recent call last):
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 261, in
main()
File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 257, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/jpsml/anaconda3/envs/mvp/bin/python', '-u', 'run/train_3d.py', '--cfg', 'configs/campus/mvp_campus.yaml']' returned non-zero exit status 1.

@twangnh
Copy link
Collaborator

twangnh commented Jan 17, 2022

please add lib to the PYTHONPATH

@jpsml
Copy link
Author

jpsml commented Jan 17, 2022

thanks

@Taylorminer
Copy link

@jpsml I meet the same problem.I try to add lib to the PYTHONPATH, such as add"import sys
sys.path.append('/media/chen-group/9400EADF00EAC778/cy/mvp-main/lib/')" in the validate_3d.py and create the .pth file in the python site-site packages. But all of them are failed. Can you tell me how to fix it?

@jpsml
Copy link
Author

jpsml commented Feb 8, 2022

run the following in the terminal before running the training:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp/lib"

in your case you need to replace "home/jpsml/mvp" by the local path where mvp is located

@Taylorminer
Copy link

run the following in the terminal before running the training:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp/lib"

in your case you need to replace "home/jpsml/mvp" by the local path where mvp is located

I run the line in the terminal. I run 'print(sys.path)' lib path in the pythonpath. But There is still the error:
ModuleNotFoundError: No module named 'lib'

@jpsml
Copy link
Author

jpsml commented Feb 10, 2022

sorry, I believe the correct command is the following:

export PYTHONPATH="${PYTHONPATH}:/home/jpsml/mvp"

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants