You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have started to develop a distributed application and I am trying to use mpi dynamic process for data exchange. The problem is that when I run the program, I always receive the same error.
I have compiled version 4.0.2 from source with gcc7. First, I tried a machine with Centos 7, then I tried one with Ubuntu 18.04 and finally another with Ubuntu 19.04 with the same result.
Server available at 2953052161.0:2641729496
Wait for client connection
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[45060,1],0]) is on host: Node1
Process 2 ([[45085,1],0]) is on host: unknown!
BTLs attempted: self tcp
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
[CESAR-NITROV:00207] [[45060,1],0] ORTE_ERROR_LOG: Unreachable in file dpm/dpm.c at line 493
[CESAR-NITROV:00207] *** An error occurred in MPI_Comm_accept
[CESAR-NITROV:00207] *** reported by process [2953052161,0]
[CESAR-NITROV:00207] *** on communicator MPI_COMM_WORLD
[CESAR-NITROV:00207] *** MPI_ERR_INTERN: internal error
[CESAR-NITROV:00207] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[CESAR-NITROV:00207] *** and potentially your MPI job)
Terminal 3
Looking for server
server found at 2953052161.0:2641729496
Wait for server connection
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications. This means that no Open MPI device has indicated
that it can be used to communicate between these processes. This is
an error; Open MPI requires that all MPI processes be able to reach
each other. This error can sometimes be the result of forgetting to
specify the "self" BTL.
Process 1 ([[45085,1],0]) is on host: Node1
Process 2 ([[45060,1],0]) is on host: unknown!
BTLs attempted: self tcp
Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
[CESAR-NITROV:00214] [[45085,1],0] ORTE_ERROR_LOG: Unreachable in file dpm/dpm.c at line 493
[CESAR-NITROV:00214] *** An error occurred in MPI_Comm_connect
[CESAR-NITROV:00214] *** reported by process [2954690561,0]
[CESAR-NITROV:00214] *** on communicator MPI_COMM_WORLD
[CESAR-NITROV:00214] *** MPI_ERR_INTERN: internal error
[CESAR-NITROV:00214] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[CESAR-NITROV:00214] *** and potentially your MPI job)
I have tried all the solutions that I found and I have run out of ideas. I don't know if this is a bug or it's me that I forgot to set some parameter or environment variable.
Is there anyone who can help me?
Thank you in advance.
The text was updated successfully, but these errors were encountered:
As you can tell, I haven't had time to get around to this problem. Sadly, support for OMPI's runtime environment has declined a great deal over recent years as I've moved on to other things and am getting ready to retire - we just haven't been able to get other folks to pick it up the way anyone would like.
I'd suggest trying MPICH as an alternative - I don't know if they can handle your use-case or not, but it is worth a try. If they can't, then your best bet is to downgrade your OMPI installation until you find one that works - you might try the v3 series, or even v2.
This might eventually get addressed, but it probably won't happen in a very timely fashion.
@cesarpomar thanks for the exhaustive issue description.
I've recently come over it and would like to note, that v4.0.1 works fine (with #6446).
So it's something introduced in v4.0.2.
Hello,
I have started to develop a distributed application and I am trying to use mpi dynamic process for data exchange. The problem is that when I run the program, I always receive the same error.
I have compiled version 4.0.2 from source with gcc7. First, I tried a machine with Centos 7, then I tried one with Ubuntu 18.04 and finally another with Ubuntu 19.04 with the same result.
I have seen similar errors in other issues but they were about different problems. For example, https://github.com/open-mpi/ompi/issues/6916
The source code
Server:
Client:
Execution
Once the source code has been compiled, I open three terminals in the same machine and I execute the following commands in order:
Terminal 1
mpi-server -r /tmp/ompi-server.txt --no-daemonize
Terminal 2
mpiexec --ompi-server file:/tmp/ompi-server.txt -np 1 ./server
Terminal 3
mpiexec --ompi-server file:/tmp/ompi-server.txt -np 1 ./client
Output
Terminal 1
continues to run without any errors.
Terminal 2
Terminal 3
I have tried all the solutions that I found and I have run out of ideas. I don't know if this is a bug or it's me that I forgot to set some parameter or environment variable.
Is there anyone who can help me?
Thank you in advance.
The text was updated successfully, but these errors were encountered: