Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Errors with incast, bcast lane #12

Open
csehydrogen opened this issue Nov 10, 2020 · 1 comment
Open

Errors with incast, bcast lane #12

csehydrogen opened this issue Nov 10, 2020 · 1 comment

Comments

@csehydrogen
Copy link

I try to do ALLREDUCE between two processes, using UCG without MPI.
I was able to create UCG context, UCP worker, and UCG group by exchanging their worker address.
However, creating ucg_coll_h with ucg_coll_allreduce_init causes a bunch of incast/bcast errors:

[...] select.c:517  UCX  ERROR cannot add incast lane - reached limit (6)
[...] select.c:517  UCX  ERROR cannot add bcast lane - reached limit (6)
[...] ucg_plan.c:388  UCX  WARN  No transports with native broadcast support were found, falling back to P2P transports (slower)
[...] ucg_plan.c:380  UCX  WARN  No transports with native incast support were found, falling back to P2P transports (slower)
free(): double free detected in tcache 2
Aborted (core dumped)

The attached file is a minimal working example for reproducing the problem.
ucx_test.zip

# host1 and host2 are connected with 1G ethernet and 100G InfiniBand
$ unzip ucx_test.zip; cd ucx_test
$ make
# on host1
$ ./ucg_test 2 0 0 host1 12345 # meaning: total 2 process, this process's rank is 0, root's rank is 0 with address host1:12345
# on host2
$ ./ucg_test 2 1 0 host1 12345 # meaning: total 2 process, this process's rank is 1, root's rank is 0 with address host1:12345
@csehydrogen
Copy link
Author

If someone provides working examples for collective communications (gather, scatter, allreduce, ...), it would be very helpful.

shizhibao pushed a commit to shizhibao/xucg that referenced this issue Jan 16, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant