-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Ucc integration #591
Ucc integration #591
Conversation
@kaiyingshan I'm getting the following seg fault. I can't figure out where and why it is coming from. I'm using latest UCC master branch.
|
I don't know if this is the same issue that I experienced, which is due to the ucc team size. When I run the example this way, it gives a warning "ucc_team.c:114 UCC WARN minimal size of UCC team is 2, provided 1", and it is caused by |
std::cout<<std::endl; | ||
|
||
/* Cleanup UCC */ | ||
UCC_CHECK(ucc_team_destroy(team)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
team destroy is nonblocking operation, it might return UCC_INPROGRESS, should be something like this
ucc_status_t status;
while (UCC_INPROGRESS == (status = ucc_team_destroy(team.team))) {
if (UCC_OK != status) {
std::cerr << "ucc_team_destroy failed\n";
break;
}
}
|
||
RETURN_CYLON_STATUS_IF_UCC_FAILED(ucc_context_config_read(lib, nullptr, &ctx_config)); | ||
RETURN_CYLON_STATUS_IF_UCC_FAILED(ucc_context_create(lib, &ctx_params, ctx_config, &uccContext)); | ||
while (UCC_OK != ucc_context_progress(uccContext)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ucc_context_create is blocking, no need to call ucc_context_progress
recently we added support for team size 1 (openucx/ucc#511) |
…into ucc-integration � Conflicts: � cpp/src/cylon/net/ucx/ucx_communicator.cpp
…into ucc-integration
…ation # Conflicts: # cpp/src/cylon/net/ucx/ucx_communicator.cpp
@Sergei-Lebedev I'm still getting the following error with team size 1.
|
…into ucc-integration
…into ucc-integration
@kaiyingshan I reviewed your code and made some changes myself in this commit 9a7e9aa Could you please check that? |
@kaiyingshan thank you for doing this |
No description provided.