Skip to content

Conversation

bosilca
Copy link
Member

@bosilca bosilca commented May 5, 2021

A possible fix for #8763

@bosilca bosilca requested review from jsquyres and ggouaillardet May 5, 2021 04:39
@bosilca bosilca force-pushed the fix/nb_leaks branch 3 times, most recently from a4caa6f to aa31cf8 Compare October 6, 2021 04:59
@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
@bosilca bosilca added this to the v5.0.0 milestone Oct 6, 2021
@open-mpi open-mpi deleted a comment from jjhursey Oct 6, 2021
@bosilca
Copy link
Member Author

bosilca commented Oct 6, 2021

We need to converge on this and once we reached a consensus we need to apply the same logic to basically all the non-blocking Fortran functions. This includes:

  • ineighbor_alltoallw_f.c,
  • ineighbor_alltoallv_f.c,
  • neighbor_allgatherv_init_f.c,
  • neighbor_alltoallw_init_f.c,
  • neighbor_alltoallv_init_f.c,
  • ineighbor_allgatherv_f.c,
  • igatherv_f.c,
  • iallgatherv_f.c,
  • iscatterv_f.c,
  • ialltoallv_f.c,
  • ialltoallw_f.c,
  • ireduce_scatter_f.c,
  • reduce_scatter_init_f.c,
  • alltoallv_init_f.c,
  • alltoallw_init_f.c,
  • allgatherv_init_f.c,
  • gatherv_init_f.c,
  • scatterv_init_f.c.

Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will approve once the other functions are in 👍

@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
@open-mpi open-mpi deleted a comment from ibm-ompi Oct 6, 2021
Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small formatting nitpick, looks good otherwise.

@bosilca bosilca force-pushed the fix/nb_leaks branch 2 times, most recently from 7507a31 to 301c45e Compare October 7, 2021 19:18
@gpaulsen
Copy link
Member

gpaulsen commented Nov 1, 2021

Looks like Mellanox build failed with two compile errors:

                        ^
coll_hcoll_rte.c: In function 'get_coll_handle':
coll_hcoll_rte.c:355:19: error: 'struct <anonymous>' has no member named 'objs'
     ompi_req->data.objs.objs[0]          = NULL;
                   ^
coll_hcoll_rte.c:356:19: error: 'struct <anonymous>' has no member named 'objs'
     ompi_req->data.objs.objs[1]          = NULL;
                   ^

@open-mpi open-mpi deleted a comment from jsquyres Nov 16, 2021
@open-mpi open-mpi deleted a comment from azure-pipelines bot Nov 16, 2021
@open-mpi open-mpi deleted a comment from janjust Nov 16, 2021
@open-mpi open-mpi deleted a comment from azure-pipelines bot Nov 16, 2021
@bosilca
Copy link
Member Author

bosilca commented Nov 16, 2021

I don't have hcoll on mu clusters, but I think I pushed a fix for this compile error. The PR is ready for final review and them merge. @ggouaillardet and @devreal please one last review, maybe we can make it in the pending releases.

@gpaulsen gpaulsen requested a review from devreal November 30, 2021 19:42
@gpaulsen
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@bosilca bosilca force-pushed the fix/nb_leaks branch 2 times, most recently from bc791a2 to 301c45e Compare December 28, 2021 19:27
If Fortran and C integers have different sizes we allocate the storage
for the C objects in the MPI API. For nonblocking functions we need to
release these temporary arrays upon completion of the request in order
to avoid memory leaks.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
@gpaulsen
Copy link
Member

@ggouaillardet @devreal can you please re-review?

@gpaulsen
Copy link
Member

gpaulsen commented Feb 9, 2022

@devreal @ggouaillardet ?

@gpaulsen
Copy link
Member

gpaulsen commented Feb 9, 2022

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
(cherry picked from commit 4a1c29a)
@awlauria
Copy link
Contributor

@bosilca I fixed the compile error. Does that look right?

@janjust ^^ this touches hcoll.

@awlauria
Copy link
Contributor

If that's right can you squash and rebase on current main?

@awlauria
Copy link
Contributor

awlauria commented Apr 18, 2022

Closing for #10288

It's a squash of the two commits here and a rebase on top of main.

@awlauria awlauria closed this Apr 18, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants