-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix OMP offload build memory leaks #720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the changes in formatting... It is passing the tests as is, but was also passing the tests before.
Me neither. I just ran indent in the docker container and that's what it came up with... |
What don't you understand? I see the old version had incorrect indents (not multiple of 4) plus missing spaces after commas etc. In Python the linters do weird line breaks sometimes, but I didn't see anything obvious here (but there are a lot of lines) |
For me the odd thing is that the lint test was previously passing...there were some error messages when I ran indent, maybe it's related to that.
…________________________________
From: Jamal Mohd-Yusof ***@***.***>
Sent: Thursday, June 29, 2023 10:48:20 AM
To: lanl/bml
Cc: Wall, Michael E; Author
Subject: [EXTERNAL] Re: [lanl/bml] Fix OMP offload build memory leaks (PR #720)
What don't you understand? I see the old version had incorrect indents (not multiple of 4) plus missing spaces after commas etc. In Python the linters do weird line breaks sometimes, but I didn't see anything obvious here (but there are a lot of lines)
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/lanl/bml/pull/720*issuecomment-1613521143__;Iw!!Bt8fGhp8LhKGRg!CT2XbCD9vfYyGC7ijKazO4wK9uRFFjNPoak8Hii3nIgX1zBxXfUqaglHWP_JR3A7lAhM7NjvAWVYnv43OY9mEH9E$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA67VEIYAPY3AC4VNLV5LADXNWWVJANCNFSM6AAAAAAZXTSK4A__;!!Bt8fGhp8LhKGRg!CT2XbCD9vfYyGC7ijKazO4wK9uRFFjNPoak8Hii3nIgX1zBxXfUqaglHWP_JR3A7lAhM7NjvAWVYnv43OTsi9E9h$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I have seen things like this when the local line length limit is different across machines so one linter breaks the line and then it gets unwrapped in github. |
Yes, you're correct about the main changes.
…________________________________
From: Jamal Mohd-Yusof ***@***.***>
Sent: Thursday, June 29, 2023 10:55:23 AM
To: lanl/bml
Cc: Wall, Michael E; Author
Subject: [EXTERNAL] Re: [lanl/bml] Fix OMP offload build memory leaks (PR #720)
I have seen things like this when the local line length limit is different across machines so one linter breaks the line and then it gets unwrapped in github.
The main changes look to be the 'enter data' and 'target' clauses being added, plus the addition of the 'exit data' and deallocation of arrays?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/lanl/bml/pull/720*issuecomment-1613530013__;Iw!!Bt8fGhp8LhKGRg!FZfVD3idvO6YNilFhrkNkVOkmFGrmcsCQtwJBEQx1TjKLCgVycAJ85T6digl4ZlzemnNPyg6YxiWTtfC6D6pUOBH$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA67VEMDGOI7W3W3AW77B33XNWXPXANCNFSM6AAAAAAZXTSK4A__;!!Bt8fGhp8LhKGRg!FZfVD3idvO6YNilFhrkNkVOkmFGrmcsCQtwJBEQx1TjKLCgVycAJ85T6digl4ZlzemnNPyg6YxiWTtfC6JLpDUZq$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Was the enter/exit data missing for all the offload options or just for rocm? If for all, we probably should sweep the code and check for similar errors. |
Let me check. If there are other errors they should be addressed in this PR.
…________________________________
From: Jamal Mohd-Yusof ***@***.***>
Sent: Thursday, June 29, 2023 11:03:10 AM
To: lanl/bml
Cc: Wall, Michael E; Author
Subject: [EXTERNAL] Re: [lanl/bml] Fix OMP offload build memory leaks (PR #720)
Was the enter/exit data missing for all the offload options or just for rocm? If for all, we probably should sweep the code and check for similar errors.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/lanl/bml/pull/720*issuecomment-1613540320__;Iw!!Bt8fGhp8LhKGRg!CUisWukIY949NUa2qAmQKaQtvp1zWRadVgbIqorNMr3rpVPUkVU_NJKAdZzXS7jXC2qTz4fg-yi7Aysl1LK3cUIf$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA67VEIDXSZN57M3FC4QJXLXNWYM5ANCNFSM6AAAAAAZXTSK4A__;!!Bt8fGhp8LhKGRg!CUisWukIY949NUa2qAmQKaQtvp1zWRadVgbIqorNMr3rpVPUkVU_NJKAdZzXS7jXC2qTz4fg-yi7Aysl1FSDsblg$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Just to make sure I am clear, these are local work arrays that need to persist for a while, not the original arrays that should be allocated in allocate_typed or wherever? |
They're local arrays that are just needed within the subroutines for the calculations being performed.
…________________________________
From: Jamal Mohd-Yusof ***@***.***>
Sent: Thursday, June 29, 2023 11:09:32 AM
To: lanl/bml
Cc: Wall, Michael E; Author
Subject: [EXTERNAL] Re: [lanl/bml] Fix OMP offload build memory leaks (PR #720)
Just to make sure I am clear, these are local work arrays that need to persist for a while, not the original arrays that should be allocated in allocate_typed or wherever?
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/lanl/bml/pull/720*issuecomment-1613547532__;Iw!!Bt8fGhp8LhKGRg!GXb2G7Y-D-Nu43rB3EGIdLhiDYnjJJQyiCQ4unwaZbKH2NDDMpvzq04SbBZOR653mYTOd0VQfFamnMa5bK9bc5UG$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA67VEMO7VOV2DBQMA4PIXTXNWZEZANCNFSM6AAAAAAZXTSK4A__;!!Bt8fGhp8LhKGRg!GXb2G7Y-D-Nu43rB3EGIdLhiDYnjJJQyiCQ4unwaZbKH2NDDMpvzq04SbBZOR653mYTOd0VQfFamnMa5bJnacLYX$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
OK, so probably there isn't a similar issue in other routines, but doesn't hurt to check. Thx! |
o Fix leaks seen in running progress benchmarks - bml_add_ellpack arrays allocated but not freed - bml_multiply_ellpack arrays allocated but not freed o Fix similar leaks in other subroutines o Also increase efficiency of a target region in bml_prune_rocsparse_ellpack
Indentation of typed sources is suppressed in indent.sh, which is what the CI uses for the lint check. I was indenting the *_typed.c files without using the script.
I will back out the indent changes.
# Do not indent typed sources. indent-2.2.12 has a regression
# https://lists.gnu.org/archive/html/bug-indent/2023-04/msg00000.html and
# incorrectly aligns the indirection operator `*` in some cases.
if [[ ${file} =~ typed.c ]]; then
continue
fi
…________________________________
From: Jamal Mohd-Yusof ***@***.***>
Sent: Thursday, June 29, 2023 11:18:17 AM
To: lanl/bml
Cc: Wall, Michael E; Author
Subject: [EXTERNAL] Re: [lanl/bml] Fix OMP offload build memory leaks (PR #720)
OK, so probably there isn't a similar issue in other routines, but doesn't hurt to check. Thx!
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/lanl/bml/pull/720*issuecomment-1613557442__;Iw!!Bt8fGhp8LhKGRg!EEQIdz1hJFH0Z6x1IFphN7nLwIw_UzX76rskQ4Q7mPSyN2QgsvUterEQkbE69wNhcM7Z7SaI1EybtbPEZvcT9DRm$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA67VELL53J2GMMLSTJQBULXNW2FTANCNFSM6AAAAAAZXTSK4A__;!!Bt8fGhp8LhKGRg!EEQIdz1hJFH0Z6x1IFphN7nLwIw_UzX76rskQ4Q7mPSyN2QgsvUterEQkbE69wNhcM7Z7SaI1EybtbPEZi1lvKLF$>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I found some other offload build routines where it was needed, ones that aren't used by the progress benchmark. The latest revision addresses these other cases. |
I found we aren't lint checking the *_typed.c files. I backed out the formatting changes so we can address this as a separate issue. We might want to see whether we can start lint checking these files again in the future. |
@jeanlucf22 @nicolasbock Can this be merged? I have another PR based on this branch that I'd like to initiate, I'd like to do that after this is merged to avoid any confusion with the diffs. |
o bml_add_ellpack arrays allocated but not freed
o bml_multiply_ellpack arrays allocated but not freed o Also increase efficiency of a target region in bml_prune_rocsparse_ellpack