-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
AddressSanitizer: heap-use-after-free in object destructor #492
Comments
This looks very similar to #174 (with a more thorough discussion as well). There are a few tests that we explicitly exclude when we run with AddressSanitizer (marked with podio/tests/unittests/unittest.cpp Line 82 in babec74
We haven't fixed this yet because we were (apparently wrongly) convinced that in normal usage it should not occur. The main reason this occurs is, if an object is kept around longer than the collection it belongs to. One of the usual cases where I encounter this (semi-regularly) is if objects are used in maps or other containers outside of their collections. If these containers are cleared only at the beginning of the next event things might break and containers would need to be cleared at the end of the current event. We have a draft solution in #220 (which would need reviving) that solves some but not all of the issues. |
This will actually also happen in normal usage, I think. Since the order in which elements are destroyed matters in this case, and we cannot guarantee any well defined order (nor is that even possible for types that can form cyclic relations). Also it makes running anything with an Address sanitizer a PITA, so I will try to have a look at this. Referencing #250 here just to have this also show up there. |
I think we were able to mitigate the issue by using Thanks to your pointers, I appreciate the problem a bit. It makes me wonder if an appropriate stopgap measure would be to do refcounting unconditionally. Then one could check refs in collection deallocator and throw some helpful exceptions? The issue is that use-after-free does not necessarily lead to a segfault, which makes such bugs a bit hard to find and fix. Otherwise, we can close this issue as it has served its purpose. |
Actually, it didn't get resolved, the issue has moved into the collection destructor itself:
Moreover, I'm seeing occasional failures when running DD4hep ctest during build on macOS:
|
I am not sure I understand the DD4hep failure fully yet. From having a look into the code and the stacktrace it seems to me this could be a different issue with the same symptoms. Effectively what happens in For the original issue, is it possible to provide some more context in which this happens? E.g. the full stacktrace or some highlevel description and some potential us of maps that have objects or collections as keys or values? I am re-surrecting #220, and I am trying to get to a more complete collection of actual use cases that I have to consider for this. |
The original issue was, to my understanding, triggered like so: eic/EICrecon#949 (comment) |
Yes that looks more or less the same as what I have described in #174, i.e. an object that points into a collection, outlives the collection and the frame in which the collection lives. I suppose this object would only be need to used as long as the Frame lives, i.e. it is only necessary to properly destruct it without invoking use-after-free, after the frame is gone? |
Yes. Hence see the unlink solution, that did work, although another, more rare, crash appeared on our radar. |
Thanks for the clarifications. I think these crashes are in the end all caused by the same underlying issue and we should be able to fix that. I am not yet sure about the best solution, but I have put this more or less on the top of our todo list as I would like to fix this before v1.0. |
It looks like there are some issues with lifetimes (
R in Rust stands for Rewrite) when Object tries to update ref counter in the destructor:podio/python/templates/macros/implementations.jinja2
Lines 35 to 39 in babec74
cc @wdconinc @nathanwbrei
The text was updated successfully, but these errors were encountered: