-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix mismatch between thread ID sets and reachable threads #484
Conversation
This makes nonunique thread spawns also spawn the unique version to keep it reachable since thread ID sets refer to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this does fix the issue: https://goblint.cs.ut.ee/vesal-results/latest/ We still have a potential unsoundness / precision improvement with restart, but there the difference is in the int domain.
I ran sv-benchmarks before vs after and didn't see any precision or performance change due to this change either, so it should be good. |
@@ -136,9 +136,9 @@ struct | |||
let n = Base.threadenter n v in | |||
let ((p', s') as composed) = compose current n in | |||
if is_unique composed && S.mem n cs then | |||
(p, S.singleton n) | |||
[(p, S.singleton n); composed] (* also respawn unique version of the thread to keep it reachable while thread ID sets refer to it *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a comment that this potentially harms precision for creates that are encountered multiple times but not in a loop and is only needed for incremental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of example do you have in mind where this precision loss can happen? I cannot think of one right now.
And if it does somehow happen, then I think the proper way to get such precision back would be to also track a set of must-created threads, not just the may-created threads that it currently tracks. Then it would be possible to distinguish the loop case (where it isn't must-created) from some other case (where I guess it is must-created?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something like
// SKIP PARAM: --set ana.path_sens[+] threadflag --set ana.activated[+] threadJoins
#include <pthread.h>
#include <assert.h>
int t_fun(void *arg) {
int i = (int)arg;
return i+1;
}
int crea_and_join(int arg) {
pthread_t id;
int i;
// Not my proudest moment but this can't be a proper pointer or that would escape
pthread_create(&id, NULL, t_fun, (void**)arg);
pthread_join(id, (void**) &i);
assert(i == arg+1);
}
int main(void) {
int t;
crea_and_join(10);
crea_and_join(5);
return 0;
}
But somehow it also works here which is highly confusing. One would expect that now also for the unique thread id
main, t_fun@tests/regression/36-apron/96-exp.c:19:3-19:47, {}
would have to account for the possibility that t_fun
is passed 5
but it doesn't. Or am I missing something here?
Also, for some strange reason id
is still a singleton set in crea_and_join(5)
which is strange as it should contain both thread ids, if I spawn two different threads here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, my local master was behind. This actually is an example highlighting the precision loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The threadflag path-sensitivity and the threadJoins analysis aren't really needed for this, are they?
I now see what you mean, but this precision we're losing here is extremely specific: using a wrapper function twice to distinguish the first and the second thread created from the same location, while casting the argument to an unrealistic type.
I feel like there are numerous cleaner way to get this precision than rely on the previous inconsistent behavior, e.g. having a must-created set in ThreadIdDomain.History
or maybe disabling this double-spawning when threadflag is path-sensitive or having a threadWrapper analysis to distinguish these or having a more general form of inlining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The threadflag path-sensitivity and the threadJoins analysis aren't really needed for this, are they?
Nope
I feel like there are numerous cleaner way to get this precision than rely on the previous inconsistent behavior, e.g. having a must-created set in ThreadIdDomain.History or maybe disabling this double-spawning when threadflag is path-sensitive or having a threadWrapper analysis to distinguish these or having a more general form of inlining.
The cleanest but probably also expensive way would be to do it as we describe it in the paper.
Overall, you are right, the precision loss here seems to be limited to very peculiar changes, so it probably doesn't do anything in the grand scheme of things.
The more interesting comparison would be what happens to the numbers we reported in the more-traces paper. Actually, I am wondering if there should not be a switch to turn this behavior off, as it is only needed for incremental and it might be surprising to have such a precision loss to a reader of the more traces paper. |
This inconsistency is also present in the non-incremental case. A global thread ID set may contain a unique thread ID, but if you look at that thread function, then it only exists in the non-unique context. So the part of the solution is referring to something which doesn't appear in the solution, because from the final solution the unique context is no longer reachable and is thus pruned by reachability. But reachability doesn't go into the domain elements of particular analyses to find any references to the pruned contexts. |
This makes nonunique thread spawns also spawn the unique version to keep it reachable since thread ID sets refer to it.
It should fix the issue particularly with incremental comparison where a thread ID set refers to a unique thread, which later was found to be nonunique, making the unique context of it not reached and thus pruned.
Compared to using Hoare sets for thread ID sets, this is significantly simpler and also should work with thread joins correctly.
Compared to path splitting according to the created thread set in the threadid analysis, this should be more efficient by not splitting the states.
TODO