Skip to content

Commit

Permalink
Fix deadlock on join() timeout.
Browse files Browse the repository at this point in the history
This closes OpenTSDB#4.
  • Loading branch information
tsuna committed May 20, 2015
1 parent dfc36d8 commit d5e34d9
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions src/Deferred.java
Original file line number Diff line number Diff line change
Expand Up @@ -1130,6 +1130,7 @@ private T doJoin(final boolean interruptible, final long timeout)
try {
while (true) {
try {
boolean timedout = false;
synchronized (signal_cb) {
addBoth((Callback<T, T>) ((Object) signal_cb));
if (timeout == 0) { // No timeout, we can use a simple loop.
Expand Down Expand Up @@ -1174,12 +1175,15 @@ private T doJoin(final boolean interruptible, final long timeout)
// But entering `wait' is pretty much guaranteed to make the
// loop take more than 100ns no matter what.
if (timeleft < 100) {
throw new TimeoutException(this, timeout);
timedout = true;
break;
}
}
}
}
if (signal_cb.result instanceof Exception) {
if (timedout) {

This comment has been minimized.

Copy link
@dvdreddy

dvdreddy May 20, 2015

Very minor suggestion, should we do a signal_cb.result != signal_cb check and then throw the timeoutException ?. That may lead to dirty code but this is the situation which lead to deadlock in our case the callback chain just finished in time to join, better to have result than exception

This comment has been minimized.

Copy link
@tsuna

tsuna May 20, 2015

Author Owner

That's weird, how can you always be in that case? I mean if it's a lucky coincidence then OK, but that shouldn't happen frequently, no?

In any case I applied your suggestion in this revised commit: 3552d39

This comment has been minimized.

Copy link
@dvdreddy

dvdreddy May 20, 2015

We have a internal class that schedules a TimerTask to a HashedWheelTimer task which triggers the callback chain of this Deferred with a partial result at about around the same time [-5ms] as the join timeOut. That is the reason we hit this bug in the first place. And in a busy jvm we are seeing around 1 % timer task triggering at around + 5 ms from the scheduled timeout time

throw new TimeoutException(this, timeout);
} else if (signal_cb.result instanceof Exception) {
throw (Exception) signal_cb.result;
}
return (T) signal_cb.result;
Expand Down

0 comments on commit d5e34d9

Please # to comment.