-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
pg,abci: unblock notice subscribers on DB shutdown #1021
Conversation
// will still be deterministic so nbd to not halt here | ||
a.log.Errorf("failed to parse notice (%.20s...): %v", log, err) | ||
continue // since txid is invalid and won't match any result.TxHash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls double check me on this continue
@brennanjl . I tend to agree it not necessary to halt, but I don't think we should attempt to use the txid
(which is empty or generally invalid) as a map key below, just continue to receive in the next iteration of the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this seems correct.
// will still be deterministic so nbd to not halt here | ||
a.log.Errorf("failed to parse notice (%.20s...): %v", log, err) | ||
continue // since txid is invalid and won't match any result.TxHash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this seems correct.
wg.Wait() | ||
// wait for all logs to be received, or a premature shutdown | ||
select { | ||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to worry about this not being supported in comet v0.38? I don't think so because we close the notice subscribers when the database shuts down, but just want to double check that you see it the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's ok. With 0.38 we only have one way to break out of here, and that's the notice signaling that the DB dies. With 1.0 we can also get a signal from cometbft, but that's just going to be a bonus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm but seems to be a merge conflict
e205a27
to
cd9a85f
Compare
cd9a85f
to
a6ee3b5
Compare
gofmt'd |
Resolved conflicts. |
This resolves a possible hang in
FinalizeBlock
if the DB shuts down / dies while waiting for the completion of the notice stream atwg.Wait()
. When thepg.DB
instance closes up in response to the replication connection terminating, we also make sure to close the subscriber channels to unblock the receivers. To ensure the receivers handle this as an error rather than assuming all notices have been received successfully, an empty string is first sent on the channel. Since all notice messages have a special prefix (pgtx:
), this is easily recognized as an exception.