-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
stats: revert upgrade of hyperloglog library #138358
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
@MattWhelan @dhartunian was the |
I needed a hyperloglog implementation for the UniqueCount metric type, and I noticed that the one we had was several years out of date with what looked like several critical improvements since. So I updated it. It's probably not critical, though if we can't use a recent version successfully, we should either contribute a fix or find a different library. -- Matt WhelanOn Jan 6, 2025, at 6:46 PM, Yahor Yuzefovich ***@***.***> wrote:
@MattWhelan @dhartunian was the hyperloglog version bump driven by performance or something else? How critical is doing that?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
It's really risky to have out of date libraries that we can't upgrade. If we do decide to revert the upgrade rather than addressing the issue in another way, I strongly suggest that we add a unit test that detects this issue. Something that asserts a particular binary serialization is probably sufficient. From a cursory look at what we're doing here, we seem to have made ourselves dependent on the unstable 0.0.0 version of the serialized form of this data structure from late 2018, and clearly there's been a compatibility break sometime in the intervening six years. When we depend on the binary serialization of a data structure for correctness, we need tests that verify that. We also ought to have a plan to version and upgrade our binary serializations, so we don't get stuck forever with an outdated library that cannot receive even security-critical updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reverting. New version is not critical.
This commit partially reverts the bump of the `hyperloglog` library from b8c532e. The library is used in two places - for unique counter metrics as well as to compute the distinct count in the table statistics, and the latter usage needs to work across versions. With the library version bump it became possible to hit an index out of range panic when performing the table stats collection in the following case: - the cluster is in the mixed-version state where some nodes are running 24.3 and prior versions while other are running binaries with the library version bump - the coordinator for the job (i.e. the node that has the sample aggregator) is running 24.3 and prior versions. If we want to bump the library version, we'll have to do so with backwards-compatibility in mind for the table stats collection usage. For now, we simply revert the bump for that usage. Release note: None
7f40937
to
1c4d6b6
Compare
With Ricky's help I changed the patch to only revert the library upgrade in the table stats usage. @rickystewart could you please double check this patch to have two versions of the same library in the binary? I agree with Matt that we should be able to upgrade the library in the table stats usage, so I filed #138546 to track that. I'm being lazy and didn't include an explicit regression test for this bug, but we did catch it automatically in our mixed-version roachtests #137749, so we do have necessary test coverage in general. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rickystewart)
TFTRs! bors r+ |
This commit partially reverts the bump of the
hyperloglog
library from b8c532e. The library is used in two places - for unique counter metrics as well as to compute the distinct count in the table statistics, and the latter usage needs to work across versions. With the library version bump it became possible to hit an index out of range panic when performing the table stats collection in the following case:If we want to bump the library version, we'll have to do so with backwards-compatibility in mind for the table stats collection usage. For now, we simply revert the bump for that usage.
Fixes: #137386.
Fixes: #137749.
Release note: None