Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

stats: revert upgrade of hyperloglog library #138358

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 7, 2025

This commit partially reverts the bump of the hyperloglog library from b8c532e. The library is used in two places - for unique counter metrics as well as to compute the distinct count in the table statistics, and the latter usage needs to work across versions. With the library version bump it became possible to hit an index out of range panic when performing the table stats collection in the following case:

  • the cluster is in the mixed-version state where some nodes are running 24.3 and prior versions while other are running binaries with the library version bump
  • the coordinator for the job (i.e. the node that has the sample aggregator) is running 24.3 and prior versions.

If we want to bump the library version, we'll have to do so with backwards-compatibility in mind for the table stats collection usage. For now, we simply revert the bump for that usage.

Fixes: #137386.
Fixes: #137749.

Release note: None

@yuzefovich yuzefovich requested review from a team as code owners January 7, 2025 02:44
@yuzefovich yuzefovich requested review from arjunmahishi and aa-joshi and removed request for a team January 7, 2025 02:44
Copy link

blathers-crl bot commented Jan 7, 2025

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich removed request for a team, arjunmahishi and aa-joshi January 7, 2025 02:44
@yuzefovich
Copy link
Member Author

@MattWhelan @dhartunian was the hyperloglog version bump driven by performance or something else? How critical is doing that?

@MattWhelan
Copy link
Contributor

MattWhelan commented Jan 7, 2025 via email

@MattWhelan
Copy link
Contributor

It's really risky to have out of date libraries that we can't upgrade.

If we do decide to revert the upgrade rather than addressing the issue in another way, I strongly suggest that we add a unit test that detects this issue. Something that asserts a particular binary serialization is probably sufficient.

From a cursory look at what we're doing here, we seem to have made ourselves dependent on the unstable 0.0.0 version of the serialized form of this data structure from late 2018, and clearly there's been a compatibility break sometime in the intervening six years.

When we depend on the binary serialization of a data structure for correctness, we need tests that verify that. We also ought to have a plan to version and upgrade our binary serializations, so we don't get stuck forever with an outdated library that cannot receive even security-critical updates.

Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reverting. New version is not critical.

This commit partially reverts the bump of the `hyperloglog` library from
b8c532e. The library is used in two
places - for unique counter metrics as well as to compute the distinct
count in the table statistics, and the latter usage needs to work across
versions. With the library version bump it became possible to hit an
index out of range panic when performing the table stats collection in
the following case:
- the cluster is in the mixed-version state where some nodes are running
24.3 and prior versions while other are running binaries with the
library version bump
- the coordinator for the job (i.e. the node that has the sample
aggregator) is running 24.3 and prior versions.

If we want to bump the library version, we'll have to do so with
backwards-compatibility in mind for the table stats collection usage.
For now, we simply revert the bump for that usage.

Release note: None
@yuzefovich yuzefovich requested a review from a team as a code owner January 7, 2025 20:57
@yuzefovich yuzefovich changed the title util,stats: revert upgrade of hyperloglog library stats: revert upgrade of hyperloglog library Jan 7, 2025
@yuzefovich
Copy link
Member Author

With Ricky's help I changed the patch to only revert the library upgrade in the table stats usage. @rickystewart could you please double check this patch to have two versions of the same library in the binary?

I agree with Matt that we should be able to upgrade the library in the table stats usage, so I filed #138546 to track that. I'm being lazy and didn't include an explicit regression test for this bug, but we did catch it automatically in our mixed-version roachtests #137749, so we do have necessary test coverage in general.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Great find!

Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rickystewart)

@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig craig bot merged commit e207d7e into cockroachdb:master Jan 9, 2025
22 checks passed
@yuzefovich yuzefovich deleted the revert-hyperloglog branch January 9, 2025 00:38
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
5 participants