-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
sql: v24.3.0: index out of range in processSketchRow when collecting table stats #137386
Comments
CC'ing via the CODEOWNERS-based sentry heuristic:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
I'm looking into this now—temporarily assigning myself. |
I spent a while trying to repro this out-of-bounds error in our hyperloglog library, without success. It seems like this lenfht of I thought there may be a bug in the marshalling/unmarshalling of the I think there's a few potential action items for us:
|
This appears to be a regression in 24.3 version - in #137749 we have 4 occurrences of this problem and all were running 24.3.x. Perhaps a RESTORE of a backup from earlier version is required for reproducing. |
I'm able to reproduce this problem with the following steps:
create table t (k int primary key);
insert into t select generate_series(1, 100000);
alter table t split at values (50000);
alter table t experimental_relocate values (array[1], 0), (array[2], 50000);
analyze t; and boom. On a quick glance, the coordinator node must be running older version (i.e. doing these steps when connected to n2 doesn't trigger the problem). I believe the issue is the following: in 2c036cf (which is only present on master, i.e. 25.1 version) we upgraded the The only remaining question for me is why we're seeing this problem in sentry (the library bump wasn't backported) - perhaps they are the result of the same backup roachtest failures? Or someone is doing their testing with master version and mixed version state? |
@yuzefovich Great job tracking this down! I incorrectly assumed that the cluster was on 24.3 based on the version of the Sentry report. I did not think about the mixed-version case. |
This issue was auto filed by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.
Sentry Link: https://cockroach-labs.sentry.io/issues/6137369904/?referrer=webhooks_plugin
Panic Message:
Stacktrace (expand for inline code snippets):
src/runtime/asm_amd64.s#L1694-L1696
pkg/util/stop/stopper.go#L497-L499
pkg/jobs/jobs.go#L831-L833
pkg/jobs/adopt.go#L445-L447
pkg/jobs/registry.go#L1639-L1641
pkg/jobs/registry.go#L1638-L1640
pkg/sql/create_stats.go#L709-L711
pkg/sql/internal.go#L1937-L1939
pkg/sql/internal.go#L2010-L2012
pkg/kv/db.go#L1035-L1037
pkg/kv/db.go#L1060-L1062
pkg/kv/db.go#L1097-L1099
pkg/kv/txn.go#L1051-L1053
pkg/sql/internal.go#L2023-L2025
pkg/sql/internal.go#L1936-L1938
pkg/sql/create_stats.go#L759-L761
pkg/sql/distsql_plan_stats.go#L777-L779
pkg/sql/distsql_running.go#L923-L925
pkg/sql/flowinfra/flow.go#L573-L575
pkg/sql/rowexec/sample_aggregator.go#L196-L198
pkg/sql/rowexec/sample_aggregator.go#L349-L351
pkg/sql/rowexec/sample_aggregator.go#L397-L399
external/com_github_axiomhq_hyperloglog/hyperloglog.go#L156-L158
external/com_github_axiomhq_hyperloglog/registers.go#L79-L81
GOROOT/src/runtime/panic.go#L113-L115
GOROOT/src/runtime/panic.go#L769-L771
pkg/sql/flowinfra/flow.go#L608-L610
GOROOT/src/runtime/panic.go#L769-L771
Tags
Jira issue: CRDB-45552
The text was updated successfully, but these errors were encountered: