-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Incorrect results: Bloom filters on UInt8
, Int8
, UInt16
and Int16
columns always return false negatives
#9779
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
I just reproduced the bug in the |
UInt8
, Int8
, UInt16
and Int16
columns always return false negatives
It turns out that #9770 demonstrates that the unsigned variants are incorrect as well so I updated the title of this ticket |
Not exactly: as long as #9770 is not merged, bloom filters are not used on Now that I say it, I realize that I should probably amend that PR (and the existing code) to disable bloom filters entirely on these types; so Datafusion is slow instead of incorrect. |
UInt8
, Int8
, UInt16
and Int16
columns always return false negativesUInt8
, Int8
, UInt16
and Int16
columns always return false negatives
This issue came up in the context of 37.1.0 release: #9904 and I wanted to cross post here Specifically, versions 34.0.0 through 37.0.0 have a bug where The int8/int16 bloom filter support was added in #7821 / shipped as part of https://github.com/apache/arrow-datafusion/blob/main/dev/changelog/33.0.0.md We have disabled using bloom filters for int8/int16 columns as of datafusion 38.0.0 (until we fix the underlying issue) |
Describe the bug
Bloom filters on these columns always filter out every value.
To Reproduce
#9778 demonstrates this, through
correct_bloom_filters: false
as macro "parameter".Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: