HllSketch performance improvement for strings
·
159 commits
to master
since this release
- HLL DataToSketchUDAF: Input strings are converted to char[] before passing to HllSketch. This is substantially faster than passing strings due to avoiding UTF-8 conversion process. Warning: effectively a different hash function is used for strings. So unions of sketches produced by this version and the previous version will have no overlap, and therefore produce incorrect results. We recommend upgrading to this version, and, if any sketches have been created with string inputs and stored, we recommend recomputing them from the raw data.