You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running some simple benchmarks from Python, I was a bit surprised by the performance of group-by aggregations:
10000 groups:
>>> n =10000
>>> a = pa.table({'group': list(range(n))*2, 'key': ['h']*n+['w']*n, 'value': range(n*2)})
>>> %timeit a.group_by('group', use_threads=False).aggregate([('value', 'sum')])
496 μs ± 439 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> %timeit a.group_by('group', use_threads=False).aggregate([(('key', 'value'), 'pivot_wider', pc.PivotWiderOptions(key_names=('h', 'w')))])
708 μs ± 1.34 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
100000 groups:
>>> n =100000
>>> a = pa.table({'group': list(range(n))*2, 'key': ['h']*n+['w']*n, 'value': range(n*2)})
>>> %timeit a.group_by('group', use_threads=False).aggregate([('value', 'sum')])
5.93 ms ± 11.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit a.group_by('group', use_threads=False).aggregate([(('key', 'value'), 'pivot_wider', pc.PivotWiderOptions(key_names=('h', 'w')))])
8.23 ms ± 28.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I was initially expecting pivot_wider to be much slower than sum, both because it does a secondary grouping using a naive std::unordered_map, and because it does a row-to-column transposition of grouped values. But pivot_wider only appears to be 50% slower than a simple sum.
In absolute numbers, it seems group-by summing hovers at around 30-40M rows/second. Given that we're supposed to use a high-performance hash table ("swiss table" with AVX2 optimizations) and the group ids above are trivially distributed integers, this doesn't seem like a very high number.
What should be the expectations here? @zanmato1984
Component(s)
C++
The text was updated successfully, but these errors were encountered:
I can't say for the numbers for now. Do you have any flame graphs for sum and pivot_wider? I can take a look if you do before I benchmark them myself (can't promise a time though).
No, I don't have any flamegraphs. It's not a pressing issue either, and I don't have a need for faster hashing actually :). I was just surprised and thought I'd share the results in case other people care.
Describe the enhancement requested
Running some simple benchmarks from Python, I was a bit surprised by the performance of group-by aggregations:
I was initially expecting
pivot_wider
to be much slower thansum
, both because it does a secondary grouping using a naivestd::unordered_map
, and because it does a row-to-column transposition of grouped values. Butpivot_wider
only appears to be 50% slower than a simplesum
.In absolute numbers, it seems group-by summing hovers at around 30-40M rows/second. Given that we're supposed to use a high-performance hash table ("swiss table" with AVX2 optimizations) and the group ids above are trivially distributed integers, this doesn't seem like a very high number.
What should be the expectations here? @zanmato1984
Component(s)
C++
The text was updated successfully, but these errors were encountered: