-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
improve std.sort.sortContext to use block sort implementation #11117
Comments
there's a benchmark result in alichraghi/zort. as you can see Insertion sort is ~160 times slower than block sort. we can try other algorithms such as |
@alichraghi Notable for pd quicksort Not really sure, if quadsort is worth the complexity+memory cost https://github.com/scandum/quadsort Would also be nice to have the sorting algorithms as ascii table overview and which is (simple to ) parallelize etc. |
Zort now has pdq, tim, and many more algorithms. gantt
title Sorting (ascending) 10000000 usize (Core i7-4600U CPU 2.10GHz)
dateFormat x
axisFormat %S s
section random
tim 1.932: 0,1932
pdq 0.400: 0,400
quick 0.882: 0,882
radix 0.311: 0,311
twin 1.126: 0,1126
std_block_merge 1.402: 0,1402
comb 1.631: 0,1631
shell 2.943: 0,2944
section sorted
tim 0.010: 0,10
pdq 0.011: 0,11
quick 0.281: 0,281
radix 0.303: 0,303
twin 0.085: 0,85
std_block_merge 0.063: 0,63
comb 0.500: 0,500
shell 0.332: 0,332
section reverse
tim 0.603: 0,603
pdq 0.108: 0,108
quick 0.474: 0,474
radix 0.251: 0,251
twin 0.431: 0,431
std_block_merge 0.477: 0,477
comb 0.559: 0,559
shell 0.357: 0,357
section ascending saw
tim 0.325: 0,325
pdq 0.408: 0,408
quick 1.454: 0,1454
radix 0.318: 0,318
twin 0.285: 0,285
std_block_merge 0.518: 0,518
comb 0.836: 0,836
shell 0.590: 0,590
section descending saw
tim 0.482: 0,482
pdq 0.404: 0,404
quick 49.500: 0,49519
radix 0.375: 0,375
twin 0.674: 0,674
std_block_merge 0.746: 0,746
comb 1.082: 0,1082
shell 0.834: 0,834
Possible alternatives of block sort
|
@alichraghi Nice work on zorting algorithms! It might be better to open another issue if there is a need for other sorting algorithms to be included in the std. Since |
thank you. pdq, tail, tim and twin is @voroskoi work
@SpexGuy has implemented GrailSort in Zig EDIT: bench result gantt
title Sorting (ascending) 10000000 usize
dateFormat x
axisFormat %S s
section random
pdq 0.403: 0,403
grail 1.650: 0,1650
std_block_merge 1.725: 0,1725
section sorted
pdq 0.009: 0,9
grail 0.717: 0,717
std_block_merge 0.064: 0,64
section reverse
pdq 0.133: 0,133
grail 0.769: 0,769
std_block_merge 0.513: 0,513
section ascending saw
pdq 0.464: 0,464
grail 0.939: 0,939
std_block_merge 0.367: 0,367
section descending saw
pdq 0.431: 0,431
grail 1.043: 0,1043
std_block_merge 0.688: 0,688
|
I think it would make sense to replace |
This would be great for TigerBeetle (part of tigerbeetle/tigerbeetle#1191). In our workload, bumping the cache to 2MB works out to around a ~16% throughput improvement. (we can't use an unstable sort due to how we currently handle duplicates.) It would also be nice to eliminate the stack overflow risk from |
How about adding
to the existing API? I think these functions would be the minimal set of functions needed to implement any comparison sorting routine with tmp = get(ctx, a);
set(ctx, a, get(ctx, b));
set(ctx, b, tmp); Moreover, some sorting algorithms might be better implemented in terms of It would be possible to write the sorting routines so that they only depend on a subset of the functions via |
On second thought |
In #11109 I introduced a new function,
std.sort.sortContext
which allows the caller to override theswap
function, which makes it possible to call std lib sort in theMultiArrayList.sort
method:zig/lib/std/multi_array_list.zig
Lines 395 to 421 in 0b82c02
This, in turn, made it possible to add a
sort
method toArrayHashMap
.However, I did not convert our main std lib sort function to utilize this context; instead it calls insertion sort:
zig/lib/std/sort.zig
Lines 835 to 839 in 0b82c02
The main problem to solve here is that the block sort implementation does not exclusively swap elements; it also uses a "cache" of elements that it copies items into and out of. In order to generalize this sort implementation, and make it usable from, for example,
MultiArrayList
, the cache concept needs to either be eliminated (harming performance) or incorporated in a generalized manner into the context (complicating the API). Since we strive for optimality in Zig, we will pay the price of the more complicated API.On the plus side, this will make the cache size configurable, which is something that we already should be exposing in the API.
The tricky part of this issue will be designing the Context API carefully such that block sort can be implemented in terms of it, but is as simple as possible otherwise for callers.
The text was updated successfully, but these errors were encountered: