Description
https://github.com/JuliaSymbolics/SymbolicUtils.jl relies on gcd
to simplify fractions.
This is currently the performance bottleneck and while the current implementation was focused on correctness, we should now perform a detailed performance benchmark and eliminate bottlenecks.
Possible improvements are:
- Reduce allocation by making use of MutableArithmetics
- Flip order of terms so that leading monomial can be efficiently dropped in-place
- Specialized implementation of univariate polynomial so that univariate GCD which is the base case of the recursion is as fast as possible (preliminary work in https://github.com/JuliaAlgebra/MultivariatePolynomials.jl/tree/bl/univariate)
It would also be relevant to compare to https://github.com/YingboMa/SIMDPolynomials.jl
Below is the results of #205 after the following PRs:
- Add default term and polynomial representations #199
- Speedup for content #200
- Specialize on function type #208
- Avoid closure for operate! on +/- #209
- Zero-allocation terms/coefficients/monomials for AbstractTermLike #210
- More performance #201
Benchmark 1
SIMDPolynomials v0.1, DynamicPolynomials v0.4.5, TypedPolynomials v0.3.2
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | |||
DynamicPolynomials | 6.276 ms | 123655 | 4.90 MiB |
TypedPolynomials | 4.040 ms | 94216 | 3.21 MiB |
After #199
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 3.880 ms | 95459 | 3.22 MiB |
DynamicPolynomials | 6.026 ms | 121182 | 4.75 MiB |
TypedPolynomials | 3.751 ms | 91794 | 3.10 MiB |
After #200
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 3.052 ms | 86958 | 2.59 MiB |
DynamicPolynomials | 4.946 ms | 107489 | 3.84 MiB |
TypedPolynomials | 3.139 ms | 86455 | 2.58 MiB |
After #208
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 2.996 ms | 86695 | 2.58 MiB |
DynamicPolynomials | 4.807 ms | 107489 | 3.84 MiB |
TypedPolynomials | 3.050 ms | 86217 | 2.58 MiB |
After #209
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 3.024 ms | 82269 | 2.48 MiB |
DynamicPolynomials | 5.273 ms | 107489 | 3.84 MiB |
TypedPolynomials | 2.973 ms | 81791 | 2.48 MiB |
After #210
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 3.346 ms | 81929 | 2.47 MiB |
DynamicPolynomials | 5.907 ms | 107321 | 3.83 MiB |
TypedPolynomials | 3.596 ms | 81451 | 2.46 MiB |
After #217
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 2.894 ms | 82941 | 2.48 MiB |
DynamicPolynomials | 5.087 ms | 106752 | 3.77 MiB |
TypedPolynomials | 2.890 ms | 82131 | 2.47 MiB |
After #218
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 2.991 ms | 82841 | 2.47 MiB |
DynamicPolynomials | 4.929 ms | 106733 | 3.77 MiB |
TypedPolynomials | 3.028 ms | 82123 | 2.47 MiB |
Benchmark 2
SIMDPolynomials v0.1, DynamicPolynomials v0.4.5, TypedPolynomials v0.3.2
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 4.234 μs | 94 | 8.06 KiB |
DynamicPolynomials | 262.587 μs | 4817 | 291.62 KiB |
TypedPolynomials | 117.350 μs | 2021 | 91.28 KiB |
After #199
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 42.200 μs | 1104 | 48.12 KiB |
DynamicPolynomials | 258.029 μs | 4691 | 284.20 KiB |
TypedPolynomials | 80.639 μs | 1517 | 62.34 KiB |
After #200
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 21.231 μs | 623 | 25.92 KiB |
DynamicPolynomials | 151.213 μs | 2656 | 161.92 KiB |
TypedPolynomials | 47.272 μs | 823 | 35.17 KiB |
After #208
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 18.073 μs | 582 | 25.17 KiB |
DynamicPolynomials | 153.686 μs | 2656 | 161.92 KiB |
TypedPolynomials | 42.174 μs | 785 | 34.47 KiB |
After #209
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 18.729 μs | 450 | 22.64 KiB |
DynamicPolynomials | 156.125 μs | 2656 | 161.92 KiB |
TypedPolynomials | 42.091 μs | 657 | 31.97 KiB |
After #210
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 19.147 μs | 437 | 21.84 KiB |
DynamicPolynomials | 178.418 μs | 2647 | 161.22 KiB |
TypedPolynomials | 46.409 μs | 640 | 30.84 KiB |
After #217
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 16.274 μs | 423 | 19.61 KiB |
DynamicPolynomials | 149.753 μs | 2494 | 152.22 KiB |
TypedPolynomials | 26.885 μs | 345 | 20.28 KiB |
After #218
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 11.822 μs | 351 | 17.55 KiB |
DynamicPolynomials | 139.923 μs | 2433 | 147.75 KiB |
TypedPolynomials | 25.341 μs | 341 | 20.03 KiB |
Benchmark 3
SIMDPolynomials v0.1, DynamicPolynomials v0.4.5, TypedPolynomials v0.3.2
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 432.164 μs | 1159 | 588.06 KiB |
DynamicPolynomials | 29.397 ms | 511347 | 30.37 MiB |
TypedPolynomials | 5.521 ms | 116743 | 5.84 MiB |
After #199
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 7.483 ms | 195099 | 8.05 MiB |
DynamicPolynomials | 29.439 ms | 502251 | 29.67 MiB |
TypedPolynomials | 3.605 ms | 91338 | 4.38 MiB |
After #200
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 2.715 ms | 76147 | 3.05 MiB |
DynamicPolynomials | 13.290 ms | 222506 | 13.10 MiB |
TypedPolynomials | 2.509 ms | 66448 | 3.01 MiB |
After #208
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 2.668 ms | 75839 | 3.04 MiB |
DynamicPolynomials | 12.821 ms | 222506 | 13.10 MiB |
TypedPolynomials | 2.332 ms | 66152 | 3.01 MiB |
After #209
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 1.780 ms | 34412 | 2.01 MiB |
DynamicPolynomials | 13.720 ms | 222506 | 13.10 MiB |
TypedPolynomials | 1.254 ms | 24725 | 1.97 MiB |
After #210
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 1.632 ms | 32239 | 1.89 MiB |
DynamicPolynomials | 17.388 ms | 221419 | 13.02 MiB |
TypedPolynomials | 1.389 ms | 22552 | 1.86 MiB |
After #217
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 1.678 ms | 31108 | 1.83 MiB |
DynamicPolynomials | 13.646 ms | 209332 | 12.52 MiB |
TypedPolynomials | 840.497 μs | 5060 | 1.43 MiB |
After #218
Time | Alloc | Memory | |
---|---|---|---|
SIMDPolynomials | 1.398 ms | 30836 | 1.82 MiB |
DynamicPolynomials | 13.404 ms | 209192 | 12.51 MiB |
TypedPolynomials | 820.093 μs | 5044 | 1.43 MiB |
cc @shashi