Detect cases where first eval is slower than subsequent evals #102

LilithHafner · 2024-05-12T23:35:33Z

If I have something like @b rand(1000) sort!, the first eval is much slower than subsequent evals within a given sample, which violates benchmarking assumptions and results in weird results. For example, @b rand(1000) sort! reports a super fast runtime while @b rand(100_000) sort! is realistic.

See: compintell/Mooncake.jl#140

julia> @be rand(100_000) sort!
Benchmark: 100 samples with 1 evaluation
min    761.379 μs (6 allocs: 789.438 KiB)
median 871.046 μs (6 allocs: 789.438 KiB)
mean   890.113 μs (6 allocs: 789.438 KiB, 2.74% gc time)
max    1.223 ms (6 allocs: 789.438 KiB, 14.46% gc time)

julia> @be rand(1000) sort!
Benchmark: 2943 samples with 7 evaluations
min    2.345 μs (0.86 allocs: 1.429 KiB)
median 3.208 μs (0.86 allocs: 1.429 KiB)
mean   4.221 μs (0.86 allocs: 1.434 KiB, 0.25% gc time)
max    701.837 μs (0.86 allocs: 1.714 KiB, 98.49% gc time)

The text was updated successfully, but these errors were encountered:

gdalle · 2024-05-13T09:07:14Z

I guess most of these cases can be detected by systematically running a second evaluation after the first one? Of course it's debatable whether the benefit outweighs the cost

LilithHafner · 2025-01-05T17:44:53Z

For seconds=0.1 (the default), we'll choose to run only a single eval if the runtime is greater than about 0.02% of the budget. In this case, there isn't actually an issue because evals=1. If the runtime is less than 0.02% of the budget, then it should be pretty cheap to perform this check.

For higher budgets, the sitaution is even better. For lower budgets, it seems reasonable to perform fewer sanity checks.

This is all assuming that runtime is dominated by evaluating the target function rather than by Chairmarks plumbing or by the setup or teardown functions.

LilithHafner added the timing results label May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect cases where first eval is slower than subsequent evals #102

Detect cases where first eval is slower than subsequent evals #102

LilithHafner commented May 12, 2024

gdalle commented May 13, 2024

LilithHafner commented Jan 5, 2025 •

edited

Loading

Detect cases where first eval is slower than subsequent evals #102

Detect cases where first eval is slower than subsequent evals #102

Comments

LilithHafner commented May 12, 2024

gdalle commented May 13, 2024

LilithHafner commented Jan 5, 2025 • edited Loading

LilithHafner commented Jan 5, 2025 •

edited

Loading