Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[PoC] Use timestamp_delta to weight samples #166

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

casperisfine
Copy link
Contributor

NB: this is just proof of concept quality to start a discussion. A releasable version of this patch would involve quite more work.

Problem

We recently noticed that we were overlooking a major hotspot because stackprof regular reporting is biased.

Here's the stackprof summary of one of our application boot, using stackprof 0.2.17:

==================================
  Mode: cpu(1000)
  Samples: 95925 (13.33% miss rate)
  GC: 30564 (31.86%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     27790  (29.0%)       27790  (29.0%)     (marking)
      6278   (6.5%)        6277   (6.5%)     RubyVM::InstructionSequence.load_from_binary
...
       311   (0.3%)         311   (0.3%)     Module#using

And now here's the same profiling data, but rendered by Speedscope:

Capture d’écran, le 2021-09-03 à 12 12 46

Notice how Module#using account for almost nothing according to Stackprof, but for 11% for speedscope.

This is because stackprof's own reporting is based on the number of samples, whereas Speedscope use the raw_timestamp_deltas data that is intended for flamegraphs.

And since Module#using goes over the whole heap to flush method caches, I assume that the stackprof event isn't fired for a long amount of time. Based on the stackprof data, Module#using was sampled 311 times, but each sample took 41ms instead of 1ms.

This is particularly visible for Module#using, but from analyzing raw_timestamp_deltas, a large proportion of the samples are longer than intended.

Possible solution

Again this is just a proof of concept, but if we were to always compute the timestamp_delta, we could "weight" the samples to somewhat correct this bias.

It's a bit hard to implement because of other profiling mode such as object which aren't time based. So maybe instead of weighting the samples, we should simply record the timestamp_delta as part of the profiling, and use that instead of the sample count for reporting.

@tenderlove @XrXr what do you think?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants