-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
This lib seems to block event loop #683
Comments
I'm not able to reproduce multi-second blocks of the event loop. I was able to sometime get long blocks (up to ~750ms) at this same location; but those did not occur consistently -- ~20ms of blocking was more common. Here's an example of what was logged; note that the blocking was always logged after "Successfully created profile WALL.":
Even though the line numbers in the stack trace suggest that the long delay may occur at the line which sends the CreateProfile request, the fact that I only see this when a wall profile is being collected suggests it may be something about the collection of wall profiles. @chrisdostert -- If you would be able to enable debug logging (documented here; if modifying the source code isn't feasible, setting the environment variable |
I added additional logging to local versions of cloud-profiler-nodejs and pprof-nodejs. It looks like the blocking happens just after the call into native code to begin profile collection (here in pprof-nodejs). I definitely see longer blocks on alpine-linux than I do when running in an non-alpine linux docker image (30-80ms, ve ~300ms). My next step is to make a smaller reproducer which depends only on pprof-nodejs (since that seems to be where the issue is). |
Yesterday, I added logging throughout pprof-nodejs, and was able to confirm that sometimes the method used to start the V8 CPU profiler takes a long time to return:
The logging for "Calling cpuProfiler->StartProfiling" and "Finished calling cpuProfiler->StartProfiling" was added around the line on which the V8 CPU profiler was started. I'm still working on creating a smaller reproducer (something which depends only on pprof-nodejs). |
I'm actually less worried on a reproducer; I think I should look into seeing if I can make all calls into C++ asynchronous in pprof-nodejs. Then, issues like this won't block the event loop. |
Making the call to start the V8 CPU profiler asynchronous isn't option. This NAN example has the following comment:
To make starting the CPU profiler asynchronous (following traditional patterns), we would need to start the V8 profiler within the "Execute" method. So, we can't use standard NAN helpers to call It still might be possible to do this... Maybe we can spawn a C++ thread which calls StartProfiling and wait for that thread to finish within Execute. Before I continue looking in to making out call to the V8 profiler asynchronous, though, I'm going to look a bit into the V8 CPU profiler. |
@psmarshall -- Could it be expected that If it's within the range of expected behaviors, then I really want to make our call to start the CPU profiler asynchronous. But if it's not expected, then I might be better off looking at optimizing |
I think |
Yes it's expected that I don't remember ever trying to optimize the startup so there could be wins to find there. We do all this logging on startup on the main thread because we need to access the v8 heap and copy all of the metadata we use for symbolizing functions during profiling to an off-heap data structure, because we access it from a background thread during profiling to keep the overhead of profiling itself low. Given this work has to happen on the main thread I don't think making it async is an option. |
Here is a repro just in V8 tests BTW: https://chromium-review.googlesource.com/c/v8/v8/+/2366695 Profiler startup takes ~4.8 seconds with 500k functions on the heap |
nodejs/diagnostics#444 is related to this I think. |
Environment details
@google-cloud/profiler
version: 4.0.2Steps to reproduce
The text was updated successfully, but these errors were encountered: