googleapis/cloud-profiler-nodejs

100% Idle When Profiling Wall Time on GAE Standard

Closed this issue · 2 comments

Apologies if this isn't the right format to be submitting this in!

I will pre-empt this with the Google App Engine project in question has the following set:

runtime: nodejs10
instance_class: F4_1G
[...]
inbound_services:
  - warmup
[...]

When attempting to profile my Node.js GAE instance, I have followed the prescribed steps and added the example code, as such:

npm install --save @google-cloud/profiler

and in app.js:

require('@google-cloud/profiler').start({
    logLevel: 3,
    disableTime: false,
    disableHeap: false
});

I then deployed to the GAE instance. After testing this for a few hours, I can not for the life of me get it to show anything other than 100% idle CPU:
image
Now I saw #302 which appears to be my issue, however it doesn't seem quite correct as his GCE was actually under light load and therefore 100% idle could be possible (if I'm understanding why it was closed correctly). My issue appears to be that the server is actually under heavy load, with 8 instances running delays of > 5ms, but the profiler claiming they are completely idle.

Oddly, the Heap filter is working fine (as far as I can tell), with it showing all sorts of statistics and such when selecting that view. Wall Time, though, is just always idle.

Any insights into what I'm doing wrong would be much appreciated, if it's actually a bug, or if this isn't even supposed to be used on Google App Engine Standard instances then that too (the documentation says in a few separate places that it is supported though).

I have a few clarifying questions:
First, what type of work is your server doing? The V8 CPU Profiler (which this agent's Wall profiler is built on) only profiles the main event loop. So, if your application does a lot of work reading files, for example, the profiles might be pretty empty.

Next, could you try adding a the filter "Hide stacks: (idles)", like in the following image.
image
It's possible that there is some useful information in the profiles, but that there are enough profiles collected that are 100% idle that the averaged profile is like 99.5% idle, which the UI displays as 100% idle.

Woah. For reference, it is mostly Express, with a huge variety of actions ranging from PDF generation & upload to DB access.

Okay, so turns out this is user error (mine), however I feel like the documentation on the profiler should make it more obvious about hiding the idle stacks, and should almost be enabled by default...

Not sure how it calculates 100% idle, or 99% idle, but regardless I am now able to properly see some stats after adding the filter you specified.

Thank you!