astrofrog/psrecord

Question about memory reporting

Opened this issue · 4 comments

Hello,

I discovered this tool very recently, and was trying to use it to monitor a deployment tool (using ansible actually) 's memory usage while it was running, since I wanted to have some kind of persistent data that I could look over afterwards, and not be required to eye-monitor the thing.

As such, I want to thank you for making psrecord available and so easy to use ! I really like it.

psrecord seems to be a very good fit for what I was looking for, but I found an oddity and wanted to confirm my findings. Note that I'm monitoring with both psrecord and htop in parallel:

  • When Htop shows ~3.5GB of memory, psrecord actually writes down ~6GB
  • Command run is: psrecord --log memory_usage.log --plot memory_usage_graph.plot "ansible-playbook -i myinventory playbook.yml"

Is it:

  • Because psrecord is assuming a cumulative memory usage between the child processes of ansible (around 5 at a time) ?
  • Because the way Htop and psrecord collect informations are different ?
  • Because of some other unknown bug ?

Because psrecord is assuming a cumulative memory usage between the child processes of ansible (around 5 at a time) ?

I don't think so, because I think child processes should not be included unless you do --include-children

Because the way Htop and psrecord collect informations are different ?

This is possible - note that psrecord is just a wrapper around psutil, so you could also try using psutil directly to see if that makes a difference. Note that we record both real and virtual memory - is the 6Gb value you gave above the real value? (see the memory_usage.log for both values)

Because of some other unknown bug ?

I hope not! :) Let me know about the real/virtual above, and also see if you find the same high value with psutil directly

Well, in the 6GB case, I think it was about 5.6GB-5.8GB Real Memory usage, and 6.XGB Virtual memory usage.
At this time, Htop was displaying something in the 3.XGB memory usage (for the system, not only the ansible process)

Image upload valid for one day only: https://ibb.co/cj9YSR

Here, you can see that psrecord shows that ansible is using close to 14.7GB Real memory, while htop only reports less than 9GB being used at the same time.

That being said, there are lots of time where both tool's values are very close, but once in a while, psrecord goes much higher, stays higher a while, then gets down back to something close to what I'm observing in htop.

I dont want to mean that htop is a reference, it's just that it is a pretty standard visual for this kind of metrics, and I want to make sure my reporting is good.

Note that of course, the reporting of psrecord is at a much higher frequency than htop (~1-2 seconds between each update).
This may be part of the explanation, but even then the behavior observed seems inconsistent between the two tools, leading to this question.