jokkedk/webgrind

Handling large cache files

Closed this issue · 8 comments

What steps will reproduce the problem?

  1. Create a large cachegrind file

What is the expected output? What do you see instead?
Trying to load a 60MB cachegrind file, my browser got stuck because of the
amount of data returned.

What version of the product are you using? On what operating system?
1.0

Please provide any additional information below
Perhaps you can process the cachegrind file first, split up the results
into several HTML files and save these. So there would be somekind of
simple cache?

The initial amount of data returned to the browser when loading a cachegrind file is determined by the number of different functions
called in the profiled script. In other words the amount of data returned does not depend directly on the size of the cachegrind file.

What does depend on the size of the cachegrind file is the time it takes to preprocess the file. The preprocessing is done one time for
each profile and the result is cached in a temp file.

On my, reasonably fast, computer preprocessing 60MB and larger profiles works fine. So I suspect that something else is causing your
problem.

Are you having this problem with all profiles of this size or is this the only one you have tried?

I have seen a similar problem w/a heavy cachegrind file... only 11MB in size, but some
functions get called 11,000 times and apache ends up eating all available RAM. I
suspect this is just a function of loading all those calls into memory than an actual
issue with webgrind, but I haven't looked into it deeper. The next time it happens
I'll attach an example cachegrind file.

I'm also bothered by insanely high memory usage (1gb+) processing ~60mb callgrind files. The issue seems to be that Preprocessor.php loads ALL of the function call information into RAM before writing any of it back out to its intermediate file. Since webgrind outputs an intermediate flatfile, it is difficult to update call counts without this read all / write all workflow because you don't know where to seek in the file, you need to leave space for an arbitrarily large counter, etc.

My question is, why not output to an intermediate SQLite database instead, so you don't have to hold the entire call graph in memory? I'd be happy to provide a pull request, I just want to make sure I'm not missing something first.

The memory usage is dependant on the number of different functions called, and the number of different places a function is called. So it is not dependant on the number of times a function is called from the same place. So the current way has been designed with memory usage in mind, but in some cases it may cause issues...

I am not against using a different way of storing the information, or even designing some way of using the information directly from the original file in some fast way.

But, it is important to keep an eye on the time it takes to generate the profiling informartion.

Is there any way you could share this 60mb callgrind file?

I still have this problem. Did anyone else solve it?

Does this problem still occur in master (or release 1.3)? If so, could you supply the cachegrind file (the above links are dead)?

Presuming resolved by the binary preprocessor. If someone has a cachegrind file that still exhibits this behaviour, please post a link and reopen this issue.