joerick/pyinstrument

Pyinstrument hiding useful frames

umairanis03 opened this issue · 8 comments

Profiling an internal functions with pyinstrument becomes difficult in non-interactive mode, as pyinstrument hides the frames with filepath containing /lib/ by default.

I found this old comment from the maintainer on an old related issue:

Normally, using the default --hide param /lib/. if those files are in a path that contains /lib/, they are considered to be third party libraries. This catches anything that has been installed by pip and hides it.

Since, all most all our (D.E. Shaw) python production code contains /lib/, pyinstrument thinks it as third party library code.

Though, these frames are expandable in HTML format.

The workaround for non interactive mode we are following: pyinstrument --show '*/prod/*' <your-file.py>

One solution we thought internally was to have better global defaults for our installation. What do you think about this idea, or could you suggest better alternatives?

Hmm. Possibly there's a better heuristic I can use than /lib/. Basically I'm trying to detect code that has been pip/conda -installed or built-in. Ideas welcome!

The other thing I wonder is if you could put some pyinstrument options in a pyproject.toml, similar to how tools like black store options there. Would that work in your project structure?

Might be better to use the internal import machinery tools to have better heuristics about when a package is third-party (or built-in).

Also, this would be harder to do, but it would be useful to have some interactive way on the HTML pages to filter/unhide frames. And just generally speaking, anything that can be done in the HTML pages to make it easier to navigate really big traces would be useful.

Thanks for chipping in @asmeurer ! What import mechanisms do you speak of? I've only looked into sys.path, but found that I couldn't easily distinguish between system and user paths.

On the broader point , I'd love to see some examples of the kinds of profile sessions that you're seeing that would benefit from more interactivity? Perhaps you could share the pyisession files and the specific thing that you're trying to drill into? I tend to work on smaller scale projects so maybe don't see the need for that kind of feature but I'd love to improve the functionality for larger projects.

I think this is what you're looking for. It's honestly really complicated and very customizable, but most people don't customize it so you can probably make some good heuristics https://docs.python.org/3/reference/import.html (sorry for not giving more details. I'm not really an expert on the Python import system)

@joerick Are we going with what @asmeurer has suggested? If not, a configuration option for us (either through pyproject.toml or otherwise) would work for us. I would also check internally if we have any preference.

@joerick I discussed internally - A hierarchal config system like black would be fine for us. CC @mlucool

I did have another idea just now... to have another mechanism that captures sys.prefix when the profile is captured and see if the files reside in a subpath of that. that might avoid the need for a config option at all. If there aren't any edge cases there I think that might be my preferred option, as it would avoid the issue for everyone that stores code in a /lib/ folder.

I did have another idea just now... to have another mechanism that captures sys.prefix when the profile is captured and see if the files reside in a subpath of that. that might avoid the need for a config option at all. If there aren't any edge cases there I think that might be my preferred option, as it would avoid the issue for everyone that stores code in a /lib/ folder.

@joerick Did you verify your idea? I am afraid I did not understand how would you differentiate the site wide packages from sys.prefix.