vitsalis/PyCG

Source code analysis with PyCG

jonathanooikwanw opened this issue · 5 comments

Hi, is it possible to use PyCG in a Python script? I would like to analyze Tensorflow's python scripts which consist of 2800 python files. I would like to iterate through the folder and print out a JSON file for each python script.
image

Thank you for sharing your tool!

An update: Is it possible to use PyCG to analyze source code of large repositories such as Tensorflow, Pytorch and Numpy?

Please see: #8

@ashwinprasadme sorry I don't understand, is it not possible then to analyze large source code repositories?

@jonathanooikwanw currently PyCG does not have the instrumentation to analyze external libraries. This does not mean large source code repositories. But rather how PyCG handles external imports, detailed in the issue linked above. Currently work is under progress to handle these as well.

@jonathanooikwanw Maybe this may help you.
I had forked and added a new branch called "output-line-number", which includes changes for generating function call line numbers, add --dir argument and scan whole directory, skip broken python files from analysing.
You can try to use it by typing:
$ pycg --dir [the analyzable directory absolute path] -o [output path]
Result is:

  • newly generated output path with the ending "_pycg"
  • generated JSON files with the source file plus with the extension of ".json".

Please note: we've tested this branch for our local files and we still didn't add tests. So I'll be happy if you try it and tell me if any error is occured.
The only problem I faced is:

  • by running in medium performance server the PyCG may cause CPU or memory overwork and the OS may stop the script or the script will freeze after analyzing some portion of the files. However, I succeeded to analyze about 40000 files (all together 400MB).