Get a handle on unfamiliar code by extracting and visualising the natural language programmers used when writing it.
<language>-code <source-file-or-directory>* | code-to-words <stop-word-file> [<stop-word-file>...] | wordcloud -o <output-file>.png
E.g.
java-code project/src/ | code-to-words java-stop-words cargo-cult-java-stop-words | wordcloud -o project.png
Languages supported:
- C:
c-code
c-stop-words
: most C keywordsc-primitive-type-stop-words
: ignores basic C types (int, char, etc.)
- C++:
c++-code
c++-stop-words
: most C++ keywordsc-primitive-type-stop-words
: ignores basic C types (int, char, etc.)
- Java:
java-code
.java-stop-words
: most keywordsjava-primitive-type-stop-words
: ignores primitive typescargo-cult-java-stop-words
: ignores get, set, bean etc.
- Python:
python-code
python-stop-words
: most keywords
- Ruby:
ruby-code
ruby-stop-words
Example visualisations of various applications are in the examples/ directory.
To extract text from source code:
- Bash
- Gnu Sed
- Awk
- Java 1.6
Should work on any desktop Linux. Does not work on MacOS unless you install the Gnu command-line tools.
To compile the Java wordcloud generator:
- JDK 1.6
- Gnu Make