Spark/GraphX application for analysing Maven dependencies as a graph
This Spark/GraphX application allows you to analyse the Maven dependencies dataset as in https://ogirardot.wordpress.com/2013/01/11/state-of-the-mavenjava-dependency-graph/
Input
The application runs on Spark and takes two extra commandline arguments:
- args[0]: File path of input file.
- args[1]: Output directory. (directory has to exist, no trailing 's)
Output
The application writes the following files as output:
- the 25 top ranking maven dependencies (according to the PageRank algorithm), i.e. the combination of (groupId,artifactId,version) on which other projects depend most.
- the vertices of the graph
- the edges of the graph
Requirements
- Scala 2.10.5
- SBT (Scala 2.10 compatible version)
- Spark 1.4.0