/PageRank-MPI

MPI based implmentation of the page rank algorithm

Primary LanguageGroff

This is MPI implementation of the PageRankalgorithm. If you have stampede, you can enjoy the following MPI Pagerank implementation.

Dataset: We used LiveJournal (https://snap.stanford.edu/data/soc-LiveJournal1.html) for all our tests.

One needs to be aware of the following components to run the code:

1) We want to split the graph using METIS. METIS requires the graph to be undirected, therefore we convert
   our directed graph into undirected graph. Also the graph we are given is in the edgelist form. Therefore,
   we convert the edgelist into the adjacency list (directed and undirected both).
   To undirected just for the sake of metis and directed to get the actual graph, which is to be used in
   the following computations.

   python edgelist_2_adj.py --help (Converts edgelist to actual graph)
   python edgelist_2_adj_metis.py --help (Just for the metis)

   Also, metis needs to have number of nodes and edges on the top of file. So, we need to count
   the edges in the new undirected graph produced from edgelist_2_adj_metis.py

   python calculate_nodes_edges.py --help

2)  graph_splitter.cpp/graph_splitter.py 
    
    These two scripts are just the same functionality, just that cpp version is highly optimized for performance.
    
    Sample Invocation: ./graph_splitter ./datasets/soc-LiveJournal1.txt_adjacency 4847571 ./metis-5.1.0/build/metis-install/bin/gpmetis 2 ./LiveJournalSplits_new/2/
    Result of the these scripts would be something like:

    [nanda@TACC]~/parallel_algo/PageRank-MPI$ ls -al ./LiveJournalSplits_new/2/
    total 625772
    drwx------ 2 nanda G-800722      4096 May 12 09:34 .
    drwx------ 7 nanda G-800722      4096 May 12 09:33 ..
    -rw------- 1 nanda G-800722 320836748 May 12 09:34 soc-LiveJournal1.txt_adjacency.split.0
    -rw------- 1 nanda G-800722 319933375 May 12 09:34 soc-LiveJournal1.txt_adjacency.split.1

3) Last thing and most important thing is to run the page_rank.C. This is where the main action happens.
   Sample Invocation:

   mpic++ -fopenmp Graph.C Node.C page_rank.C -o page_rank_openmp
   ibrun tacc_affinity <path>/parallel_algo/PageRank-MPI/page_rank_openmp <path>/parallel_algo/PageRank-MPI/LiveJournalSplits_new/2/soc-LiveJournal1.txt_adjacency

Other stuff:

1) srun -p development -t 0:30:00 -n 4 -N 4 --pty /bin/bash -l 

    Requests 4 MPI tasks on 4 Nodes on STAMPEDE.

Caution: Very less effort has been spent on the correctness, only preliminary tests were done to verify correctness.