vite: A C++ repository from sg0

*************
-------------
 COMPILATION
-------------
*************

Pass -DDEBUG_PRINTF if detailed diagonostics is required along
program run. This program requires OpenMP and C++11 support,
so pass -fopenmp (for g++)/-qopenmp (for icpc) and -std=c++11/
-std=c++0x. For Cray systems, use CC.

Pass -DUSE_32_BIT_GRAPH if number of nodes in the graph are 
within 32-bit range (2 x 10^9), else 64-bit range is assumed.

Pass -DDONT_CREATE_DIAG_FILES if you dont want to create 2 files
per process with detail diagonostics.

Upon building, the program will generate two binaries:
bin/graphClustering (parallel) and bin/fileConvert (serial).

Please use bin/fileConvert for input graph conversion from 
native formats to a binary format that bin/graphClustering will
be able to read. 

Compiling on Intel KNL:

We made some modifications to the code to port it to Cray XC systems 
with Intel KNL. We use lib-memkind to allocate some heavily used data 
structures to KNL MCDRAM (for usage in flat-mode). 
Please ensure lib-memkind module is loaded and pass 
-DUSE_AUTOHBW_MEMALLOC, and pass -xmic-AVX512 instead of -xHost in 
the Makefile.

************************
------------------------
 INPUT GRAPH CONVERSION
------------------------
************************

Convert input files to binary from various formats. 
Possible options (can be combined):

1. -n               : Input graph in SNAP format
2. -m               : Input graph in Matrix-market format
3. -e               : Input graph in METIS format
4. -p               : Input graph in Pajek format
5. -d               : Input graph in DIMACS format. Pass 0 or 1
                      to indicate undirected/directed graph
6. -s               : Input graph is directed edge list
7. -u               : Input graph is undirected edge list 
                      (Can be used for Graph Challenge official 
                      datasets - http://graphchallenge.mit.edu/) 
8. -x               : Read a number of files with edge list 
                      information, usage e.g.: 	
                      -x "<file-path> <start-chunk> <end-chunk>"
                      Requires conformant file names.
9. -i               : Accept weights as is from the file. If this 
                      option is not passed, then by default the 
                      absolute value of the original weight is 
                      chosen. 
10. -r              : Create random edge weights
11. -w              : Initialize edge weights to 1.0
12. -o              : Output binary file with full path
13. -z              : If the index of input graph is 1-based,
                      then this option makes it 0-based
14. -f              : Option preceding input graph file                      

------------------------
File conversion related
------------------------

Typical example: 
bin/./fileConvert -n -f karate.txt -o karate.bin

If your input file does not have edge weights, and you 
want random edge weights, then pass option "-r". 

bin/./fileConvert -m -r -f karate.mtx -o karate.bin

Note: DIMACS file could be directed or undirected, so
pass 0 if input is directed, or a number > 0 input
graph is undirected. Also, we expect DIMACS inputs to
have weights, therefore passing -r has no effect.

bin/./fileConvert -d 0 -f sample.gr -o sample.bin

Same as the one before, except initializes all weights
to 1, no matter what is in the input file.

bin/./fileConvert -d 0 -w -f sample.gr -o sample.bin

Note: If the input file is index 1-based, then indicate
that by passing -z, such that it will be converted to
0-based internally.

bin/./fileConvert -s -z -f network.dat -o lfrtest.bin

Note: We consider files containing edge list, as 'simple' 
format files (option "-s" for directed, and option "-u" for
undirected). If you have a simple format file (i.e., an 
edge list) that does not have weights, then please pass 
"-w" or "-r" in addition to "-s" or "-u" such that we can 
assign weights internally. By default, we assume edge list
files to have weights.

bin/./fileConvert -f uk2007-edges.txt -s -w -o uk2007.bin

****************************
----------------------------
 SYNTHETIC GRAPH GENERATION
----------------------------
****************************

We have bindings to NetworKit[*], that can help generate random and scale
free graphs. This is mostly serial, but NetworKit internally may use OpenMP. 
We have only tested with NetworKit v4.2, and have no plans on supporting 
other NetworKit versions (we will most probably remove this option from 
future releases).

[*] https://networkit.iti.kit.edu/

Install C++ module of NetworKit, by following instructions from:
https://networkit.iti.kit.edu/get_started.html#build-networkit-from-source

Possible options (can be combined):
1.  -e               : Generate ER graph
2.  -r               : Generate RGG graph
3.  -b               : Generate Barabasi-Albert graph
4.  -h               : Generate hyperbolic graph
5.  -n <...>         : Number of (max) nodes/vertices
6.  -p <...>         : Probability for edge creation (for ER graph)
7.  -k <...>         : Number of clusters for RGG (each unit square is divided into
                       clusters where edges are created within) OR attachments per
                       node for Barabasi-Albert graph
8.  -m <...>         : Initial nodes attached for Barabasi-Albert                      
9.  -z               : Visualize RGG graph (EPS file with graph visualization created)
10. -r               : Create random edge weights 

***********************
-----------------------
 EXECUTING THE PROGRAM
-----------------------
***********************

0. Input file conversion to Vite binary format from the respective native 
file formats: convert the input file to binary format using fileConvert.

1. Set the desired number of processes and threads.  

2. Run distributed parallel Louvain algorithm (graphClustering binary) 
using the binary file created in Step #0:

E.g., pass a binary file using "-f" option and include other heuristics: 
mpiexec -n 2 bin/./graphClustering -c -i -f karate.bin

E.g., generate a random geometric graph by passing #vertices=64:
mpiexec -n 2 bin/./graphClustering -n 64

E.g., generate a random geometric graph by passing #vertices=64
and store the resultant binary graph:
mpiexec -n 2 bin/./graphClustering -n 64 -s rgg64.bin

E.g., read a previously generated random geometric binary graph:
mpiexec -n 2 bin/./graphClustering -f rgg64.bin


Possible options (can be combined):

1. -f <input>    : Input real-world binary file (read #1 above).
2. -c <ncolors>  : Enable coloring, adjacent vertices are processed
                   in different order. Synchronization happens once
                   per <ncolors>.
3. -d <ncolors>  : Enable coloring, adjacent vertices are processes
                   in different order. No synchronization.
4. -i            : Threshold cycling, threshold changes dynamically
                   in every phase of Louvain algorithm.
5. -t 1          : If a vertex is in the same community for past 3
                   iterations, then consider it inactive.
6. -t 3          : If a vertex is in the same community for past 3
                   iterations, then consider it inactive. Also,
                   adds a communication step to gather inactive 
                   vertices and terminate Louvain if >= 90% vertices
                   at an iteration are inactive.
7. -t 2 -a <0-1> : Early termination* using probability alpha.
8. -t 4 -a <0-1> : Early termination* using probability alpha. Also,
                   adds a communication step to gather inactive 
                   vertices and terminate Louvain if >= 90% vertices
                   at an iteration are inactive.
9. -b            : Only valid for real-world inputs. Attempts to 
                   distribute approximately equal number of edges among 
                   processes. Irregular number of vertices owned by a 
                   particular process. Increases the distributed graph 
                   creation time due to serial overheads, but may 
                   significantly improve the overall execution time.
10. -o           : Output communities into a file. This option will result 
                   in Vite dumping the communities (community-per-vertex in 
                   each line, total number of lines == number of vertices) 
                   in a text file named <input-binary-file>.communities in 
                   the same path as the input binary file.
11. -r <nranks>  : This is used to control the number of aggregators in MPI 
                   I/O and is meaningful when an input binary graph file is 
                   passed with option "-f".
                   naggr := (nranks > 1) ? (nprocs/nranks) : nranks;
12. -g <gfile>   : Pass a ground truth file for community comparison. We 
                   expect the ground truth file to contain N lines (equal to 
                   the total #vertices in the graph), while each line containing 
                   a distinct vertex ID and associated community ID, separated by 
                   a space or tab. Ground truth community comparison is performed
                   in a single node, and it uses OpenMP to parallelize. It may take
                   a substantial amount of time for large files.
13. -z           : Only applicable if "-g <gfile>" option is passed. This tells us
                   that the passed ground truth file is 1-based. If this option is
                   not passed, we assume the ground truth to the 0-based.
14. -n <|V|>     : Generate graph in memory (uses a parallel Random Geometric Graph
                   generator).
15. -e <%>       : Used in conjunction with the "-n <|V|>" option to generate RGG. 
                   This option tells the percentage of edges to be added, randomly 
                   connecting vertices across processes. Currently, the maximum number
                   of randomly added edges to RGG cannot exceed INT_MAX (there is no 
                   check to determine this, so exercise caution).
16. -p           : Run a single phase of Louvain.
17. -s <output>  : Only applicable to the generated graphs (for e.g., -n <|V|>),
                   output is a binary file as per the output file path.
18: -j           : Just process (read or generate) the graph and exit without running
                   community detection.

*Note: Option to update inactive vertices percentage is defined as
macro ET_CUTOFF in louvain.hpp, and the default is 2%.

**********************************************
----------------------------------------------
 COMPARING COMMUNITIES WITH GROUND TRUTH DATA
----------------------------------------------
**********************************************

1. Generate benchmark graphs with ground truth
community information. For e.g., we have tested
with LFR benchmark. 
(https://sites.google.com/site/santofortunato/inthepress2)
LFR generates an edge list (index 1-based), along with community
membership of each vertex.

2. Convert the generated network to binary using 
fileConvert binary. 

3. Execute the program as shown in the e.g. below:
mpiexec -n 2 bin/./graphClustering -r 1 -f lfrtest.bin -z -g community.dat

The difference between standard way to execute the program and this way
is that one needs to pass the ground truth vertex-community mapping 
using the "-g" option. Also, the "-z" option tells the program that
the input ground truth file -- `community.dat` is 1-based. If "-z" is
not passed, then we assume that the file is 0-based.

We return some statistics such as F-score, Gini-coefficient, 
false positives/negatives and true positives. We expect that the isolated
vertices will be placed into their respective communities, and not into 
a single one.

However, extra communication per phase is required when these metrics
are to be computed, so the program execution time will increase
significantly. Plus, the community comparison part is currently
multithreaded (uses OpenMP), so everything is brought to a single 
node/PE and then communities are compared. 

***************
---------------
 OUTPUT RESULT
---------------
***************

If -DDONT_CREATE_DIAG_FILES is passed during compilation,
then output is send to stdout.
Otherwise, the output result is dumped per process on files 
named as dat.out.<process-id>. Check dat.out.0 to review 
program diagonostics. 
Output files are cleared with: `make clean`.
sg0/vite