/WMW

A fast heuristic algorithm for community detection in large-scale complex networks.

Primary LanguageC++

Weighted Weak Community Detecion

A fast heuristic algorithm for community detection in large-scale complex networks. This is a reference implementation of the algorithm proposed in http://bdigital.unal.edu.co/69933/. This repository is integral part of the Master's thesis.

Papers:

Please, check the branches for different versions of the algorithm.

Release Notes:

  • Added support for weighted graphs.
  • Weighted Most Weak community definition has been removed since it performs quite similar to the Weighted Weak community definition. So the parameter -C from previous version is no longer required.
  • Community definition and size detection are now merged into a single loop, so we guarantee the final result contains Weak communities with minimum size desired.
  • Self-loops are removed.
  • Zero degree vertices are skipped.
  • The input graph is considered undirected.

To build run in the command line:

$ make clean
$ make

Note: The compiler must be compatible with the C++11 standard.

Command line options:

--quiet -q No verbose.
--input_file -i Input file with the graph in edge list format: node1 node2 [weight].
--weighted -w The input graph is weighted.
--output_file -o Output file with the detected communities in format given by parameter -f.
--output_format -f Output format of the detected communities: 0 for one pair node-community per line. 1 for one community per line
--gml -g Graph in GML format with membership (Useful for visual exploratory analysis in Gephi). Only works if the -O option is not specified
--size_distri_file -z Output file for community size distribution (Community size vs frequency).
--edge_sims_file -s Output file for dynamic structural similarity for each edge in the graph (node1 node2 similarity).
--memb_dist_file -m Output for membership distribution (number of memberships vs frequency).
--node_numbering -n Index of the first node in the graph. Default to 0. Common values {0, 1}.
--min_comm_size -K Minimum community size in the result. Default value 2.
--dss_iterations -I Number of iterations to perform by the Dynamic Structural Similarity. Default to 2.
--overlap -O Detect overlapping communities: 0 for fuzzy communities. 1 for crisp communities with threshold -T.
--crisp_threshold -T Threshold used to generate the crisp overlapping communities. Default to 0.05. Common values {0.001, 0.005, 0.01, 0.02, 0.03, 0.04, 0.05}

Example of Disjoint Community Detection:

./master -i input_graph.txt -o output_memberships.txt -g output_gml.gml

Example of Overlapping Community Detection:

./master -i input_graph.txt -o output_communities.txt -O 1 -T 0.05