DCD (Deep learning-based Community Detection) is designed to apply state-of-the-art deep learning technologies to identify communities for large-scale networks. Compared with existing community detection methods, DCD offers a unified solution for many variations of community detection problems.
DCD provides implementation of 4 community detection algorithms, 1 evaluation, and two types of networked data:
Function | Description | Input | Output |
---|---|---|---|
K-Means | Baseline (1) | -Network node file -Network edge file -Performance evaluation flag -K |
<node id, community id> |
MM | Baseline (2) | -Network node file -Network edge file |
<node id, community id> |
DCD | DCD | -Network node file -Network edge file -Performance evaluation flag -Node attribute flag -K |
<node id, community id> |
Random network Generation | Generate random network datasets | -Network size -Community size -Probability of edges within communities -Probability of edges between communities -Directed network flag |
<node id, community id> Network node file Network edge file |
Load Dataset | Load Facebook, citation or user-provided datasets | Dataset name | Facebook dataset Citation dataset |
Generally, the library is compatible with Python 3.6/3.7.
NetworkX >= 2.3
pip3 install pydcd
Here is a quick-start example.
Python 3.7.3 (default, January 01 2020, 09:00:00)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from pydcd import DCD, KM, MM
>>> kmeans_detector = KM(10)
>>> kmeans_detector.km_detect_community('fb_nodes.txt','fb_edges.txt','N') # N means no evaluation
>>> mm_detector = MM()
>>> mm_detector.mm_detect_community('fb_nodes.txt','fb_edges.txt','Y') # Y means showing evaluation
>>> dcd_detector = DCD() # using default setting for initialization, or
>>> dcd_detector = DCD(128,64,128,50) # set the neurons for three hidden layers and the output dimension
>>> dcd_detector.dcd_detect_community('fb_nodes_withattributes.txt','fb_edges.txt','Y','N') # Y means nodes having attributes
>>> dcd_detector.dcd_detect_community('fb_nodes_noattributes.txt','fb_edges.txt','N','N') # The first N means nodes no attributes
>>> rn = RandNet() # to generate random networks
>>> rn.generate_random_networks(1000,100,0.2,0.05) # undirected network with 1000 nodes and 100 communities
>>> rn.generate_random_networks(1000,100,0.2,0.05,directed=True) # directed network with 1000 nodes and 100 communities
node file without attributes:
node_id_1
node_id_2
node_id_3
...
node_id_n
node file with attributes:
node_id_1 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_2 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
node_id_3 <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
...
node_id_n <tab> value_for_attribute_1 value_for_attribute_2 ... value_for_attribute_m
edge file:
node_id_1 node_id_2
...
node_id_i node_id_j
...
node_id_m node_id_k
PyDCD is developed by Prof. Kunpeng Zhang, Prof. Shaokun Fan, and Prof. Bruce Golden.
If you find this useful for your research or development, please cite our work.