Collection of graphs with non-overlapping communities and ground truth partition. All graphs are transformed into GML format, labels are kept in "gt" attribute.
The biggest connected component of every graph is kept in /gml_connected_subgraphs folder.
family | name | n_nodes | n_edges | n_classes | |
---|---|---|---|---|---|
0 | as: AS Internet topology AS Internet topology of June 2009 extracted from data collected by the archipelago active measurement infrastructure developed by Cooperative Association for Internet Data Analysis Marián Boguná at al. Sustaining the internet with hyperbolic mapping (2010) |
23752 | 58416 | 176 | |
1 | citeseer: CiteSeer Citation network extracted from the CiteSeer digital library. Nodes are publications and the directed edges denote citations. Kurt Bollacker at al. CiteSeer: An autonomous Web agent for automatic retrieval and identification of interesting publications (1998) |
3327 | 4676 | 6 | |
2 | cora | cora: Cora Citation Network The original paper describes CoRA dataset with 2708 nodes. Sen, Prithviraj, et al. Collective classification in network data (2008) |
2708 | 5278 | 7 |
3 | cora | cora_full: Cora Citation Network Nodes represent scientific papers. An edge between two nodes indicates that the left node cites the right node. Lovro Šubelj and Marko Bajec. Model of complex networks based on citation dynamics (2013) |
23166 | 89157 | 70 |
4 | cora_full subset | Artificial_Intelligence | 4900 | 13217 | 11 |
5 | cora_full subset | Artificial_Intelligence__Machine_Learning | 3445 | 10761 | 7 |
6 | cora_full subset | Data_Structures__Algorithms_and_Theory | 1937 | 4457 | 9 |
7 | cora_full subset | Databases | 1046 | 3186 | 7 |
8 | cora_full subset | Encryption_and_Compression | 864 | 1995 | 3 |
9 | cora_full subset | Hardware_and_Architecture | 763 | 1644 | 7 |
10 | cora_full subset | Human_Computer_Interaction | 1107 | 2385 | 5 |
11 | cora_full subset | Information_Retrieval | 457 | 1157 | 4 |
12 | cora_full subset | Networking | 1249 | 4022 | 4 |
13 | cora_full subset | Operating_Systems | 2176 | 8731 | 4 |
14 | cora_full subset | Programming | 3109 | 10564 | 9 |
15 | dolphins: Dolphin social network An undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand. David Lusseau at al. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations (2003) |
62 | 159 | 2 | |
16 | eu-core: email-Eu-core network The network was generated using email data from a large European research institution Jure Leskovec at al. Graph evolution: Densification and shrinking diameters (2007) |
1005 | 16706 | 42 | |
17 | eurosis: EuroSiS web mapping study Mapping interactions between Science in Society actors on the Web of 12 European countries. |
1285 | 7524 | 13 | |
18 | football: American College football Network of American football games between Division IA colleges during regular season Fall 2000. Mark EJ Newman and Michelle Girvan. Finding and evaluating community structure in networks (2004) |
115 | 613 | 12 | |
19 | karate: Zachary's karate club Social network of friendships between 34 members of a karate club at a US university in the 1970s. W. W. Zachary, An information flow model for conflict and fission in small groups (1977) |
34 | 78 | 2 | |
20 | newsgroup | news_2cl1 Confusion graphs generated from the Newsgroup 20 dataset Yen, Luh, et al. Graph nodes clustering based on the commute-time kernel (2007) |
400 | 33854 | 2 |
21 | newsgroup | news_2cl2 | 398 | 21480 | 2 |
22 | newsgroup | news_2cl3 | 399 | 36527 | 2 |
23 | newsgroup | news_3cl1 | 600 | 70591 | 3 |
24 | newsgroup | news_3cl2 | 598 | 68201 | 3 |
25 | newsgroup | news_3cl3 | 595 | 64169 | 3 |
26 | newsgroup | news_5cl1 | 998 | 176962 | 5 |
27 | newsgroup | news_5cl2 | 999 | 164452 | 5 |
28 | newsgroup | news_5cl3 | 997 | 155618 | 5 |
29 | newsgroup_0.1 Binarized weights with threshold 0.1 |
news_2cl1_0.1 | 398 | 2634 | 2 |
30 | newsgroup_0.1 | news_2cl2_0.1 | 398 | 2455 | 2 |
31 | newsgroup_0.1 | news_2cl3_0.1 | 398 | 3347 | 2 |
32 | newsgroup_0.1 | news_3cl1_0.1 | 599 | 5129 | 3 |
33 | newsgroup_0.1 | news_3cl2_0.1 | 598 | 5041 | 3 |
34 | newsgroup_0.1 | news_3cl3_0.1 | 595 | 4557 | 3 |
35 | newsgroup_0.1 | news_5cl1_0.1 | 998 | 11525 | 5 |
36 | newsgroup_0.1 | news_5cl2_0.1 | 999 | 10194 | 5 |
37 | newsgroup_0.1 | news_5cl3_0.1 | 997 | 9791 | 5 |
38 | polblogs: Political blogs A directed network of hyperlinks between weblogs on US politics, recorded in 2005 by Adamic and Glance. Lada A Adamic and Natalie Glance. The political blogosphere and the 2004 US election: divided they blog (2005) |
1490 | 19025 | 2 | |
39 | polbooks: Books about US politics A network of books about US politics published around the time of the 2004 presidential election and sold by the online bookseller Amazon.com. Edges between books represent frequent copurchasing of books by the same buyers. Mark EJ Newman. Modularity and community structure in networks (2006) V. Krebs, unpublished, http://www.orgnet.com/ |
105 | 441 | 3 | |
40 | sp_school | sp_school_day_1: SocioPatterns: Primary school day 1 Primary School: face-to-face proximity between students and teachers. Juliette Stehlé at al. High-Resolution Measurements of Face-to-Face Contact Patterns in a Primary School (2011) |
236 | 5899 | 11 |
41 | sp_school | sp_school_day_2: SocioPatterns: Primary school day 2 | 238 | 5539 | 11 |
If you find this repository useful for you, please also consider to cite our paper:
@inproceedings{ivashkin2016logarithmic,
title={Do logarithmic proximity measures outperform plain ones in graph clustering?},
author={Ivashkin, Vladimir and Chebotarev, Pavel},
booktitle={International Conference on Network Analysis},
pages={87--105},
year={2016},
organization={Springer}
}