/PageRank

🌐 From-Scratch Implementation of the PageRank algorithm to rank websites in a massive network of links on Stanford Dataset.

Primary LanguageJupyter Notebook

📚 DATOS MASIVOS II

💻 Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas

🏫 Universidad Nacional Autónoma de México


📄 PageRank

Realizado por:

Iván Alejadro Ramos Herrera

📓 Dataset

STANFORD UNIVERSITY WEB

Información del dataset:

Nodes represent pages from Stanford University (stanford.edu) and directed edges represent hyperlinks between them. The data was collected in 2002.

Dataset statistics
Nodes 281903
Edges 2312497
Nodes in largest WCC 255265 (0.906)
Edges in largest WCC 2234572 (0.966)
Nodes in largest SCC 150532 (0.534)
Edges in largest SCC 1576314 (0.682)
Average clustering coefficient 0.5976
Number of triangles 11329473
Fraction of closed triangles 0.002889
Diameter (longest shortest path) 674
90-percentile effective diameter 9.7

Source (citation) J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1) 29--123, 2009. k