/pagerank

Some algorithms for computing PageRank

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

PageRank

For a University exam on numerical methods for Markov chains, I had to read and make an oral report on two articles on PageRank algorithms, on the inner-outer iteration method and fast sampling method.

This repository contains my python implementation of some of those algorithms, which I made to make some quick tests. Unfortunately I was not able to get the sampling methods to work - the sampling function is not working as expected (so the fastsample.py file is to be considered broken). Also the Gauss-Seidel method (gs.py) is really slow: I used scipy.sparse.csr_matrix to store the matrixes, and it works very well for methods that require many matrix x vector products, but very poorly with the Gauss-Seidel method that needs to make lots of accesses to single matrix elements.

I tested the programs with the web graphs from http://snap.stanford.edu/data/index.html and the much smaller Harvard500 graph (from which you have to manually remove the first lines)

Example:

python readandsavegraph.py wiki-Vote.txt wiki float64
python pow.py wiki
python innerouter.py wiki
python plotresidues.py wiki