This is a semister project for Linear Algebra course at Ukrainian Catholic University aimed at ranking web pages' importance based on Google's PageRank algorithm.
We made 3 implementations of the PageRank algorithm here:
-
Eigendecomposition
-
Power Method with epsilon
-
Power Method without epsilon
To run a simple example on generated data:
python3 src/pagerank.py
To test the productivity of each method:
python3 src/timings.py
To run the algorithm on real data:
python3 src/dataset_example.py
For demonstration we used data from open resourses. This dataset of parsed links from Hollins University website is quite old, though it demonstrates the work of the algorithm pretty well. You don't need to download the dataset, it is already present in the data
folder, though you can do it using this link.
All of the steps and conclusions are described in this pdf document.