/HITS-and-PageRank-Algorithm

A very basic implementation of the HITS (HyperLink Induced Topic Search) algorithm

Primary LanguagePythonMIT LicenseMIT

HITS-and-PageRank-Algorithm

A very basic implementation of the HITS (HyperLink Induced Topic Search) and the PageRank Algorithms

The Authority Score of any node is given by:

equation
The Hubbiness Score of any node is given by:

equation
And the algorithm is run for a certain number of iterations until it converges by the following steps:

  1. Authority Update Round
  2. Hub Update Round
  3. Authority and Hub Scaling

The scaling is done by:
equation
equation

The PageRank Score of any node in an iteration is given by:
equation
where equation is the number of outbound links (in this case, just the number of links since graph is undirected) of node equation and equation is the PageRank of nodes in the previous iteration.

This also takes into account the damping parameter equation where equation with a default of 0.85, in order to take into account the user's patience while surfing the web.
equation
where equation is the total number of documents/webpages in consideration, in this case, the total number of Nodes.

To use this implementation, please clone this repository first and use the hits.py or pagerank.py file:
git clone https://github.com/Bharat123rox/HITS-and-PageRank-Algorithm.git
To use the Algos:

    c = HITS(G=Graph) or HITS(file='graph.txt') 
    hubs, authorities = c.compute_hits()
    c = PageRank(G=Graph) or PageRank(file='graph.txt')
    pagerank = c.compute_pagerank()