Erdos collaboration graph (node representing Erdos is removed) and Ant colony graph will be used to perfrom the following analysis.
I will partition the nodes of an input graph (either directed or undirected) into clusters (communities) with the
help of the following algorithm. Let the neighbours of node $v$ be denoted by $math v_1 , v_2 , \ldots, v_k$ , and let each node
also have a label $C_v$ that gives the community it belongs to. A node $v$ will determine its label based on the
labels of its neighbours: it will update it to the value that the maximum number of its neighbours have. If
there are multiple possible labels with the same maximum amount of neighbours, then the tie is broken in
a uniformly random way. Because labels will be updated iteratively, we define $C_v (t)$ as the label of a node
in iteration $t$. Nodes can be updated in either a synchronous or asynchronous way. Synchronous updating
means that the label of a node $v$ in iteration $t$ will depend on the labels of its neighbours in the previous
iteration: $C_v(t) = f(C_{v_1}(t-1), C_{v_2}(t-1), \ldots, C_{v_k}(t-1)$.
However, this can lead to the oscillation of labels in certain neighbourhood structures. In the case
of asynchronous updates, the new values will be considered for neighbours whose label has already been
updated. Let $v_{i1} , v_{i2} , \ldots, $v_{im}$ denote the already updated $m$ neighbours of $v$, the asynchronous update is as
follows: $C_v(t) = f(C_{v1}(t), C_{v2}(t), \ldots, C_{vm}(t), C_{v(m+1)}(t-1), \ldots, C_{vk}(t-1))$.
The algorithm iterates until it can update any label. If no label updates happen in an iteration, the
algorithm terminates, and the current labels give the clusters (communities) where each node belongs. The
outline of the algorithm can be seen below:
Initialize labels: For all $v \in V$, let $C_v(0) = v$
Set $t=1$
Determine random update order for $V$
For each $v \in V$ in the update order of 3., determine $C_v(t)$
If no label has been updated, terminate. Otherwise, $t=t + 1$ and go to 3.
Except implementing the algorithm I will aslo perform a $k$-means and a hierarchical clustering of the nodes of the graph (using the number of clusters I got from the algorithm above). The shortest path will be used as a distance between nodes, I will also try different linkage methods and centroids.