Given a directed social graph, have to predict missing links to recommend users (Link Prediction in graph)
Taken data from facebook's recruting challenge on kaggle https://www.kaggle.com/c/FacebookRecruiting
data contains two columns source and destination eac edge in graph
- Data columns (total 2 columns):
- source_node int64
- destination_node int64
- Generated training samples of good and bad links from given directed graph and for each link got some features like no of followers, is he followed back, page rank, katz score, adar index, some svd fetures of adj matrix, some weight features etc. and trained ml model based on these features to predict link.
- No low-latency requirement.
- Probability of prediction is useful to recommend ighest probability links
- Both precision and recall is important so F1 score is good choice
- Confusion matrix
- Jaccard distance
- Cosine distance (Otsuka-Ochiai coefficient)
- Page Rank
- Shortest path
- Connected Components
- Adar Index
- Does the person follow back?
- Katz Centrality
- Hits score
- Singular Value Decompostion
- Weight features
- Preferential Attachment (
ASSIGNMENT
) - SVD_dot (
ASSIGNMENT
)
-
Random forest
-
XGBOOST (
ASSIGNMENT
)
- For running the code on low computional devices USE THIS
link
to download the processed file https://drive.google.com/drive/folders/1Q1beCGjWMjZxO-b8BY5x2M0VWNmZmGyC?usp=sharing