tf-idf_wikipedia_Celebrities

Using tf-idf an NLP technique to calculate the similarity amongst different celebrities. The interesting pattern is that the highest similarity is between people in same profession or field of work which actually makes sense and shows the power of tf-idf.

Used graphlab for the analysis and model generation.

You can get the csv version of the dataset from this link: https://drive.google.com/file/d/0B91JEPO_jfR4ajRNOUY5OUhSUTg/view?usp=sharing