/heartfulness-similar-content-service

📑Semantic content similarity search experiment on heartfulness.org mission literature dataset. Developed using pytorch and azure machine learning service.

Primary LanguageJupyter Notebook

📑 Similar Content Service - Heartathon 2019

Applied Flair Word+Document Embeddings on a small subset of the given mission literature dataset. Then computed cosine similarity on the embedding vectors. Top 'k' elements from resulting vector are mapped with the content id's and sent back as 'Similar Content' in an REST API.

Tech Stack includes Python (pytorch, flair, pandas) + Azure Machine Learning Service for training in cloud and model deployment as webservice (training on full dataset is in progress).

Complete API Documentation

https://documenter.getpostman.com/view/5756089/SVfGzCVu?version=latest

References

  1. Flair: State-of-the-Art Natural Language Processing Library (NLP)
  2. Contextual String Embeddings for Sequence Labelling
  3. Text Similarities : Estimate the degree of similarity between two texts
  4. Quick review on Text Clustering and Text Similarity Approaches