A machine learning project on detecting plagiarism, deployed using Sagemaker, delivered as part of the Udacity machine learning engineer nanodegree.
Containment and longest common sequence are used to identify similarities among pairs of text. Naive Bayes is used as classifier and returns 92% accuracy.
The project runs in aws environment
Python, AWS Sagemaker, AWS Lambda, Amazon API Gateway and Locust.io