/Sagemaker-Plagiarism-Detection

Project for Machine Learning Engineer Nanodegree, unit 4 (Machine Learning, Case Studies).

Primary LanguageJupyter Notebook

SageMaker Plagiarism Detector Project

Project for Machine Learning Engineer Nanodegree, unit 4.

This repository is created on behalf of Machine Learning Engineer Nanodegree Program. On this repo, you might find "Project: Plagiarism Detector". Binary classification is performed, labeling a set of text files as plagiarized or not. Different techniques including containment and longest common subsequence are used to find similarities among the Wikipedia source texts and the Student answer texts.