/concurrent_doc_classifier_knn

K-Nearest Neighbor Text Categorization - Concurrent Implementation

Primary LanguageJava

doc_classifier_knn

Concurrent Java Implementation of K-Nearest Neighbor Text Categorization

BBC Dataset is used for test(filenames are refactored to [categoryname]-[number] ). Dataset is under "data" folder. http://mlg.ucd.ie/datasets/bbc.html

Run: java -jar doc_classifier_knn.jar -Dir=data -categories=business,entertainment,politics,sport,tech -trainingSize=150 -k=3

Written for experimental purpose. It could be optimized and expanded.