Text-Classification---Reuters

Classifying articles based on body of text using Weka jar on Java

ExtractFiles.java iterates through all SGM files(each with more than one article) in a directory and divides it into files that contain only the heading and body of text (i.e. tags etc are removed) of ONE article alone
CreateDataset.java is used to create the arff dataset for weka from a set of text files
Classifiers.java performs the classification on the arff dataset using tfidf algorithm

vishakhpk/Text-Classification---Reuters