/arffBuilder

Document classification with WEKA

Primary LanguageJava

Document Classification with WEKA. (Wroclaw, 2017)

Artificial Intelligence Lab, by Prof. Maciej Piasecki

Getting familiar with the representation of text documents as vectors of word frequencies. Learning basic methods for feature selection and Machine Learning algorithms for classification. Obtaining basic skills in using Weka environment for Machine Learning.

The main task is to train a classifier based on Machine Learning for recognising documents that belong to one of the selected categories on the basis of the content of the documents.

The very first step in this lab assignment was to build up a program to transform a bunch of labelled documents into proper arff files readable for WEKA environment. This program was actually implemented as a set of tools, that works in sequence to reach the goal.

Used Technologies:

  • Java SE
  • Eclipse
  • WEKA
  • Git

For more info see: