Michael Merki, Julien Romero, Markus Greiner
November 14, 2016
Provided is the zip file project1_group11.zip
. Unzip it:
$ unzip project1_group11.zip
This creates the directory project1_group11
, which contains:
- the sources under
src
tinyir.jar
as part oflib
build.sbt
- the directory
labelingtestdocs
which contains the three resulting test results. - and this
README.md
file.
Note: We had continuing conflicts between the tinyir and the breeze libraries. This is why we decided to build tinyir for Scala 2.11.5 and provide it as jar.
To run:
$ cd project1
$ sbt "run-main Main <path-to-data-folder>"
The <path-to-data-folder>
must contain the directories train
, test
, and validation
.
It is also possible to give options to influence the iterations and learning rates of the linear regression classifier. These options are called:
ITERATION=<nof-iterations-integer>
LEARNING=<learning-rate-double>
SKIP=(BAYES|LINREG|SVM)+
It can be run like this:
$ sbt "run-main <path-to-data-folder> ITERATION=10000 LEARNING=0.001 SKIP=BAYES,SVM"
Upon running the program, it will
- Naive Bayes Classifier: train and generate list of tested documents and their codes
- Logistic Regression Classifier: train and generate the list
- SVM: train and generate the list.
The result files are called ir-project-2016-1-11-[nb|lr|lsvm].txt
and they are located under labelingtestdocs
The project report is under ir-2016-1-report-11.pdf