git clone https://github.com/HTAustin/CAL.git
- Intall Sofia-ML package: https://code.google.com/archive/p/sofia-ml/
- Make the kissdb indexer.
cd CAL && make
- Change the path for Sofia-ML in doAll_Baseline
SOFIA="/the/path/to/sofia-ml-read-only/src/sofia-ml"
- Apply word tf-idf features:
bash doAll_Baseline
- Or apply 4-gram tf-idf features:
bash doAll_Baseline_4gram
- The output of BMI are stored in
result/
folder. - The gain curve can be plotted by analyzing
$TOPIC.record.list
- Change number of threads in
doAll_Baseline
by changing the variableMAXTHREADS
(default=4)
Please feel free to open issues and report bugs.