This is a workshop for XBrain.
- '/data/t_alibaba_data.csv' is the original data file;
- '/data/train.csv' contains original data for training;(month 4 to 7)
- '/data/test.csv' contains original data for testing;(month 8)
- '/src/pre_data.py' is the python script used for getting 'train.csv' and 'test.csv';(unnecessary now)
- '/src/build_truth.py' is used for building the 'groundtruth.txt' file;
- '/src/build_index.py' is used for building the 'user_id.txt' and 'brand_id.txt' files;
- '/src/build_behavior_matrix.py' is used for building the 'behavior_matrix.txt';
- '/src/kNN.py' provides basic methods like kNN-find and topN-votes;
- '/src/predict.py' combine the tools and files above to give a prediction 'predict.txt';
- '/src/evaluate.py' is used for evaluating the performance of algorithm;
- '/result/groundtruth.txt' is generated easily and '/result/predict.txt' is our algorithm's output; (now these two files are just model);
To do: Design algorithm and give a proper 'predict.txt' file ASAP.
By Sean.