A script json2txt.py
converts a JSON file into a text file which can be used as an input for TransRec.
For Amazon datasets,
python3 ./json2txt.py reviews_Office_Products_5.json.gz | gzip > reviews_Office_Products_5.txt.gz
For other datasets, use -d
option.
python3 ./json2txt.py -d googlelocal reviews.clean.json.gz | gzip > reviews.clean.txt
Paper: http://cseweb.ucsd.edu/~jmcauley/pdfs/recsys17.pdf
Code: https://drive.google.com/file/d/0B9Ck8jw-TZUEVmdROWZKTy1fcEE/view?usp=sharing
- Edit
src/main.cpp
to usego_TransRec
. make clean
make
You can use the result executable train
as the following:
./train reviews_Automotive.txt.gz 5 5 10 0.1 0.1 0.01 0 10000 my_model_path_blah_blah
Data: http://jmcauley.ucsd.edu/data/amazon/
We can use json2txt.py
to convert Amazon's 5-core JSON datasets to txt.
Data: http://jmcauley.ucsd.edu/data/googlelocal/googlelocal.tar.gz
Actually, this "JSON" file is not a JSON. So I wrote a converter for dealing with this problem. You can convert reviews.clean.json
by the following command.
go run ./toJSON.go < reviews.clean.json > reviews.clean.real.json
If you think this conversion is slow, you can compile the Go source by using go build ./toJSON.go
and run ./toJSON
.
We can use json2txt.py
to convert JSON to txt.
Data: http://jmcauley.ucsd.edu/data/epinions/
Data: https://archive.org/details/201309_foursquare_dataset_umn
Data: http://www.cs.ubc.ca/~jamalim/datasets/ --> http://socialcomputing.asu.edu/datasets/Flixster