Issue about dataset format

Question

Issue about dataset format

albat3ross opened this issue 4 years ago · 2 comments

Hello,
As we were trying to re-implement the model onto other datasets, we get stuck at the generation feature.bin file. Your team has mentioned that we could use txt2bin.py to convert the feature files from txt into binary format, but I'm not sure what should the feature files looks like when it is in .txt form.
Can you provide a few lines of example for the txt feature files? It would be great if there're some example files for reference.
Thank you for your help!

Answer 1 · 2020-10-16T04:09:14.000Z

Please refer to here. The format of each line is an id followed by a feature vector.
ps: We have already released our feature extraction code.

Answer 2 · 2020-10-16T21:02:53.000Z

Thank you for the example! It would be very helpful for us.