On 20News data processing
Chen-Hailin opened this issue · 1 comments
Chen-Hailin commented
When I re-run the existing system for 20news, I find that the input data text is like: "Newsgroups: rec.motorcycles\nPath: cantaloupe.srv.cs.cmu.edu!ro ...etc".
Am I right that you do not discard the header (which often contains the name of the newgroup label) during data processing?
ZixuanKe commented
Hi Hailin,
Thank you for your interest!
Yes. We didn’t apply any pre-processing to the 20 newsgroup data.
Feel free to re-open if you have further questions.
Thank you for your time,
Zixuan