X matrix
iyedbennour opened this issue · 1 comments
iyedbennour commented
Hi ! can you describe the process to create the X matrix that is contained in the different npz files please ? How do you transform the cora papers into features.
Thank you !
abojchevski commented
Sorry for the late reply. The X matrix contains the TF-IDF representation of the text in the paper abstracts. Specifically, I used the https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html on the raw text data.
Hope this helps.