rcv1 sample replaced with the original data but not getting result.

Question

rcv1 sample replaced with the original data but not getting result.

Opened this issue 3 years ago · 6 comments

Hi. I got the original RCV1-V2 dataset used your preprocess file (clean_str and clean_stopwors) and converted it to the required format {'token': List[str], 'label': List[str]}. I made the train, val, and test dataset according to the benchmark data split.

Then ran the helper.hierarchical_tree_statistic.py to get the rcv1-prob.json file.

Then used your taxonomy file and the "gcn-rcv1-v2.json" config file (which I only replaced "hierarchy": "sample_rcv1.taxonomy" with "hierarchy": "rcv1.taxonomy") but I am not getting a result. The precision, recall, Micro-f1 and Macro-f1 are all zero and loss is nan.

Do I need to perform any additional steps when running your code on other data (not the sample rcv1 dataset)?

Answer 1 · 2021-09-02T06:18:31.000Z

Hi, Did you try to run this model directly using their RCV1-V2 data after preprocessing, which is stored in the HiAGM/data? I ran it and got a very bad result, which is far away from the result in this paper.

Answer 2 · 2021-09-03T07:56:06.000Z

Hi, Did you try to run this model directly using their RCV1-V2 data after preprocessing, which is stored in the HiAGM/data? I ran it and got a very bad result, which is far away from the result in this paper.

Yes. Same here!
The Sample data gives a result but obviously not a good result cause it's only sample data.
However, when getting the original RCV1-V2 dataset and using the preprocessed file they provided I got the precision, recall, Micro-f1 and Macro-f1 all zero and loss is nan.
I also tried with the WoS dataset and used their preprocess file and still could not get good results!

Answer 3 · 2022-11-28T08:15:13.000Z

Hi, I try this code on another dataset and I have the same problem, I got the precision, recall, Micro-f1, and Macro-f1 all zero, and the loss is nan. Have you solved this problem finally?

Answer 4 · 2024-03-02T10:56:52.000Z

你好。我使用您的预处理文件（clean_str 和 clean_stopwors）获取原始 RCV1-V2 数据集，并将其转换为所需的格式 {'token': List[str], 'label': List[str]}。我根据基准数据分割制作了训练、验证和测试数据集。

然后运行helper.hierarchical_tree_statistic.py以获取rcv1-prob.json文件。

然后使用您的分类文件和“gcn-rcv1-v2.json”配置文件（我仅将“hierarchy”：“sample_rcv1.taxonomy”替换为“hierarchy”：“rcv1.taxonomy”），但我没有得到结果。精度、召回率、Micro-f1 和 Macro-f1 均为零，损失为 nan。

在其他数据（不是示例 rcv1 数据集）上运行代码时，我是否需要执行任何其他步骤？

请问一下您是怎么操作的呢，我一直操作报错

Answer 5 · 2024-08-03T11:24:49.000Z

Me too. precision 、recall 、f1-score are 0,0,0,

Answer 6 · 2024-08-04T15:45:10.000Z

You should using the complete version of dataset. The dataset which now available in this repository isn't complete and for this reason you can not get good out come!