Zyh716/WSDM2022-C2CRS

About evaluation

Closed this issue · 7 comments

Hi yuanhang,
first of all, thanks for your work. The results of Dist2, 3, 4 on the KGSF dialogue module reported in the paper are different from those reported in the KGSF paper itself. Is this because you and KGSF use different evaluation scripts? If the answer is yes, can you specify the difference between KGSF's evaluation script and yours.
Thanks a lot! Lucy

Hello, You can see our Dist@N method in this file(https://github.com/Zyh716/WSDM2022-C2CRS/blob/main/crslab/evaluator/standard.py). Especially starting from line 118.

We use this method to compute the Dist@N for C2CRS and all baselines (including KGSF).

wow! Thanks for your quick reply. I will check this carefully!

Hey, yuanhang, sorry to bother you again. When I use sh script/redial/eval/redial_rec_eval.sh, they report the result of Hit@N. I want to know, Hit@N=Recall@N in this work? Also, it took me so long to train the recommendation module, about 2 days(using CPU, default settings in repo), is this right?

  1. Yes, Hit@N=Recall@N in this work
  2. In a 3090, it approximately spends 2.5 hours to finish the pre-training, 0.5 hours to finish the fine-tuning onrecommendation task, 8 hours to finish the fine-tuning on generation task. You can refer to this time. I have not run on the CPU and not quite know the time it will spend

The reason is I should set CUDA_VISIBLE_DEVICES=0. Thanks again!

The reason is I should set CUDA_VISIBLE_DEVICES=0. Thanks again!

can you find redial_context_movie_id2crslab_entityId.json file?The initial nltk is nonvalid and I use the updated path in CRSLAB but can't find redial_context_movie_id2crslab_entityId.json file