THUDM/KBRD

Show_bias.py not executing

ahtsham58 opened this issue · 14 comments

Hey @qibinc

I am done with all the three training steps. I can see model files and everything in "saved" folder.

Now I want to run last command "show_bias.py" but getting an error.. here is the error description..
could you please me ..?

[ pytorch_preprocess: False ] [ pytorch_teacher_batch_sort: False ] [ pytorch_teacher_dataset: None ] [ pytorch_teacher_task: None ] [ shuffle: False ] [ ParlAI Image Preprocessing Arguments: ] [ image_cropsize: 224 ] [ image_size: 256 ] Traceback (most recent call last): File "scripts/show_bias.py", line 41, in <module> agent = create_agent(opt, requireModelExists=True) File "/root/KBRD/parlai/core/agents.py", line 587, in create_agent raise RuntimeError('Need to set modelargument to use create_agent.') RuntimeError: Need to setmodelargument to use create_agent.

please can you help..

Hi,
can you please lead how i can generate dialogue conversations using this KBRD system. My model is already trained..

Hi @ahtsham58 ,

You can run the following command to generate conversations on the test set.

python scripts/display_model.py -t redial -mf saved/both_rgcn_0 -dt test

For details, you can refer to ParlAI's documentation here https://parl.ai/

Hi,
Thanks for sharing me the command for conversation generation. I executed the command successfully. currently, it uses ReDial test data to pick random conversations and the output looks like this..
image

Now I have two questions..
first, the movies name are not mentioned in conversation instead tag __unk__] is used. how can I exactly show movies names in conversation?

secondly, which data file can I use if I want to input my own customized recommendation queries?
i.e. I want to converse with the system using my own queries, what should i do in that case...

Thanks in advance!

Hi @ahtsham58 ,

Sorry that I provided the wrong command in the last reply. To obtain conversations, you should run python scripts/display_model.py -t redial -mf saved/transformer_rec_both_rgcn_0 -dt test.

  1. The unk token is used because we want to separate the evaluation of the dialog part and recommender part (in this way the movie names won't affect the dialog quality metrics such as BLEU). To obtain the recommended movies. Please add a line return Output(list(map(str, return_dict["scores"].argmax(dim=1).tolist()))) at the end of eval_step in this file https://github.com/THUDM/KBRD/blob/master/parlai/agents/kbrd/kbrd.py and then run python scripts/display_model.py -t redial -mf saved/both_rgcn_0 -dt test.
  2. Unfortunately, conversing with the system is beyond the scope of this research paper. This code base is only intended to reproduce the results presented in the paper. My suggestion is that you can look into ParlAI's code for interactively conversing with the system here https://github.com/facebookresearch/ParlAI/blob/master/parlai/scripts/interactive.py.

Hope this helps!

Hi @qibinc ,

Thanks for the detailed answer. I will try soon..

Hi @qibinc ,
Thanks for the detailed answer. I did the above and it went well.
Actually, I added one conversation data in the test set make sure the same format as redial dataset with valid movie ids as well.

Right now there is only conversation instance in my test_data.jsonl file.

Ideally it should work, but getting json error like, KeyError: 'Hey! looking for extremly horrible movie this night'

could you please tell me what else the reason could be provided that if json is in valid format with exactly same keys and everything? I only changed messages text, but movies ids are same as in dbpedia.

Hi @ahtsham58 ,

Thanks for following up and I'm glad something is working for you!

The KeyError is raised because there isn't this sentence 'Hey! looking for extremly horrible movie this night' in the text_dict.pkl file. If you look into this file (e.g., run d = pkl.load(open('data/redial/text_dict.pkl', 'rb'))), you will find it's a dict where the keys are sentences and values are extracted entities.

We use this as a cache of extracted entities, as extracting the entity from sentences on the fly causes a bottleneck during training. To solve this, we decided to save the results to text_dict.pkl.

Therefore, when adding a new sentence, first we need to extract entities from this sentence, (which maybe Horror_film in this case). To do this, you'll need to set up https://github.com/dbpedia-spotlight/spotlight-docker. Then, you can refer to this:

import requests
from tqdm import tqdm
# Set up a local dbpedia-spotlight docker https://github.com/dbpedia-spotlight/spotlight-docker
DBPEDIA_SPOTLIGHT_ADDR = " http://0.0.0.0:2222/rest/annotate"
SPOTLIGHT_CONFIDENCE = 0.1
def _id2dbpedia(movie_id):
pass
def _text2entities(text):
headers = {"accept": "application/json"}
params = {"text": text, "confidence": SPOTLIGHT_CONFIDENCE}
response = requests.get(DBPEDIA_SPOTLIGHT_ADDR, headers=headers, params=params)
response = response.json()
return (
[f"<{x['@URI']}>" for x in response["Resources"]]
if "Resources" in response
else []
)

Finally, you should add this new entry to text_dict.pkl and everything should go well again!

ohhhh. I see.
Really grateful to you. You were so kind in response. Bless you!

Hi,
can you please lead how i can generate dialogue conversations using this KBRD system. My model is already trained..

can you tell me how to execute the codes and the process of works;
when I run the “bash scripts/both.sh 4 4 ” the mention is no module "parlai"

Hi @Ashappyboy ,

Please add the project path (/xx/xx/KBRD) to your $PYTHONPATH

Hi @Ashappyboy ,

  1. KBRD/data/redial 有一部分是training会自动下载的,然后下载完就把解压好的pkl放进去。
  2. dbpedia 请放在 KBRD/dbpedia

很抱歉 README 写的比较粗糙,希望这个回复可以说清楚。

同祝安好~