AMontgomerie/question_generator

</s> special tokens are coming in answers

Threepointone4 opened this issue · 7 comments

The special character is coming as part of the answer. One warring i saw which may be related to this :
UserWarning: This sequence already has . In future versions, this behavior may lead to duplicated eos tokens being added.
f"This sequence already has {self.eos_token}. In future versions, this behavior may lead to duplicated eos tokens being added."

My version : transformers==4.1.1

Hi @Threepointone4 @AMontgomerie I am also facing the same problem, is it solved by any chance ?

Don't append at the end to the input, the library does that for you.

I meant don't append ""

dont append </s>

Hello, this is the following code I used to generate questions. I also have the same problem as @Threepointone4. As shown in my code, I am not appending anything to the input @Punit-Koujalgi.

Here is my code:

qg = QuestionGenerator()

with open('owl_rescue.txt', 'r') as a:
article = a.read()

qa_list = qg.generate(
article,
num_questions=10,
answer_style='all'
)

print_qa(qa_list)

@AMontgomerie is there any solution for this?

Thanks in advance! I greatly appreciate it!

@ALL sorry for the late response,
Just upgrade to new transformer this will not happen and
self.qg_tokenizer.decode(output[0], skip_special_tokens=True) this skip_special_token should be true in newer transformer

I've added skip_special_tokens=True to master now.