patil-suraj/question_generation

ValueError: substring not found

bipinkc19 opened this issue · 13 comments

image

I cant disclose the text, but this is the error I get in every model

I have also faced the same issue, Can you check your text if it contains any trademark symbol in between any word.

for example:
interface to Teamcenter™s Business
when I removed that ™ from the above string it works fine,
interface to Teamcenter's Business

image
If I remove the has and put some other word like free it works

How do we fix this
image

Okay, I got that, I have commented on a specific problem. In my case, I am using a pdf file consisting of 50 pages, I checked it with some pages, it was throwing the same error, Maybe it is due to the model not able to generate the answer of specific text.

If you want an instant solution then I will suggest using try: exception at least you will get some question and answer.

This is just a temporary solution. I am looking into this issue. will update soon.

May be when it can't generate the answers in some text and throws error like you said.
Thank you for looking in to it.

I have done a pull request, lets see when it will be accepted, you can always try the approach of Exception handling.

@neelkantnewra how was the inference times for 50 pg pdf? Also did you try fine tuning the model?

@neelkantnewra how was the inference times for 50 pg pdf? Also did you try fine tuning the model?

I don't remember exactly, currently, I am busy with another project. Yes, we fine-tune it, else it will give the worse result.

I had the same error, There are some special characters in input string which models are unable to process. So just remove special characters and it will work.

I had the same error, There are some special characters in input string which models are unable to process. So just remove special characters and it will work.

That is one case, but when the model is not finding any suitable question-answer pair it through an empty dict and since it is empty we get a Value error. we can solve this by exception Handling as previously mentioned.

@neelkantnewra how was the inference times for 50 pg pdf? Also did you try fine tuning the model?

I don't remember exactly, currently, I am busy with another project. Yes, we fine-tune it, else it will give the worse result.

@neelkantnewra Can you please guide us with the fine-tuning code on custom data?

When using the text "42 is the answer to life, universe and everything" same error occurs. How to solve this?

@liesketrommelen You need to use transformers 3.0.0