what is the form of Wikipedia ? xml or json or text?
Closed this issue · 1 comments
chendi1995 commented
thanks. I just use the xml but it failed. it says "ValueError: Sentence boundaries unset. You can add the 'sentencizer' component to the pipeline with: nlp.add_pipe(nlp.create_pipe('sentencizer')) Alternatively, add the dependency parser, or set sentence boundaries by setting doc[i].sent_start "