nert-nlp/pastrie

Sentence text is tokenized, should be raw

nschneid opened this issue · 1 comments

I need to figure out the best way to obtain the raw version of each sentence.

Example:

-# text = Just because he taunts our own tubby ( a trend of taunting which started en-masse after GWB s own childish ' Axis of Evil ' bullshit ) does n't make him crazy either ....
+# text = Just because he taunts our own tubby (a trend of taunting which started en-masse after GWB's own childish 'Axis of Evil' bullshit) doesn't make him crazy either....