stanfordnlp/CoreNLP

Expecting value: line 1 column 1 (char 0) when Tregexing on large text

jiangweiatgithub opened this issue · 12 comments

When trying Tregexing on the following large text by Python, I get the following, even though I have timeout = 300000 for the server:

src = requests.post( 'http://' + server + ':9008/tregex?pattern=' + pos_string + '&filter=False&properties={"annotators":"tokenize,ssplit,pos,ner,depparse,parse","outputFormat":"json"}', data={'data': src0}, headers={'Connection': 'close'}, timeout = 300).json()
Expecting value: line 1 column 1 (char 0)

Article 5 of the Vienna Convention on Consular Relations lists the following consular functions: “(a) protecting in the receiving State the interests of the sending State and of its nationals, both individuals and bodies corporate, within the limits permitted by international law; (b) furthering the development of commercial, economic, cultural and scientific relations between the sending State and the receiving State and otherwise promoting friendly relations between them in accordance with the provisions of the present Convention; (c) ascertaining by all lawful means conditions and developments in the commercial, economic, cultural and scientific life of the receiving State, reporting thereon to the Government of the sending State and giving information to persons interested; (d) issuing passports and travel documents to nationals of the sending State, and visas or appropriate documents to persons wishing to travel to the sending State; (e) helping and assisting nationals, both individuals and bodies corporate, of the sending State; (f) acting as notary and civil registrar and in capacities of a similar kind, and performing certain functions of an administrative nature, provided that there is nothing contrary thereto in the laws and regulations of the receiving State; (g) safeguarding the interests of nationals, both individuals and bodies corporate, of the sending States in cases of succession mortis causa in the territory of the receiving State, in accordance with the laws and regulations of the receiving State; (h) safeguarding, within the limits imposed by the laws and regulations of the receiving State, the interests of minors and other persons lacking full capacity who are nationals of the sending State, particularly where any guardianship or trusteeship is required with respect to such persons; (i) subject to the practices and procedures obtaining in the receiving State, representing or arranging appropriate representation for nationals of the sending State before the tribunals and other authorities of the receiving State, for the purpose of obtaining, in accordance with the laws and regulations of the receiving State, provisional measures for the preservation of the rights and interests of these nationals, where, because of absence or any other reason, such nationals are unable at the proper time to assume the defence of their rights and interests; (j) transmitting judicial and extra-judicial documents or executing letters rogatory or commissions to take evidence for the courts of the sending State in accordance with international agreements in force or, in the absence of such international agreements, in any other manner compatible with the laws and regulations of the receiving State; (k) exercising rights of supervision and inspection provided for in the laws and regulations of the sending State in respect of vessels having the nationality of the sending State, and of aircraft registered in that State, and in respect of their crews; (l) extending assistance to vessels and aircraft mentioned in subparagraph (k) of this article, and to their crews, taking statements regarding the voyage of a vessel, examining and stamping the ship’s papers, and, without prejudice to the powers of the authorities of the receiving State, conducting investigations into any incidents which occurred during the voyage, and settling disputes of any kind between the master, the officers and the seamen in so far as this may be authorized by the laws and regulations of the sending State; (m) performing any other functions entrusted to a consular post by the sending State which are not prohibited by the laws and regulations of the receiving State or to which no objection is taken by the receiving State or which are referred to in the international agreements in force between the sending State and the receiving State”.

Which python interface are you using? Stanza?

What CoreNLP settings (if any) did you use? This looks exactly like the kind of sentence which would cause the PCFG to crash (not that the SRParser would deliver particularly useful results)

Out of curiosity, I put this into the non-bert version of the stanza constituency parser and got this:

(ROOT
  (S
    (S
      (NP
        (NP (NN Article) (CD 5))
        (PP
          (IN of)
          (NP
            (NP (DT the) (NNP Vienna) (NNP Convention))
            (PP
              (IN on)
              (NP (NNP Consular) (NNPS Relations))))))
      (VP
        (VBZ lists)
        (NP
          (NP
            (NP (DT the) (VBG following) (JJ consular) (NNS functions))
            (: :)
            (`` “)
            (-LRB- -LRB-)
            (S
              (INTJ (LS a) (-RRB- -RRB-))
              (VP
                (VBG protecting)
                (PP
                  (IN in)
                  (NP (DT the) (VBG receiving) (NN State)))
                (NP
                  (NP (DT the) (NNS interests))
                  (PP
                    (IN of)
                    (NP
                      (NP
                        (NP (DT the) (VBG sending) (NNP State))
                        (CC and)
                        (PP
                          (IN of)
                          (NP (PRP$ its) (NNS nationals))))
                      (, ,)
                      (NP
                        (NP (CC both) (NNS individuals) (CC and) (NNS bodies))
                        (ADJP (JJ corporate)))
                      (, ,)
                      (PP
                        (IN within)
                        (NP
                          (NP (DT the) (NNS limits))
                          (VP
                            (VBN permitted)
                            (PP
                              (IN by)
                              (NP (JJ international) (NN law)))))))))))
            (, ;)
            (-LRB- -LRB-)
            (LS b)
            (-RRB- -RRB-))
          (VP
            (VP
              (VBG furthering)
              (NP
                (NP (DT the) (NN development))
                (PP
                  (IN of)
                  (NP
                    (NP
                      (ADJP (JJ commercial) (, ,) (JJ economic) (, ,) (JJ cultural) (CC and) (JJ scientific))
                      (NNS relations))
                    (PP
                      (IN between)
                      (NP
                        (NP (DT the) (VBG sending) (NNP State))
                        (CC and)
                        (NP (DT the) (VBG receiving) (NN State))))))))
            (CC and)
            (ADVP (RB otherwise))
            (VP
              (VBG promoting)
              (NP
                (NP (JJ friendly) (NNS relations))
                (PP
                  (IN between)
                  (NP (PRP them))))
              (PP
                (IN in)
                (NP
                  (NP (NN accordance))
                  (PP
                    (IN with)
                    (NP
                      (NP (DT the) (NNS provisions))
                      (PP
                        (IN of)
                        (NP (DT the) (JJ present) (NN Convention))))))))))))
    (, ;)
    (-LRB- -LRB-)
    (S
      (INTJ (NN c) (-RRB- -RRB-))
      (VP
        (VP
          (VBG ascertaining)
          (PP
            (IN by)
            (NP
              (NP (DT all) (JJ lawful) (NNS means) (NNS conditions) (CC and) (NNS developments))
              (PP
                (IN in)
                (NP
                  (NP
                    (DT the)
                    (ADJP (JJ commercial) (, ,) (JJ economic) (, ,) (JJ cultural) (CC and) (JJ scientific))
                    (NN life))
                  (PP
                    (IN of)
                    (NP (DT the) (VBG receiving) (NN State))))))))
        (, ,)
        (VP
          (VBG reporting)
          (ADVP (RB thereon))
          (PP
            (IN to)
            (NP
              (NP (DT the) (NN Government))
              (PP
                (IN of)
                (NP (DT the) (VBG sending) (NNP State))))))
        (CC and)
        (VP
          (VBG giving)
          (NP (NN information))
          (PP
            (IN to)
            (NP
              (NP (NNS persons))
              (ADJP (JJ interested)))))))
    (, ;)
    (-LRB- -LRB-)
    (S
      (INTJ (LS d) (-RRB- -RRB-))
      (VP
        (VP
          (VBG issuing)
          (NP
            (NP (NNS passports))
            (CC and)
            (NP (NN travel) (NNS documents)))
          (PP
            (IN to)
            (NP
              (NP
                (NP (NNS nationals))
                (PP
                  (IN of)
                  (NP (DT the) (VBG sending) (NN State))))
              (, ,)
              (CC and)
              (NP (NNS visas))
              (CC or)
              (NP
                (NP (JJ appropriate) (NNS documents))
                (PP
                  (IN to)
                  (NP
                    (NP (NNS persons))
                    (VP
                      (VBG wishing)
                      (S
                        (VP
                          (TO to)
                          (VP
                            (VB travel)
                            (PP
                              (IN to)
                              (NP (DT the) (NN sending) (NNP State))))))))))
              (, ;)
              (-LRB- -LRB-)
              (NP
                (INTJ (NN e) (-RRB- -RRB-))
                (VP
                  (VBG helping)
                  (CC and)
                  (VBG assisting)
                  (NP
                    (NP (NNS nationals))
                    (, ,)
                    (NP
                      (NP (CC both) (NNS individuals) (CC and) (NNS bodies))
                      (ADJP (JJ corporate)))
                    (, ,)
                    (PP
                      (IN of)
                      (NP (DT the) (VBG sending) (NNP State))))))
              (, ;)
              (-LRB- -LRB-)
              (NP (NNP f))
              (-RRB- -RRB-)))
          (S
            (VP
              (VBG acting)
              (PP
                (PP
                  (IN as)
                  (NP
                    (UCP (NN notary) (CC and) (JJ civil))
                    (NN registrar)))
                (CC and)
                (PP
                  (IN in)
                  (NP
                    (NP (NNS capacities))
                    (PP
                      (IN of)
                      (NP (DT a) (JJ similar) (NN kind)))))))))
        (, ,)
        (CC and)
        (VP
          (VBG performing)
          (NP
            (NP (JJ certain) (NNS functions))
            (PP
              (IN of)
              (NP (DT an) (JJ administrative) (NN nature)))))))
    (, ,)
    (S
      (VP
        (VBN provided)
        (SBAR
          (IN that)
          (S
            (NP (EX there))
            (VP
              (VBZ is)
              (NP
                (NP (NN nothing))
                (ADJP
                  (ADJP (JJ contrary))
                  (RB thereto)
                  (PP
                    (IN in)
                    (NP
                      (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                      (PP
                        (IN of)
                        (NP (DT the) (VBG receiving) (NN State))))))))))))
    (, ;)
    (-LRB- -LRB-)
    (S
      (S
        (INTJ (NN g) (-RRB- -RRB-))
        (VP
          (VBG safeguarding)
          (NP
            (NP (DT the) (NNS interests))
            (PP
              (IN of)
              (NP
                (NP (NNS nationals))
                (, ,)
                (NP
                  (NP (CC both) (NNS individuals) (CC and) (NNS bodies))
                  (ADJP (JJ corporate)))
                (, ,)))
            (PP
              (IN of)
              (NP (DT the) (VBG sending) (NNPS States)))
            (PP
              (IN in)
              (NP
                (NP (NNS cases))
                (PP
                  (IN of)
                  (NP
                    (NP (NN succession) (NN mortis) (NN causa))
                    (PP
                      (IN in)
                      (NP
                        (NP (DT the) (NN territory))
                        (PP
                          (IN of)
                          (NP (DT the) (VBG receiving) (NN State)))))))))
            (, ,)
            (PP
              (IN in)
              (NP
                (NP (NN accordance))
                (PP
                  (IN with)
                  (NP
                    (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                    (PP
                      (IN of)
                      (NP (DT the) (VBG receiving) (NN State))))))))))
      (, ;)
      (-LRB- -LRB-)
      (NP
        (INTJ (NN h) (-RRB- -RRB-))
        (NN safeguarding))
      (, ,)
      (PP
        (IN within)
        (NP
          (NP (DT the) (NNS limits))
          (VP
            (VBN imposed)
            (PP
              (IN by)
              (NP
                (NP
                  (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                  (PP
                    (IN of)
                    (NP (DT the) (VBG receiving) (NNP State))))
                (, ,)
                (NP
                  (NP
                    (NP (DT the) (NNS interests))
                    (PP
                      (IN of)
                      (NP
                        (NP (NNS minors))
                        (CC and)
                        (NP
                          (NP (JJ other) (NNS persons))
                          (VP
                            (VBG lacking)
                            (NP (JJ full) (NN capacity)))))))
                  (SBAR
                    (WHNP (WP who))
                    (S
                      (VP
                        (VBP are)
                        (NP
                          (NP
                            (NP (NNS nationals))
                            (PP
                              (IN of)
                              (NP (DT the) (NN sending) (NNP State))))
                          (, ,)
                          (WHADVP
                            (WHADVP (RB particularly))
                            (SBAR
                              (WHADVP (WRB where))
                              (S
                                (NP (DT any) (NN guardianship) (CC or) (NN trusteeship))
                                (VP
                                  (VBZ is)
                                  (VP
                                    (VBN required)
                                    (PP
                                      (IN with)
                                      (NP
                                        (NP (NN respect))
                                        (PP
                                          (IN to)
                                          (NP (JJ such) (NNS persons))))))))))))))))))))
      (, ;)
      (-LRB- -LRB-)
      (LS i)
      (-RRB- -RRB-)
      (ADJP
        (JJ subject)
        (PP
          (IN to)
          (NP
            (NP
              (NP (DT the) (NNS practices) (CC and) (NNS procedures))
              (VP
                (VBG obtaining)
                (PP
                  (IN in)
                  (NP (DT the) (NN receiving) (NN State)))))
            (, ,)
            (VP (VBG representing))
            (CC or)
            (VP
              (VBG arranging)
              (NP
                (NP (JJ appropriate) (NN representation))
                (PP
                  (IN for)
                  (NP
                    (NP (NNS nationals))
                    (PP
                      (IN of)
                      (NP
                        (NP (DT the) (NN sending))
                        (PP
                          (IN State before)
                          (NP
                            (NP (DT the) (NNS tribunals))
                            (CC and)
                            (NP
                              (NP (JJ other) (NNS authorities))
                              (PP
                                (IN of)
                                (NP (DT the) (VBG receiving) (NNP State)))))))))))))))
      (, ,)
      (PP
        (IN for)
        (NP
          (NP (DT the) (NN purpose))
          (PP
            (IN of)
            (S
              (VP
                (VBG obtaining)
                (, ,)
                (PP
                  (IN in)
                  (NP
                    (NP (NN accordance))
                    (PP
                      (IN with)
                      (NP
                        (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                        (PP
                          (IN of)
                          (NP
                            (NP (DT the) (VBG receiving) (NN State) (, ,) (JJ provisional) (NNS measures))
                            (PP
                              (IN for)
                              (NP
                                (NP (DT the) (NN preservation))
                                (PP
                                  (IN of)
                                  (NP
                                    (NP (DT the) (NNS rights) (CC and) (NNS interests))
                                    (PP
                                      (IN of)
                                      (NP
                                        (NP (DT these) (NNS nationals))
                                        (, ,)
                                        (SBAR
                                          (WHADVP (WRB where))
                                          (, ,)
                                          (S
                                            (PP
                                              (IN because)
                                              (IN of)
                                              (NP
                                                (NP (NN absence))
                                                (CC or)
                                                (NP (DT any) (JJ other) (NN reason))))
                                            (, ,)
                                            (NP (JJ such) (NNS nationals))
                                            (VP
                                              (VBP are)
                                              (ADJP
                                                (JJ unable)
                                                (PP
                                                  (IN at)
                                                  (NP
                                                    (NP (DT the) (JJ proper) (NN time))
                                                    (SBAR
                                                      (S
                                                        (VP
                                                          (TO to)
                                                          (VP
                                                            (VB assume)
                                                            (NP
                                                              (NP (DT the) (NN defence))
                                                              (PP
                                                                (IN of)
                                                                (NP (PRP$ their) (NNS rights) (CC and) (NNS interests))))))))))))))))))))))))))))))))
    (, ;)
    (-LRB- -LRB-)
    (S
      (INTJ (NN j) (-RRB- -RRB-))
      (VP
        (VP
          (VBG transmitting)
          (NP
            (ADJP (JJ judicial) (CC and) (JJ extra-judicial))
            (NNS documents)))
        (CC or)
        (VP
          (VBG executing)
          (NP
            (NP (NNS letters))
            (NP (NN rogatory) (CC or) (NNS commissions)))
          (S
            (VP
              (TO to)
              (VP
                (VB take)
                (NP (NN evidence))
                (PP
                  (IN for)
                  (NP
                    (NP (DT the) (NNS courts))
                    (PP
                      (IN of)
                      (NP (DT the) (VBG sending) (NN State)))))
                (PP
                  (IN in)
                  (NP
                    (NP (NN accordance))
                    (PP
                      (IN with)
                      (NP
                        (NP (JJ international) (NNS agreements))
                        (PP
                          (IN in)
                          (NP (NN force)))))))))))))
    (CC or)
    (, ,)
    (PP
      (IN in)
      (NP
        (NP (DT the) (NN absence))
        (PP
          (IN of)
          (NP (JJ such) (JJ international) (NNS agreements)))))
    (, ,)
    (PP
      (PP
        (IN in)
        (NP
          (NP (DT any) (JJ other) (NN manner))
          (ADJP
            (JJ compatible)
            (PP
              (IN with)
              (NP
                (NP
                  (NP
                    (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                    (PP
                      (IN of)
                      (NP (DT the) (VBG receiving) (NN State))))
                  (, ;)
                  (-LRB- -LRB-)
                  (INTJ (LS k))
                  (-RRB- -RRB-))
                (VP
                  (VBG exercising)
                  (NP
                    (NP (NNS rights))
                    (PP
                      (IN of)
                      (NP
                        (NP (NN supervision) (CC and) (NN inspection))
                        (VP
                          (VBN provided)
                          (PP (IN for))
                          (PP
                            (IN in)
                            (NP
                              (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                              (PP
                                (IN of)
                                (NP (DT the) (VBG sending) (NN State)))))
                          (PP
                            (IN in)
                            (NP
                              (NP (NN respect))
                              (PP
                                (IN of)
                                (NP
                                  (NP (NNS vessels))
                                  (VP
                                    (VBG having)
                                    (NP
                                      (NP (DT the) (NN nationality))
                                      (PP
                                        (IN of)
                                        (NP (DT the) (NN sending) (NNP State)))))))))))))))))))
      (, ,)
      (CC and)
      (PP
        (IN of)
        (NP
          (NP (NN aircraft))
          (VP
            (VBN registered)
            (PP
              (IN in)
              (NP (DT that) (NNP State))))))
      (, ,)
      (CC and)
      (PP
        (IN in)
        (NP
          (NP (NN respect))
          (PP
            (IN of)
            (NP (PRP$ their) (NNS crews))))))
    (, ;)
    (-LRB- -LRB-)
    (NN l)
    (-RRB- -RRB-)
    (S
      (VP
        (VBG extending)
        (NP
          (NP
            (NP (NN assistance))
            (PP
              (IN to)
              (NP
                (NP (NNS vessels) (CC and) (NN aircraft))
                (VP
                  (VBN mentioned)
                  (PP
                    (IN in)
                    (NP
                      (NP
                        (NP (NN subparagraph))
                        (-LRB- -LRB-)
                        (NP (NN k))
                        (-RRB- -RRB-))
                      (PP
                        (IN of)
                        (NP (DT this) (NN article)))))))))
          (, ,)
          (CC and)
          (PP
            (IN to)
            (NP (PRP$ their) (NNS crews))))
        (, ,)
        (S
          (VP
            (VP
              (VBG taking)
              (NP
                (NP (NNS statements))
                (PP
                  (VBG regarding)
                  (NP
                    (NP (DT the) (NN voyage))
                    (PP
                      (IN of)
                      (NP (DT a) (NN vessel)))))))
            (, ,)
            (VP
              (VBG examining)
              (CC and)
              (VBG stamping)
              (NP (DT the) (NN ship))
              (NP (POS ’s) (NNS papers)))
            (, ,)
            (CC and)
            (, ,)
            (VP
              (PP
                (IN without)
                (NP
                  (NP (NN prejudice))
                  (PP
                    (IN to)
                    (NP
                      (NP (DT the) (NNS powers))
                      (PP
                        (IN of)
                        (NP
                          (NP (DT the) (NNS authorities))
                          (PP
                            (IN of)
                            (NP (DT the) (VBG receiving) (NNP State)))))))))
              (, ,)
              (S
                (VP
                  (VBG conducting)
                  (NP
                    (NP (NNS investigations))
                    (PP
                      (IN into)
                      (NP
                        (NP (DT any) (NNS incidents))
                        (SBAR
                          (WHNP (WDT which))
                          (S
                            (VP
                              (VP
                                (VBD occurred)
                                (PP
                                  (IN during)
                                  (NP (DT the) (NN voyage))))
                              (, ,)
                              (CC and)
                              (VP
                                (VBG settling)
                                (NP
                                  (NP (NNS disputes))
                                  (PP
                                    (IN of)
                                    (NP (DT any) (NN kind)))
                                  (PP
                                    (IN between)
                                    (NP
                                      (NP (DT the) (NN master))
                                      (, ,)
                                      (NP (DT the) (NNS officers))
                                      (CC and)
                                      (NP (DT the) (NNS seamen)))))
                                (PP
                                  (IN in)
                                  (ADVP
                                    (ADVP (RB so) (RB far))
                                    (SBAR
                                      (IN as)
                                      (S
                                        (NP (DT this))
                                        (VP
                                          (MD may)
                                          (VP
                                            (VB be)
                                            (VP
                                              (VBN authorized)
                                              (PP
                                                (IN by)
                                                (NP
                                                  (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                                                  (PP
                                                    (IN of)
                                                    (NP (DT the) (VBG sending) (NN State))))))))))))))))))))))))))
    (, ;)
    (-LRB- -LRB-)
    (INTJ (NN m))
    (-RRB- -RRB-)
    (S
      (VP
        (VBG performing)
        (NP
          (NP (DT any) (JJ other) (NNS functions))
          (VP
            (VBN entrusted)
            (PP
              (IN to)
              (NP (DT a) (JJ consular) (NN post)))
            (PP
              (IN by)
              (NP
                (NP (DT the) (VBG sending) (NN State))
                (SBAR
                  (WHNP (WDT which))
                  (S
                    (VP
                      (VBP are)
                      (RB not)
                      (VP
                        (VBN prohibited)
                        (PP
                          (IN by)
                          (NP
                            (NP (DT the) (NNS laws) (CC and) (NNS regulations))
                            (PP
                              (IN of)
                              (NP (DT the) (VBG receiving) (NN State)))))))))))))))
    (CC or)
    (SBAR
      (SBAR
        (WHPP
          (IN to)
          (WHNP (WDT which)))
        (S
          (NP
            (NP (DT no) (NN objection)))
          (VP
            (VBZ is)
            (VP
              (VBN taken)
              (PP
                (IN by)
                (NP (DT the) (VBG receiving) (NN State)))))))
      (CC or)
      (SBAR
        (WHNP (WDT which))
        (S
          (VP
            (VBP are)
            (VP
              (VBN referred)
              (PP (IN to))
              (PP
                (IN in)
                (NP
                  (NP (DT the) (JJ international) (NNS agreements))
                  (PP
                    (IN in)
                    (NP (NN force)))
                  (PP
                    (IN between)
                    (NP
                      (NP (DT the) (NN sending) (NNP State))
                      (CC and)
                      (NP (DT the) (NN receiving) (NNP State)))))))))))
    ('' ”)
    (. .)))

I would say it's not attaching the list items at the correct level, FWIW

Which python interface are you using? Stanza?

What CoreNLP settings (if any) did you use? This looks exactly like the kind of sentence which would cause the PCFG to crash (not that the SRParser would deliver particularly useful results)

My Ubuntun command line to run the server is:
java -Xmx128g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9008 -timeout 300000

Actually, it is the Tregex feature that I have trouble with.

What version CoreNLP? What is the output if you run it just on that sentence?

What python interface are you using?

What version CoreNLP? What is the output if you run it just on that sentence?

What python interface are you using?
It is stanford-corenlp-4.5.4 on a Ubuntu VM - with 128GB RAM - hosted on a Winserver2016

I have no trouble with shorter paragraphs, but get error message for larger text:
HTTPConnectionPool(host='192.168.1.5', port=9008): Read timed out. (read timeout=300)

I tried it on https://corenlp.run/, and got the following error:
image

I will have to try this out later today, then

The problem here is that the PCFG goes completely haywire with text this long. What will work is if you do the following:

  • download the "Extra English" models
  • run the server with the extra command line flag -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz
  • you could use edu/stanford/nlp/models/srparser/englishSR.beam.ser.gz instead for more accurate but somewhat slower results

However, if you explain what you're using the Tregex interface for, I can probably find a way for you to connect the more accurate Stanza parser to Tregex instead. (Actually, that does technically already exist if you pass in a list of trees instead of the text, but that still requires the server to be running, and it's possible to run Tregex from the command line directly.)

The problem here is that the PCFG goes completely haywire with text this long. What will work is if you do the following:

  • download the "Extra English" models
  • run the server with the extra command line flag -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz
  • you could use edu/stanford/nlp/models/srparser/englishSR.beam.ser.gz instead for more accurate but somewhat slower results

However, if you explain what you're using the Tregex interface for, I can probably find a way for you to connect the more accurate Stanza parser to Tregex instead. (Actually, that does technically already exist if you pass in a list of trees instead of the text, but that still requires the server to be running, and it's possible to run Tregex from the command line directly.)

I will definitely try them out! I have been trying extract common phrases and collocations from a parallel English-Chinese corpus which contain paragraphs of up to one or two thousdand words. Those extracted strings will late be used in an alignment process.

Now I have both just tried edu/stanford/nlp/models/srparser/englishSR.beam.ser.gz and edu/stanford/nlp/models/srparser/chineseSR.beam.ser.gz . Everythig appears good now. And I did not really feel any slowness. :-)