yzhangcs/parser

How to easily get VP/NP/PP phrase?

bayesrule opened this issue · 3 comments

Hi,

I'm new to this great tool. Besides successful parsing, is there any convenient way to get some useful elements (e.g. none phrases in the input) from the parsing result?

@bayesrule Hi, the results predicted by the parser are just stored in the form of nltk.Tree.
So you can access nonterminals simply via nltk APIs:

>>> par = Parser.load('crf-con-en')
>>> tree = par.predict("She enjoys playing tennis .".split())[0].trees
>>> tree.productions()
[TOP -> S, S -> NP VP _, NP -> _, _ -> 'She', VP -> _ S, _ -> 'enjoys', S -> VP, VP -> _ NP, _ -> 'playing', NP -> _, _ -> 'tennis', _ -> '.']

Besides, supar also provides some useful fns for factorization:

>>> supar.utils.Tree.factorize(tree)
[(0, 5, 'TOP'), (0, 5, 'S'), (0, 1, 'NP'), (1, 4, 'VP'), (2, 4, 'S'), (2, 4, 'VP'), (3, 4, 'NP')]

This issue is stale because it has been open for 30 days with no activity.

This issue was closed because it has been inactive for 7 days since being marked as stale.