How to easily get VP/NP/PP phrase?
bayesrule opened this issue · 3 comments
bayesrule commented
Hi,
I'm new to this great tool. Besides successful parsing, is there any convenient way to get some useful elements (e.g. none phrases in the input) from the parsing result?
yzhangcs commented
@bayesrule Hi, the results predicted by the parser are just stored in the form of nltk.Tree
.
So you can access nonterminals simply via nltk APIs:
>>> par = Parser.load('crf-con-en')
>>> tree = par.predict("She enjoys playing tennis .".split())[0].trees
>>> tree.productions()
[TOP -> S, S -> NP VP _, NP -> _, _ -> 'She', VP -> _ S, _ -> 'enjoys', S -> VP, VP -> _ NP, _ -> 'playing', NP -> _, _ -> 'tennis', _ -> '.']
Besides, supar
also provides some useful fns for factorization:
>>> supar.utils.Tree.factorize(tree)
[(0, 5, 'TOP'), (0, 5, 'S'), (0, 1, 'NP'), (1, 4, 'VP'), (2, 4, 'S'), (2, 4, 'VP'), (3, 4, 'NP')]
github-actions commented
This issue is stale because it has been open for 30 days with no activity.
github-actions commented
This issue was closed because it has been inactive for 7 days since being marked as stale.