dlwh/puck

Training/Building grammar for a language different than English

kk00ss opened this issue · 5 comments

Considering that :
"We have provided the cascade of grammars used in the Berkeley Parser for English."
Is there a way to obtain grammars for other languages for which the grammar is already created ?
I've downloaded Berkeley Parser grammars, is there a way to obtain a list of grammars for Puck ?
Thanks

dlwh commented

There's only english right now.

On Wed, Mar 25, 2015 at 12:35 PM, kk00ss notifications@github.com wrote:

Considering that :
"We have provided the cascade of grammars used in the Berkeley Parser for
English."
Is there a way to obtain grammars for other languages for which the
grammar is already created ?
I've downloaded Berkeley Parser grammars, is there a way to obtain a list
of grammars for Puck ?
Thanks


Reply to this email directly or view it on GitHub
#5.

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley Parser for puck? - after converted to text format?

It seems to work except these files

num.binary
num.unary
numstates
unary

Could you provide instruction on how to create these files based on the converted text files?

dlwh commented

i think it won't work all that well on unknown/rare words because of the
way I did the lexicon, but otherwise it should. If it basically works, I
can help with getting the lexicon patched in.

On Wed, Aug 24, 2016 at 1:55 AM, JimSw2016 notifications@github.com wrote:

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley
Parser for puck? - after converted to text format?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAloSQM1K_PrprRR-qeMi-__9SKsHJqks5qjAcBgaJpZM4D0wXP
.

HI David,
I have compared the extracted text files of
(a) ger_sm5.grammar -> same format as Puck's wsj_2.gr.binary
(b) ger_sm5.lexicon-> same format as Puck's wsj_2.gr.lexicon
(c) ger_sm5.splits-> same format as Puck's wsj_2.gr.hierarchy
(d) ger_sm5.words -> same format as Puck's wsj_2.gr.words

If you have time, do consider creating the missing
num.binary
num.unary
numstates
unary

If this process works, this will create NEW POSSIBILITIES for BerkeleyParser communities through GPU which you pioneered.

Did anything happen here? Does it work for German now?