Parsing arbitrary numbers
Closed this issue · 3 comments
This seem silly but the hardest thing I've come across so far is how to parse arbitrary numbers in GF!
For example:
"cats have 4 legs".
"there are twenty people in the room"
I've found Numerals.gf in the lib, but for the life of me I can't figure out how to use it. I compiles fine but then fails to parse any numbers.
hello @robclouth
which module are you importing?
> i present/LangEng.gfo
Languages: LangEng
Lang> p "cats have 4 dogs"
PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetCN (DetQuant IndefArt NumPl) (UseN cat_N)) (ComplSlash (SlashV2a have_V2) (DetCN (DetQuant IndefArt (NumCard (NumDigits (IDig D_4)))) (UseN dog_N)))))) NoVoc
Lang> p "there are twenty cats in the roof"
PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (ExistNP (AdvNP (DetCN (DetQuant IndefArt (NumCard (NumNumeral (num (pot2as3 (pot1as2 (pot1 n2))))))) (UseN cat_N)) (PrepNP in_Prep (DetCN (DetQuant DefArt NumSg) (UseN roof_N))))))) NoVoc
PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (ExistNP (DetCN (DetQuant IndefArt (NumCard (NumNumeral (num (pot2as3 (pot1as2 (pot1 n2))))))) (AdvCN (UseN cat_N) (PrepNP in_Prep (DetCN (DetQuant DefArt NumSg) (UseN roof_N)))))))) NoVoc
PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (ExistNPAdv (DetCN (DetQuant IndefArt (NumCard (NumNumeral (num (pot2as3 (pot1as2 (pot1 n2))))))) (UseN cat_N)) (PrepNP in_Prep (DetCN (DetQuant DefArt NumSg) (UseN roof_N)))))) NoVoc
(I've changed your sentences slightly just so they fit the GF test lexicon).
Came here to say exactly what @odanoburu said :) It works out of the box in the Lang
module.
Just a small addition: if you parse multi-digit numerals in the normal GF shell (i.e. no C runtime, not through an external program that uses e.g. Python/Java/... bindings), then you need to insert the bind token &+
between the digits. Here's an example:
Lang> p -cat=NP "4 dogs"
DetCN (DetQuant IndefArt (NumCard (NumDigits (IDig D_4)))) (UseN dog_N)
Lang> p -cat=NP "40 dogs"
The parser failed at token 1: "40"
Lang> p -cat=NP "4 &+ 0 dogs"
DetCN (DetQuant IndefArt (NumCard (NumDigits (IIDig D_4 (IDig D_0))))) (UseN dog_N)
Lang> p -cat=NP "four hundred and thirty &+ - &+ three dogs"
DetCN (DetQuant IndefArt (NumCard (NumNumeral (num (pot2as3 (pot2plus (pot0 n4) (pot1plus n3 (pot0 n3)))))))) (UseN dog_N)
If you linearise such a tree, you can use the -bind
flag as an argument for l
:
Lang> l DetCN (DetQuant IndefArt (NumCard (NumDigits (IIDig D_4 (IIDig D_0 (IDig D_0)))))) (UseN dog_N)
4 &+ 0 &+ 0 dogs
Lang> l -bind DetCN (DetQuant IndefArt (NumCard (NumDigits (IIDig D_4 (IIDig D_0 (IDig D_0)))))) (UseN dog_N)
400 dogs
Ahhh! Thanks so much. I've been using AllEng.gf. I hadn't encountered LangEng.gf.