/GF-exttools

GF resource grammars, using external morphological analyser and CG

Primary LanguageGrammatical FrameworkOtherNOASSERTION

GF-exttools

GF resource grammars, using external morphological analyser and CG. The directory hungarian contains a working mini resource grammar and instructions how to use. (Finnish is just fragments of code I wrote in 2015, doesn't compile.)

Why?

This is mostly a showcase in sharing resources. Instead of writing a morphology in GF, we can reuse an existing finite-state morphology, and only concentrate on the syntax.

A traditional GF grammar would store an inflection table in the entries for noun and verb, and the function for building sentences would pick the right forms for the subject, verb and object. See an example below:

lincat S  = Str ;
lincat NP = { s : Case => Str ; agr : Agr } ;
lincat V2 = { s : Agr => Str ; compl : Case } ;

fun MakeSentence : NP -> V2 -> NP -> S ;
lin MakeSentence subj verb obj = subj.s ! Nom 
                              ++ v2.s ! subj.agr 
                              ++ obj.s ! v2.compl ;

The function takes a subject, object and a verb, and produces a sentence. We can assume that NP contains a large inflection table, for all combinations of number and case. The verb, likewise, contains all the conjugation, including tense, mood, person, aspect, and even non-finite forms (e.g. singing).

To store such a lexicon makes a grammar big and slow. Usually, writing the morphology in GF is the first step in starting a resource grammar. A new grammar writer can easily spend weeks in investigating old grammar books. Even if the language has a finite-state morphology, the representation is very different from the GF morphology, and hardly lends itself for inspiration. Another option is to generate forms to feed in the GF lexicon using an existing morphological analyser---this frees the grammar writer from spending time on elegant morphological rules, but does not solve the problem of grammar blowup.

To contrast, here is a version of the MakeSentence function using existing tags. The entries for nouns and verbs do not contain tables this time, but tags, represented as simple strings.

lincat S  = Str ;
lincat NP = { s : Str ; agr : Str } ;
lincat V2 = { s : Str ; compl : Str } ;

fun MakeSentence : NP -> V2 -> NP -> S ;
lin MakeSentence subj verb obj = glue subj.s "<nom>"
                              ++ glue v2.s subj.agr 
                              ++ glue obj.s v2.compl ;

An example of a linearisation for Hungarian:

Miniresource> gr -cat=S | l -bind
a<det><def> ház<n><sg><nom> nem<adv> szeret<vblex><past><p3><sg> én<prn><pers><p1><mf><sg><acc>