GrammaticalFramework/GF

A command for asserting linearizations

Opened this issue · 1 comments

As I'm making changes to the Turkish resource grammar, I am starting to doubt if the changes I'm making are messing with the stuff I have implemented before.

When I consulted @inariksit, she told that the current practice is to use resource.gfs and to have a “gold standard” against which its linearization is diffed.

I think it would be much better have a built-in solution for this like anassertLin command that takes in an abstract expression, a concrete grammar name, and an expected output and checks the linearization for equality to the expected output. Then we could have -test flag for gf that, so that when we run on a gf -run tests.gfs, prints out all the assertLins coming from tests.gfs nicely and tells whether all tests have succeeded or not.

Though this doesn't have substantial benefits compared to file-diffing, it could be one step towards a unified practice of testing resource grammars which seems to be nonexistent at this point.

If I'm not missing anything, this should also be easy to implement.

Thoughts or comments? @inariksit @krangelov

Hi @ayberkt,

there is a new way to test a new and old versions of the same grammar, and it only outputs linearisations that differ: you can see the documentation here.

I've been using it myself, and it has saved me a lot of mistakes--for instance, I was fixing German agreement of reflexives, which only should affect some functions in VerbGer. When doing that, I added a field to VPSlash, and in order to save typing the new field everywhere that uses VPSlashes, I modified some opers in ResGer. Before starting my changes, I had compiled a PGF of the German grammar, which I compared to the new version, and found out that my changes of the opers in ResGer broke e.g. CleftNP in IdiomGer. I would've never thought to check something like that on my own! :-P

Of course, your suggestion is still relevant too; I can see why people would like to themselves specify a set of test cases that should work, in addition to the automatically generated test cases. I'll keep this on a vague TODO-sometime-in-future-list; I'm now much more comfortable with GF internals so it feels like a feasible task. Naturally, if someone wants to take it as a TODO-right-now item, I won't object!