/menard

A Clojure library for generation and parsing expressions from grammars and lexicons.

Primary LanguageClojureEclipse Public License 1.0EPL-1.0

Clojars Project License Clojure CI

Menard

A Clojure library for generation and parsing of natural language expressions.

Acknowledgements

All errors in the code are my fault, not anyone's cited here.

HPSG

Based on a linguistic theory called HPSG (Head Driven Phrase Structure Grammar). More about HPSG:

[Fehringer 1999]

Dutch grammar and lexicon based on Carole Fehringer, "A Reference Grammar of Dutch", Cambridge University Press, 1999, referred to here as "F. <section>" or "F. pp. <pages>".

[Oosterhoff 2009]

Dutch grammar and lexicon based on Janneke Oosterhoff, "Basic Dutch: A Grammar and Workbook", 2009, referred to here as "O. Unit <unit> Note <note>".

Verbix

Uses verb conjugations from Verbix: http://www.verbix.com

Demo

For the demo, a Dutch sentence and a semantically-equivalent* English sentence. First the Dutch sentence is generated (for each specification listed in expressions.edn), then, the semantics of this Dutch sentence are used to generate an English sentence with the same* semantics.

* Approximately the same semantics, modulo various bugs and misunderstandings on my part.

$ ./src/scripts/demo.sh

The output will look like this example, although you'll get your own, uniquely-generated set of sentences.

License

Copyright © 2018 Eugene Koontz

Distributed under the Eclipse Public License, the same as Clojure. Please see the epl-v10.html file at the top level of this repo.

Name

The story Pierre Menard, autor del Quijote, by Jorge Luis Borges, tells of a writer, Pierre Menard, who attempted to rewrite Cervantes' Don Quixote, not simply by copying it letter-by-letter from the original text, but rather by somehow reproducing the mental and experiential state of Cervantes that was necessary to write it, and then writing it de novo from that state. I see this as reminicent of this project's attempt to generate natural language expressions by encoding semantic representations and then using grammar rules and lexical entries that are able to generate the expressions from the semantic representations.

Initially, I named this project babel, after another of Borges' stories, The Library of Babel, but this name is already taken by a well-known Javascript tool. I later rewrote and simplified the same ideas in my 'babel' project into a new github repository, calling it 'babylon', before discovering that this name was also already a well-known Javascript software project. Now I have arrived at the current name, by renaming the 'babylon' repository to 'menard'.

I've found that there is another project called pierre-menard, also as it happens written in Clojure, that is now archived with the last commit being 2012, so I think it is safe to reappropriate this name for my own project now in 2020.

Inspirations

Besides the above-mentioned linguistics and Borges references mentioned above, I am including some links below which I think are interesting and related to the ideas I am exploring here.