This repository holds the re-write and working implementation of the interface and code for reconstructing Proto-Burmese, as found here:
Xun Gong, & Nathan Hill. (2020). Materials for an Etymological Dictionary of Burmish. Zenodo. https://doi.org/10.5281/zenodo.4311182
To run while developing (with Python 3):
cd server
export FLASK_APP=server
flask run
# opens API on localhost:5000
and then (with Node >=14)
cd cognate-app
npm i
npm run dev
# opens interface on localhost:8080
You must have libfoma0
and libfoma0-dev
installed for the API to work.
sudo apt-get install libfoma0 libfoma0-dev
For much more in-depth instructions, see SETUP.md.
Do note that due to this error it is possible that using the 0.9.18 version of foma you will end up with un-caught errors in FST compilation, that may end up returning a 500 error from the server. So, please be careful to check the version of the package your are installing with the above command. If you have problems on linux with the version, I strongly recommend just downloading and building the most recent version yourself:
~# wget https://github.com/mhulden/foma/archive/refs/heads/master.zip
~# unzip master.zip
~# cd ./foma-master/foma/
~/foma-master/foma# make <-- make sure to install the libraries listed below
~/foma-master/foma# make install
I would love to have this working for other OS's that aren't linux distros, however getting foma to work with the Python bindings is, frankly, a complete mess (especially for Windows). I will say that one can successfully compile foma from source using Cygwin (using the 0.10.0 version on Github). Attempting to build the 0.9.18 version (whose source can be found here) with more recent versions of gcc (what you will by default download with Cygwin), will throw errors.
If you attempt to build with Cygwin, note that you will likely need to edit the Makefile
to remove the -ltermcap
flag (as mentioned here). Also ensure that you install the Devel
packages necessary for foma (graphviz, flex, bison, zlib, libncurses-dev, libreadline-dev
).
If you would like to run the entire app at once (i.e. on a server or linux machine) so that you can use it, please install Caddy. Follow the instructions they give for setting up, and once your caddy
command is available, you can:
-
Make sure your Caddyfile is set to the location of the site you are using (i.e. a domain or simply
:5000
for personal use). This is the first line of the Caddyfile. -
Start up the server and interface.
cd server gunicorn server:app # starts api on :8000 cd cognate-app npm i npm run build npm run start # starts interface on :8080
-
Run Caddy to handle proxying api requests.
caddy adapt caddy run # opens interface at specified port
-
Visit the location you specified to use CAPR.
When writing FSTs in the editor, you must write them for the languages below (unless you go ahead and change the code). Each language has a corresponding .bin
name that it must be assigned at the end of that section of the transducer, i.e: save stack lashi.bin
.
Language Name | .bin Name |
---|---|
Old_Burmese | burmese.bin |
Achang_Longchuan | ngochang.bin |
Xiandao | xiandao.bin |
Maru | maru.bin |
Bola | bola.bin |
Atsi | atsi.bin |
Lashi | lashi.bin |
.
├── cognate-app/
│ └── [svelte code for cognate reassignment and fst editor]
├── orthoprofiles/
│ └── [orthographical profiles for pipeline stages]
├── pipeline/
│ └── [wordlist to tokenized lexicon, ran through lexstat to find intial cognates]
├── reconstruct/
│ └── [intial fsts for pipeline usage]
└── server/
└── [all api routes and associated functions]
You can read more about each individual folder in their respective READMEs.
List, J.-M. and R. Forkel (2022): LingRex: Linguistic Reconstruction with LingPy. [Computer software, Version 1.2.0]. Geneva: Zenodo. DOI: 10.5281/zenodo.1544943
List, J.-M. and R. Forkel (2021): LingPy. A Python library for quantitative tasks in historical linguistics. Version 2.6.9. Max Planck Institute for Evolutionary Anthropology: Leipzig. https://lingpy.org
Hulden, M. (2009). Foma: a finite-state compiler and library. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (pp. 29–32).