Various corpora that I have compiled.
Māori-language corpus. There are variants with and without tohu tō (macrons).
Compiled by:
- Parsing mi.wikipedia.org pages
Tools:
- Wikipedia: https://pypi.python.org/pypi/wikipedia/
Morphemes and the meaning they contribute to words.
Compiled by:
- Scraping http://www.cognatarium.com
Tools:
- Requests: http://docs.python-requests.org/en/latest/
- lxml: http://lxml.de/