A collection of resources for Natural Language Programming resources for the Baltic languages (Latvian, Lithuanian and Estonian) Table of Contents
- fasttext language identification, includes EE, LT, LV
- FastText pre-trained word vectors (1): bin, text The word vectors where trained on Common Crawl and Wikipedia using fastText. See documentation at Fasttext.cc
- FastText pre-trained word vectors (2): bin+text, text The word vectors where trained on Wikipedia using fastText. See documentation at Fasttext.cc
- Also available for Latgalian language: bin+text, text
- Polyglot Latvian word embeddings (scroll down in the table) polyglot embeddings
- AI Lab by Latvian University
Dr. Comp. Sc. Inguna Skadiņa -- Publications -- CV
- ALKSNIS dependency treebank The Lithuanian dependency treebank ALKSNIS v3.0 (Vytautas Magnus University).
- spaCy Lithuanian multi-task CNN trained on UD Lithuanian ALKSNIS and TokenMill.lt news corpus. Assigns context-specific token vectors, POS tags, dependency parses and named entities. 3 different models and label scheme included in the documentation.
- FastText pre-trained word vectors: bin, text The word vectors where trained on Common Crawl and Wikipedia using fastText. See documentation at Fasttext.cc
- FastText pre-trained word vectors (2): bin+text, text The word vectors where trained on Wikipedia using fastText. See documentation at Fasttext.cc
- Also available for Samogitian language: bin+text, text
- Polyglot Latvian word embeddings (scroll down in the table) polyglot embeddings
- Rasa NLU COVID model An open-source model for building an AI assistant to help disseminate information about the virus, how to stay safe, and where to seek help.
- FastText pre-trained word vectors: bin, text The word vectors where trained on Common Crawl and Wikipedia using fastText. See documentation at Fasttext.cc
- FastText pre-trained word vectors (2): bin+text, text The word vectors where trained on Wikipedia using fastText. See documentation at Fasttext.cc
- Polyglot Latvian word embeddings (scroll down in the table) polyglot embeddings