Pinned Repositories
Deep_Learning_in_LangTech_course
Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)
FIN-bench
Evaluation of Finnish generative models
FinBERT
BERT model trained from scratch on Finnish
finngen-tools
Tools for training causal language models for Finnish
Finnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
ocr-correction
Post-processing OCR errors with seq2seq models
Text_Mining_Course
Stuff for the Text Mining course
Turku-neural-parser-pipeline
A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.
wikibert
BERT models for many languages created from Wikipedia texts
TurkuNLP Group - IT Department - University of Turku's Repositories
TurkuNLP/Finnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
TurkuNLP/wikibert
BERT models for many languages created from Wikipedia texts
TurkuNLP/ocr-correction
Post-processing OCR errors with seq2seq models
TurkuNLP/turku-ner-corpus
Open broad-coverage corpus for Finnish named entity recognition.
TurkuNLP/bert-eval
TurkuNLP/pubmed_parses
Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documents
TurkuNLP/Finnish_PropBank
Finnish Proposition Bank
TurkuNLP/biBERT
Finnish English bilingual BERT models
TurkuNLP/BINF_Programming
Stuff for the BINF programming course (@fginter)
TurkuNLP/CAFA3
University of Turku CAFA3 project
TurkuNLP/BioCreativeVI_BioID_assignment
TurkuNLP/BioCreativeVI_CHEMPROT_RE
Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: farmeh@utu.fi
TurkuNLP/conll17-system
Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.
TurkuNLP/Corpus-linguistics
Code and data for the examples and use cases described in the article "Määrällinen korpuslingvistiikka" to be published in the book "Kielentutkimuksen metodologian käsikirja" in Finnish.
TurkuNLP/WAC-XII
Data presented in the paper "From Web Crawl to Clean Register-Annotated Corpora"
TurkuNLP/BHE
End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: farmeh@utu.fi
TurkuNLP/Cell-line-recognition
Cell line names recognition and normalization
TurkuNLP/deepfin-tools
DeepFin tools
TurkuNLP/Digi_menetelmat
Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"
TurkuNLP/korona-tweets
stuff for our korona-tweets
TurkuNLP/SRNNMT
Sentence representation for translation finding
TurkuNLP/bert-for-core
TensorFlow code and pre-trained models for BERT
TurkuNLP/CAFA4
TurkuNLP/csc-guide
Guide to using CSC resources
TurkuNLP/hockey-text-generation-corpus
Finnish hockey event text generation data.
TurkuNLP/ML_Linguistics
Machine learning in linguistics
TurkuNLP/multiling-cnn
Simple multi/cross-lingual CNN text classifier
TurkuNLP/Patient-text-tool
A web based tool for searching and annotating patient medical reports
TurkuNLP/S24_idx
Code to push S24 data into Solr
TurkuNLP/s2_tokenizer