Pinned Repositories
Deep_Learning_in_LangTech_course
Materials for the University of Turku course TKO_8965 Deep Learning in Human Language Technology (previously named TKO_2101 Natural Language Processing)
FIN-bench
Evaluation of Finnish generative models
FinBERT
BERT model trained from scratch on Finnish
finngen-tools
Tools for training causal language models for Finnish
Finnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
ocr-correction
Post-processing OCR errors with seq2seq models
Text_Mining_Course
Stuff for the Text Mining course
Turku-neural-parser-pipeline
A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more than 50 languages. Top ranker in the CoNLL-18 Shared Task.
wikibert
BERT models for many languages created from Wikipedia texts
TurkuNLP Group - IT Department - University of Turku's Repositories
TurkuNLP/Finnish-dep-parser
The Finnish dependency parsing pipeline being developed by the Turku NLP group. Documentation:
TurkuNLP/wikibert
BERT models for many languages created from Wikipedia texts
TurkuNLP/bert-eval
TurkuNLP/pubmed_parses
Syntactic parses and named entity recognition for PubMed abstracts and PubMed Central full documents
TurkuNLP/IR_Course
Stuff for the upcoming IR course 2017
TurkuNLP/Finnish_PropBank
Finnish Proposition Bank
TurkuNLP/biBERT
Finnish English bilingual BERT models
TurkuNLP/BINF_Programming
Stuff for the BINF programming course (@fginter)
TurkuNLP/CAFA3
University of Turku CAFA3 project
TurkuNLP/BioCreativeVI_BioID_assignment
TurkuNLP/BioCreativeVI_CHEMPROT_RE
Deep learning-based systems for biomedical relation extraction: recognizing the statements of relations between chemical compounds/drugs and genes/proteins from biomedical literature. The code is developed for our participation in the BioCreative VI Task 5 (CHEMPROT) challenge. Contact: farmeh@utu.fi
TurkuNLP/conll17-system
Instructions for TurkuNLP system in CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies.
TurkuNLP/BHE
End-to-end System for Bacteria Habitat Extraction: Named-entity recognition (NER), named-entity normalization, relation extraction. email: farmeh@utu.fi
TurkuNLP/Cell-line-recognition
Cell line names recognition and normalization
TurkuNLP/deepfin-tools
DeepFin tools
TurkuNLP/Digi_menetelmat
Johdatus digitaalisiin ihmistieteisiin -kurssin työpaja "Digitaaliset ihmistieteet kielentutkimuksessa: tekstinlouhinta"
TurkuNLP/korona-tweets
stuff for our korona-tweets
TurkuNLP/SRNNMT
Sentence representation for translation finding
TurkuNLP/TDT_editor
The tree editor used to annotate the Turku Dependency Treebank. Vintage code, but putting it online in case someone finds it in any way useful.
TurkuNLP/bert-for-core
TensorFlow code and pre-trained models for BERT
TurkuNLP/CAFA4
TurkuNLP/csc-guide
Guide to using CSC resources
TurkuNLP/dep2dep
Treebank transformation tool. Put here publicly in the hope of being useful. The code origins in 2008 and therefore some apologies and disclaimers apply. Has been used in numerous treebank transformations efforts.
TurkuNLP/hockey-text-generation-corpus
Finnish hockey event text generation data.
TurkuNLP/ML_Linguistics
Machine learning in linguistics
TurkuNLP/multiling-cnn
Simple multi/cross-lingual CNN text classifier
TurkuNLP/Patient-text-tool
A web based tool for searching and annotating patient medical reports
TurkuNLP/S24_idx
Code to push S24 data into Solr
TurkuNLP/s2_tokenizer
TurkuNLP/smt-pronouns