tschomacker
Post-Graduate Research Assistant @ Hamburg University of Applied Sciences
Hamburg, Deutschland
Pinned Repositories
aligned-narrative-documents
A collection of scripts to create a Document-aligned corpus of German Narrative Texts from four different sources of Simple Language Texts and three different sources of Standard Language Texts.
BertSum
A fork of BertSum which uses Stanford Stanza for tokenizing. This makes it possible to tokenize a big variety of languages.
churchtools-birthdays
Simple tool for automatically sending a list of people which had their birthdays within the last week generated from a churchtools database
draft
fake-news-detection-bot
generalizing-passages-identification-bert
Automatic Identification of Generalizing Passages in German Fictional Texts using BERT with Monolingual and Multilingual Training Data
germeval.github.io
longmbart-1
news-scraper
A program for downloading online articles and saving it in a SQLLite database.
tschomacker's Repositories
tschomacker/aligned-narrative-documents
A collection of scripts to create a Document-aligned corpus of German Narrative Texts from four different sources of Simple Language Texts and three different sources of Standard Language Texts.
tschomacker/BertSum
A fork of BertSum which uses Stanford Stanza for tokenizing. This makes it possible to tokenize a big variety of languages.
tschomacker/churchtools-birthdays
Simple tool for automatically sending a list of people which had their birthdays within the last week generated from a churchtools database
tschomacker/draft
tschomacker/fake-news-detection-bot
tschomacker/generalizing-passages-identification-bert
Automatic Identification of Generalizing Passages in German Fictional Texts using BERT with Monolingual and Multilingual Training Data
tschomacker/germeval.github.io
tschomacker/longmbart-1
tschomacker/news-scraper
A program for downloading online articles and saving it in a SQLLite database.
tschomacker/pyrouge-first-use
First Use of Rouge 1.5.5 / pyrouge in Python
tschomacker/textgrid-domain-adaptation-dataset
A small script to mask textrgrid texts sentence by sentence and combine them into one dataset. This dataset can be used for masked language modeling and thus for pre-training and domain adaptation.
tschomacker/tschomacker