Pinned Repositories
nizza
Neural word alignment models
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
shmoo-decoder
MT Marathon 2022 project: The Shmoo Decoder
tensor2tensor
A library for generalized sequence to sequence models
tensor2tensor-usr
Additional models for tensor2tensor.
translate
Translate - a PyTorch Language Library
ucam-scripts
C4_200M-synthetic-dataset-for-grammatical-error-correction
This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences from C4 using a tagged corruption model. The approach and the dataset are described in more detail by Stahlberg and Kumar (2021) (https://www.aclweb.org/anthology/2021.bea-1.4/)
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
tensor2tensor
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
fstahlberg's Repositories
fstahlberg/shmoo-decoder
MT Marathon 2022 project: The Shmoo Decoder
fstahlberg/tensor2tensor-usr
Additional models for tensor2tensor.
fstahlberg/nizza
Neural word alignment models
fstahlberg/tensor2tensor
A library for generalized sequence to sequence models
fstahlberg/ucam-scripts
fstahlberg/NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
fstahlberg/translate
Translate - a PyTorch Language Library