Language Machines

NLP Research group at Centre for Language Studies, Radboud University Nijmegen

Nijmegen, The Netherlands

Pinned Repositories

CLIN28_ST_spelling_correction
Scripts that were used for preparing and converting the Wikipedia documents that are part of the CLIN28 shared task on spelling correction
Language:Python10 10 144
frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Language:C++78 15 10311
LamaEvents
Lama Events is a calendar application listing events in the near future. The events are detected and selected by a fully automatic procedure in the Dutch Twitter stream.
Language:HTML10 4 273
libfolia
FoLiA library for C++
Language:C++16 10 566
LuigiNLP
A workflow system for Natural Language Processing.
Language:Python21 6 25
PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
Language:Python49 7 647
ticcltools
Tools for TICCL
Language:C++14 7 474
timbl
TiMBL implements several memory-based learning algorithms.
Language:C++53 8 139
ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --
Language:C++69 12 9314
uctodata
Datafiles for the tokenizer ucto.
Language:Shell9 6 95

Language Machines's Repositories

LanguageMachines/LuigiNLP
A workflow system for Natural Language Processing.
Language:Python21 6 25
LanguageMachines/CLIN28_ST_spelling_correction
Scripts that were used for preparing and converting the Wikipedia documents that are part of the CLIN28 shared task on spelling correction
Language:Python10 10 144
LanguageMachines/LamaEvents
Lama Events is a calendar application listing events in the near future. The events are detected and selected by a fully automatic procedure in the Dutch Twitter stream.
Language:HTML10 4 273
LanguageMachines/quoll
Language:Python3 6 120
LanguageMachines/ICDAR2017-PostOCR-Ticcl
Wrapper scripts for processing ICDAR2017 PostOCR data given a TICCL ranked input list
Language:Python2 5 01
LanguageMachines/bp-som
BP-SOM: A hybrid of back-propagation learning in multi-layered perceptrons and self-organizing maps
Language:C++1 10 1
LanguageMachines/homebrew-lamachine
Brew formulas for installing NLP software developed by the Language Machines research group
Language:Ruby1 5 0
LanguageMachines/paramsearch
Automated parameter optimisation for Timbl
Language:C1 6 01
LanguageMachines/svn-timblmanual
copy from the old ILK svn
Language:TeX0 5 00
LanguageMachines/clin28
Language:TeX6 0
LanguageMachines/clst-webservices-meta
CLST webservices software metadata, only for those webservices/webapplications that are not included in LaMachine
7 0
LanguageMachines/CRoaring
Roaring bitmaps in C (and C++)
Language:C5 0
LanguageMachines/fambl
Family Memory Based Learning (original in ILK SVN)
Language:C6 0
LanguageMachines/GloVe
GloVe model for distributed word representation
Language:C5 0
LanguageMachines/json
JSON for Modern C++
Language:C++2 0
LanguageMachines/knngraph
KNN graph software originally in TiCC SVN
Language:Python4 0
LanguageMachines/SB-tokenizer
Language:Perl5 0
LanguageMachines/SoNaR
Language:Python
LanguageMachines/svn-mbmt
Language:C
LanguageMachines/svn-sonar
Old Sonar stuff from the TiCC svn
Language:C++5 0
LanguageMachines/svn-ticclopstools
Ols ticclopstools from the TiCC svn
Language:C++5 0
LanguageMachines/tadpole
The good old predecessor of Frog
Language:Lex5 0
LanguageMachines/wikinerdata
Script to collect data from Wikipedia and automatically annotate the linked named entities with Named Entity type.
Language:Jupyter Notebook6 0
LanguageMachines/word2vec
This tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research.
Language:C5 0

Language Machines

Pinned Repositories

CLIN28_ST_spelling_correction

frog

LamaEvents

libfolia

LuigiNLP

PICCL

ticcltools

timbl

ucto

uctodata

Language Machines's Repositories

LanguageMachines/LuigiNLP

LanguageMachines/CLIN28_ST_spelling_correction

LanguageMachines/LamaEvents

LanguageMachines/quoll

LanguageMachines/ICDAR2017-PostOCR-Ticcl

LanguageMachines/bp-som

LanguageMachines/homebrew-lamachine

LanguageMachines/paramsearch

LanguageMachines/svn-timblmanual

LanguageMachines/clin28

LanguageMachines/clst-webservices-meta

LanguageMachines/CRoaring

LanguageMachines/fambl

LanguageMachines/GloVe

LanguageMachines/json

LanguageMachines/knngraph

LanguageMachines/SB-tokenizer

LanguageMachines/SoNaR

LanguageMachines/svn-mbmt

LanguageMachines/svn-sonar

LanguageMachines/svn-ticclopstools

LanguageMachines/tadpole

LanguageMachines/wikinerdata

LanguageMachines/word2vec