Pinned Repositories
cosycat
Collaborative Synchronized Corpus Annotation Tool
cosycat-wiki
Wiki repo for the main cosycat repo (https://github.com/emanjavacas/cosycat)
git-course
Materials for a one-day git tutorial
gysbert-eval
hierarchical-lm
Training sentence-level text generators at different scales and with optional sentence-level conditions
language-model-playground
macberth-eval
pie
A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
seqmod
Sequence modelling with pytorch
urban-tweeters
A Clojure project for visualising the language of tweets in cities
emanjavacas's Repositories
emanjavacas/cosycat
Collaborative Synchronized Corpus Annotation Tool
emanjavacas/language-model-playground
emanjavacas/cosycat-wiki
Wiki repo for the main cosycat repo (https://github.com/emanjavacas/cosycat)
emanjavacas/text-reuse
emanjavacas/ankura
Anchor-based topic modeling
emanjavacas/ark-twokenize-py
Python port of the Twokenize class of ark-tweet-nlp
emanjavacas/bash-dotfiles
emanjavacas/bilstm-aux
Bidirectional Long-Short Term Memory tagger (bi-LSTM) (in DyNet) -- hierarchical (with word and character embeddings)
emanjavacas/casket
persistent storage for ML experiment results
emanjavacas/config-dotfiles
emanjavacas/emacs-dotfile
emanjavacas/emanjavacas.github.io
emanjavacas/football
emanjavacas/Geste
Un corpus de chansons de geste
emanjavacas/gh-src
emanjavacas/lm-augmentation
Data-augmentation & Self-learning experiments with RNNLMs
emanjavacas/mulisera
Multilingual image sentence ranking
emanjavacas/multilemma
emanjavacas/NCRFpp
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
emanjavacas/pandora
A Tagger-Lemmatizer for Natural Languages
emanjavacas/potter-reuse
emanjavacas/python-exercises
Python exercises (some adapted from other source, some of my own)
emanjavacas/seq2seq.pytorch
Sequence-to-Sequence learning using PyTorch
emanjavacas/skip-thoughts
Sent2Vec encoder and training code from the paper "Skip-Thought Vectors"
emanjavacas/squawka-scraper
Scrapy crawler to get football (soccer) match reports from Squawka.
emanjavacas/text-matcher
A simple text reuse detection CLI tool.
emanjavacas/text-nn
Discriminative text classification with pytorch
emanjavacas/tfkld
Code for the [EMNLP 2013 paper](http://www.cc.gatech.edu/~jeisenst/papers/ji-emnlp-2013.pdf)
emanjavacas/treebank_data
Perseus Treebank Data
emanjavacas/Tutorial_workflow
Condorcet 2019 workshop (Paris): Digital Philology and Medieval Text Processing Workflow