Pinned Repositories
bicleaner
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
bicleaner-ai
Bicleaner fork that uses neural networks
arch-install
Simple bash script to install Arch Linux.
clean
A tool for downloading and cleaning parallel corpora
heliport
Fast and accurate language identifier
image_omr
Optical Music Recognition with RNN's in Keras
paraphrasing
A repository with different paraphrasing related tools. Sent2vec and paraphrase generation.
terminology
Tools to annotate parallel data with terminology for NMT forced translation
tmxt
Transform TMX to text
ZJaume's Repositories
ZJaume/clean
A tool for downloading and cleaning parallel corpora
ZJaume/heliport
Fast and accurate language identifier
ZJaume/image_omr
Optical Music Recognition with RNN's in Keras
ZJaume/paraphrasing
A repository with different paraphrasing related tools. Sent2vec and paraphrase generation.
ZJaume/terminology
Tools to annotate parallel data with terminology for NMT forced translation
ZJaume/tmxt
Transform TMX to text
ZJaume/arch-install
Simple bash script to install Arch Linux.
ZJaume/bicleaner
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
ZJaume/Computer-Vision
Computer vision repository
ZJaume/cyrillic-transliteration
Transliterate Cyrillic script to Latin script and vice versa.
ZJaume/datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble
ZJaume/diceware-cat
Diccionaris catalans per a generar contrasenyes Diceware
ZJaume/Domain_Adaptation
InDomain detection is a tool designed to extract in-domain data from a large collections of data.
ZJaume/dotfiles
My dotfiles
ZJaume/escape-unk
Escape unknown symbols in SentecePiece vocabularies
ZJaume/fastspell
Targetted language identifier, based on FastText and Hunspell.
ZJaume/gaoya
Locality Sensitive Hashing
ZJaume/Infinity-For-Reddit
A Reddit client for Android
ZJaume/LanguagePack
A language pack project for AnySoftKeyboard
ZJaume/lttoolbox
Finite state compiler, processor and helper tools used by apertium
ZJaume/mutnmt
ZJaume/sacrebleu
Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
ZJaume/serde-fancy-regex
A serde-regex fork to (de)serialize fancy-regex regular expressions
ZJaume/splitters
A CLI for Rust SRX sentence segmenation rules as Python package.
ZJaume/srx
A mostly compliant Rust implementation of the Segmentation Rules eXchange (SRX) 2.0 standard for text segmentation.
ZJaume/students
Efficient teacher-student models and scripts to make them