/kpca_embeddings

Primary LanguagePythonApache License 2.0Apache-2.0

kpca_embeddings

Python implementation of Eduardo Brito, Rafet Sifa and Christian Bauckhage KPCA Embeddings: an Unsupervised Approach to Learn Vector Representations of Finite Domain Sequences

You can train KPCA embeddings for various tasks such as junction recognition on DNA sequences from the "Molecular Biology (Splice-junction Gene Sequences) Data Set" from UCI Machine Learning Repository or German verb classification from the TIGER treebank. The exact hyperparameter combinations can be found in the reference paper.

We also participated in the Dutch spelling correction task from the Computational Linguistics in the Netherlands 28 conference by means of KPCA embeddings. The code of the presented Dutch spell checker is also included in this repository.