Resources for doing NLP in Polish
If you'd like to contribute, please edit this document. You can do it directly from GitHub (if you're logged in).
- morfologik-stemming (BSD-3) - dictionary based, doesn't support out-of-vocabulary words.
- stempel (Apache-style) - seems ancient, but working; can handle some out-of-vocabulary words.
- pystempel (mixed) - Python port of stempel with improved stemming tables.
- Estem - Erlang wrapper (not port) for Stempel stemmer.
- pl_stemmer - a Python stemmer based on Porter's Algorithm.
- polish-stem - a Python stemmer using Finite State Transducers.
Know some other stemmer? Please open an issue.
- morfologik - stemmer, morphology analyser, grammar analyser, autocompleter
- psi-toolkit - stemmer, morphology analyser, grammar analyser, many others
- spaCy - framework for Industrial-Strength NLP in Python with models for Polish (from Sigmoidal and from IPIPAN)
- D. Weiss: A Survey of Freely Available Polish Stemmers (2005): comparision of different stemmers, their results and efficiency.
To the extent possible under law, Adam Stankiewicz has waived all copyright and related or neighboring rights to this work.