The main purpose of this repo is to unify different stemming components based on its language.
This package is distributed over the packagist service for composer. In order to use this package composer must be installed.
composer require nadar/stemming
Using the stemmer for your desired language:
<?php
include 'vendor/autoload.php';
$stemmed = \Nadar\Stemming\Stemm::stem('drinking', 'en');
echo $stemmed; // output: "drink"
If your provided language could not be found, the original word will be returned.
You can also stem a whole phrase:
echo \Nadar\Stemming\Stemm::stemPhrase('I am playing drums', 'en');
Certain words are on the ignore list, valid for all languages, see Stemm::$ignore. You can adjust that list with Stemm::$ignore = ['foo', 'bar']
.
- German Stemming: https://github.com/arisro/german-stemmer (Copyright (c) 2013 Aris Buzachis (buzachis.aris@gmail.com))
- English Stemming: https://tartarus.org/martin/PorterStemmer/php.txt (Copyright (c) 2005 Richard Heyes (http://www.phpguru.org/))
In order to test the libray run:
./vendor/bin/phpunit tests
in order to psr2 fix your code run:
./vendor/bin/php-cs-fixer fix src/