yooper/php-text-analysis

Entity Extraction returns empty array

Zera97 opened this issue · 4 comments

Hey,

i've started working with your wrapper for the "Stanford Named Entity Extraction", but all i get returned is an empty array. Also there are no error messages.

This is my Code:

            use TextAnalysis\Taggers\StanfordNerTagger;
            use TextAnalysis\Tokenizers\WhitespaceTokenizer;

            $jarpath = [HIDDEN]/stanford-yooper/stanford-ner.jar";
            $classifierPath =[HIDDEN]/stanford-yooper/classifiers/english.all.3class.distsim.crf.ser.gz";
     
            $engText = "Marquette County is a county located in the Upper Peninsula of the US state of Michigan. As of the 2010 census, the population was 67,077.";
            
            $document = new TokensDocument((new WhitespaceTokenizer())->tokenize($engText));
            $tagger = new StanfordNerTagger($jarpath,$classifierPath);
            $output = $tagger->tag($document->getDocumentData());
            var_dump($output); //empty Array

Thank you for your Response!

I installed the jar and classifier manually according to the Wiki, since i couldnt figure out, how to use the command. I also tested the jar with the shell which was working.

If you could give me a some advice on how to use the shell command, i would try to install the NER files this way.

@Zera97 , I will reach out to you later today with a write up, on how to make it work.

Cheers,

I'm about to install the jar and classifier manually myself, according to the Wiki. is there something additional I will need to know? : )

Here's a fix for your directions:
php textconsole pta:package:install stanford_ner_tagger

should be:
php textconsole pta:install:package stanford_ner_tagger
and you need to be in the ./vendor/yooper/php-text-analysis directory.

Hopefully this helps someone else.