yooper/php-text-analysis

How can I use the TF-IDF?

nafre opened this issue · 5 comments

nafre commented

Hi, I was experimenting around and found that this library has a TFIDF implementation. Can someone show me an example to get this to work?

What should I put for the DocumentAbstract $document and the $token? And how can I see the result?

Thanks.

@nafre Hopefully, this example will help.

        $docs = [
            new TokensDocument(tokenize($text1)),
            new TokensDocument(tokenize($text2)),
            new TokensDocument(tokenize($text3))
        ];
        
        $docCollection = new DocumentArrayCollection($docs);
           
        $tfIdf = new TfIdf($docCollection);

@nafre , let me know if you have any more questions. I am closing this issue.

nafre commented

Thanks. Yeah just one more question.

TextAnalysis\Indexes\TfIdf:
function getTfIdf(DocumentAbstract $document, $token, $mode = 1)

Would it make sense to call this function to get the values for the tfIDF?
If it does, when I want to call this function, what do I put in for the parameters?

The source is here. https://github.com/yooper/php-text-analysis/blob/master/src/Indexes/TfIdf.php

Yes, getTfIdf will get you the weight of the document for a given document.

nafre commented

Alright Thanks.