Error trying to train, WordCountVectorizer missing parameter $maxDocumentFrequency
bavamont opened this issue · 3 comments
I am getting this error, when I am trying to train using your train.php (https://github.com/RubixML/Sentiment/blob/master/train.php) example:
Fatal error: Uncaught TypeError: Argument 3 passed to Rubix\ML\Transformers\WordCountVectorizer::__construct() must be of the type int, object given....
In your example on Line 44 you have:
new WordCountVectorizer(10000, 3, new NGram(1, 2)),
But the constuctor for WordCountVectorizer expects this:
public function __construct(
int $maxVocabulary = PHP_INT_MAX,
int $minDocumentFrequency = 1,
int $maxDocumentFrequency = PHP_INT_MAX,
?Tokenizer $tokenizer = null
)
What would be your recommended parameters for WordCountVectorizer for your example to work best?
Good catch! Did you upgrade versions recently? We added the $maxDocumentFrequency
parameter in 0.1.0-rc5 ... thanks for the reminder I am going to update the train script!
Let's try a setting of 5000 for maxDocumentFrequency ... let me know if you get better results with a different setting
Also if you'd like to join our channel on Telegram https://t.me/RubixML
Thank you @andrewdalpino !
I’ll try it with 5000 for maxDocumentFrequency.
Thanks again!