Killed while initialize feature
Jun-jie-Huang opened this issue · 3 comments
Jun-jie-Huang commented
Hi, I'm running your code to implement 'substoke' model with my 80G corpus, but it was killed.
Here' s the picture of the error. And I modify the run.sh
like this:
path_input=/data2/private/huangjunjie/COS960
path_out=.
rm -rf ./bin
cp -rf ./word2vec/bin .
./bin/word2vec substoke -input ${path_input}/SogouT_all -infeature ./Simplified_Chinese_Feature/sin_chinese_feature.txt -output ${path_out}/cw2vec_vector -lr 0.025 -dim 300 -ws 5 -epoch 5 -minCount 10 -neg 5 -loss ns -minn 3 -maxn 18 -thread 20 -t 1e-4 -lrUpdateRate 100
dalinvip commented
80G corpus ? maybe it's out of memory, be killed.
Jun-jie-Huang commented
It only takes 20% of the memory, and I can run 'fasttext' with the 80G corpus freely, so I guess it's not out of memory? It's killed while initializing stoke feature.
dalinvip commented
try modify max_vocab_size rebuild https://github.com/bamtercelboo/cw2vec/blob/master/word2vec/src/include/dictionary.h#L43.
I did not use such a large data set for training, it may be a problem of setting restrictions, you can try to modify.