
lda2vec preprocess not good support about memory

hxsnow10 opened this issue · 1 comments

when i use 6G pure text, 3 threads, max_len=4W, they run out my 120Gmeory+180Gswap.

i guess may be some point unfriendly with big data.

I've encountered MemoryError several times and now have to fall back to a smaller amount of data.