clab/fast_align

Crash with larger corpus

Opened this issue · 0 comments

I have an English->German corpus with ~7GB built from (Pattr, Europarl and News Comments), max sentence length of 80 chars. fast_align crashes in iteration 1.

ITERATION 1

.................................................. [50000]
.................................................. [100000]
.................................................. [150000]
.................................................. [200000]
.................................................. [250000]

...
.................................................. [5650000]
Killed: 9

It is a memory problem. Process gets killed because of the memory consumption.
I avoided the problem by using a machine with a lot of memory.