Memoly leak when use python-wrapper and input string is too long
ankokumoyashi opened this issue · 0 comments
ankokumoyashi commented
memoly leak When the following conditions are fullfilled
- use python wrapper("-C (allocate sentence)" option is ON)
- use same lattice instance within each loop
- input bytes over 5534
How to reproduce
-
versions
- Python 3.5.1
- mecab of 0.996
-
code
import MeCab import os import psutil import sys pid = os.getpid() py = psutil.Process(pid) class CheckMemoryLeak(): def __init__(self): self.lattice = MeCab.Lattice() def mecab_set_sentence(self, text): self.lattice.set_sentence(text) if __name__ == '__main__': Mecab = CheckMemoryLeak() sentence = 'あ' * 2730 print('input bytes:', sys.getsizeof(sentence)) while True: Mecab.mecab_set_sentence(sentence) memoryUse = py.memory_info()[0] print('memory use:', memoryUse)
-
result
input bytes: 5534 memory use: 13950976 ・・・(about 10 times mecab_set_sentence) memory use: 14221312 ・・・(about 10 times mecab_set_sentence) memory use: 14491648 ・・・(after 30 seconds) memory use: 2043158528
However, in the case of the following code
sentence = 'あ' * 2729
- result
input bytes: 5532 memory use: 13950976 ・・・(about 10 times mecab_set_sentence) memory use: 14155776 ・・・(after 30 seconds) memory use: 14155776 ・・・(after 10 minutes) memory use: 14155776
Probable Cause
- It is not checked that the number of bytes of input_str is less than or equal to BUF_SIZE.
- It is considered that a memory leak has occurred when allocating a character string of a size exceeding BUF_SIZE after allocating an area for BUF_SIZE.
- BUF_SIZE, MIN_INPUT_BUFFER_SIZE, MAX_INPUT_BUFFER_SIZE can not be set with setting file, options, etc. only input-buffer-size
Lines 42 to 53 in 3a07c4e
Temporary solution
- Edit BUF_SIZE
Lines 72 to 74 in 3a07c4e
-
before
#define MIN_INPUT_BUFFER_SIZE 8192 #define MAX_INPUT_BUFFER_SIZE (8192*640) #define BUF_SIZE 8192
-
after
#define MIN_INPUT_BUFFER_SIZE 16384 #define MAX_INPUT_BUFFER_SIZE (16384*640) #define BUF_SIZE 16384
- rebuild&reinstall
make
sudo make install
Proposed solution
The problem is that execution will not stop even if a memory leak occurs
- Warn if input string exceeds BUF_SIZE also python-wrapper