hillu/go-yara

Without restarting the program, repeated loading of some rules keeps increasing memory

vectian opened this issue · 12 comments

Go-Yara is used in my program. Now I want to reload a batch of rules according to the reload rule command I received without restarting the program. I use the command 'top' to monitor the proportion of memory, but after reloading, the memory has been increasing without being released.

hillu commented

My best guess is that the Go garbage collector does not collect the Rules object (which would cause yr_rules_destroy to be called), but without a reproducer, it is hard to tell.

Is the size of the memory leak similar to the size of the compiled ruleset?

hillu commented

Can you share some code sample that demonstrates the behavior?

Can you share some code sample that demonstrates the behavior?

sample: https://github.com/qtmee/yara-memory-leak.git

Browser enter http://localhost:3000/i to simulate the recreate command
image

hillu commented

Does adding runtime.GC() after replacing the Rules instance fix the problem for you?

How about explicitly calling s.Rules.Destroy() right before replacing it?

Does adding runtime.GC() after replacing the Rules instance fix the problem for you?

How about explicitly calling s.Rules.Destroy() right before replacing it?

Both approaches have been tried, and memory has not dropped

hillu commented

I have added a few endpoints to your sample in order to trigger GC and to log malloc statistics using the malloc_stats(3) function. Here's what I have found:
This is the initial state (before any rules have been parsed):

Arena 0:
system bytes     =     135168
in use bytes     =       4752
Arena 1:
system bytes     =     135168
in use bytes     =       2896
Arena 2:
system bytes     =     135168
in use bytes     =       2896
Arena 3:
system bytes     =     135168
in use bytes     =       2896
Arena 4:
system bytes     =     135168
in use bytes     =       2896
Total (incl. mmap):
system bytes     =     675840
in use bytes     =      16336
max mmap regions =          0
max mmap bytes   =          0

It turns out that having YARA parse a ruleset even just once will considerably grow Arena 4:

Arena 4:
system bytes     =   16281600
in use bytes     =   15946016

Forcing GC will cause the "in use bytes" value to drop, but the "system bytes" value will stay at its previous level.

Arena 5:
system bytes     =   16281600
in use bytes     =      48112

Note that for some reason, Arena 4 is now called Arena 5, but the "system bytes" value stays the same. (I don't know enough about the GNU libc heap implementation to explain this.)

Parsing the ruleset multiple times without forcing a GC will cause the rulesets to be allocated to multiple arenas.

Also note that 45216 bytes have apparently not been freed.

malloc(3) keeps buffers around for subsequent reuse, that's a reason why the "system bytes" values are not dropped. What looks like a HUGE memory leak may still be a leak in the code but not as large as one would think. Most of the leak has something to do with GNU libc malloc(3) implementation and probably fragmentation.

The Dgraph developers seem to have run into similar effects and their solution is to use jemalloc. One can override the default malloc(3) implementation with jemalloc using LD_PRELOAD on Linux:

: ; LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 ./main

Of course this is not a proper solution but it may help you in the short term. I'll try to figure out how to fix this with the standard malloc(3).

I have added a few endpoints to your sample in order to trigger GC and to log malloc statistics using the malloc_stats(3) function. Here's what I have found:
This is the initial state (before any rules have been parsed):

Arena 0:
system bytes     =     135168
in use bytes     =       4752
Arena 1:
system bytes     =     135168
in use bytes     =       2896
Arena 2:
system bytes     =     135168
in use bytes     =       2896
Arena 3:
system bytes     =     135168
in use bytes     =       2896
Arena 4:
system bytes     =     135168
in use bytes     =       2896
Total (incl. mmap):
system bytes     =     675840
in use bytes     =      16336
max mmap regions =          0
max mmap bytes   =          0

It turns out that having YARA parse a ruleset even just once will considerably grow Arena 4:

Arena 4:
system bytes     =   16281600
in use bytes     =   15946016

Forcing GC will cause the "in use bytes" value to drop, but the "system bytes" value will stay at its previous level.

Arena 5:
system bytes     =   16281600
in use bytes     =      48112

Note that for some reason, Arena 4 is now called Arena 5, but the "system bytes" value stays the same. (I don't know enough about the GNU libc heap implementation to explain this.)

Parsing the ruleset multiple times without forcing a GC will cause the rulesets to be allocated to multiple arenas.

Also note that 45216 bytes have apparently not been freed.

malloc(3) keeps buffers around for subsequent reuse, that's a reason why the "system bytes" values are not dropped. What looks like a HUGE memory leak may still be a leak in the code but not as large as one would think. Most of the leak has something to do with GNU libc malloc(3) implementation and probably fragmentation.

The Dgraph developers seem to have run into similar effects and their solution is to use jemalloc. One can override the default malloc(3) implementation with jemalloc using LD_PRELOAD on Linux:

: ; LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 ./main

Of course this is not a proper solution but it may help you in the short term. I'll try to figure out how to fix this with the standard malloc(3).

Thank you very much for your answer. I will continue to pay attention to this problem and try the solution you have provided at present

hillu commented

I have taken a closer look at the various malloc tuning parameters documented in the GNU libc manual, but they don't seem to make any difference.

Rather than setting LD_PRELOAD, you can use a compiler build flag, though:

; go build -ldflags="-linkmode=external -extldflags=-ljemalloc"

Do I modify the Go-Yara source code directly? Change malloc used in CGO to je_calloc? I have tried but failed. Could you help me to do it again? Thank you

hillu commented

No, you don't have to modify anything. Just install libjemalloc-dev from your distribution and use the -ldflags parameter I gave above.

This problem has been solved。There are no direct loading yara rules,the yara rules are precompiled, and the yara rules that have been precompiled are loaded when reloaded。

This problem has been solved。There are no direct loading yara rules,the yara rules are precompiled, and the yara rules that have been precompiled are loaded when reloaded。