hillu/go-yara

Linux Process Scan Question

ozanh opened this issue ยท 18 comments

ozanh commented

Hello,
Thank you for the great work. I was able compile go-yara statically under Linux and Windows (using msys2) and it works.

I am new to yara and my question is about scanning process(es) using yara executable or go-yara. Both Scanner.ScanProc or yr_scanner_scan_proc show same behavior on my laptop, Ubuntu 18.04 amd64. The scanned (target) process's RSS memory increases almost up to VSS memory. This becomes impossible when I want to scan all the available processes due to memory, I ran out of memory. Besides, I use the simplest yara rule just to search for a string existence.

What I see from yara Windows code, it does not scan uncommited memory pages of a process but under Linux story is different.

I really need a guidance to scan all processes without crashing the Linux OSes. One can try to scan chrome/firefox/Xorg processes and watch with top/htop to see how their memory usage changes.

hillu commented

The problem lies within YARA itself: Virtual memory addresses are determined via /proc/$PID/maps and then the memory contents are copied from /proc/$PID/mem before being scanned. (I had the idea to use mmap there, but the /proc filesystem only supports access via regular read()/write() operations.)

I am not sure why this will increase the RSS, perhaps it has to do with large uninitialized memory-mapped regions?

ozanh commented

I guess so (uninitialized memory-mapped regions) but it is not possible for me to do a full process scan until I find a solution to distinguish uninitialized memory mapped regions safely. I tried to scan processes with some proprietary tools which uses go-yara as well (using strings command to find included libraries) but it does not blow up the memory (perhaps they just the scan heap and stack segments). I will dig deeper and share my findings if I could.
I will be glad if you keep open this issue to attract some interest.

hillu commented

If the problem has something to do with uninitialized mmap'd regions, I think I may have found a solution / workaround.

hillu commented

Could you paste the contents of /proc/$PID/maps of one of those processes that would end up growing and exhausting all virtual memory when scanned with YARA?

ozanh commented

To reproduce the problem I use start htop executable whose VIRT is 36.5M, RES is 6.3K, SHR is 4K. htop uses many shared libraries as below and small memory requirement. Alternatively chromium/firefox can be scanned as well. Glad to hear you have a workaround ๐Ÿ‘

I was considering to skip memory mapped shared libraries (which have inode numbers) because skipping scanning shared files does not consume extra physical memory and files can be scanned individually with file scan, not sure about this though. PS: /proc/$PID/map_files contains the symlinks to files which are mapped to memory regions.

verigraf@verigraf:~$ htop & echo $!
[1] 12209
12209
verigraf@verigraf:~$ cat /proc/12209/maps
55b6b6e07000-55b6b6e2f000 r-xp 00000000 103:03 1966093                   /usr/bin/htop
55b6b702f000-55b6b7031000 r--p 00028000 103:03 1966093                   /usr/bin/htop
55b6b7031000-55b6b7034000 rw-p 0002a000 103:03 1966093                   /usr/bin/htop
55b6b7034000-55b6b7035000 rw-p 00000000 00:00 0 
55b6b742f000-55b6b74b3000 rw-p 00000000 00:00 0                          [heap]
7f9d81760000-7f9d81ba0000 r--p 00000000 103:03 1975092                   /usr/lib/locale/locale-archive
7f9d81ba0000-7f9d81ba3000 r-xp 00000000 103:03 4860216                   /lib/x86_64-linux-gnu/libdl-2.27.so
7f9d81ba3000-7f9d81da2000 ---p 00003000 103:03 4860216                   /lib/x86_64-linux-gnu/libdl-2.27.so
7f9d81da2000-7f9d81da3000 r--p 00002000 103:03 4860216                   /lib/x86_64-linux-gnu/libdl-2.27.so
7f9d81da3000-7f9d81da4000 rw-p 00003000 103:03 4860216                   /lib/x86_64-linux-gnu/libdl-2.27.so
7f9d81da4000-7f9d81f8b000 r-xp 00000000 103:03 4860186                   /lib/x86_64-linux-gnu/libc-2.27.so
7f9d81f8b000-7f9d8218b000 ---p 001e7000 103:03 4860186                   /lib/x86_64-linux-gnu/libc-2.27.so
7f9d8218b000-7f9d8218f000 r--p 001e7000 103:03 4860186                   /lib/x86_64-linux-gnu/libc-2.27.so
7f9d8218f000-7f9d82191000 rw-p 001eb000 103:03 4860186                   /lib/x86_64-linux-gnu/libc-2.27.so
7f9d82191000-7f9d82195000 rw-p 00000000 00:00 0 
7f9d82195000-7f9d82332000 r-xp 00000000 103:03 4860219                   /lib/x86_64-linux-gnu/libm-2.27.so
7f9d82332000-7f9d82531000 ---p 0019d000 103:03 4860219                   /lib/x86_64-linux-gnu/libm-2.27.so
7f9d82531000-7f9d82532000 r--p 0019c000 103:03 4860219                   /lib/x86_64-linux-gnu/libm-2.27.so
7f9d82532000-7f9d82533000 rw-p 0019d000 103:03 4860219                   /lib/x86_64-linux-gnu/libm-2.27.so
7f9d82533000-7f9d82558000 r-xp 00000000 103:03 4854917                   /lib/x86_64-linux-gnu/libtinfo.so.5.9
7f9d82558000-7f9d82758000 ---p 00025000 103:03 4854917                   /lib/x86_64-linux-gnu/libtinfo.so.5.9
7f9d82758000-7f9d8275c000 r--p 00025000 103:03 4854917                   /lib/x86_64-linux-gnu/libtinfo.so.5.9
7f9d8275c000-7f9d8275d000 rw-p 00029000 103:03 4854917                   /lib/x86_64-linux-gnu/libtinfo.so.5.9
7f9d8275d000-7f9d8278a000 r-xp 00000000 103:03 4854834                   /lib/x86_64-linux-gnu/libncursesw.so.5.9
7f9d8278a000-7f9d8298a000 ---p 0002d000 103:03 4854834                   /lib/x86_64-linux-gnu/libncursesw.so.5.9
7f9d8298a000-7f9d8298b000 r--p 0002d000 103:03 4854834                   /lib/x86_64-linux-gnu/libncursesw.so.5.9
7f9d8298b000-7f9d8298c000 rw-p 0002e000 103:03 4854834                   /lib/x86_64-linux-gnu/libncursesw.so.5.9
7f9d8298c000-7f9d829b5000 r-xp 00000000 103:03 4860116                   /lib/x86_64-linux-gnu/ld-2.27.so
7f9d82b94000-7f9d82b98000 rw-p 00000000 00:00 0 
7f9d82bb5000-7f9d82bb6000 r--p 00029000 103:03 4860116                   /lib/x86_64-linux-gnu/ld-2.27.so
7f9d82bb6000-7f9d82bb7000 rw-p 0002a000 103:03 4860116                   /lib/x86_64-linux-gnu/ld-2.27.so
7f9d82bb7000-7f9d82bb8000 rw-p 00000000 00:00 0 
7ffe3444e000-7ffe3446f000 rw-p 00000000 00:00 0                          [stack]
7ffe34494000-7ffe34497000 r--p 00000000 00:00 0                          [vvar]
7ffe34497000-7ffe34498000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]
hillu commented

Skipping shared libraries would be problematic because their contents may have been overwritten in process virtual memory, without those changes hitting disk. This might (!) work for mappings that lack the p (private) flag, though.

Using a small C program (https://gist.github.com/hillu/fb07ee4b23b7700873d80f71710b4203), I noticed that Xorg processes seem to have large mappings to pseudo-files /SYSV00000000 (probably SysV shared memory?), mappings that are neither file-backed nor swap-backed and that have no page present in VM (e.g. 7b9f10021000-7b9f14000000 ---p 00000000 00:00 0).

I believe that mmap'd pages that have been modified (e.g. by patching library code after process initialization), should show up as F (or possibly s) the output of my tool. So, we might get away with the following for file mappings:

  • Open file
  • Verify that device/inode match entry from /proc/$PID/maps If yes:
    • Duplicate mapping
    • Copy that are marked as "present" from /proc/$PID/mem
  • If no: fall back to previous behavior: Copy the whole mapping from /proc/$PID/mem
hillu commented

Here are a few more ideas / notes:

  • mmap'd device files (/dev/dri/card0 etc.) should be ignored altogether; /dev/zero etc. might make an exception, though.
  • mappings with no real file (see previous comment) should be replaced with an nulled area (or /dev/zero mapping) of equal size.
ozanh commented

Thank you Hilko, I will try to understand all of these and need to do some tests to implement it. It will take some time for me to digest everything ๐Ÿ˜„

hillu commented

I haven't understood every detail myself yet, but I think that there's quite a bit of potential to make libyara's process scanning feature considerably safer.

hillu commented

@ozanh, please have a look at the changes I have made for https://github.com/hillu/yara/tree/proc-linux-improvements and tell me how much of an improvement they cause in your use-case.

ozanh commented

Hi @hillu I built your branch but I got the following error during make check and also during process scanning. To be sure I tried this with both ubuntu18 amd64 and ubuntu20 arm64. Am I missing something?

==================================
   yara 4.1.0: ./test-suite.log
==================================

# TOTAL: 15
# PASS:  14
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: test-rules
================

yr_rules_scan_proc: Got unexpected error 4
--- PASS 1 ---
FAIL test-rules (exit status: 1)
ozan@ubuntu:~/go/yara/hillu$ sudo yara rule.yar 63811
error scanning 63811: error: 4
hillu commented

Oh, that was quick. Thanks!

My dev environment is Debian/unstable (amd64). The unit test works fine here, as does a scan of a simple bash process using a simple single-string rule. I have now encountered the same problem while scanning firefox, chromium, Xorg processes, though. Need to investigate, will get back to you.

hillu commented

@ozanh Does the last commit (65cc1f95a8eab93d98d0a947139687ed822a39bf) fix your issues?

ozanh commented

@hillu I tested under ubuntu20 arm64 with ssl support and all yara modules enabled, it works great ๐Ÿ‘ with trivial firefox/Xorg process scan ๐Ÿฅณ ๐Ÿ‘Œ ๐ŸŽ‰ , resident memory is almost not affected. I have not checked what you magically did but I will check the commits and do other tests later, Vielen Dank.

hillu commented

@ozanh My fix consists of adding code to deal with a case I hadn't thought of before: File mappings may extend past the end of a file and accessing such mappings leads to an access violation โ€“ and ultimately to YARA_COULD_NOT_MAP_FILE error.

hillu commented

Let's see where VirusTotal/yara#1470 gets us. This is probably post-YARA-4.1.0 material, but since the patch is self-contained, you'd be able to get away with simply replacing libyara/proc/linux.c in local builds if you need it before YARA 4.2.0.

ozanh commented

I will probably need it before v4.2.0 but I had tested go-yara with v4.1.0 master branch under arm64 but go-yara failed as expected because it supports 4.0.4 currently. Replacing linux.c for old versions will be OK I guess and that file didn't get any update after your ptrace fix. I am subscribed to PR to see what will happen. Thank you.

hillu commented

My PR fixing the issue within YARA itself is awaiting review, so I think we can close this issue here.