mandiant/stringsifter

numpy.core._exceptions.MemoryError

OevreFlataeker opened this issue · 2 comments

Running

flarestrings <big 2 GB file> | rank_strings

crashes with an out-of-memory exception:

(python37) daubsi@bigigloo:/tmp$ flarestrings <bigfile> | rank_strings
Traceback (most recent call last):
  File "/home/daubsi/.conda/envs/python37/bin/rank_strings", line 11, in <module>
    load_entry_point('stringsifter', 'console_scripts', 'rank_strings')()
  File "/tmp/stringsifter/stringsifter/rank_strings.py", line 138, in argmain
    args.scores, args.batch)
  File "/tmp/stringsifter/stringsifter/rank_strings.py", line 31, in main
    input_strings.readlines()])
numpy.core._exceptions.MemoryError: Unable to allocate array with shape (19412352,) and data type <U45056

There is more than 10GB free memory available.
Running on Ubuntu 14.04 with Python 3.7.4

Is the tool supposed to work on smaller files only? The standard "strings" utility had no issue getting the strings from the binary.

The stringsifter suite (flarestrings and rank_strings) is intended for binary triage of executable files suspected of being malware. For such files a few 10's of MB is considered a large file; a 2 GB file would be an extraordinary case. However if large file analysis is a common use case please notify us and we'll investigate it.

A possible workaround would be to split the large binary into smaller segments and process them separately, printing the string scores using the rank_strings -s option. Using the scores the ranked ouptut could be reassembled using a separate script or spreadsheet to form a report covering the whole file.

Thanks. I was trying to analyze a memory dump file, as I usually do with the standard "strings" command as well hence the large size. Will try your suggestion!