stacscan/stacs

Invalid start byte

Closed this issue · 4 comments

Hi @darkarnium,

we got a report on EMBA about a failed scan with a stacs error trace here. I can reproduce it with the attached file which is from the deep extractor and somehow corrupted.

┌──(m1k3㉿emba)-[~/github-repos/emba_forked]
└─$ stacs --rule-pack /home/m1k3/github-repos/emba_forked/external/stacs-rules/credential.json --skip-unprocessable /home/m1k3/firmware-stuff/emba_logs_test/firmware/patool_extraction/470ABBI4C0.bin_binwalk_extracted/_470ABBI4C0.bin.extracted/189830_binwalk_extracted/_189830.extracted/1393A94 
2022-11-21 14:00:16,209 - 1507053 - [INFO] STACS running with 10 threads
2022-11-21 14:00:16,209 - 1507053 - [INFO] STACS uses libarchive (licenses may be found at https://github.com/libarchive/libarchive/blob/master/COPYING)
2022-11-21 14:00:16,209 - 1507053 - [INFO] STACS uses yara (licenses may be found at https://github.com/VirusTotal/yara-python/blob/master/LICENSE)
2022-11-21 14:00:16,209 - 1507053 - [INFO] Attempting to load rule pack from /home/m1k3/github-repos/emba_forked/external/stacs-rules/credential.json
2022-11-21 14:00:16,210 - 1507053 - [INFO] Using cache directory at /tmp/1669035616210497
2022-11-21 14:00:16,210 - 1507053 - [INFO] Attempting to get a list of files to scan from /home/m1k3/firmware-stuff/emba_logs_test/firmware/patool_extraction/470ABBI4C0.bin_binwalk_extracted/_470ABBI4C0.bin.extracted/189830_binwalk_extracted/_189830.extracted/1393A94
2022-11-21 14:00:16,279 - 1507053 - [INFO] Found 1 files for analysis
Traceback (most recent call last):
  File "/usr/local/bin/stacs", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/stacs/scan/entrypoint/cli.py", line 143, in main
    getattr(stacs.scan.scanner, scanner).run(targets, pack, workers=threads)
  File "/usr/local/lib/python3.10/dist-packages/stacs/scan/scanner/rules.py", line 222, in run
    findings.extend(future.result())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/dist-packages/stacs/scan/scanner/rules.py", line 176, in matcher
    findings.extend(generate_findings(target, match))
  File "/usr/local/lib/python3.10/dist-packages/stacs/scan/scanner/rules.py", line 147, in generate_findings
    location = generate_location(target, offset)
  File "/usr/local/lib/python3.10/dist-packages/stacs/scan/scanner/rules.py", line 132, in generate_location
    line_number += fin.read(CHUNK_SIZE).count("\n")
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 9752: invalid start byte

1393A94.zip

Hey @m-1-k-3,

Thanks for the report! I'll have a look into this and get back to you.

Hey there,

Just as an update, the cause of this issue is that the data scanned was identified as potentially being text but also contains binary data. The fix is relatively simple, we'll just add a new exception handler and return the location of the finding in place of the line-number - which is what we do for "real" binary files.

I'll get a bug-fix release out shortly.

STACS 0.4.14 has just been released which includes fixes for this issue. I cannot reproduce this issue on this new version, but please let me know if you have any issues with the fix :)

Thanks again for the report!

Thanks @darkarnium for your quick fix. I will integrate the new version into EMBA soon