gwillem/magento-malware-scanner

Py2 doesn't handle the magesec rules

Closed this issue · 5 comments

[*] https://magesec.org:443 "GET /download/yara-standard.yar HTTP/1.1" 200 159703
Traceback (most recent call last):
  File "/usr/local/bin/mwscan", line 9, in <module>
    load_entry_point('mwscan==20180307.122431', 'console_scripts', 'mwscan')()
  File "/usr/local/lib/python2.7/dist-packages/mwscan/scan.py", line 243, in main
    rules, whitelist = provider(args=args).get()
  File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 124, in get
    rawrules = self.get_rules()
  File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 46, in get_rules
    return self._recursive_fetch(self.rules_url)
  File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 145, in _recursive_fetch
    data = self._httpget(url)
  File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 109, in _httpget
    return resp.content.decode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 154508: ordinal not in range(128)

Plus one on this bug, just ran into it today..

Traceback (most recent call last):
  File "/usr/bin/mwscan", line 5, in <module>
    from pkg_resources import load_entry_point
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in <module>
    working_set.require(__requires__)
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve
    raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: requests>=0.8.2

I was able to fix the issue by installing python-requests and re-installing mwscan

yum install python-requests

pip install --no-cache-dir --upgrade mwscan

This was on a CentOS6 machine also for anybody else who may run into this issue.

Thanks @curiouscrusher! Would you mind sending in a PR for the docs? See Troubleshooting under docs/usage.md

I have upgraded everything but still getting this error from cron

Traceback (most recent call last):
File "/usr/local/bin/mwscan", line 9, in
load_entry_point('mwscan==20180307.122431', 'console_scripts', 'mwscan')()
File "/usr/local/lib/python2.7/dist-packages/mwscan/scan.py", line 243, in main
rules, whitelist = provider(args=args).get()
File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 124, in get
rawrules = self.get_rules()
File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 46, in get_rules
return self._recursive_fetch(self.rules_url)
File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 145, in _recursive_fetch
data = self._httpget(url)
File "/usr/local/lib/python2.7/dist-packages/mwscan/ruleset.py", line 109, in _httpget
return resp.content.decode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 158905: ordinal not in range(128)

This error still persist but for cached rulesets.

Traceback (most recent call last):
File "/opt/mwscan/bin/mwscan", line 11, in <module>
sys.exit(main())
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/scan.py", line 243, in main
rules, whitelist = provider(args=args).get()
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/ruleset.py", line 139, in get
rawrules = self.get_rules()
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/ruleset.py", line 48, in get_rules
rawrules = self._recursive_fetch(self.rules_url)
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/ruleset.py", line 160, in _recursive_fetch
data = self._httpget(url)
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/ruleset.py", line 99, in _httpget
mtime, cachedcontent = self._get_cache_timestamp_content(cachefile)
File "/opt/mwscan/lib/python3.5/site-packages/mwscan/ruleset.py", line 89, in _get_cache_timestamp_content
cachedcontent = fh.read()
File "/opt/mwscan/lib/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 159242: ordinal not in range(128)

Problem starts here:

if resp.status_code == 200:
            with open(cachefile, 'wb') as fh:
                fh.write(resp.content)

            # py3 vs py2
            if type(resp.content) is bytes:
                return resp.content.decode('utf-8', errors='ignore')
            else:
                return resp.content

Ruleset should be either saved already decoded or decoded when read again from cache.