gwillem/magento-malware-scanner

Move build artefacts out of repo

Closed this issue · 18 comments

Note to mwscan users: update your install, or you will not get new rules anymore!

  • The grep URL has changed from git.io/mwscan.txt to mwscan.s3.amazonaws.com/mwscan.txt
  • If using the mwscan package, try sudo pip3 install --upgrade mwscan (or sudo pip install --upgrade mwscan).

See the updated docs for sample crons.

What is this change about?

Let the CI pipeline build the signatures, instead of including them in the repo (redundantly).

Pro: This will unclutter many PRs
Con: Installation instructions need to change, people need to update their mwscan code as the URL is hardcoded and currently points to github.

Plan:

  • Instruct Travis to build rules and upload them to S3 upon commit to master. Done: https://mwscan.s3.amazonaws.com/mwscan.yar
  • Change built rules name to mwscan.txt and mwscan.yar (from all-confirmed).
  • Update all references to all-confirmed, eg in travis test scripts
  • Change URL in ruleset.py
  • Update basic instructions/URL for grep usage
  • Do not bundle rules anymore in pip/deb package and remove DEFAULT_RULES_FILE
  • Make mwscan ruleset the default one
  • Ensure that scanning continues, even if S3 is unreachable (except of course when there is no cached version of the rules)
  • Add build/* to .gitignore so PRs will not clutter any further.
  • Verify that mwscan without arguments still does a sane thing (ie download the latest default ruleset and use that)
  • Update screenshot in docs
  • Release new pip package
  • Add wildcard rule that will fail on everything, to warn sysadmins to upgrade.

Mwscan users (e.g. Byte) should:

  • Once steps above are completed, install new pip package and/or build new deb with new S3 rule URL

@timmuller @thomasbrockmeier did I forget anything? BTW I think you would only have to rebuild a new deb package, the Hypernode mwscan cron looks like it can stay the same.

To warn users of the old rules URL, I could update the Github rules file to add a single signature with name "UPDATE YOUR MWSCAN PACKAGE" which would trigger on all files.

What to do.. Fail fast and hard? Or let unsuspecting sysadmins silently use the deprecated rules files for years?

@jeroenvermeulen what do you think?

I vote for "fail fast and hard", because I hate false security.

To build the rules, we use tools/validate_signatures.py, as CONTRIBUTING.md instructs. I think that is not the most logical name for the command to build the rules. However it is a logical name to test the rules. Or am I misunderstanding something?

One other thing: at this moment we are maintaining our rules in the master branch in our fork, is this the best way? How will we push to S3? The changes compared to your master are only because of the rules.

I think that is not the most logical name for the command to build the rules.

Indeed :) However, validate_signatures.py itself calls build_rules.py before validation.
NB, because the new Yara rules file will be build by Travis (and the existing .yar will be removed from this repository), there is no need to build rules when contributing. It would still be useful to test the rules locally though.

One other thing: at this moment we are maintaining our rules in the master branch in our fork, is this the best way

Absolutely! You would still be able to use mwscan -s magehost. But upstream PRs for signatures are still most welcome though.

Does that answer your question? The main objective this issue tries to solve, is to deduplicate the rules in this repo and to streamline merging of PRs.

@gwillem Is it the plan we send a PR every time we update rules/magehost.yar or rules/magehost_whitelist.json? Will you configure your travis to deploy to S3, or is it better we use an own S3 bucket? Testing with:

Wildcard rules were added for the (now deprecated) all-confirmed.txt & .yar. Hopefully they don't cause too much inconvenience. On the upside: mwscan maintenance has been much simplified with this change!

https://github.com/nbs-system/php-malware-finder

I tested the magento scanner and this one.
this second brought better results!
both using Yara.

@albertobraschi Thanks, will give it a try. If it works well I will add it to the list of scanners we run every night at MageHost.

@albertobraschi These projects have different design goals: NBS's PMF optimizes for coverage while mwscan optimizes for accuracy. So PMF tends to have more hits, while MWscan has fewer false positives. But perhaps you want to share which malware was missed by mwscan so we can include it?

@jeroenvermeulen FYI the NBS ruleset is supported by mwscan (-s nbs) and runs faster on stock Yara, see https://github.com/nbs-system/php-malware-finder/issues/45

@albertobraschi Tested NBS on our customers' Magento installs. Lots of false positives as @gwillem predicted. I did some work to whitelist vanilla Magento installs, but the guys don't accept my PR, they don't trust me.
I think the NBS will not work for us.

I have used both. mwscan and nbs.
mwscan had few false positives, on the other hand, it did not cover all malware.
nbs had a few false positives, but it covered all the malwares.

this is the log from nbs
https://gist.github.com/albertobraschi/c77f060a04c5c4f9486c60751ad05cd7
and in mwscan were found thing of some 3 files... more or less.

I am no longer with the files to test again and do a comparison.
I'm sorry.

@albertobraschi obfuscated PHP isn't necessarily malware.. unfortunately, many legitimate extension vendors use obfuscation techniques to "protect" their code.
Anyway, this issue was about a different topic. Please reopen a new issue if you want to discuss further.