fox-it/log4j-finder

Scanning issue very large zip files

dpritipalsingh opened this issue · 6 comments

I would like to report an issue on Windows, regarding the scanning of very large zip files.
This seems to be problematic for very large zip files with sizes around 2-4GB containing a large amount of files and/or a very large jar file.
The tools seems to be scanning for a couple of hours (on this zip file alone), after which I manually abort.
When extracted the tool does not have a problem with scanning the jar file(s).
All the zips are oracle related software.
As a workaround I use the -e (exclude) option to exclude these very large zip files.
I just wanted to tip you on this issue.
Perhaps you can add an option: do not scan zip files and/or disable the scanning of very large zip files.
I appreciate all the good work you guys have done.

Hi, thanks for reporting! Do you know your Python version, or are you running the binaries?

There is an issue with Python < 3.7 where ZipFile objects cannot seek in memory so the workaround was to read the file into memory. I assume this could be part of the problem if that is the case.

I'm using the pre-compiled binary for windows version 1.2.0 and don't have python installed. I would really appreciate an additional scan option to skip zip files with a specified size and larger (i.e. >1GB) in order to improve the scan speed.

I'm using the pre-compiled binary for windows version 1.2.0 and don't have python installed. I would really appreciate an additional scan option to skip zip files with a specified size and larger (i.e. >1GB) in order to improve the scan speed.

That is interesting, I will check if we can make such a feature but I would also like to know what causes it. Is it possible to share one of the zip files?

Unfortunately, I cannot share these files. I'm running a debug -vv at the moment to see if it logs (with 2>&1) anything usefull, if that will help? The problem is that when it scans very large zip files the scan speed is significantly degraded depending on the size of the zipfile and probably it's contents. After 2-4 hours I just manually abort because it takes way too long for 1 zipfile.

The oracle weblogic 12.2.1.4.0 zip (containing 1 jar) causing issues can be downloaded from the oracle url described here (login required): https://github.com/oracle/docker-images/blob/main/OracleFMWInfrastructure/dockerfiles/12.2.1.4/fmw_12.2.1.4.0_infrastructure_Disk1_1of1.zip.download
Additionally scanning a 4,5GB zip file containing oracle 12 installation files and jars (and not containing the file mentioned above) took 10 hours of scanning time to complete succesfully.

thanks, i got the download link