Dataset of packed PE files

This is a fork of the dataset at with some samples sanitized (e.g. UPX-packed samples in the ´not-packed´ folder or samples with a same hash from the packer and not-packed folders). It also includes a folder named outliers containing samples we could identify as potentially disturbing our models, i.e. when they were sorted among the not packed samples while demonstrating characteristics of packed data. This dataset can be used for training machine learning models tailored to PE executable packing.

⭐ Related Projects

You may also like these:

Example of visualization created with Bintropy: