A ground-truth dataset consisting of 7840 repositories, half of which contain malware, half of which don't