tvquynh/api_import_dataset
This dataset is part of my Master's research on malware detection and classification using the XGBoost library. The dataset is a collection of 1.55 million of 1000 API import features extract from the EMBER dataset 2017 v2 and 2018. All duplicated records are removed. Instructions: * FEATURES * Column name: sha256 Description: SHA256 hash of the example Type: string Column name: appeared Description: appeared date of the sample Type: date (yyyy-mm format) Column name: label Description: specify malware or goodware of the sample Type: integer Column name: GetProcAddress Description: Most imported function (1st) Type: 0 (Not imported) or 1 (Imported) ... Column name: LookupAccountSidW Description: Least imported function (1000th) Type: 0 (Not imported) or 1 (Imported)
GPL-3.0
No issues in this repository yet.