Android Malware Detection Method Based on Graph Attention Networks and Deep Fusion of Multimodal Features

1. 整体介绍 (Overview)

    本连接为《Android Malware Detection Method Based on Graph Attention Networks and Deep Fusion of Multimodal Features》的Python实现, 本代码由北京航空航天大学软件开发环境国家重点实验室海量信息处理与信息安全课题组提供。

    This the Python implement of paper "Android Malware Detection Method Based on Graph Attention Networks and Deep Fusion of Multimodal Features", which was provided by Massive Data Processing and Information Security Group of State Key Laboratory of Software Development Environment, Beihang University.

    论文引用信息:Chen, S., Lang, B., Liu, H., Chen, Y., Song, Y. (2024). Android malware detection method based on graph attention networks and deep fusion of multimodal features. Expert Systems with Applications, 237, Article 121617. https://doi.org/10.1016/j.eswa.2023.121617.

    Citation information: Chen, S., Lang, B., Liu, H., Chen, Y., Song, Y. (2024). Android malware detection method based on graph attention networks and deep fusion of multimodal features. Expert Systems with Applications, 237, Article 121617. https://doi.org/10.1016/j.eswa.2023.121617.

2. 文件夹介绍 (Package Description)

本路径下的文件夹主要用于存在数据集,中间文件以及结果文件。其作用分别如下:

This folder are mainly used to store datasets, intermediate files and the result files. The uses of each floder are as follows:

  1. code/android_malware_detection: 存放工程源代码 (Stores the project source code)
  2. dataset: 存放原始数据集的apk文件 (stores the original APK files of datasets)
  3. 3rd: 存放三方库检测结果 (Stores third-party library detection results)
  4. apk_tool: 存放APKToll解析得到的dex和manifest.xml (Stores and Dex and AndroidManifest.xml obtained by APKToll)
  5. permission: 存放提取到的59维权限特征 (Stores the extracted 59-dimensional permission features)
  6. java: 存放jadx反编译的java源代码的压缩包 (Stores the zip file of Java source code )
  7. tokenResult: 存放对完整APK文件提取的token (Stores the tokens extracted from the whole APK files)
  8. java_src_tmp: 临时存放jadx反编译的java文件 ,需放置于固态硬盘下,详情见"code/android_malware_detection"下的"README.md" (Temporarily store the Java source obtained by JADX, and shold be placed on solid state drivers, the details would be explained in the README.md file in "code/android_malware_detection")
  9. class_set_call_graph: 存放提取的CSCG的节点调用关系 (Stores the relationships of extracted CSCG)
  10. tokenClassSet: 存放CSCG节点的token (Stores the token extracted from the node of extracted CSCG)
  11. model: 存放节点LSI特征、处理后的数据、模型文件。这里包含我们训练得到的TD-IDF、LSI、以及深度神经网络的最终模型文件。(Stores node LSI features, processed data, and model files. Here, it contains the final model files for TD-IDF, LSI, and deep neural networks.)

3. 其他 (Other notices)

    当前工程基于AMD_AndroZoo数据集的子集构建的demo,该demo的中间文件已全部保留。

    The current project is a demo on a subset of AMD_AndroZoo dataset. All intermediate files of the demo have been reserved.

    真实数据集的文件太大,无法上传,仅保留了对应的3组模型文件

    The files of the real datasets is too large, only the corresponding model files of the 3 datasets are reserved.

    如无需demo,则只需下载本文件夹,并确保外层的路径存在即可

    If do not need the demo, it could just download "code" folder and make sure the folders exist.

注 (Note)

详细的代码解释见"code/android_malware_detection"下的README.md

For detailed code explanation, see "README.md" file in folder "code/android_malware_detection"