_ .-')
( '.( OO )_
,--. .-'),-----. .-'),-----. ,--. ,--.)
| |.-') ( OO' .-. '( OO' .-. '| `.' |
| | OO )/ | | | |/ | | | || |
| |`-' |\_) | |\| |\_) | |\| || |'.'| |
(| '---.' \ | | | | \ | | | || | | |
| | `' '-' ' `' '-' '| | | |
`------' `-----' `-----' `--' `--'
DataCon2019大数据安全分析大赛方向二(恶意代码检测)冠军方案:rose::rose:,详细思路分享见知乎,DataCon2020大数据安全分析大赛方向五(恶意代码分析)季军方案,详细思路分享见知乎,由于比赛时间仓促代码写得比较混乱,还请各位读者多多见谅!
- [deep_learning_model.ipynb]
- [call_pid_tfidf_stacking.ipynb]
- [exinfos.ipynb]
- [explore.ipynb]
- [feature_engineering.ipynb]
- [new_feature_engineering.ipynb]
- [out_of_fold.ipynb]
- [ret_value_stacking.ipynb]
- [stacking.ipynb]
- [test.ipynb]
- [feature_engineering.ipynb]
- [for_cluster_kmeans.py]
- [get_call_name_tfidf_features.py]
- [plot_comparison.py]
- [yield_call_name_api_name_exinfos_tsne.py]
- [DBSCAN.py]
- [get_id.py]: 获取测试集的文件名
- [get_raw_test_data.py]: 获取测试集的原始字符串
- [get_raw_train_data.py]: 获取训练集的原始字符串
- [test_train_model.py]: 测试训练的模型
- [yield_end_result.py]: 生成最终提交的结果
- [yield_features.py]: 由原始字符串生成特征矩阵
- [yield_train_model.py]: 生成训练模型
- [plot.py]: 绘图模块
- [t_sne.py]: 降维可视化模块
- [lgb_cv.py]: LightGBM模型+交叉验证
- [xgb_bagging.py]: XGBoost模型+Bagging
- [bagging.py]: 经典Bagging框架代码