https://tianchi.aliyun.com/competition/entrance/231694/introduction
- Report
- PPT
- Plot (data, trend, target)
- 词云
- 箱型图
- 柱状图
- 散点图
- 特征重要性(卡方特征、LGB)
- 天花板
- 算法以及数学模型
- 领域、行业深入
-
Statistic Feature
-
Model Feature
-
v1 (only one feature by TF-IDF)
- api sorted by tid and index grouped by file_id
-
v2
- tid_count
- tid_distinct_count
- api_distinct_count
- tid_api_count_max
- tid_api_count_min
- tid_api_count_mean
- tid_api_distinct_count_max
- tid_api_distinct_count_min
- tid_api_distinct_count_mean
-
v3
- v1 + v2
- N-Gram
- TF-IDF
- XGBoost
- NB-LR
- 卡方校验
- Numpy
- Pandas
- Scikit-learn
- SciPy
- 阿里云安全恶意程序检测,线上成绩0.443705,受邀分享比赛思路
- gitHappyboy/ML
- API based sequence and statistical features in a combined malware detection architecture
- Google Machine Learning Crash Courses
- XGBoost multiclass_classification demo
- TF-IDF与余弦相似性的应用(一):自动提取关键词
- 深度学习基础 (九)--Softmax (多分类与评估指标)
- 使用Random Forest(随机森林)进行多分类和模型调优
- 用机器学习进行恶意软件检测——以阿里云恶意软件检测比赛为例
- python – LightGBM的多类分类
- LR文本分类
- 特征选择方法:卡方检验和信息增益
- 第三届阿里云安全算法挑战赛
- one-vs-rest与one-vs-one以及sklearn的实现
- 模型融合方法概述