2021 APTOS Big Data Competition
https://tianchi.aliyun.com/competition/entrance/531929/information?lang=zh-cn
请在 ./data 目录下建立这两个文件夹,
test-set-img
: 放测试集图片training-set-img
: 放训练集图片
请直接向本项目 commit, 不用 fork 再 Pull requests
数据处理的过程
拍平图片文件夹:
flat-img-dir.py
把原来具有文件夹层次的图片目录拍平了, 所有的图片直接放在 data/training-set-img/
或 data/test-set-img/
里.
训练集:
clean-training-set.py
: 清洗raw-data/TrainingAnnotation.csv
, 得到data/training-set.csv
generate-training-set-with-img.py
: 从data/training-set.csv
, 加上对图片的索引, 得到training-set-with-img.json
待提交答案的输入, 官方将它误称作测试集, 我们也继续叫测试集算了:
clean-test-set.py
: 由raw-data/PreliminaryValidationSet_Info.csv
, 得到data/test-set.csv
- 注: 无需清洗, 只是改变了 gender 的意义(改为 0 表示 男, 1 表示女)
generate-test-set-with-img.py
: 从data/test-set.csv
, 加上对图片的索引, 得到training-set-with-img.json
data/training-set-with-img.json
后续训练应使用 它长这样:
[
{
"patient ID": "0000-1315L",
"gender": 1,
"age": 76,
"diagnosis": 6,
"preVA": 0.3,
"anti-VEGF": 0,
"preCST": 227,
"preIRF": 0,
"preSRF": 0,
"prePED": 0,
"preHRF": 0,
"VA": 0.4,
"continue injection": 0,
"CST": 237,
"IRF": 0,
"SRF": 0,
"PED": 0,
"HRF": 0,
"img_before": [
"./data/training-set-img/0000-1315L_1000.jpg",
"./data/training-set-img/0000-1315L_1001.jpg",
"./data/training-set-img/0000-1315L_1002.jpg",
"./data/training-set-img/0000-1315L_1003.jpg",
"./data/training-set-img/0000-1315L_1004.jpg",
"./data/training-set-img/0000-1315L_1005.jpg"
],
"img_after": [
"./data/training-set-img/0000-1315L_2000.jpg",
"./data/training-set-img/0000-1315L_2001.jpg",
"./data/training-set-img/0000-1315L_2002.jpg",
"./data/training-set-img/0000-1315L_2003.jpg",
"./data/training-set-img/0000-1315L_2004.jpg",
"./data/training-set-img/0000-1315L_2005.jpg"
]
},
{
"patient ID": "0000-0656L",
"gender": 0,
"age": 64,
"diagnosis": 2,
"preVA": 0.32,
"anti-VEGF": 1,
"preCST": 201,
"preIRF": 0,
"preSRF": 1,
"prePED": 1,
"preHRF": 0,
"VA": 0.4,
"continue injection": 1,
"CST": 191,
"IRF": 0,
"SRF": 0,
"PED": 1,
"HRF": 0,
"img_before": [
"./data/training-set-img/0000-0656L_1000.jpg",
"./data/training-set-img/0000-0656L_1001.jpg",
"./data/training-set-img/0000-0656L_1002.jpg",
"./data/training-set-img/0000-0656L_1003.jpg",
"./data/training-set-img/0000-0656L_1004.jpg",
"./data/training-set-img/0000-0656L_1005.jpg"
],
"img_after": [
"./data/training-set-img/0000-0656L_2000.jpg",
"./data/training-set-img/0000-0656L_2001.jpg",
"./data/training-set-img/0000-0656L_2002.jpg",
"./data/training-set-img/0000-0656L_2003.jpg",
"./data/training-set-img/0000-0656L_2004.jpg",
"./data/training-set-img/0000-0656L_2005.jpg"
]
},
......
}