Deep learning method for automatic diagnosis of ovarian cancer using CT images and clinical information.
将原始数据手工划分为训练集和独立测试集,分别保存在指定文件夹下,如 train_data
和test_data
,目录结构如下,包含两个子文件夹r0
和non-r0
,每个样例包含一个dcm
的序列文件和一个nii
的分割文件:
./non-r0
│ ├── ./non-r0/xxx
│ │ ├── ./non-r0/xxx/IM0
...
│ │ ├── ./non-r0/xxx/IM42
│ └── ./non-r0/xxx_Merge.nii
└── ./r0
├── ./r0/xxx
│ ├── ./r0/xxx/IM0
...
│ ├── ./r0/xxx/IM42
└── ./r0/xxx_Merge.nii
运行convert_to_npy.py
将原始数据转存为hdf5
格式,其中image
和mask
键值分别索引图像和分割标注,运行之前先指定输入路径和保存路径,如下(提前将数据):
#训练数据
input_path = '../dataset/raw_data/train_data'
save_path = '../dataset/npy_data/train_data'
convert_to_npy(input_path,save_path)
#测试数据
input_path = '../dataset/raw_data/test_data'
save_path = '../dataset/npy_data/test_data'
convert_to_npy(input_path,save_path)
出于方便,将数据路径及其对应的标签以csv
文件的形式保存,如下,id
和label
分别指向的是数据绝对路径及其对应的标签:
id,label
xxxxx.hdf5,0
xxxxx.hdf5,1
运行tools.py
,同样地,运行之前先指定输入路径和csv
保存路径,如下:
os.makedirs('./csv_file')
input_path = os.path.abspath('../dataset/npy_data/train_data')
csv_path = './csv_file/index.csv'
make_label_csv(input_path,csv_path)
input_path = os.path.abspath('../dataset/npy_data/test_data')
csv_path = './csv_file/test_index.csv'
make_label_csv(input_path,csv_path)
修改config.py
,默认配置如下,注意版本号不要冲突,最好基于某种规则来指定,方便记忆,我目前用的是模型列表的索引顺序(1起始):
__all__ = ['r3d_18', 'se_r3d_18','da_18','da_se_18','r3d_34','da_se_34','vgg16_3d','vgg19_3d']
NET_NAME = 'r3d_18'
VERSION = 'v1.0'
DEVICE = '1'
# Must be True when pre-training and inference
PRE_TRAINED = False
# 1,2,3,4,5
CURRENT_FOLD = 1
GPU_NUM = len(DEVICE.split(','))
FOLD_NUM = 5
CKPT_PATH = './ckpt/{}/fold{}'.format(VERSION,CURRENT_FOLD)
# Arguments when trainer initial
INIT_TRAINER = {
'net_name':NET_NAME,
'lr':1e-3,
'n_epoch':80,
'channels':1,
'num_classes':3,
'input_shape':(32,256,256),
'crop':48,
'scale':(-100,200),
'use_roi':False or 'roi' in VERSION, #在data_loader.py 预留了接口,还没实现
'batch_size':2,
'num_workers':2,
'device':DEVICE,
'pre_trained':PRE_TRAINED,
'weight_path':WEIGHT_PATH,
'weight_decay': 0.0001,
'momentum': 0.9,
'gamma': 0.1,
'milestones': [30,60,90],
'T_max':5,
'use_fp16':True
}
# Arguments when perform the trainer
SETUP_TRAINER = {
'output_dir':'./ckpt/{}'.format(VERSION),
'log_dir':'./log/{}'.format(VERSION),
'optimizer':'AdamW',
'loss_fun':'Cross_Entropy',
'class_weight':None,
'lr_scheduler':'MultiStepLR' # MultiStepLR
}
修改run.py
文件,写入Step 2生成的索引文件路径,如下:
if 'train' in args.mode:
csv_path = './converter/csv_file/index.csv'
...
elif 'inf' in args.mode:
test_csv_path = './converter/csv_file/test_index.csv'
这一步就很简单
单折训练
python run.py -m train
多折(默认5折)训练
python run.py -m train-cross