/Cellphone-Behavior

“2020创青春·交子杯” 新网银行金融科技挑战赛 AI算法赛道

Primary LanguageJupyter Notebook

2020创青春·交子杯” 新网银行金融科技挑战赛 AI算法赛道 TOP2

模型

CNN

1.Baseline 0.711左右的分数

https://mp.weixin.qq.com/s/r7Ai8FVSPRB71PVghYk75A 更换优化算法 0.7296349206349205

model.compile(loss='categorical_crossentropy',
                  optimizer='rmsprop',
                  metrics=['acc'])

线上0.7296349206349205

  1. 开源Baseline ot_code代码 线上0.74
23/23 [==============================] - 1s 43ms/step - loss: 0.1183 - acc: 0.9578 - val_loss: 1.1392 - val_acc: 0.7455 - lr: 7.8125e-06
Epoch 178/500
23/23 [==============================] - 1s 43ms/step - loss: 0.1159 - acc: 0.9597 - val_loss: 1.1376 - val_acc: 0.7476 - lr: 7.8125e-06
Epoch 00178: early stopping

线上0.7456984126984126

将优化算法调整为rmsprop线上分数0.7463015873015874

Epoch 161/500
23/23 [==============================] - 1s 44ms/step - loss: 0.1667 - acc: 0.9412 - val_loss: 0.9308 - val_acc: 0.7510 - lr: 7.8125e-06
Epoch 162/500
23/23 [==============================] - 1s 44ms/step - loss: 0.1751 - acc: 0.9403 - val_loss: 0.9285 - val_acc: 0.7510 - lr: 7.8125e-06
Epoch 00162: early stopping

从上面可以看出来rmsprop要优于Adam算法 3. bp_cnn 在卷积层之后添加批归一化

X = Conv2D(filters=128,
               kernel_size=(3, 3),
               activation='relu',
               padding='same')(X)
    X = BatchNormalization()(X)

训练效果有明显提升

Epoch 153/500
23/23 - 1s - loss: 0.0745 - acc: 0.9794 - val_loss: 0.8036 - val_acc: 0.7716 - lr: 7.8125e-06
Epoch 154/500
23/23 - 1s - loss: 0.0777 - acc: 0.9775 - val_loss: 0.8064 - val_acc: 0.7709 - lr: 7.8125e-06
Epoch 00154: early stopping

线上 0.7661587301587302

删除两个特征之后,线上0.7644

  1. 尝试不同的归一化方式
  • Group Normalization (TensorFlow Addons)
23/23 - 2s - loss: 0.0185 - acc: 0.9971 - val_loss: 0.8559 - val_acc: 0.8073 - lr: 1.5625e-05
Epoch 157/500
23/23 - 2s - loss: 0.0173 - acc: 0.9971 - val_loss: 0.8587 - val_acc: 0.8107 - lr: 7.8125e-06
Epoch 158/500
23/23 - 2s - loss: 0.0186 - acc: 0.9964 - val_loss: 0.8590 - val_acc: 0.8073 - lr: 7.8125e-06
Epoch 00158: early stopping

线上分数0.764825396825397

  • Instance Normalization (TensorFlow Addons)
23/23 - 2s - loss: 2.8817 - acc: 0.1030 - val_loss: 2.8588 - val_acc: 0.1021 - lr: 0.0010
Epoch 2/500
23/23 - 2s - loss: 2.8613 - acc: 0.0945 - val_loss: 2.8557 - val_acc: 0.1021 - lr: 0.0010
Epoch 3/500
23/23 - 2s - loss: 2.8608 - acc: 0.0965 - val_loss: 2.8554 - val_acc: 0.1021 - lr: 0.0010
Epoch 4/500
23/23 - 2s - loss: 2.8594 - acc: 0.0989 - val_loss: 2.8557 - val_acc: 0.1008 - lr: 0.0010

没有成功

  • Layer Normalization (TensorFlow Core)
Epoch 00112: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
23/23 - 2s - loss: 0.0067 - acc: 0.9995 - val_loss: 0.9610 - val_acc: 0.7970 - lr: 1.2500e-04
Epoch 113/500
23/23 - 2s - loss: 0.0052 - acc: 0.9998 - val_loss: 0.9456 - val_acc: 0.8038 - lr: 6.2500e-05
Epoch 114/500
23/23 - 2s - loss: 0.0045 - acc: 0.9995 - val_loss: 0.9422 - val_acc: 0.8018 - lr: 6.2500e-05
Epoch 00114: early stopping

线上分数0.7607777777777778

  1. GlobalAveragePooling2D vs GlobalMaxPooling2D
Epoch 179/500
23/23 - 1s - loss: 0.0718 - acc: 0.9762 - val_loss: 0.8834 - val_acc: 0.7936 - lr: 1.9531e-06
Epoch 00179: early stopping
accuracy_score 0.7949245541838135 acc_combo 0.8232738911751242
5kflod mean acc score:0.7934734597517326
5kflod mean combo score:0.8238260577813273

GlobalAveragePooling2D 线上分数0.7686349206349207优于 GlobalMaxPooling2D

LSTM

  1. 双向的LSTM
23/23 - 1s - loss: 0.1118 - acc: 0.9609 - val_loss: 1.7949 - val_acc: 0.6543 - lr: 1.2500e-04
Epoch 00104: early stopping
accuracy_score 0.6694101508916324 acc_combo 0.7127833300672791
5kflod mean acc score:0.6626436732978505
5kflod mean combo score:0.7082775474080174

线上分数0.7023650793650793 2. LSTM-FCN

论文地址:LSTM Fully Convolutional Networks for Time Series Classification

  • 参数1
def LSTM_FCN():
    input = Input(shape=(seq_len, fea_size), name="input_layer")
    x = LSTM(64)(input)
    x = Dropout(0.8)(x)

    # y = Permute((2, 1))(input)
    y = Conv1D(128, 8, padding='same', kernel_initializer='he_uniform')(input)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = Conv1D(256, 5, padding='same', kernel_initializer='he_uniform')(y)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = Conv1D(128, 3, padding='same', kernel_initializer='he_uniform')(y)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = GlobalAveragePooling1D()(y)

    x = concatenate([x, y])

    pred = Dense(19, activation='softmax')(x)
    model = Model([input], pred)
    return model
结果
accuracy_score 0.7887517146776406 acc_combo 0.8202038016852817
5kflod mean acc score:0.7870284342677916
5kflod mean combo score:0.818158419984462
  • 参数2
def LSTM_FCN():
    input = Input(shape=(seq_len, fea_size), name="input_layer")
    x = LSTM(64)(input)
    x = Dropout(0.8)(x)

    # y = Permute((2, 1))(input)
    y = Conv1D(128, 5, padding='same', kernel_initializer='he_uniform')(input)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = Conv1D(256, 4, padding='same', kernel_initializer='he_uniform')(y)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = Conv1D(128, 3, padding='same', kernel_initializer='he_uniform')(y)
    y = BatchNormalization()(y)
    y = Activation('relu')(y)

    y = GlobalAveragePooling1D()(y)

    x = concatenate([x, y])

    pred = Dense(19, activation='softmax')(x)
    model = Model([input], pred)
    return model

线下: lstm_acc 0.7962136532999378 combo 0.8261682540488406

  • 线上0.754 代码对应为lstm.py
  • 加入class weight引起过拟合
  1. MLSTM-FCN 代码:mlstm_fcn.py
  • 原始generate_model
accuracy_score 0.7764060356652949 acc_combo 0.8090012411000044
acc_scores: [0.8163125428375599, 0.7930089102124743, 0.7908093278463649, 0.789437585733882, 0.7764060356652949]
combo_scores: [0.842749436992068, 0.8260060706942118, 0.822294075380494, 0.8216735253772278, 0.8090012411000044]
5kflod mean acc score:0.7931948804591152
5kflod mean combo score:0.8243448699088012

线上:0.761

  • 输入train_lstm,train_lstm_inv,train_features
accuracy_score 0.8168724279835391 acc_combo 0.8431314912796375
acc_scores: [0.8122001370801919, 0.818368745716244, 0.8237311385459534, 0.821673525377229, 0.8168724279835391]
combo_scores: [0.8394203466170552, 0.8462090799308052, 0.8495656149977121, 0.8504474492128801, 0.8431314912796375]
5kflod mean acc score:0.8185691949406314
5kflod mean combo score:0.8457547964076181

线上0.776

Multi Input

cnn双输入(正向sequences和反向sequences) 可以对应har.py代码, 线上0.7726190476190478 线下har_acc0.8123967315118028_combo0.8401833538228309.csv

HAR:CNN输入与Dense输入融合

优化的地方:

  • 目前尝试输入时间序列数据与特征数据,特征不能漫蛮力添加
  • CNN中间BN和Dropout稍微有调整,减少一些BN
  • 调整数据补齐:将frame的前20条数据进行填充
  • 5个随机种子 5折:0.788063492063492 模型较稳定
  • ★根据类别频率统计class weight,加上线上效果0.788666
  • 使用jitter进行数据增强:线下分数较高,线上0.779
  • 加入LSTM-FCN之后,0.781
  • ★数据增强 data enhance:对每折数据争抢很重要,否则全局会严重过拟合 线上0.794

Resnet

resnet_acc0.798683635276431_combo0.82761327304097.csv
resnet_proba_t_0.798683635276431.csv
resnet_proba_x_0.798683635276431.csv

线上0.7457777777777777
Multi-Scale Convolutional Neural Networks for Time Series Classification

针对现有时间序列分类方法的特征提取与分类过程分离,且无法提取存在于不同时间尺度序列的不同特征的问题,作者提出MCNN模型。
对于单一时间序列输入,进行降采样和滑动平均等变化,产生多组长度不同的时间序列,并在多组时间序列上进行卷积,提取不同时间尺度序列的特征。

失败的尝试

  • 添加特征
  • lstm:也不算失败,目前网络结构比较简单,个人感觉应该LSTM的效果比CNN要好
  • 超参数调优
  • 加上归一化之后 效果比较差
print('Scaler....')
for col in ['acc_x','acc_y','acc_z','acc_xg','acc_yg','acc_zg','mod','modg']:
    scaler = MinMaxScaler().fit(data[[col]])
    train[[col]] = scaler.transform(train[[col]])
  • 加入特征
角度特征

# 角度
# data['acc_x_cos(x)'] = data['acc_x'] / data['mod']
# data['acc_y_cos(y)'] = data['acc_y'] / data['mod']
# data['acc_z_cos(z)'] = data['acc_z'] / data['mod']
# data['acc_x_angle(x)'] = np.arccos(data['acc_x_cos(x)'])
# data['acc_y_angle(y)'] = np.arccos(data['acc_y_cos(y)'])
# data['acc_z_angle(z)'] = np.arccos(data['acc_z_cos(z)'])
# 2020.7.8
data['mod2'] = data.acc_x ** 2 + data.acc_y ** 2 + data.acc_z ** 2
data['modg2'] = data.acc_xg ** 2 + data.acc_yg ** 2 + data.acc_zg ** 2

92/92 - 2s - loss: 0.0624 - acc: 0.9803 - val_loss: 0.8870 - val_acc: 0.7840 - lr: 3.1250e-05
Epoch 00117: early stopping
accuracy_score 0.7839506172839507 acc_combo 0.8136063753347688

线上0.7614603174603174

  • 数据增强
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
    horizontal_flip=True,  # randomly flip images
    vertical_flip=True  # randomly flip imag
)

线下0.792 线上0.7577619047619049

  • LSTM+CNN
92/92 - 2s - loss: 0.0656 - acc: 0.9798 - val_loss: 1.0423 - val_acc: 0.7599 - lr: 7.8125e-06
Epoch 138/500
92/92 - 2s - loss: 0.0594 - acc: 0.9832 - val_loss: 1.0431 - val_acc: 0.7613 - lr: 7.8125e-06
Epoch 00138: early stopping
accuracy_score 0.766803840877915 acc_combo 0.800509504213206

线上0.758

  • X = GlobalMaxPooling2D()(X)
Epoch 119/500
23/23 - 1s - loss: 0.1941 - acc: 0.9311 - val_loss: 0.9021 - val_acc: 0.7359 - lr: 6.2500e-05
Epoch 00119: early stopping
accuracy_score 0.7414266117969822 acc_combo 0.7816317199033243
5kflod mean acc score:0.728880483560249
5kflod mean combo score:0.7694991110919478

线上 0.7402857142857142

  • VGG19 预训练模型
accuracy_score 0.10150891632373114 acc_combo 0.1966490299823643
5kflod mean acc score:0.35815603637043997
5kflod mean combo score:0.4347271162644451
  • 加入word2vec词向量特征
feainput = Input(shape=(w2v_fea))
    dense = Dense(32, activation='relu')(feainput)
    dense = BatchNormalization()(dense)
    dense = Dropout(0.2)(dense)
    dense = Dense(64, activation='relu')(dense)
    dense = Dropout(0.2)(dense)
    dense = Dense(128, activation='relu')(dense)
    dense = Dropout(0.2)(dense)
    dense = Dense(256, activation='relu')(dense)
    dense = BatchNormalization()(dense)

  • Disout 和Dropout类似,线下效果相比dropout差,线上0.787,另外收敛时间比较慢

  • EMN Time series classification with Echo Memory Networks 线下:0.444+ 可能因为和原论文实现差别太大的原因。。

  • MultiVariateCNN 每个变量单做一个输入

  • SRCNN

论文地址: Image Super-Resolution Using DeepConvolutional Networks

参考资料