stevewyl/tensorflow_tips

some useful tensorflow training tips

Python

tensorflow_tips

混合精度

当你创建变量时，请使用tf.float16

    dtype = tf.float16
    data = tf.placeholder(dtype, shape=(nbatch, nin))
    weights = tf.get_variable('weights', (nin, nout), dtype)
    biases = tf.get_variable('biases', nout, dtype,
                            initializer=tf.zeros_initializer())
    logits = tf.matmul(data, weights) + biases

确保需要训练的变量是float32精度，然后在模型中使用它们时转换为float16

    tf.cast(tf.get_variable(..., dtype=tf.float32), tf.float16)

确保loss函数的精度为float32

    tf.losses.softmax_cross_entropy(target, tf.cast(logits, tf.float32))

应用loss-scaling，在计算梯度的时候乘以比例因子，一般为128，然后将得到的梯度除以相同的比例

    loss, params = ...
    scale = 128
    grads = [grad / scale for grad in tf.gradients(loss * scale, params)]