/tf.extra_losses

Large-Margin Softmax Loss, Angular Softmax Loss, Additive Margin Softmax, ArcFaceLoss And FocalLoss In Tensorflow

Primary LanguageC++MIT LicenseMIT

Large-Margin Softmax Loss, Angular Softmax Loss, Additive Margin Softmax, ArcFaceLoss And FocalLoss In Tensorflow

This repository contains core codes of the reimplementation of the following papers in TensorFlow:

If your goal is to reproduce the results in the original paper, please use the official codes:

For using these Ops on your own machine:

  • copy the header file "cuda_config.h" from "your_python_path/site-packages/external/local_config_cuda/cuda/cuda/cuda_config.h" to "your_python_path/site-packages/tensorflow/include/tensorflow/stream_executor/cuda/cuda_config.h".

  • run the following script:

mkdir build
cd build && cmake ..
make
  • run "test_op.py" and check the numeric errors to test your install

  • follow the below codes snippet to integrate this Op into your own code:

    • For Large Margin Softmax Loss:
     op_module = tf.load_op_library(so_lib_path)
     large_margin_softmax = op_module.large_margin_softmax
    
     @ops.RegisterGradient("LargeMarginSoftmax")
     def _large_margin_softmax_grad(op, grad, _):
       '''The gradients for `LargeMarginSoftmax`.
       '''
       inputs_features = op.inputs[0]
       inputs_weights = op.inputs[1]
       inputs_labels = op.inputs[2]
       cur_lambda = op.outputs[1]
       margin_order = op.get_attr('margin_order')
    
       grads = op_module.large_margin_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order)
       return [grads[0], grads[1], None, None]
    
     var_weights = tf.Variable(initial_value, trainable=True, name='lsoftmax_weights')
     result = large_margin_softmax(features, var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.)
     loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))
    • For Angular Softmax Loss:
     op_module = tf.load_op_library(so_lib_path)
     angular_softmax = op_module.angular_softmax
    
     @ops.RegisterGradient("AngularSoftmax")
     def _angular_softmax_grad(op, grad, _):
       '''The gradients for `AngularSoftmax`.
       '''
       inputs_features = op.inputs[0]
       inputs_weights = op.inputs[1]
       inputs_labels = op.inputs[2]
       cur_lambda = op.outputs[1]
       margin_order = op.get_attr('margin_order')
    
       grads = op_module.angular_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order)
       return [grads[0], grads[1], None, None]
    
     var_weights = tf.Variable(initial_value, trainable=True, name='asoftmax_weights')
     normed_var_weights = tf.nn.l2_normalize(var_weights, 1, 1e-10, name='weights_normed')
     result = angular_softmax(features, normed_var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.)
     loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))
    • For others just refer to this script.

All the codes was tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04 with CUDA 8.0. The outputs of these Ops in C++ had been compared with the original caffe codes' outputs, and the bias could be ignored. The gradients of this Op had been checked using tf.test.compute_gradient_error and tf.test.compute_gradient. While the others are implemented following the official implementation in Python Ops.

If you encountered some linkage problem when generating or loading *.so, you are highly recommended to read this section in the official tourial to make sure you were using the same C++ ABI version.

Any contributions to this repo is welcomed.

MIT License