A simple summary of loss functions in machine learning
In the context of an optimization algorithm, the function used to evaluate a candidate solution is referred to as the objective function.
Typically, with neural networks, we may seek to minimize a loss function (objective function) so as to search for a candidate solution that quanlify the model best.
Param name | |
---|---|
n | Number of training examples |
M | Number of classes |
i | ith training example in a data set |
c | class label |
y(i) | Ground truth label for ith training example |
y_hat(i) | Prediction for ith training example |
- simplified as 'mse', also known as Quadratic Loss/ L2 Loss
- measured as the average of squared difference between predictions and actual observations
- easier to calculate the gradients
- measured as the average of sum of absolute differences between predictions and actual observations
- more robust to outliers
- hard to calculate the gradients
## Classification Losses
- the score of correct category should be greater than sum of scores of all incorrect categories by some safety margin (usually one)