Support-Vector-Machine

Support vector machine (SVM) is a supervised learning model used for classification.

Develop tools and techniques

  • Python
  • Pycharm

Algorithm

The core idea of SVM is to find a decision boundary which can seperate different class.


We want to find decision boundary whose margin is maximized for training set .
At beginning, we will trasnform maximizing margin to minimizing with constraint , all class of training data should be 1 or -1,

Use Lagrange function (primary Lagrangians) to solve,

Take the derivative of with respect to w and b and set to zero,

Because the Lagrange multipliers are unknown, we still can not slove w and b. Lagrange multipliers for equality constraints are free parameters that can take any values. Therefore, we add the Karush–Kuhn–Tucker(KKT) conditions which constraint the Lagrange multipliers to be non-negative:

The constraint states the Lagrange multiplier must be zero unless the training instance satisfies the equation becasue must be larger than or equal to zero. Moreover, these training instance whose is larger than 0 is known as support vector. Also, only the support vectors define the decision boundary.

For simplifying, we will transform the problem into a function of the Lagrange multipliers only (dual problem):
Combine and into :

Find that maximizes subject to and

Take the derivative of with respect to :

must be decided by N-1 because of :

The gradient of is composed of two parts which are with respect to and :

Use gradient decsent to find and , Let

Once we get , we can use to calculate

For calculating b, we use to obtain for each support vectors and then average these values:

Training detail

  • must be within (0,C)

  • At each iteration, we will choose a survived as the dependent variable

Practice

  • Number of class label: 2

  • Number of data: 100

  • C: 30

  • Learning rate: 0.005

  • Iteration: 100

  • Variation of loss

    iteration 0 ~ 99 iteration 0 ~ 12 iteration 12 ~ 99
  • Result

    • There are 3 support vectors

Reference

  • AI course of international management department in NTUST
  • Introduction to data mining---by Pan-ning Tan, Mich. Steinbach and Vipin kumar