SVM – Support Vector Machine

SVM is the ML model that belongs to the category of Linear Models used for Supervised Machine Learning [Both Classification and Regression].

  • It is more performant in the high dimensional spaces [more features].
  • It is effective in the cases, where the number of dimensions is larger than the number of observations.
Its an advance ML model that can handle the Overfit quite well, and uses complex mathematical equations like Lagrange Multiplier.

In SVM, to avoid overfitting, we choose a Soft Margin, instead of a Hard one i.e. we allow some data points to enter our margin intentionally (but we do penalize them), so that our classifier don't overfit on our training sample. 

Here comes an important parameter Gamma (γ), which control Overfitting in SVM
SVM when used 
for Classification, its known as SVC [support vector classifier] 
and 
for Regression, its known as as SVR [support vector regressor]

Consider that we want to build a ML model that can easily classify 2 groups, i.e. green circles vs blue squares [in above chart]. We need a central line i.e. a hyperplane for the same. However to select the best hyperplane we need some support.

Support comes from support vectors, i.e. actual observations from the data that are closest to the Hyperplane and hence we get 2 extra supporting hyperplanes i.e. Negative and Positive Hyperplanes.

The distance between these 2 hyperplanes in called the gutter width. The hyperplane between these 2 hyperplanes is called the Maximum Margin Hyperplane

  • Support Vectors represents the data points helping you to create the linear lines [Hyperplanes] separating the 2 classes.
  • Machine means the ML model here

WIDEST STREET APPROACH: Here we talk about Upper and Lower margins [also known as Decision Boundaries] and hence a gutter width between the two.

Optimization function here is to increase the gutter width so that we can have clear separation between 2 classes.


#python class for SVC

class sklearn.svm.SVC(
*, 
C=1.0, 
kernel='linear', 
degree=3, 
gamma='scale', 
coef0=0.0, 
shrinking=True, 
probability=False, 
tol=0.001, 
cache_size=200, 
class_weight=None, 
verbose=False, 
max_iter=- 1, 
decision_function_shape='ovr', 
break_ties=False, 
random_state=None)

source - https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

  • The above SVC formula uses:
    • Kernel functions like
      • Linear
      • Polynomial
      • RBF [Radial Basis Function]
      • Sigmoid
    • C is Inverse Regularization Parameter i.e. The strength of the regularization is inversely proportional to C
    • decision_function_shape [to be used in multi-class strategy] values can be
      • OVR – One vs Rest
      • OVO – One vs One

SVM behaves as Non-Linear Model when we use the Kernel other than Linear.
Non-Linear Kernels like poly, rbf and sigmoid separates the classes using non-linear curves, and to be used when its not possible to linearly separate the data via a straight line

Rahul Aggarwal
http://guardiancoder.in

Senior Data Scientist and Gen-AI Engineer #DataScience #AI #RNN #CNN #GenAI #ChatGPT #LLMs

Leave a Reply

Discover more from Rahul Aggarwal's EdTech

Subscribe now to keep reading and get access to the full archive.

Continue reading