Featured image of post Hyperplanes and Support Vector Machines (SVMs)

Hyperplanes and Support Vector Machines (SVMs)

Brief Explanation

Hydroplanes on Lake Washington

(Yes.. I KNOW.. HYROplanes are different than HYPERplanes..)

Support Vector Machines (SVMs) are powerful supervised learning models used for classification and regression.

At the core of SVMs lie two key concepts: hyperplanes and support vectors.

Understanding these two concepts will demystify why SVMs work so well in complex classification problems.


1. What is a Hyperplane?

A hyperplane is a decision boundary that separates different classes in an SVM model.

  • In 2D, it’s a straight line.
  • In 3D, it’s a flat plane.
  • In higher dimensions, it’s a multi-dimensional surface.

The key idea of SVMs is to find the optimal hyperplane that best separates the classes with the maximum margin between the closest data points.

Example: Visualizing a Hyperplane

Imagine you’re classifying red and blue points on a graph. A simple straight line (hyperplane) can separate them like this:

1
2
3
Red   |  Blue
------|------
Red   |  Blue

But what if the points are mixed up and not linearly separable? That’s where support vectors and kernels come in!


2. What are Support Vectors?

Support vectors are the data points closest to the hyperplane. These points define the margin of separation.

  • The fewer support vectors, the better – too many means the model may overfit.
  • These points are critical – moving them would change the hyperplane’s position!

Why are Support Vectors Important?

  • They maximize the margin, which improves generalization.
  • They are the most influential points in the dataset.
  • They allow SVMs to handle outliers better than many other models.

Example in Python

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import numpy as np

# Load sample dataset
X, y = datasets.make_classification(n_features=2, n_classes=2, n_redundant=0, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVM model
svm_model = SVC(kernel='linear')
svm_model.fit(X_train, y_train)

# Plot decision boundary
w = svm_model.coef_[0]
b = svm_model.intercept_[0]
x_values = np.linspace(X[:, 0].min(), X[:, 0].max(), 100)
y_values = - (w[0] / w[1]) * x_values - (b / w[1])

plt.scatter(X[:, 0], X[:, 1], c=y, cmap='coolwarm')
plt.plot(x_values, y_values, 'k-')
plt.title("SVM Hyperplane and Support Vectors")
plt.show()

3. How SVMs Find the Best Hyperplane

SVMs don’t just find any hyperplane; they find the one that maximizes the margin. This is done through:

  • Hard Margin SVM: Used when data is perfectly separable.
  • Soft Margin SVM: Allows for some misclassification when data is noisy.
  • Kernel Trick: Maps non-linearly separable data into a higher dimension where a hyperplane can separate them.

Example: Using Kernels for Complex Boundaries

1
2
3
4
5
6
7
# Train SVM with a non-linear kernel
svm_model_rbf = SVC(kernel='rbf', gamma='scale')
svm_model_rbf.fit(X_train, y_train)

# Predict and evaluate
accuracy = svm_model_rbf.score(X_test, y_test)
print("Non-Linear SVM Accuracy:", accuracy)

4. Comparing Hyperplanes in Different Dimensions

DimensionType of Hyperplane
2DStraight Line
3DPlane
4D+Multi-dimensional surface

The beauty of SVMs is that they can handle any number of dimensions using the right kernel function


References

  1. Scikit-Learn Documentation
  2. Understanding Hyperplanes
  3. Support Vectors Explained