Machine Learning - EEC3501 Linear Classifiers

Study Notes

Machine Learning - EEC3501

The lecture introduces the concept of a linear classifier that distinguishes between two classes of data points.
Data points are represented graphically with stars and circles.
A linear decision boundary is needed to separate the classes effectively.
The decision boundary (hyperplane) is a line in 2D space, but generally for a higher-dimensional space, it's a hyperplane with one less dimension.
The equation for a hyperplane is f(x) = w^Tx + b = 0, where x is a data point, w is a vector parameter, and b is a bias parameter.
An optimal hyperplane seeks to maximize the distance to the nearest data point in either class, a maximum margin.
The distance, margin, calculation involves the closest point from either class.
Ensuring a classifier is not too close to data points leads to better generalization for the test data.
The decision hyperplane (in the 2D case, a line) is perpendicular to w.
A unit vector, w*, is defined in the same direction as w for geometric definition.
The correct classification of data point (x⁽ⁱ⁾) is denoted by sign(w^Tx⁽ⁱ⁾ + b) = t⁽ⁱ⁾.
Classification can be rewritten as t⁽ⁱ⁾(w^Tx⁽ⁱ⁾ + b) > 0.
A suitable margin of C ensures data points are correctly separated.
The aim is to find the maximum margin, hence minimizing ||w||², with the constraint t⁽ⁱ⁾(w^Tx⁽ⁱ⁾ + b) ≥ 1 for each data point i.
Training points with algebraic margins of 1 are called Support Vectors.
If data are not linearly separable, slack variables ξ_i can be introduced to allow data points to be within or beyond the margin.
An algebraic constraint is used t⁽ⁱ⁾(w^Tx⁽ⁱ⁾ + b) ≥ 1−ξ_i, and the total sum of ξ_i must be penalized, i.e., minimized.
The hyperparameter γ controls the trade-off between maximizing the margin and managing errors.
The hinge loss function is introduced; it has the form L_hinge (y, t) = (1-ty).
This function helps determine how well the hyperplane fits the data.
Soft-margin SVM is a linear classifier with hinge loss function and L₂ regularization term.