SVM (1) PDF
Document Details
Uploaded by Deleted User
Tags
Related
- COMP9517 Computer Vision 2024 Term 2 Week 4 Pattern Recognition Part 2 PDF
- Recognition and Classification of Pomegranate Leaves Diseases by Image Processing and Machine Learning Techniques PDF
- Linear Classification - Machine Learning PDF
- Supervised Learning Lecture 6 PDF
- Support Vector Machine (SVM) PDF
- Support Vector Machines (SVM) PDF
Summary
This document provides a detailed overview of support vector machines (SVMs). It covers different aspects, from the basic concepts to various applications, highlighting its use in linear and nonlinear classification. The guide dives into SVM terminology, types of SVMs and includes diagrams to better illustrate the concepts discussed.
Full Transcript
Support Vector Machine (SVM) What it is? Numerical classifier that draws a single decision boundary that maximizes the margin between two classes of data Classifier – Machine Learning Model Kernel helps to find a hyperplane in the higher dimensional space with...
Support Vector Machine (SVM) What it is? Numerical classifier that draws a single decision boundary that maximizes the margin between two classes of data Classifier – Machine Learning Model Kernel helps to find a hyperplane in the higher dimensional space without increasing the computational cost much Support Vector Machine A Support Vector Machine (SVM) is a powerful machine learning algorithm widely used for both linear and nonlinear classification, as well as regression and outlier detection tasks. SVMs are highly adaptable, making them suitable for various applications such as text classification, image classification, spam detection, handwriting identification, face detection. Support Vector Machine SVMs are particularly effective because they focus on finding the maximum separating hyperplane between the different classes in the target feature Support Vector Machine A Support Vector Machine (SVM) is a supervised machine learning algorithm used for both classification and regression tasks. SVM is best suited for classification tasks. The primary objective of the SVM algorithm is to identify the optimal hyperplane in an N- dimensional space that can effectively separate data points into different classes in the feature space. The algorithm ensures that the margin between the closest points of different classes, known as support vectors, is maximized. Support Vector Machine The dimension of the hyperplane depends on the number of features. For instance, if there are two input features, the hyperplane is simply a line, and if there are three input features, the hyperplane becomes a 2-D plane. As the number of features increases beyond three, the complexity of visualizing the hyperplane also increases. How does Support Vector Machine Algorithm Work? Consider two independent variables, x1 and x2, and one dependent variable represented as either a blue circle or a red circle. In this scenario, the hyperplane is a line because we are working with two features (x1 and x2). There are multiple lines (or hyperplanes) that can separate the data points. The challenge is to determine the best hyperplane that maximizes the separation margin between the red and blue circles. So how do we choose the best line How does Support Vector Machine Algorithm Work? Support Vector Machine (SVM) is the one that maximizes the separation margin between the two classes. The maximum-margin hyperplane, also referred to as the hard margin, is selected based on maximizing the distance between the hyperplane and the nearest data point on each side. What to do if data are not linearly separable? SVM solves this by creating a new variable using a kernel. We call a point xi on the line and we create a new variable yi as a function of distance from origin o.so if we plot this we get something like as shown below A non-linear function that creates a new variable is referred to as a kernel. Support Vector Machine Terminology Hyperplane: The hyperplane is the decision boundary used to separate data points of different classes in a feature space. For linear classification, this is a linear equation represented as wx+b=0. Support Vectors: Support vectors are the closest data points to the hyperplane. These points are critical in determining the hyperplane and the margin in Support Vector Machine (SVM). Margin: The margin refers to the distance between the support vector and the hyperplane. The primary goal of the SVM algorithm is to maximize this margin, as a wider margin typically results in better classification performance. Kernel: The kernel is a mathematical function used in SVM to map input data into a higher-dimensional feature space. This allows the SVM to find a hyperplane in cases where data points are not linearly separable in the original space. Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid. Hard Margin: A hard margin refers to the maximum-margin hyperplane that perfectly separates the data points of different classes without any misclassifications. Soft Margin: When data contains outliers or is not perfectly separable, SVM uses the soft margin technique. This method introduces a slack variable for each data point to allow some misclassifications while balancing between maximizing the margin and minimizing violations. Types of Support Vector Machine Based on the nature of the decision boundary, Support Vector Machines (SVM) can be divided into two main parts: Linear SVM: Linear SVMs use a linear decision boundary to separate the data points of different classes. When the data can be precisely linearly separated, linear SVMs are very suitable. This means that a single straight line (in 2D) or a hyperplane (in higher dimensions) can entirely divide the data points into their respective classes. A hyperplane that maximizes the margin between the classes is the decision boundary. Non-Linear SVM: Non-Linear SVM can be used to classify data when it cannot be separated into two classes by a straight line (in the case of 2D). By using kernel functions, nonlinear SVMs can handle nonlinearly separable data. The original input data is transformed by these kernel functions into a higher-dimensional feature space, where the data points can be linearly separated. A linear SVM is used to locate a nonlinear decision boundary in this modified space. Popular kernel functions in SVM The SVM kernel is a function that takes low-dimensional input space and transforms it into higher-dimensional space, ie it converts nonseparable problems to separable problems. It is mostly useful in non-linear separation problems. Simply put the kernel, does some extremely complex data transformations and then finds out the process to separate the data based on the labels or outputs defined. Advantages of Support Vector Machine (SVM) 1. High-Dimensional Performance: SVM excels in high-dimensional spaces, making it suitable for image classification and gene expression analysis. 2. Nonlinear Capability: Utilizing kernel functions like RBF and polynomial, SVM effectively handles nonlinear relationships. 3. Outlier Resilience: The soft margin feature allows SVM to ignore outliers, enhancing robustness in spam detection and anomaly detection. 4. Binary and Multiclass Support: SVM is effective for both binary classificationand multiclass classification, suitable for applications in text classification. 5. Memory Efficiency: SVM focuses on support vectors, making it memory efficient compared to other algorithms. Disadvantages of Support Vector Machine (SVM) 1. Slow Training: SVM can be slow for large datasets, affecting performance in SVM in data mining tasks. 2. Parameter Tuning Difficulty: Selecting the right kernel and adjusting parameters like C requires careful tuning, impacting SVM algorithms. 3. Noise Sensitivity: SVM struggles with noisy datasets and overlapping classes, limiting effectiveness in real- world scenarios. 4. Limited Interpretability: The complexity of the hyperplane in higher dimensions makes SVM less interpretable than other models. 5. Feature Scaling Sensitivity: Proper feature scaling is essential; otherwise, SVM models may perform poorly.