Machine Learning Principles Overview

Study Notes

Machine Learning Principles

Unsupervised Learning: Model learns from unlabelled data to find hidden patterns or structures. Example: K-means clustering algorithm groups similar data points together.
Supervised Learning: Model learns from labelled data, mapping inputs to outputs based on provided examples. Examples include linear regression (predicting continuous values), logistic regression (predicting categorical outcomes), and support vector machines (separating data points into classes).
Regression: Predicting a continuous outcome based on input variables.
Classification: Assigning input data into discrete categories or classes.
Underfitting: Model is too simple to capture the underlying patterns in data. The model performs poorly on both training and testing data. Solution: increase model complexity or add more features.
Overfitting: The model fits the training data too well, but performs poorly on test data since it learns noise and details specific to the training data. Solution: gather more data or apply cross-validation.
Model Selection: Choosing the best machine learning model for a specific task. This involves comparing models, evaluating performance, and selecting the one that generalises best to unseen data. Cross-validation is a key tool for model selection.
Training Dataset: Data used to train the model, allowing the model to determine relationships between inputs and outputs.
Validation Dataset: A separate set of data used to tune model parameters (hyperparameters). It helps prevent overfitting by evaluating how well the model generalises to unseen data.
Test Dataset: Data used to evaluate the performance of the trained model on previously unseen data.
Cross-Validation: A technique for evaluating a model by splitting the data into multiple folds. The model trains on (S-1) folds and validates on the remaining fold. This ensures the model generalises well to new data.
No Free Lunch Theorem: No single machine learning algorithm is the best for every problem. The best performing algorithm depends on the specific task and dataset.
Model Parameters: Internal variables of the model that are learned from the training data, such as weights. These parameters define the model's behavior and how it makes predictions.
Parametric Model: Assumes a specific form for the function mapping inputs to outputs and has a fixed number of parameters, regardless of the size of the dataset. Examples include linear regression and logistic regression.
Nonparametric Model: Does not assume a fixed form for the function and can grow in complexity as more data becomes available. It has a growing number of parameters as the data grows.
Likelihood Function: Measures how likely a given set of parameters explains the observed data.
Maximum Likelihood Estimation (MLE): A method for estimating the parameters of a model by maximizing the likelihood function.
Bias Term: A constant value added to the model to ensure it can fit datasets that don't form y = mx + b. It allows the model to capture variability in the data and potentially improve prediction accuracy.
Dummy Variable: A variable that converts a categorical feature into a numerical format. It is essential for models that require numerical inputs, transforming a categorical variable into a binary representation (e.g., 0 or 1) to represent the presence or absence of a particular characteristic.