Machine Learning Landscape PDF

The machine learning landscape ICYBM101 Machine Learning and Data Mining 23 September 2024 Prof. dr. Katrien Beuls Faculté d’informa que, Université de Namur https://unamur...

The machine learning landscape ICYBM101 Machine Learning and Data Mining 23 September 2024 Prof. dr. Katrien Beuls Faculté d’informa que, Université de Namur https://unamur.be/info www.unamur.be ti www.unamur.be What is machine learning? www.unamur.be What is machine learning? "[Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed." Arthur Samuel, 1959 "A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E." Tom Mitchell, 1997 www.unamur.be Example www.unamur.be Why use machine learning? The traditional approach www.unamur.be The machine learning approach www.unamur.be Automatically adapting to change www.unamur.be Complex problems www.unamur.be Machine learning can help humans learn Data mining www.unamur.be Why use machine learning? » Problems for which existing solutions require a lot of hand-tuning or long lists of rules » Complex problems for which there is no good solution at all using a traditional approach » Fluctuating environments: a Machine Learning system can adapt to new data » Getting insights about complex problems and large amounts of data www.unamur.be Examples of applications www.unamur.be Examples of applications (1) » Analysing images of products on a production line to automatically classify them » Image classification, typically performed using convolutional neural networks (CNNs) or sometimes transformers » Detecting tumors in brain scans » Semantic image segmentation, where each pixel in the image is classified, typically using CNNs or transformers www.unamur.be Examples of applications (2) » Automatically classifying news articles » Natural language processing (NLP), and more specifically text classification, which can be tackled using recurrent neural networks (RNNs) and CNNs, but transformers work even better » Automatically flagging offensive comments on discussion forums » Also text classification, using the same NLP tools » Summarising long documents automatically » Branch of NLP called text summarisation, again using the same tools www.unamur.be Examples of applications (3) » Creating a chatbot or a personal assistant » Forecasting your company’s revenue next year, based on many performance metrics » Making your app react to voice commands » Detecting credit card fraud » Segmenting clients based on their purchases so that you can design a different marketing strategy for each segment » Representing a complex, high-dimensional dataset in a clear and insightful diagram » Recommending a product that a client may be interested in, based on past purchases » Building an intelligent bot for a game www.unamur.be Types of machine learning systems www.unamur.be Types of machine learning systems » Whether or not they are trained with human supervision (supervised, unsupervised, semisupervised, and reinforcement learning) » Whether or not they can learn incrementally on the fly (online versus batch learning) » Whether they work by simply comparing new data points to known data points, or instead detect patterns in the training data and build a predictive model, much like scientists do (instance-based versus model-based learning) www.unamur.be Training supervision www.unamur.be Supervised learning Classification A labeled training set for spam classification (an example of supervised learning) www.unamur.be Supervised learning Regression A regression problem: predict a value, given an input feature (there are usually multiple input features, and sometimes multiple output values) www.unamur.be Most important supervised learning algorithms » k-Nearest Neighbours » Linear Regression » Logistic Regression » Support Vector Machines (SVMs) » Decision Trees and Random Forests » Neural networks www.unamur.be Unsupervised learning www.unamur.be Unsupervised learning Clustering www.unamur.be Unsupervised learning Visualisation algorithms www.unamur.be Unsupervised learning Anomaly detection www.unamur.be Unsupervised learning Association rule learning www.unamur.be Semisupervised learning www.unamur.be Self-supervised learning www.unamur.be Reinforcement learning www.unamur.be Batch and online learning www.unamur.be Batch learning www.unamur.be Online learning www.unamur.be Online learning Out-of-core learning incremental learning www.unamur.be Learning rate » How fast your online learning system should adapt to changing data » High learning rate: Rapid adaptation, but quick forgetting » Low learning rate: More inertia, but less sensitive to noise » Big challenge: Bad data! » Monitor your system closely and promptly switch learning off » Monitor the input data and react to abnormal data www.unamur.be Instance-based vs. model-based learning www.unamur.be Instance-based learning www.unamur.be Model-based learning www.unamur.be Model-based learning Example life_sa sfac on = θ0 + θ1 × GDP_per_capita ti ti www.unamur.be Model-based learning Example life_sa sfac on = θ0 + θ1 × GDP_per_capita ti ti www.unamur.be Model-based learning Example www.unamur.be Terminology: model www.unamur.be https://scikit-learn.org www.unamur.be Main challenges of Machine Learning www.unamur.be Insufficient Quantity of Training Data The importance of data versus algorithms Michele Banko and Eric Brill. 2001. Scaling to Very Very Large Corpora for Natural Language Disambiguation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 26–33, Toulouse, France. Association for Computational Linguistics. www.unamur.be Non-representative training data Life satisfaction revisited www.unamur.be Non-representative training data Sampling noise www.unamur.be Poor quality data » If some instances are clearly outliers, it may help to simply discard them or try to fix the errors manually » If some instances are missing a few features (e.g., 5% of your customers did not specify their age), you must decide whether you want to ignore this attribute altogether, ignore these instances, fill in the missing values (e.g., with the median age), or train one model with the feature and one model without it, and so on www.unamur.be Irrelevant features Garbage in, garbage out » Feature engineering » Feature selection » Feature extraction » Creating new features by gathering new data www.unamur.be Overfitting the training data www.unamur.be Overfitting the training data Possible solutions » Simplify the model by selecting one with fewer parameters (e.g., a linear model rather than a high-degree polynomial model), by reducing the number of attributes in the training data or by constraining the model » Gather more training data » Reduce the noise in the training data (e.g., fix data errors and remove outliers) regularisation www.unamur.be Overfitting the training data Regularisation » Regularisation hyperparameter www.unamur.be Underfitting the training data » Occurs when the model is too simple to learn the underlying structure of the data » Main options to fix this problem: » Selecting a more powerful model, with more parameters » Feeding better features to the learning algorithm (feature engineering) » Reducing the constraints on the model (e.g. reducing the regularisation hyperparameter) www.unamur.be Testing and validating www.unamur.be Testing and validating » Split your data into two sets: » Training set » Test set » Generalisation error: Error rate on new cases www.unamur.be Hyperparameter tuning and model selection » Validation set www.unamur.be FIGURE 4. Diagram of the architecture of the CNN. Each convolution stage (ConvStag Testing and validating pooling layers. Each fully connected stage (FCX) consists of dense, batch normalizatio 3⇥3, 3⇥3, respectively. n-fold cross-validation (o Data 3-fold cross-validation Test th ca Fold 1 Train Train Validation cr Fold 2 Train Validation Train u d Fold 3 Validation Train Train w d Final test Train Train Train Test w u FIGURE 5. 3-fold cross-validation and final test diagram. The dataset was divided into four subsets. Two of them were used for training each fold and one www.unamur.be for validation. After that evaluation, those three subsets were used to train a 3 No free lunch (NFL) theorem A model is a simplification of reality Simplification is based on assumptions (model bias) Assumptions fail in certain situations David H. Wolpert; The Lack of A Priori Distinctions Between Learning Algorithms. Neural Computation 1996; 8 (7): 1341–1390. doi: https://doi.org/10.1162/neco.1996.8.7.1341 www.unamur.be Homework 1: Follow the Jupyter notebook of Chapter 1 https://github.com/ageron/handson-ml3 www.unamur.be Homework 2: Exercises Chapter 1 1. How would you define machine learning? 2. Can you name four types of applications where it shines? 3. What is a labeled training set? 4. What are the two most common supervised tasks? 5. Can you name four common unsupervised tasks? 6.... www.unamur.be

Machine Learning Landscape PDF

Document Details

Tags

Related

Summary

Full Transcript