Machine Learning for Beginners PDF
Document Details
Uploaded by Deleted User
Alexandria University
Noha A Yousri
Tags
Summary
This document provides a beginner's overview of machine learning. It covers the fundamental concepts, types of machine learning, and examples. The document is focused on explaining machine learning concepts, suitable for introductory learning.
Full Transcript
MACHINE LEARNING FOR BEGINNERS NOHA A YOUSRI PROFESSOR OF BIOINFORMATICS & MACHINE LEARNING, COMPUTER AND SYSTEMS ENGINEERING, ALEXANDRIA UNIVERSITY. OUTLINE ¡ What is machine learning? ¡ Examples of machine learning. ¡ Types of machine learning algorithms. ¡ Supervised...
MACHINE LEARNING FOR BEGINNERS NOHA A YOUSRI PROFESSOR OF BIOINFORMATICS & MACHINE LEARNING, COMPUTER AND SYSTEMS ENGINEERING, ALEXANDRIA UNIVERSITY. OUTLINE ¡ What is machine learning? ¡ Examples of machine learning. ¡ Types of machine learning algorithms. ¡ Supervised learning. ¡ Unsupervised learning. ¡ Supervised vs Unsupervised learning. ¡ Features or Dimensions. ¡ Distances between objects. ¡ Kmeans clustering with examples. ¡ Summary. WHAT IS MACHINE LEARNING? Programming a machine/robot by: Teaching it to think like human professionals by giving it examples then testing its ability to predict or classify data. EXAMPLES OF MACHINE LEARNING Examples -Surgical robots. (Robot aiding in surgery). Steps: This Photo by Unknown Author is licensed under CC BY-NC-ND 1-A surgeon gives all important details of a surgical prodecure to a programmer. 2-A programmer transforms those steps into a learning algorithm. 3-The learning algorithm is fed to the machine (surgical robot). 4-The learning algorithm is fed thousands of examples to teach the surgical robot how to run a surgical procedure. 5-The programmer tests the surgical robot hundreds to thousands of times to make sure it runs the procedure with 100% accuracy. This Photo by Unknown Author is licensed under CC BY 6-Surgical robot is tested by the surgeon thousands of times to make sure it is 100% accurate. EXAMPLES OF MACHINE LEARNING Examples: Drones ¡ Drones are flying robots or unmanned aerial vehicle or unpiloted aircrafts. ¡ Used for surveillance, monitoring, taking images and videos, delivering packages, etc. ¡ They are remotely controlled or automated. ¡ Automated drones use machine learning for learning how to take decisions. ¡ Autopilot System: The autopilot system is a specialized component that allows autonomous drones to operate without constant human control. It consists of software and hardware that process sensor data, execute flight plans, and make autonomous decisions. ¡ Autopilot system requires an onboard Computer: An autonomous drone requires a powerful onboard computer capable of running sophisticated algorithms and processing large amounts of data in real- time. This computer is responsible for decision-making, path planning, and executing autonomous flight tasks. EXAMPLES OF MACHINE LEARNING ¡ Text mining and Natural Language Processing (NLP): learning from text data to extract important information, for example integrating news from different news sources to integrate them in one site. ¡ Social Networks: learning user behavior to be able to predict useful feeds, products and suggest friends of similar inetrests. ¡ Bioinformatics: learning from large data as human genome data (sequences of DNA) to extract important biological information related to diseases to be able to predict disease. ¡ Financial systems: learning from historical financial data to predict profit or loss in a company. ¡ Weather forcast: learning from archived weather data to predict weather. TYPES OF MACHINE LEARNING : SUPERVISED LEARNING https://logicmojo.com/supervised-and-unsupervised-learning What is a classifier A supervised learning method, that has two steps: 1- Training step : Needs thousands of examples to be fed to a model that learns how to differentiate two groups or classes or differentiate multiple groups or classes. 2- Testing step : Needs another independent set of examples to test the model. This is evaluated by a measure of accuracy. Training data Classification: Taking the example of the following image, images of shapes are the input to the machine learning model, and labels of those images as shape names are the output data. The labelling process is called “annotation”. Based on these input and output data, the model learns to predict the category of unseen image data, whether a rectangle, circle, triangle, or hexagon. Training data https://databasetown.com/supervised-learning-algorithms/#google_vignette TYPES OF MACHINE LEARNING : UNSUPERVISED LEARNING Source: https://logicmojo.com/supervised-and-unsupervised-learning ¡ Taking decisions, based on certain models or criteria in the data – without training a model. ¡ Clustering: Unsupervised learning that groups data into different groups based on a certain criteria. SUPERVISED VS UNSUPERVISED LEARNING ¡ Supervised learning requires labeled data with input features and corresponding output labels, while unsupervised learning aims to discover patterns or structures in unlabeled data without predefined output labels. FEATURES OR DIMENSIONS ¡ In machine learning, the data input is a set of objects. ¡ Features are used to describe objects in a data set, examples are: ¡ Fruit can be described by: color, texture, shape, size, weight, …etc. ¡ A student in a class can be described by: Grades, number of courses, level of study, …etc. ¡ Cars can be described by: brand, size, color, power, model, …etc. ¡ Human being can be described by: Age, gender, weight, height, nationality, birth place, …etc ¡ Important note: we may use all features for classification or clustering or select the features that makes the learning process more efficient. ¡ The number of features used determines the number of dimensions of the data set. To learn a model from training data, we need to find the decision boundary that best separates the two classes. Two dimensions, taking two features only Decision boundary Decision boundaries can take many shapes according to the complexity of the problem. In two dimensions, this could be a simple line, and in higher dimensions this can be a hyperplane. Figures source: https://pythonnumericalmethods.berkeley.edu/_images/25.02.01-classification-example.png DISTANCES BETWEEN OBJECTS ¡ Euclidean distance is used to compute distances between two objects as illustrated below: 1-Square the difference between the values of each feature (each dimension). 2-Sum the squared differences of all features (dimensions). 3-Take root square of the sum calculated student in step 2. number Height Age In this example we only have two features (Height and Age) 1 180 20 2 175 23 𝑑 1,2 = 180 − 175 ! + 20 − 23 ! 3 169 27 4 173 22 𝑑 1,3 = 180 − 169 ! + 20 − 27 ! 5 160 19 6 155 16 𝑑 4,5 = 173 − 160 ! + 22 − 19 ! 7 162 22 8 158 23 Quiz: Calculate d(7,8). student number Height Age 1 180 20 2 175 23 3 169 27 4 173 22 5 160 19 6 170 16 7 162 22 8 168 17 Are there groups of students according to age and height? How to predict the groups in this data? Or how to group the data into two groups? Can we measure "how similar" are two points on the graph? (more similar students should be in the same group). Use Euclidean distance to know the distance between two points (two students) : the larger the distance the less similar they are, and vice versa. UNSUPERVISED LEARNING : CLUSTERING ¡ Given a two dimensional dataset of age and height of individuals, we want to group the students into 2 groups. ¡ K means is a popular clustering technique that groups data objects according to distances to a center. ¡ K means needs as input : 1-data set (objects described by features, in this case age and height are the features), 2-number of clusters (k). ¡ Number of clusters (k) determines the number of groups we want to identify in the given data. In this example, k=2. KMEANS CLUSTERING ¡ 1-Randomly pick any two points (we have k=2) in space as initial centers. (red and orange points in figure 1). ¡ 2-Calculate the distances between each point and each of the two centers (using euclidean distance). (figures 1 and 2) ¡ 3-Assign each point to the nearest center, it is either assigned to center 1 (orange) or center 2 (red). It is only assigned to one of them. (figure 3). ¡ 4-Recompute the features of centers from the average of the features of points that belong to that center. (ie the centers will move towards the points that are assigned to it). (figure 4) ¡ Repeat steps 2-4 until there is no change in assignment to a center. (figures 5-7). Figure 1 Figure 2 Figure 3 Compute the features of the centers from the points that belong to their cluster Distances computed between each of the centers and all points Figure 5 Figure 4 Compute the features of the centers from After the step in figure 7, the Assign each point to the nearest center the points that belong to their cluster next step will calculate the distances of all points from each center, and it can be easily figured out that points will be assigned to the same centers as in figure 7, and thus there will be no change in the clusters, so we will conclude that the clusters in figure 7 are the final clusters. Figure 6 Figure 7 SUMMARY ¡ Machine learning has a wide variety of applications in many fields. ¡ Machine learning algorithms can be supervised and unsupervised learning. ¡ Supervised learning uses labelled data while unsupervised learning uses unlabelled data. ¡ An example of supervised learning is classification and an example of unsupervised learning is clustering. ¡ Classification is based on a training set to train a model and then testing ( evaluating ) this model using a test set. ¡ The number of features determine the number of dimensions of the data set. ¡ Distances between objects can be calculated using euclidean distance. ¡ Kmeans is an iterative clustering technique to group (cluster) points into k clusters, according to distances of points from centers of clusters.