Introduction to Machine Learning - Jamhuriya University

Objectiv es  Understand the basics and importance of machine learning. Chapter 5  Explore the machine learning life cycle and its key steps....

Objectiv es  Understand the basics and importance of machine learning. Chapter 5  Explore the machine learning life cycle and its key steps.  Distinguish between supervised, Introduction to Machine unsupervised, and reinforcement learning. Learning & Algorithms  Identify real-world applications of machine learning.  Learn to find and prepare datasets for ML Facilitator. Ahmed Osman projects.  Understand and apply linear regression in practical scenarios.  Familiarize with key terminologies and common pitfalls in ML.  Gain hands-on experience through practical ML projects. Jamhuriya University of Science & Technology (JUST) Introduction to Machine Learning What do you know about Machine Learning? Discussion…………….. Jamhuriya University of Science & Technology (JUST) Introduction to Machine Learning Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on developing computer programs capable of accessing and learning from data to improve performance automatically over time. ML enables systems to learn and improve based on experience, without requiring explicit programming, by using algorithms to identify patterns and make decisions or predictions based on the data. Jamhuriya University of Science & Technology (JUST) Introduction to Machine Learning Machine learning is an exciting field and a subset of artificial intelligence. In other words, Machine learning (ML) is a sort of AI technology (AI) that enables software applications to improve their prediction accuracy without being expressly designed to do so. Jamhuriya University of Science & Technology (JUST) Why Machine Learning? Jamhuriya University of Science & Technology (JUST) Why Machine Learning? According to these applications, you can see why it is important. Jamhuriya University of Science & Technology (JUST) Why Machine Learning? Customer Assistance: Providing information, answering questions, or guiding people at events, shopping malls, or exhibitions. Jamhuriya University of Science & Technology (JUST) Machine learning life cycle process Machine learning has given computer systems the ability to automatically learn without being explicitly programmed. But how does a machine learning system work? So, it can be described using the life cycle of machine learning. The machine learning life cycle is a cyclic process to build an efficient machine learning project. The main purpose of the life cycle is to find a solution to the problem or project. Jamhuriya University of Science & Technology (JUST) Machine learning life cycle process Gathering Data: Collect raw data from various sources. Data Preparation: Clean, format, and organize the data for analysis. Data Wrangling: Transform and structure data to make it usable. Analyze Data: Explore and understand the patterns in the data. Train Model: Use algorithms to teach the model based on the data. Test Model: Evaluate the model's accuracy and performance. Deployment: Deploy the model for real-world use. Jamhuriya University of Science & Technology (JUST) Machine learning life cycle process 1. Gathering Data: It is the first step of the machine learning life cycle. The goal of this step is to identify and obtain all data-related problems. 2. Data preparation: It is a step where we put our data into a suitable place and prepare it to use in our machine learning training(including Data exploration). 3. Data Wrangling: It is the process of cleaning and converting raw data into a useable format(Missing Values, Duplicate data Invalid data &Noise). 4. Data Analysis: After the cleaned and prepared data is passed on to the analysis step(Selection of analytical techniques, Building models, Review of the result) Jamhuriya University of Science & Technology (JUST) Machine learning life cycle process 5. Train Model: Now the next step is to train the model, in this step we train our model to improve its performance for better outcome of the problem. 6. Test Model: Testing the model determines the percentage accuracy of the model as per the requirement of project or problem. 7. Deployment: The last step of the machine learning life cycle is deployment, where we deploy the model in the real- world system. Jamhuriya University of Science & Technology (JUST) Types of Machine learning Machine learning involves showing a large volume of data to a machine so that it can learn and make predictions, find patterns, or classify data. The three machine learning types are supervised, unsupervised, and reinforcement learning. 1. Supervised Learning: uses a training set to teach models to yield the desired output. 2. Unsupervised Learning: uses machine learning algorithms to analyze and cluster unlabeled datasets. 3. Reinforcement Learning: an area of Machine Learning. It is about taking suitable action to maximize reward in a particular situation Jamhuriya University of Science & Technology (JUST) Types of Machine learning How Rewards and Penalties Work: 1.Rewards: 1. Positive feedback given when the agent performs a favorable action. 2. Encourages the agent to repeat that action in similar scenarios. 3. Examples: 1.A self-driving car reaching its destination without accidents earns a reward. 2.A robot successfully picking up an object gets a reward. 2.Penalties: 1. Negative feedback given when the agent performs an unfavorable action. 2. Discourages the agent from repeating that action in the future. 3. Examples: 1.A self-driving car hitting an obstacle incurs a penalty. 2.A game-playing agent losing a life in the game receives a penalty. Mechanism: The environment assigns a reward value (positive or negative) after each action. The agent uses the feedback to update its strategy through algorithms like Q-Learning or Policy Gradient. 14 Over time, the agent learns an optimal policy to maximize cumulative https://www.geeksforgeeks.org/types-of-machine-learning/ 15 https://www.geeksforgeeks.org/types-of-machine-learning/ 16 Reinforcement Machine Learning https://www.geeksforgeeks.org/types-of-machine-learning/ 17 Reinforcement Machine Learning 18 Jamhuriya University of Science & Technology (JUST) Types of Machine Learning and Algorithms Reinforcement can be used: 1. Robotics for industrial Automation 2. ML & Data processing 3. Create training systems Jamhuriya University of Science & Technology (JUST) What Algorithm An algorithm is a step-by-step procedure or a set of rules designed to solve a specific problem or perform a task. It takes an input, processes it according to predefined steps, and produces an output. Key Characteristics of an Algorithm: Finite: It must have a clear end and finish after a specific number of steps. Definitive: Each step must be well-defined and unambiguous. Input: It may require one or more inputs to start the process. Output: Produces at least one output (result). Effective: Each step should be basic and executable within a finite time. Daily Example: Making tea: Boil water -> Add tea leaves -> Pour in milk and sugar -> Stir and serve. Programming Example: Sorting numbers in ascending order using a Bubble Sort Algorithm: Compare adjacent numbers. Swap them if they are in the wrong order. Repeat until the list is sorted. In Machine Learning: Jamhuriya University of Science & Technology (JUST) Supervised Machine learning Supervised learning is effective for a variety of business purposes, including sales forecasting, inventory optimization, and fraud detection. Some examples of use cases include: Predicting real estate prices Classifying whether bank transactions are fraudulent or not Finding disease risk factors Determining whether loan applicants are low-risk or high- risk Jamhuriya University of Science & Technology (JUST) Find Dataset Repositories in Online Resource Dataset Finders Kaggle: This data science platform has many interesting, user-contributed datasets for cognitive computing. The UCI Machine Learning Repository has been a go-to resource for open datasets for decades. Users can also access the information without registering. Dataset Search on Google: Dataset Search has over 25 million datasets from across the internet. Jamhuriya University of Science & Technology (JUST) Find Dataset Repositories in Online Resource Searching Dataset: Kaggle Click the link below https:// www.kaggle.com/ Jamhuriya University of Science & Technology (JUST) Find Dataset Repositories in Online Resource You may contribute your dataset or download any uploaded datasets. https://www.kaggle.com/datasets Jamhuriya University of Science & Technology (JUST) Find Dataset Repositories in Online Resource Click the link below to the above dataset https://www.kaggle.com/datasets/hanifalirsyad/coffee-scrap- coffeereview Jamhuriya University of Science & Technology (JUST) 1 project st Simple Linear Regression (Student Grade Prediction project: using Linear Regression Model) Jamhuriya University of Science & Technology (JUST) What is Linear regression?  Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as Student Grade, sales, salary, age, product price, etc.  Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (x) variables, hence called as linear regression. Since linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable. https://www.javatpoint.com/linear-regression-in-machine- Jamhuriya University of Science & Technology (JUST) The linear regression model provides a sloped straight line representing the relationship between the variables. Consider the below image: Jamhuriya University of Science & Technology (JUST) Types of Linear Regression Linear regression can be further divided into two types of the algorithm: Simple Linear Regression: If a single independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression. Multiple Linear Regression: If more than one independent variable is used to predict the value of a numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear Regression. https://www.javatpoint.com/linear-regression-in-machine- learning Jamhuriya University of Science & Technology (JUST) Machine Learning is Like a Baby — 3  Key Similarities Learning from Experience Just as a baby learns by observing, experimenting, and receiving feedback, machine learning models learn by analyzing data and improving based on patterns they find.  Trial and Error Babies try different actions, like reaching for an object, and learn what works. Similarly, machine learning models test predictions and adjust when errors are identified.  Guidance In supervised learning, the model is given labeled data (like a baby being taught by a parent). Over time, the model (or baby) becomes better at recognizing patterns or making decisions.  Exploration Babies explore their environment to understand it, just like unsupervised learning models analyze data to find structure or clusters without prior guidance.  Reinforcement Babies are encouraged to repeat good actions (e.g., clapping) and avoid bad ones (e.g., touching something hot), which mirrors reinforcement learning, where actions are reinforced with rewards or penalties. Jamhuriya University of Science & Technology (JUST) Common Terminologies in Model building era Understanding Key Concepts in Machine Learning Jamhuriya University of Science & Technology (JUST) What Sklearn? Scikit-learn (Sklearn) is a robust library for machine learning in Python. It provides tools for Classification Regression Clustering Dimensionality Reduction Jamhuriya University of Science & Technology (JUST) Key Terms: x_train: Training part of independent variables. x_test: Testing part of independent variables. y_train: Training labels (dependent variables). y_test: Testing labels (dependent variables). Example: If test_size=0.4, 40% of the data is used for testing, and 60% is used for training Jamhuriya University of Science & Technology (JUST) Key Terminologies in Model Building Model: A trained file used to recognize patterns in data. random_state: Controls shuffling in train_test_split(). Ensures reproducibility (e.g., random_state=42). model.fit(): Trains the model using data (X, y). predict(): Predicts labels for new data (X_new). Jamhuriya University of Science & Technology (JUST) Additional Terminologies Overfitting: When a model is too tailored to the training data, failing on new data. Underfitting: When a model is too simplistic and fails to capture patterns. Feature Scaling: Normalizing data using standardization or min-max scaling. Validation Set: Used for hyperparameter tuning to improve generalization. Epochs & Batch Size: Critical for optimization in deep learning algorithms. Jamhuriya University of Science & Technology (JUST) Common Pitfalls in Model Building Skipping Data Preprocessing: Leads to poor model performance. Ignoring Data Imbalances: Results in biased models. Not Tuning Hyperparameters: Reduces model accuracy. Overfitting: Model performs well on training data but poorly on testing data. Underfitting: Model fails to capture the complexity of the data. Jamhuriya University of Science & Technology (JUST) 1 project st Simple Linear Regression (Student Grade Prediction project: using Linear Regression Model) refer to the Jupyter Note Book that we created in the Jamhuriya University of Science & Technology (JUST) Multiple Linear Regression (More than one independent variable) (Sales project using Linear Regression Model) Jamhuriya University of Science & Technology (JUST) 2 nd project Multiple Linear Regression (Sales project using Linear Regression Model) Jamhuriya University of Science & Technology (JUST) Sales project using Linear Regression Model In this session, we will use a practical regression project to implement the entire machine-learning pipeline. This will be based on a real-world advertising dataset. The overall task is to future sales predictions, given past records of sold properties Jamhuriya University of Science & Technology (JUST) Continue… Linear regression is an algorithm that provides a linear relationship between an independent variable and a dependent variable to predict the outcome of future events. The dataset given here contains data about the sales of the product. The dataset is about the advertising cost incurred by the business on various advertising platforms. Below is the description of all the columns in the dataset: 1.TV: Advertising cost spent in dollars for advertising on TV; 2.Radio: Advertising cost spent in dollars for advertising on Radio; 3.Newspaper: Advertising cost spent in dollars for advertising on Newspaper; You can download the dataset; 4.Sales: Number of units sold; https://raw.githubusercontent.com/amankharwal/Website-data/master/advertising.csv Jamhuriya University of Science & Technology (JUST) Jamhuriya University of Science & Technology (JUST) 42

Introduction to Machine Learning - Jamhuriya University

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue