CS-323-AI-LEC1-FA-24 (1).pptx
Document Details
Uploaded by FertilePyramidsOfGiza4813
NED University of Engineering and Technology, Karachi
Full Transcript
CS-323 : Artificial Intelligence Module 1: Machine learning Basics What is ML? Machine learning (ML) is a branch of artificial intelligence (AI) and computer science that focuses on the using data and algorithms to enable AI to imitate the way that humans learn, gradually improving its accurac...
CS-323 : Artificial Intelligence Module 1: Machine learning Basics What is ML? Machine learning (ML) is a branch of artificial intelligence (AI) and computer science that focuses on the using data and algorithms to enable AI to imitate the way that humans learn, gradually improving its accuracy. What is ML? In traditional programming, a computer follows a set of predefined instructions to perform a task. However, in machine learning, the computer is given a set of examples (data) and a task to perform, but it's up to the computer to figure out how to accomplish the task based on the examples it's given. What is ML? This ability to learn from data and improve over time makes machine learning incredibly powerful and versatile. It's the driving force behind many of the technological advancements we see today, from voice assistants and recommendation systems to self-driving cars and predictive analytics. What is ML? At its core, machine learning is all about creating and implementing algorithms that facilitate these decisions and predictions. These algorithms are designed to improve their performance over time, becoming more accurate and effective as they process more data. What is ML? For instance, if we want a computer to recognize images of cats, we don't provide it with specific instructions on what a cat looks like. Instead, we give it thousands of images of cats and let the machine learning algorithm figure out the common patterns and features that define a cat. Over time, as the algorithm processes more images, it gets better at recognizing cats, even when presented with images it has never seen before. Self test What is your understanding of ML? How it is different from a programmed computer? ML-How it works? Developing the right ML model to solve a problem requires diligence, experimentation and creativity. Although the process can be complex, it can be summarized into a seven-step plan for building an ML model. ML-How it works? Understand the business problem and define success criteria. Convert the group's knowledge of the business problem and project objectives into a suitable ML problem definition. Consider why the project requires machine learning, the best type of algorithm for the problem, any requirements for transparency and bias reduction, and expected inputs and outputs. Understand and identify data needs. Determine what data is necessary to build the model and assess its readiness for model ingestion. Consider how much data is needed, how it will be split into test and training sets, and whether a pretrained ML model can be used. How ML Works? Step 1: Data collection The first step in the machine learning process is data collection. Data is the lifeblood of machine learning - the quality and quantity of your data can directly impact your model's performance. Data can be collected from various sources such as databases, text files, images, audio files, or even scraped from the web. Once collected, the data needs to be prepared for machine learning. This process involves organizing the data in a suitable format, such as a CSV file or a database, and ensuring that the data is relevant to the problem you're trying to solve. How ML Works? Step 2: Data preprocessing Data preprocessing is a crucial step in the machine learning process. It involves cleaning the data (removing duplicates, correcting errors), handling missing data (either by removing it or filling it in), and normalizing the data (scaling the data to a standard format). Preprocessing improves the quality of your data and ensures that your machine learning model can interpret it correctly. This step can significantly improve the accuracy of your model. Data Processing-Key Take aways Data collection-Collect all the data you need for your models, whether from your own organization, public or paid sources. Data cleaning-Turn the messy raw data into clean, tidy data ready for analysis. Feature engineering-Manipulate the datasets to create variables (features) that improve your model’s prediction accuracy. Create the same features in both the training set and the testing set. Split the data-Randomly divide the records in the dataset into a training set and a testing set. For a more reliable assessment of model performance, generate multiple training and testing sets using cross validation How ML Works? Step 3: Choosing the right model Once the data is prepared, the next step is to choose a machine learning model. There are many types of models to choose from, including linear regression, decision trees, and neural networks. The choice of model depends on the nature of your data and the problem you're trying to solve. Factors to consider when choosing a model include the size and type of your data, the complexity of the problem, and the computational resources available How ML Works? Step 4: Training the model After choosing a model, the next step is to train it using the prepared data. Training involves feeding the data into the model and allowing it to adjust its internal parameters to better predict the output. During training, it's important to avoid overfitting (where the model performs well on the training data but poorly on new data) and underfitting (where the model performs poorly on both the training data and new data). How ML Works? Step 5: Evaluating the model Once the model is trained, it's important to evaluate its performance before deploying it. This involves testing the model on new data it hasn't seen during training. Common metrics for evaluating a model's performance include accuracy (for classification problems), precision and recall (for binary classification problems), and mean squared error (for regression problems). How ML Works? Step 6: Hyperparameter tuning and optimization After evaluating the model, you may need to adjust its hyperparameters to improve its performance. This process is known as parameter tuning or hyperparameter optimization. Techniques for hyperparameter tuning include grid search (where you try out different combinations of parameters) and cross validation (where you divide your data into subsets and train your model on each subset to ensure it performs well on different data). How ML Works? Step 7: Predictions and deployment Once the model is trained and optimized, it's ready to make predictions on new data. This process involves feeding new data into the model and using the model's output for decision-making or further analysis. Deploying the model involves integrating it into a production environment where it can process real-world data and provide real-time insights. This process is often known as MLOps. Deployment-Summary Deploy the model-Embed the model you chose in dashboards, applications, or wherever you need it. Monitor model performance-Regularly test the performance of your model as your data changes to avoid model drift Improve your model-Continously iterate and improve your model post deployment. Replace your model with an updated version to improve performance. ML-Workflow ML Process-Summary learning system of a machine learning algorithm into three main parts. A Decision Process: In general, machine learning algorithms are used to make a prediction or classification. Based on some input data, which can be labeled or unlabeled, your algorithm will produce an estimate about a pattern in the data. An Error Function: An error function evaluates the prediction of the model. If there are known examples, an error function can make a comparison to assess the accuracy of the model. ML Process A Model Optimization Process: If the model can fit better to the data points in the training set, then weights are adjusted to reduce the discrepancy between the known example and the model estimate. The algorithm will repeat this iterative “evaluate and optimize” process, updating weights autonomously until a threshold of accuracy has been met. Applications Speech recognition: It is also known as automatic speech recognition (ASR), computer speech recognition, or speech-to-text, and it is a capability which uses natural language processing (NLP) to translate human speech into a written format. Many mobile devices incorporate speech recognition into their systems to conduct voice search—e.g. Siri—or improve accessibility for texting. Applications Customer service: Online chatbots are replacing human agents along the customer journey, changing the way we think about customer engagement across websites and social media platforms. Chatbots answer frequently asked questions (FAQs) about topics such as shipping, or provide personalized advice, cross-selling products or suggesting sizes for users. Examples include virtual agents on e-commerce sites; messaging bots, using Slack and Facebook Messenger; and tasks usually done by virtual assistants and voice assistants. Applications Computer vision: This AI technology enables computers to derive meaningful information from digital images, videos, and other visual inputs, and then take the appropriate action. Powered by convolutional neural networks, computer vision has applications in photo tagging on social media, radiology imaging in healthcare, and self-driving cars in the automotive industry. Applications Recommendation engines: Using past consumption behavior data, AI algorithms can help to discover data trends that can be used to develop more effective cross- selling strategies. Recommendation engines are used by online retailers to make relevant product recommendations to customers during the checkout process. Applications Robotic process automation (RPA): Also known as software robotics, RPA uses intelligent automation technologies to perform repetitive manual tasks. Automated stock trading: Designed to optimize stock portfolios, AI-driven high-frequency trading platforms make thousands or even millions of trades per day without human intervention. Applications Fraud detection: Banks and other financial institutions can use machine learning to spot suspicious transactions. Anomaly detection can identify transactions that look atypical and deserve further investigation. Self test What are its key components of ML? Types of ML Supervised ML Supervised Machine Learning Models Supervised learning is defined as when a model gets trained on a “Labelled Dataset”. Labelled datasets have both input and output parameters. In Supervised Learning algorithms learn to map points between inputs and correct outputs. It has both training and validation datasets labeled. Steps Involved in Supervised Learning: First Determine the type of training dataset Collect/Gather the labeled training data. Split the training dataset into training dataset, test dataset, and validation dataset. Determine the input features of the training dataset, which should have enough knowledge so that the model can accurately predict the output. Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc. Execute the algorithm on the training dataset. Sometimes we need validation sets as the control parameters, which are the subset of training datasets. Evaluate the accuracy of the model by providing the test set. If the model predicts the correct output, which means our model is accurate. Supervised Machine Learning Models In this type of algorithm, there is a target variable that we wish to predict. This target variable is dependent on various independent variables. A function is generated using this set of variables that gives the desired outputs. The model is trained until the training dataset achieves the desired accuracy. Supervised Machine Learning Models A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a “reasonable” way. Advantages of Supervised Learning High Accuracy: Since the algorithm is trained on labeled data, it typically provides high accuracy in predictions. Clear Objective: The goal is well-defined, making it easier to measure the model's performance. Versatile: Can be applied to various domains, including finance, healthcare, and marketing. Disadvantages of Supervised Learning Requires Labeled Data: Obtaining a labeled dataset can be time-consuming and expensive. Limited Generalization: The model may not perform well on unseen data if the training data is not representative of the real-world scenarios. Prone to Overfitting: The model may become too tailored to the training data, losing its ability to generalize to new data Types of Supervised ML Regression Classification Classification Classification algorithms are used when the output variable is categorical, which means there are two classes such as Yes-No, Male-Female, True-false, etc. Classification Classification: Assigns labels to input data, often used for tasks with distinct categories. For instance, classifying images of fruits into categories like apples, oranges, and bananas. classification is predicting discrete or categorical output Classification: Examples of classification algorithms: Logistic Regression K-Nearest Neighbours Support Vector Machines Kernel SVM Naïve Bayes Decision Tree Classification Random Forest Classification Evaluation Accuracy: Accuracy is the percentage of predictions that the model makes correctly. It is calculated by dividing the number of correct predictions by the total number of predictions. Precision: Precision is the percentage of positive predictions that the model makes that are actually correct. It is calculated by dividing the number of true positives by the total number of positive predictions. Recall: Recall is the percentage of all positive examples that the model correctly identifies. It is calculated by dividing the number of true positives by the total number of positive examples. Evaluation F1 score: The F1 score is a weighted average of precision and recall. It is calculated by taking the harmonic mean of precision and recall. Confusion matrix: A confusion matrix is a table that shows the number of predictions for each class, along with the actual class labels. It can be used to visualize the performance of the model and identify areas where the model is struggling. Regression: In this type of supervised learning, the machine learning model understands the relationship among various variables and helps estimate the value of a new dependent variable when certain independent variables are available. For example: Predicting the price of a house, it is common knowledge that the larger the house, the more expensive it will be. To predict its price, we can give the regression model a dataset of areas of various houses and their prices then, when a new house’s area is sent in as input, the model can predict its price. Here are the key terminologies to help us identify metrics: True Positive (TP) — actual value Positive and predicted value Positive True Negative (TN) — actual value Negative and predicted value Negative False Positive (FP) / Type I Error — actual value Negative and predicted value Positive False Negative(FN) / Type II Error — actual value Positive and predicted value Negative Evaluation Mean Squared Error (MSE): MSE measures the average squared difference between the predicted values and the actual values. Lower MSE values indicate better model performance. Root Mean Squared Error (RMSE): RMSE is the square root of MSE, representing the standard deviation of the prediction errors. Similar to MSE, lower RMSE values indicate better model performance. Mean Absolute Error (MAE): MAE measures the average absolute difference between the predicted values and the actual values. It is less sensitive to outliers compared to MSE or RMSE. R-squared (Coefficient of Determination): R-squared measures the proportion of the variance in the target variable that is explained by the model. Higher R-squared values indicate better model fit. Examples Credit Card Fraud Detection is a crucial application of machine learning in the financial sector. The goal is to build models that can automatically identify and flag transactions that are likely to be fraudulent, helping financial institutions and credit card companies prevent or minimize losses due to fraudulent activities. involves building a model to identify potentially fraudulent transactions based on various patterns and anomalies in credit card transactions Examples Image Classification: Images are labeled with the objects they contain (e.g., “cat”, “dog”, “car”), it forms the basis of a supervised learning problem in computer vision. Supervised learning involves training a model on a labeled dataset, where each input (in this case, an image) is associated with a corresponding output label (the object in the image). The goal is for the model to learn a mapping from inputs to outputs so that, given a new, unseen image, it can accurately predict or classify the object it contains. Examples Cryptocurrency Prediction, predicting the future prices or trends of cryptocurrencies based on historical market data and other relevant factors. Predicting the future prices or trends of cryptocurrencies involves utilizing machine learning models to analyze historical market data and other relevant factors. In this predictive task, historical data, including past cryptocurrency prices, trading volumes, and market indicators, serves as the training ground for the model. The model learns patterns and relationships within the data to make informed predictions about future price movement Examples Predicting the selling price of cars based on features like brand, model, age, mileage, and additional attributes. Predicting the selling price of cars based on a set of features is a classic example of regression in machine learning. In this scenario, a model is trained to understand the relationship between various attributes of cars and their corresponding selling prices Examples Heart disease prediction involves building a model that can assess the likelihood of an individual having heart disease based on various health-related features. Heart disease prediction is a critical application of machine learning where the objective is to construct a model capable of assessing the likelihood of an individual having heart disease based on a range of health-related features. Un Supervised ML Unsupervis ed ML Unsupervised learning in artificial intelligence is a type of machine learning that learns from data without human supervision. Unlike supervised learning, unsupervised machine learning models are given unlabeled data and allowed to discover patterns and insights without any explicit guidance or instruction. Unsupervised ML Unsupervised learning models, on the other hand, work in an autonomous manner to identify the innate structure of data that has not been labeled. It is important to keep in mind that validating the output variables still calls for some level of human involvement. For instance, an unsupervised learning model can determine that customers who shop online tend to purchase multiple items from the same category at the same time. However, a human analyst would need to check that it makes sense for a recommendation engine to pair Item How Does Unsupervised Learning Work? Data Collection: Gather a dataset without any output labels. Training Phase: Feed the unlabeled data into the machine learning algorithm. The algorithm analyzes the data to find hidden patterns or structures. Pattern Recognition: The algorithm groups similar data points together or reduces the dimensionality of the data for easier interpretation. Types of Unsupervised Learning Unsupervised learning can be categorized into two main types: Clustering: The algorithm groups similar data points together based on their features. For example, grouping customers with similar buying habits for targeted marketing campaigns. Dimensionality Reduction: The algorithm reduces the number of features in the dataset while retaining the most important information. This is useful for visualizing high-dimensional data or speeding up subsequent machine learning tasks. Advantages of Unsupervised Learning No Labeled Data Required: Can work with unlabeled data, which is often more readily available. Discover Hidden Patterns: Can uncover structures and relationships within the data that may not be apparent through manual analysis. Scalable: Can handle large datasets more efficiently. Disadvantages of Unsupervised Learning Less Accurate: Since there are no labels to guide the learning process, the results may be less accurate compared to supervised learning. Interpretability: The results can be harder to interpret and may require domain expertise to make sense of the identified patterns. Evaluation Challenges: Without labels, it is difficult to quantitatively evaluate the model's performance. Examples –Unsupervised learning Customer Segmentation: Grouping customers with similar purchasing behaviors for targeted marketing. Anomaly Detection: Identifying unusual patterns in network traffic that could indicate a security breach. Image Compression: Reducing the number of colors in an image while preserving the essential features, using techniques like PCA. Comparison Data Requirement Supervised Learning: Requires a labeled dataset, where each example is paired with the correct output. Unsupervised Learning: Works with unlabeled data, relying solely on the input features to identify patterns. Algorithm Complexity Supervised Learning: Generally involves more straightforward algorithms since the learning process is guided by the labeled data. Examples include linear regression, logistic regression, and decision trees. Comparison Unsupervised Learning: Often involves more complex algorithms due to the lack of guidance from labels. Examples include k-means clustering, hierarchical clustering, and principal component analysis (PCA). Accuracy and Performance Supervised Learning: Typically offers higher accuracy and performance on prediction tasks because the model is trained with explicit labels. Unsupervised Learning: May have lower accuracy in terms of specific predictions but excels at discovering hidden structures and patterns within the data. Semi Supervised Semi-supervised learning is the type of machine learning that uses a combination of a small amount of labeled data and a large amount of unlabeled data to train models. This approach to machine learning is a combination of supervised machine learning, which uses labeled training data, and unsupervised learning, which uses unlabeled training data. In other words, it is partially supervised and partially unsupervised learning. Example Suppose a bucket consists of three fruits , apple, banana and orange. Someone captured the image of all the three but labeled only the orange and banana images. Here, the model first will classify the new apple image as not a banana and not orange. Then someone will observe these predictions and label them as apples. Then retraining the model with that label will give it the ability to classify apple images as an apple. Reinforcement Learning Reinforcement Learning Reinforcement learning (RL) is a type of machine learning process that focuses on decision making by autonomous agents. An autonomous agent is any system that can make decisions and act in response to its environment independent of direct instruction by a human user. Robots and self-driving cars are examples of autonomous agents. In reinforcement learning, an autonomous agent learns to perform a task by trial and error in the absence of any guidance from a human user. Reinforcement Learning Reinforcement learning essentially consists of the relationship between an agent, environment, and goal. Literature widely formulates this relationship in terms of the Markov decision process (MDP). Key Concepts There are a few key concepts in RL that are important to understand before diving into the algorithms: Environment: The environment is the world in which the agent operates. It can be anything from a physical environment like a game or robot to a virtual environment like a simulation or computer program. Key Concepts Agent: The agent is the decision-maker that takes actions in the environment. Its goal is to maximize the total reward it receives from the environment. State: The state is the current situation of the environment that the agent observes. It includes all the relevant information about the environment that the agent needs to make decisions. Action: The action is the decision that the agent takes based on the current state of the environment. It can be anything from moving left or right in a game to buying or selling a stock in the financial market. Reward: The reward is the feedback that the agent receives from the environment for taking a certain action in a certain state. The goal of the agent is to maximize the total reward it receives over time. Reinforcement Learning The reinforcement learning agent learns about a problem by interacting with its environment. The environment provides information on its current state. The agent then uses that information to determine which actions(s) to take. If that action obtains a reward signal from the surrounding environment, the agent is encouraged to take that action again when in a similar future state. This process repeats for every new state thereafter. Over time, the agent learns from rewards and punishments to take actions within the environment that meet a specified goal. Reinforcement Learning Because an RL agent has no manually labeled input data guiding its behavior, it must explore its environment, attempting new actions to discover those that receive rewards. From these reward signals, the agent learns to prefer actions for which it was rewarded in order to maximize its gain. But the agent must continue exploring new states and actions as well. In doing so, it can then use that experience to improve its decision-making. Reinforcement Learning RL algorithms thus require an agent to both exploit knowledge of previously rewarded state-actions and explore other state-actions. It must continuously try new actions while also preferring single (or chains of) actions that produce the largest cumulative reward Markov Decision Process In Markov decision processes, state space refers to all of the information provided by an environment’s state. Action space denotes all possible actions the agent may take within a state. Self test Self test Applications Applications Applications Applications Applications Test What is Machine learning? a) The selective acquisition of knowledge through the use of computer programs b) The selective acquisition of knowledge through the use of manual programs c) The autonomous acquisition of knowledge through the use of computer programs d) The autonomous acquisition of knowledge through the use of manual programs Test What is the key difference between supervised and unsupervised learning? a) Supervised learning requires labeled data, while unsupervised learning does not. b) Supervised learning predicts labels, while unsupervised learning discovers patterns. c) Supervised learning is used for classification, while unsupervised learning is used for regression. d) Supervised learning is always more accurate than unsupervised learning.