Learn AI with Python PDF

Learn AI with Python Explore Machine Learning and Deep Learning Techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras Gaurav Leekha www.bpbonline.com FIRST EDITION 2022 Copyright © BPB Publications, India ISBN: 978-93-91392-611 All Rights Reserved. No part of this publication may be reproduced, distributed or transmitted in any form or by any means or stored in a database or retrieval system, without the prior written permission of the publisher with the exception to the program listings which may be entered, stored and executed in a computer system, but they can not be reproduced by the means of publication, photocopy, recording, or by any electronic and mechanical means. LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY The information contained in this book is true to correct and the best of author’s and publisher’s knowledge. The author has made every effort to ensure the accuracy of these publications, but publisher cannot be held responsible for any loss or damage arising from any information in this book. All trademarks referred to in the book are acknowledged as properties of their respective owners but BPB Publications cannot guarantee the accuracy of this information. www.bpbonline.com Dedicated to Aarav Leekha My son, the beat of my heart, and the energy of my soul. About the Author Gaurav Leekha is a Deep Learning researcher. He has 7 years of academic experience of teaching technical courses along with 5+ years of technical content creation as a freelancer on variety of topics related to Machine Learning, Deep Learning, Artificial Intelligence, and Web Development Technologies. He has authored a few research papers published in renowned journals. He is also the reviewer of prominent journals and has been the technical reviewer for various online courses. He has also earned multiple certifications in the field of machine learning and deep learning. Outside work, Gaurav likes to cook food for his family, play with his eight- year-old son and practice vipassana meditation. About the Reviewer Bharat Sikka is the author of the book “Elements of Deep Learning for Computer Vision” and a Data Scientist based in Mumbai, India. Over the years, he has worked on implementing algorithms in Artificial Intelligence in domains like Financial Risk, Fraud, and Governance, Computer Vision among others and is currently working as a Data Scientist at State Bank of India. He also has a thorough knowledge and understanding of various programming languages such as Python, R, MATLAB and Octave for Machine Learning, Deep Learning, Data Visualization and Analysis in Python, R and through Power BI, Tableau. Bharat holds a MS in Data Science and Analytics from Royal Holloway, University of London and BTech in Information Technology from Symbiosis International University and has earned multiple certifications including MOOCs on varied fields including machine learning. He is a science fiction fanatic, loves travelling and a great cook. Acknowledgement First, I would like to express my gratitude to God whose blessings inspired me to write this book. I strongly believe in sharing my knowledge and helping others to succeed. This book wouldn’t have happened if I hadn’t had the support of my caring parents, my loving wife, and my genius son. I will take this opportunity to thank them for their continued support. My sincere thanks to my elder brother and sister for encouraging and believing in me always. Words are not enough to express my gratitude to Dr. Rajesh Kumar Aggarwal, Professor, Department of Computer Engineering, NIT- Kurukshetra, for his continued guidance and insightful comments. Along with that, accept my heartful gratitude for your time and support to motivate me and other people towards the path of spirituality and humanity. Finally, I would also thank my friends who trust my abilities and knowledge to write this book. Preface Artificial Intelligence has existed for a long time and proven to be a disruptive force in the modern world where everything is driven by data and automation. From newspapers to TV channels, the hype around AI these days is ubiquitous and due to a huge improvement in the field of AI, it along with its subfields-Machine Learning and Deep Learning-has become a buzzword in recent years. AI is used extensively across many fields, such as robotics, object detection, image recognition, speech recognition, self-driving vehicles, humanoid robots, recommender system, chatbots, Virtual personal assistants, and so on. The primary goal of this book is to let you explore some real-world scenarios and understand where and which algorithms to use in each context. This exciting recipe-based book also contain functional codes written in Python. Over the 10 chapters in this book, you will learn the following: Chapter 1 covers the basics of Artificial Intelligence and explains all the important terms and definitions. It also explains various fields of study in AI and applications of AI in various industries. It will assist you in installing the Python programming language on different platforms. Chapter 2 covers the basics of Machine Learning and its different learning styles. It also introduces you to the most popular machine learning algorithms and their implementation using Python. Chapter 3 deals with supervised machine learning tasks namely Classification and Regression. It covers various steps to build a classifier and regressor using Python. It also discusses various performance metrics used to evaluate classification and regression models. Chapter 4 deals with unsupervised machine learning tasks namely Clustering. It covers some important ML clustering algorithms and their implementation using Python. It also discusses various metrics used to evaluate the performance of clustering algorithms. Chapter 5 covers logic programming with some implementation examples useful for solving problems in the real-life domain. Chapter 6 discusses, in-depth, what is Natural Language Processing (NLP) and how to implement it in Python. It introduces you to Python’s Natural Language Toolkit (NLTK). It then shows how you can implement various important concepts of NLP using NLTK. Chapter 7 describes the working of an automatic speech recognition (ASR) system. It also covers various steps to build a classifier and regressor using Python. Chapter 8 discusses Artificial Neural Network (ANN) in detail. It then covers building some useful neural networks such as Single layer neural networks, Multilayer neural networks, etc., in Python. Chapter 9 is a key chapter that discusses, in detail, reinforcement learning and its building blocks namely agent and environment. It describes how to construct an environment and agent using the Python programming language. Chapter 10 is another key chapter, covering the basics of deep learning and convolutional neural networks (CNNs). It then explains the evolution of CNN and how it provides complicated object detection in images. It also explains how to build an image classifier using CNN in Python. Downloading the code bundle and coloured images: Please follow the link to download the Code Bundle and the Coloured Images of the book: https://rebrand.ly/9c16d0 Errata We take immense pride in our work at BPB Publications and follow best practices to ensure the accuracy of our content to provide with an indulging reading experience to our subscribers. Our readers are our mirrors, and we use their inputs to reflect and improve upon human errors, if any, that may have occurred during the publishing processes involved. To let us maintain the quality and help us reach out to any readers who might be having difficulties due to any unforeseen errors, please write to us at : [email protected] Your support, suggestions and feedbacks are highly appreciated by the BPB Publications’ Family. Did you know that BPB offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.bpbonline.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.bpbonline.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on BPB books and eBooks. BPB is searching for authors like you If you're interested in becoming an author for BPB, please visit www.bpbonline.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea. The code bundle for the book is also hosted on GitHub at https://github.com/bpbpublications/Learn-AI-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository. We also have other code bundles from our rich catalog of books and videos available at https://github.com/bpbpublications. Check them out! PIRACY If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material. If you are interested in becoming an author If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit www.bpbonline.com. REVIEWS Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at BPB can understand what you think about our products, and our authors can see your feedback on their book. Thank you! For more information about BPB, please visit www.bpbonline.com. Table of Contents 1. Introduction to AI and Python Introduction Structure Objectives Introduction to Artificial Intelligence (AI) Why to learn AI? Understanding intelligence Types of intelligence Various fields of study in AI Applications of AI in various industries How does artificial intelligence learn? AI agents and environments What is an agent? What is an agent’s environment? AI and Python – how do they relate? What is Python? Why choose Python for building AI applications? Python3 – installation and setup Windows Linux Ubuntu Linux Mint CentOS Fedora Installing and compiling Python from Source macOS/Mac OS X Conclusion Questions 2. Machine Learning and Its Algorithms Introduction Structure Objectives Understanding Machine Learning (ML) The Landscape of Machine Learning Algorithms Components of a Machine Learning algorithm Different learning styles in machine learning algorithms Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning Popular machine learning algorithms Linear regression Logistic regression Decision tree algorithm Random forest Naïve Bayes algorithm Support Vector Machine (SVM) k-Nearest Neighbor (kNN) K-Means clustering Conclusion Questions 3. Classification and Regression Using Supervised Learning Introduction Structure Objectives Classification Various steps to build a classifier using Python Step 1 – Import ML library Step 2 – Import dataset Step 3 – Organizing data-training and testing set Step 4 – Creating ML model Step 5 – Train the model Step 6 – Predicting test set result Step 7 – Evaluating the accuracy Lazy earning versus eager learning Performance metrics for classification Confusion matrix Accuracy Precision Recall Specificity F1 score Regression Various steps to build a regressor using Python Step 1 – Import ML library Step 2 – Import dataset Step 3 – Organizing data into training and testing set Step 4 – Creating ML model Step 5 – Train the model Step 6 – Plotting the regression line Step 7 – Calculating the variance Performance metrics for regression Mean Absolute Error (MAE) Mean Squared Error (MSE) R-Squared (R2) Adjusted R-squared (R2) Conclusion Questions 4. Clustering Using Unsupervised Learning Introduction Structure Objectives Clustering Various methods to form clusters Important ML clustering algorithms K-means clustering algorithm Mean-shift clustering algorithm Hierarchical clustering algorithm Performance metrics for clustering Silhouette analysis Davies–Bouldin index Dunn index Conclusion Questions 5. Solving Problems with Logic Programming Introduction Structure Objectives Logic programming Building blocks of logic programming Useful Python packages for logic programming Implementation examples Checking and generating prime numbers Solving the puzzles Conclusion Questions 6. Natural Language Processing with Python Introduction Structure Objective Natural Language Processing (NLP) Working of NLP Phases/logical steps in NLP Implementing NLP Installing Python’s NLTK Package Installing NLTK Downloading NLTK corpus Understanding tokenization, stemming, and lemmatization Tokenization Stemming Lemmatization Difference between lemmatization and stemming Understanding chunking Importance of chunking Understanding Bag-of-Words (BoW) model Why the BoW algorithm? Implementing the BoW algorithm using Python Understanding stop words When to remove stop words? Removing stop words using the NLTK library Understanding vectorization and transformers Vectorization techniques Transformers Some examples Predicting the category Gender finding Conclusion Questions 7. Implementing Speech Recognition with Python Introduction Structure Objective Basics of speech recognition Working of the speech recognition system Building a speech recognizer Difficulties while developing a speech recognition system Visualization of audio signals Characterization of the audio signal Monotone audio signal generation Extraction of features from speech Recognition of spoken words Conclusion Questions 8. Implementing Artificial Neural Network (ANN) with Python Introduction Structure Objective Understanding of Artificial Neural Network (ANN) A biological neuron Working of ANN The basic structure of ANN Types of ANNs Optimizers for training the neural network Gradient descent Stochastic Gradient Descent (SGD) Mini-Batch Gradient Descent Stochastic Gradient Descent with Momentum Adam (Adaptive Moment Estimation) Regularization Regularization techniques Installing useful Python package for ANN Examples of building some neural networks Perceptron-based classifier Single-layer neural networks Multi-layer neural networks Vector quantization Conclusion Questions 9. Implementing Reinforcement Learning with Python Introduction Structure Objective Understanding reinforcement learning Workflow of reinforcement learning Markov Decision Process (MDP) Working of Markov Decision Process (MDP) Difference between reinforcement learning and supervised learning Implementing reinforcement learning algorithms Reinforcement learning algorithms Types of reinforcement learning Benefits of reinforcement learning Challenges with reinforcement learning Building blocks of reinforcement learning Agent Environment Constructing an environment using Python Constructing an agent using Python Conclusion Questions 10. Implementing Deep Learning and Convolutional Neural Network Introduction Structure Objective Understanding Deep Learning Machine learning versus deep learning Elucidation of Convolutional Neural Networks The Architecture of Convolutional Neural Network Localization and object recognition with deep learning Deep learning models Image classification using CNN in Python Conclusion Questions Index CHAPTER 1 Introduction to AI and Python Introduction What’s the very first thing that comes into your mind when you think of Artificial Intelligence (AI)? It may be an automated machine, robots, or an image of the brain with some processing. If yes, then your understanding of AI is appropriate but vague. So, you may be wondering, what exactly the concept of AI is? This chapter provides a brief overview of AI. It covers various fields of study in AI, real-life applications of AI, and agents and environments. This chapter also addresses the Python programming language, one of the most popular programming languages used by developers today for building AI applications. It also highlights features of Python, its installation, and steps to run the Python script. Structure In this chapter, we will discuss the following topics: Introduction to Artificial Intelligence (AI) Learning AI Understanding intelligence Various fields of study in AI Application of AI in various industries How does artificial intelligence learn? AI – agents and environments AI and Python – how do they relate? Python3 – installation and setup Objectives After studying this unit, you will understand the basics of AI. You will also learn various fields of study in AI and its applications in various industries. You will be able to install Python 3 on Windows, Linux, and Mac OS X. You will also understand the reason for choosing Python for AI projects. Introduction to Artificial Intelligence (AI) John McCarthy, an American computer scientist, who was a pioneer and an inventor, coined the term Artificial Intelligence (AI) in his 1955 proposal for the 1956 Dartmouth Conference, the first artificial intelligence conference. According to him, AI is The science and engineering of making intelligent machines, especially intelligent computer programs. As we can see, Artificial Intelligence is composed of two words, first is Artificial, which means man-made, and second is Intelligence, which means thinking power. Hence, we can say that AI means a man-made thinking power. We can define AI as: "A branch of information technology by which we can create intelligent machines that can think like a human, behave like a human, and also able to make the decisions at its own.” AI is accomplished by studying how humans think, learn, and decide while trying to solve a problem, and then using this outcome as a base for developing intelligent machines. The best part of AI is that we do not have to preprogram a machine, instead we can create a machine with programmed algorithms that can work with its own intelligence. Why to learn AI? Are machines capable of thinking? This is a simple question that is very difficult to answer. Different researchers defined terms such as thought or intelligence in different ways. When we look more closely at AI, this is just one of the problems that are encountered. But, one thing is clear that the current progress in the development of algorithms, combined with greater processing power and exponential growth in the amount of available data, means that AI is now capable of developing systems that can perform tasks that were previously viewed as the exclusive domain of human beings. Some of the capabilities of AI, due to which we should learn it, are as follows: AI is capable of learning through data: In our day-to-day life, we deal with huge amounts of data and our mind can’t keep track of such huge data. AI’s capability of learning through data helps us to automate things. AI is capable of teaching itself: In this digital era, data itself keep changing at a rapid pace, so the knowledge that is derived from such data must also be updated constantly. To fulfill this purpose, a system should be intelligent, and AI can help us to create such intelligent systems. AI can respond in real time: If you use the internet regularly, you’re probably using some real-time applications in the fields of e-commerce, healthcare, retail, manufacturing, self-driving cars, and so on. AI along with the help of neural networks can analyze the data more deeply and hence can respond to the situations that are based on the conditions in real time. AI can achieve a greater degree of accuracy: Deep learning, a subset of machine learning, extends the potential of AI to more complex tasks that can only be computed through multiple steps. These tasks are often performed with a greater degree of accuracy. Understanding intelligence To build AI applications (smart systems that can think and act like a human), it’s necessary to understand the concept of intelligence. As discussed before, different researchers defined terms such as thought or intelligence in different ways. Let’s define intelligence keeping in mind the scope of AI: Ability to take decisions: From a set of many deciding factors, it’s important to take the optimal, correct, and accurate decisions. This measures intelligence in a generic way as well as in terms of AI. Ability to prove results: Another important factor that measures intelligence is the ability to prove that why this decision has been chosen. Ability to think logically: Do you think, in this world, everything can be proved by mathematical formulae or proof? No, as humans, for many things, we need to apply our common sense, think logically, and conclude. This ability also measures intelligence. Ability to learn and improve: How do we develop our experiences? Whenever we learn something new, we develop our experiences. These experiences help all of us to make better decisions and better opportunities in the future. This also measures intelligence in a generic way as well as in terms of AI because the more we learn from the external environment, the more we have the ability to improve ourselves. Types of intelligence According to Howard Gardner1, an American development psychologist, there are eight multiple intelligences. Linguistic–verbal intelligence: It is the ability to speak, recognize, and use the mechanism of phonology, syntax, and semantics. Some of the characteristics of people with linguistic–verbal intelligence are: They enjoy reading and writing. They can explain things very well. They are good at debating or giving persuasive speeches. They are also good at remembering written and spoken information. For example, writers, narrators, teachers, and journalists. Musical intelligence: It is the ability to create, communicate, and understand pitch, rhythm, and meaning of sounds. Some of the characteristics of people with musical intelligence are: They enjoy singing as well as playing musical instruments. They can recognize musical patterns and tones easily. They are good at remembering songs and melodies. They have a great understanding of musical structure, rhythm, and notes. For example, musicians, singers, music teachers, and composers. Logical–mathematical intelligence: It is the ability to use, understand relationships in the absence of action or objects, and understand complex as well as abstract ideas. Some of the characteristics of people with logical–-mathematical intelligence are: They have excellent problem-solving skills. They enjoy thinking about abstract ideas. They are good at solving scientific experiments. They also like conducting scientific experiments. For example, mathematicians, engineers, computer programmers, and scientists. Visual–spatial intelligence: It is the ability to perceive visual information, change it, re-create images without reference to the objects, construct 3-dimensional images, move, and rotate them. Some of the characteristics of people with visual–spatial intelligence are: They enjoy drawing and painting. They recognize patterns very easily. They are good at interpreting pictures, graphs, and charts. They are also good at putting puzzles together. For example, architects, artists, astronauts, and physicists. Bodily kinesthetic intelligence: It is the ability to use part of or complete body to solve problems. It is the control of fine and coarse motor skills. Some of the characteristics of people with bodily kinesthetic intelligence are: They have excellent physical coordination. They enjoy creating things by themselves. They are good at dancing and sports. For example, players, builders, actors, and dancers. Intra-personal intelligence: It is the ability to distinguish among one’s feelings, intentions, and motivations. Some of the characteristics of people with intra-personal intelligence are: They are good at analyzing their strengths and weaknesses. They enjoy analyzing and learning through theories and ideas. They have excellent self-awareness. They understand the basis for their motivations as well as feelings. For example, writers, theorists, and scientists. Inter-personal intelligence: Unlike intra-personal intelligence, it is the ability to recognize and make distinctions among other feelings, beliefs, and intentions. Some of the characteristics of people with interpersonal intelligence are: They are good at communicating verbally. They are also skilled at nonverbal communication. They always tend to create a positive relationship with others. They are also good at resolving conflicts in groups. For example, psychologists, philosophers, and politicians. Naturalistic intelligence: It is the ability to explore the environment and learning about other species. The individuals who have this type of intelligence are said to be highly aware of even the smallest changes. Some of the characteristics of people with naturalistic intelligence are: They would be interested in studying subjects such as botany, biology, and zoology. They enjoy camping, gardening, hiking, and outdoor activities. They don’t enjoy learning topics that have no relation to nature. For example, farmer, gardener, biologist, and conservationist. A system is artificially intelligent if it is equipped with at least one or at most all intelligence in it. Various fields of study in AI As soon as we start thinking about AI, various terms like Machine Learning (ML), Deep Learning (DL), Natural Language Processing (NLP), Data Science, Statistical Analysis, Artificial Neural Network (ANN), Genetic Algorithms, and so on come into our mind. But if we see broadly, AI is not an isolated domain, it’s an umbrella of every technology that helps transcend human capabilities. Let’s have a look at some of the fields of study within AI: Machine Learning (ML): Machine Learning, one of the most popular fields of study, is a subset of AI that allows machines to learn on their own as humans can learn from their experiences. It learns from the dataset and makes predictions. Deep Learning (DL): Deep Learning is a subset of ML concerned with algorithms inspired by the function of the brain called ANN. It makes the computation of a multi-layer neural network possible. Logic: According to the Oxford dictionary, Logic is the reasoning conducted or assessed according to strict principles and validity. To an extent, it carries the same meaning in AI as well. We can define logic as proof of validation behind any reason provided. But why it’s important to include logic in AI? It’s because we want our system (agent) to think like humans, and for doing so, it should be capable of making the decision based on the current situation. Knowledge Representation: We humans are best at understanding, reasoning, and interpreting knowledge because as per our knowledge, we can perform various actions in the real world. But how machines can do all these things comes under Knowledge Representation (KR). KR is concerned with AI agents thinking and how thinking contributes to the intelligent behavior of agents. Intelligence is dependent on knowledge because an AI agent will only be able to accurately act on the input when it has some knowledge or experience about that input. Applications of AI in various industries Artificial intelligence, machine learning, and deep learning are here, growing, and with each passing day they are making machines smarter and smarter. In fact, they are becoming a disruptive force that is redefining today’s world. They have come roaring out of high-tech labs to become something that we use every day without even realizing it. Also, the acceleration we have seen in recent years shows no signs of slowing down. With applications ranging from heavy industry to healthcare, the presence and importance of AI and ML technology are being felt across a broad spectrum of industries. Let’s have a look at the top five fast-growing industries that are tremendously reaping the benefits of this technology: Education: Education is the backbone of any nation. AI is improving the education system by replacing traditional techniques with personalized, and immersive learning techniques. This way, it helps teachers to tailor students’ weaknesses. Two of the realities of immersive learning are Augmented Reality (AR) and Virtual Reality (VR). Augmented reality is a type of software that uses the device’s camera to overlay digital aspects into the real world. It facilitates the teachers and the trainers in performing those tasks, in a safe environment, which they previously could not. On the other hand, virtual reality creates a 360-degree view digital environment, which allows students to interact directly with the study material by using e-learning resources on mobile devices. Healthcare: One of the latest advancements in healthcare is Google’s Medical Brain, which is enabled with a new type of AI algorithm. Google’s Medical Brain is used to make predictions about the likelihood of death among patients. AI is also helping the laboratory segment of healthcare with ML-enabled laboratory robots that can study new molecules and reactions. In recent years, cancer has been one of the leading causes of death. Companies like Infervision have developed an AI-based system, which is trained with suitable algorithms, to review CT scans and detect early signs of cancer. In this coronavirus pandemic phase, AI is also used to accurately forecast infections, deaths, and recovery timelines of the COVID-19. Automobiles: Driverless or self-driving vehicles are not a sci-fi thing anymore. With huge advancements in AI, it became a reality now. Tech giants such as Google, Apple, Amazon, Cisco, Intel, and Bosch are leading the R&D in autonomous driving. Whereas automobile companies such as General Motors (GM), Tesla, BMW, and Mercedes are some serious players in the self-driving vehicle game. Autonomai, enabled with Deep Learning and AI capabilities, is an autonomous middleware platform developed by an Indian company named Tata ELEXSI. It’s not far when we will see and use self-driving vehicles on Indian roads as well. E-Commerce: In this e-commerce and digital era, we all have the experience of online shopping and we sometimes also buy the stuff that is not required at all or we seldom use. The new strategy of e-commerce companies is to sell the stuff to their customers even before they realize the need for it. Companies achieve this by realizing their customers' preferences and various other factors like attractive deals, special coupons, and discounts. This strategy is called ‘purchase recommendations’ or ‘intuitive selling’, which is purely based on AI algorithms. For example, by using AI algorithms and computer vision, Amazon go is redefining the way of shopping in supermarkets. It adds the items automatically in customers’ virtual cart and once the customer leaves the store, adds the charges on the Amazon account. So, no more lines at the time of checkout. Digital marketing: Imagine how effective marketing becomes if most of the time-consuming tasks such as identifying the right perspective, segmenting as well as targeting audiences, building a winning content strategy, and scheduling the release could be driven without human intervention. AI is bringing this power into marketing automation by using tools like Boomtrain, Phrases, Persado, Adext, RankBrain, Chatbots, and so on. You may be wondering if AI, ML, and DL have any application(s) for day-to- day life or they are meant for industrial use only. Look around and you could see and feel AI-powered things and devices. Following are some cool AI applications enhancing our lifestyle: Virtual Personal Assistants (they are intelligent): Most of us are interacting with virtual personal assistants like Siri, Alexa, Cortana, and Google Now on a regular basis for getting the desired information. It’s AI technology with the help of which these VPAs continually learn information about us to provide better services. In fact, now we can use Google Assistant to talk to the ‘Tulips’ flower. Google and Wageningen University made it possible by mapping tulip signals to human signals on Google Assistant’s existing Neural Machine Translation. Google Assistant now added Tulipish as a language and offers translation between dozens of other human languages. So, now we can say, Okay Google, talk to my Tulip. Astonishing, right? Video games: I am sure everyone has memories about the classic video games like Road Rash, Pac-Man, Super Mario, Virtua Fighter, and Nokia Snake game. But if we see today’s video games like Fortnite, Call of Duty, Grand Theft Auto, and Far Cry, they are very advanced because they are empowered by AI algorithms. These algorithms make today’s games look highly realistic because the characters in the game understand a gamer’s behavior, learn from stimuli, and change their traits accordingly. Such features attract a player to come back to play again and again. Humanoid (human-like robots for humans): Around 2 years ago, a humanoid robot named Sophia became the first-ever robot to have a nationality. Yes, in October 2017, Sophia, created by a Hong Kong firm named Hanson Robotics, got Saudi Arabia’s citizenship. This AI- powered robot can imitate human gestures, facial expressions, and initiate discussion on predefined topics as well. That’s why we can call Sophia ‘a social humanoid robot’. In fact, India is also not far behind. We have Rashmi, the world’s first Hindi-speaking humanoid robot, who is hosting a show on Red FM since December 2018. It is created by Ranchi’s Ranjit Srivastava. We can call Rashmi, ‘The Indian Sister of Sophia’. Maps and directions: Everyone has Google Maps or any other app similar to it on their smartphones for finding directions and routes. With such apps, there is no more fear of getting lost. Here also AI is helping us out. Google uses Graph Neural Networks (GNN), which is an ML architecture to reduce the percentage of inaccurate Expected Time of Arrivals (ETAs). Cab-service ride-sharing feature: In today’s scenario, one of the best ways to commute is cab services like Ola, Uber, and so on. To save expenses, many of us used to share our rides with other passengers. But have you ever thought: How does the cab service app get a booking from the person going on the same route as yours? In a shared ride, how an individual’s fare is determined? All such queries have only one answer – AI algorithms. How does artificial intelligence learn? Today, AI helps various industries as discussed in the preceding section. These AIs are often self-taught, they work off a simple set of instructions to create a unique set of rules and strategies. So how exactly does a machine learn? There are various ways to build self-teaching programs, but they all rely on the following three basic types of machine learning: Supervised learning: It takes the data sample (usually called training data) and associated output (usually called labels or responses) with each data sample during the training process of the model. The main objective of supervised learning is to learn an association between input training data and corresponding labels. Unsupervised learning: Unsupervised learning methods (as opposed to supervised learning methods) do not require any pre-labeled training data. In such methods, the machine learning model or algorithm tries to learn patterns and relationships from the given raw data without any supervision. Although there are a lot of uncertainties in the result of these models, we can also obtain a lot of useful information like all kinds of unknown patterns in the data, the features that can be useful for categorization, and so on. Reinforcement learning: In reinforcement learning algorithms, a trained agent interacts with a specific environment. The job of the agent is to interact with the environment and once observed, it takes actions regarding the current state of that environment. AI agents and environments AI is all about practical reasoning, reasoning in order to do something, and an AI system is composed of an agent and its environment. The agents act in their environment and the environment may contain other agents. What is an agent? An agent may be defined as anything that can perceive its environment through sensors and acts upon that environment through effectors. An agent, having mental properties such as knowledge, belief, intention, and so on, runs in the cycle of perceiving, thinking, and acting. Examples of agents are: A human agent has sensory organs like eyes, ears, tongue, skin, and nose, which work as sensors. On the other hand, it has hands, legs, and vocal tract, which work as effectors. A robotic agent has cameras and infrared range finders, which act as sensors. On the other hand, it has various motors acting as effectors. A software agent has keystrokes, files, received network packages, and encoded bit strings, which work as sensors. On the other hand, it has sent network packages, content displays on the screen, which work as effectors. Figure 1.1: Agent and its environment For an AI agent, the following are the four important rules: Rule 1: It must have the ability to perceive the environment. Rule 2: It must use observation to make decisions. Rule 3: The decisions it makes should result in an action. Rule 4: Every action it takes must be a rational action. Agent terminology The performance measure of an agent: It may be defined as the criteria determining how successful an agent is. The behavior of an agent: It may be defined as the action that an agent performs after any given sequence of percepts. Percept: An agent’s perceptual inputs at a given instance is called a percept. Percept sequence: It may be defined as the history of all that an agent has perceived till now. Sensor: Sensor, through which an agent observes its environment, is a device detecting the change in the environment and sending the information to other devices. Effectors: They are the devices affecting the environment. They can be hands, legs, arms, fingers, display screen, sent network packet, wings, fins, and so on. Actuators: Actuators, only responsible for moving and controlling a system, are the components of machines that convert energy into motion. Examples of actuators can be electric motors, gears, rails, and so on. Rationality and rational agent The rationality of an agent is concerned with the performance measure of that agent. As we know, the agent should perform actions to obtain useful information. So, in simple words, we can define rationality as the status of being sensible, reasonable, and having a good sense of judgment. There are following four factors on which the rationality of any agent depends upon: The Performance Measures (PM)of an agent. Agent’s Percept Sequence (PS). Agent’s Prior Knowledge (PK) about the environment. The Actions (A) an agent can carry out. (PM, PS, PK, A) An ideal rational agent is the one that has clear preferences, models uncertainty, and is capable of doing expected actions to maximize its performance measure, based on its percept sequence and built-in knowledge base. A rational agent is said to perform the right actions always. Here the right actions mean the actions that cause the agent to be most successful in the given percept sequence. Structure of an AI agent The main task of artificial intelligence is to create and design an agent program that implements the agent function. In this way, the following structure of an AI agent can be viewed as the combination of architecture and agent program: Agent = Architecture + Agent program Architecture, agent function, and agent program are the three main terms involved in the structure of an AI agent. Architecture: It is the machinery an AI agent executes on. Agent function: It may be defined as the map from the percept sequence to an action. f : p* → A Agent program: It is an implementation of agent function that executes on the physical architecture to produce function. P.E.A.S representation P.E.A.S representation is a type of model in which the properties of an AI agent or rational agent can be grouped. It consists of four words: P: Performance measure E: Environment A: Actuators S: Sensors As discussed, the objective for the success of an agent’s behavior is the performance measure. Let’s see two examples of agents with their P.E.A.S representation: Self-driving vehicles: The P.E.A.S representation for self-driving vehicles will be: P (Performance): Safety, time, legal driving, and comfort. E (Environment): Road, road signs, other vehicles, and pedestrian. A (Actuators): Accelerator, steering, brake, clutch, signal, and horn. S (Sensors): Camera, speedometer, GPS, sonar, and accelerometer. Vacuum cleaner: The P.E.A.S representation for vacuum cleaners will be: P (Performance): Cleanness, battery life, efficiency, and security. E (Environment): Room, wooden floor, carpet, other obstacles like shoes, bed, table, and so on. A (Actuators): Brushes, wheels, and vacuum extractors. S (Sensors): Camera, cliff sensor, dirt detection sensor, bump sensor, and infrared wall sensor. Types of agents Based on the degree of perceived intelligence and capability, agents can be grouped into the following four classes: Simple reflex agent Model-based reflex agent Goal-based agent Utility-based agent Simple reflex agent They choose actions based only on the current percept and ignore the rest of the percept history. They work based on the condition–action rule, which is a rule that maps a state, that is, condition to an action. If the condition is true, the action is taken, otherwise not. For example, a room cleaner agent works only if there is dirt in the room. Their environment is fully observable. Figure 1.2: Simple reflex agent Model-based reflex agent: In order to choose their actions, they use a model of the world. It must keep track of the internal state, adjusted by each percept, that depends on the percept history. They can handle partially observable environments. Model is the knowledge about how things happen in the world. Internal state is a representation of unobserved aspects of the current state, which depends upon percept history. In order to update the agent’s state, it requires the following information: How the world evolves? How do the actions of agents affect the world? Figure 1.3: Model-based reflex agent Goal-based agent They choose their actions and take decisions based on how far they are currently from their goals – a description of a desirable situation. Every action of such agents is intended to reduce the distance from the goal. This approach, that is goal-based, is more flexible than reflex agents because the knowledge supporting a decision is explicitly modeled, which allows for modifications. Figure 1.4: Goal-based agents Utility-based agent They are developed having their end uses as building blocks. In order to decide which is the best among multiple possible alternatives, utility-based agents are used. They choose their actions and take decisions based on a preference (utility) for every state. Sometimes, achieving the desired goal is not enough because goals are inadequate when: We have conflicting goals and only a few among them can be achieved. Goals have some uncertainty of being achieved. Figure 1.5: Utility-based agents What is an agent’s environment? Everything in the world that surrounds the agent is called an agent’s environment. It can’t be a part of an agent itself, but it’s a situation in which the agent is present. In simple words, we can say that the environment is where an agent lives and operates. It’s an environment that provides an agent something to sense and act upon it. Nature of environments There are several aspects such as the shape and frequency of the data, the nature of the problem, and the volume of knowledge available at any given time, that distinguish one type of AI environment from another. If anyone wants to tackle a specific AI problem, they should first understand the characteristics of AI environments. From that perspective, based on the nature of the environment, we use several categories to group AI problems. Fully observable versus partially observable: Fully observable AI environments are those on which, at any given time, an agent sensor can sense or access the complete state of an environment. It’s very simple as there is no need to maintain the internal state to keep track of history. Image recognition operates in a fully observable AI environment. Partially observable AI environments are those on which, at any given time, an agent sensor cannot sense or access the complete state of an environment. Self-driving vehicle scenarios operate in a partial observable AI environment. Complete versus incomplete: Complete AI environments are those on which, at any given time, an agent has enough information to complete a branch of the problem. Chess is a classic example of such AI environments. On the other hand, incomplete AI environments are those in which agents can’t anticipate many moves in advance. The agents, at any given time, focus on finding a good equilibrium state. Poker is a classic example of incomplete AI environments. Static versus dynamic: Static AI environment is an environment that cannot change itself while an agent is deliberating. That’s the reason they are easy to deal as the agent doesn’t need to continually look at the world while deciding for an action. Speech analysis and crossword puzzles are the problems operating on a static AI environment. In contrast, dynamic AI environment is an environment that can change itself while an agent is deliberating. In dynamic environments, at each action, agents need to continually look at the world. Taxi driving and vision AI systems in drones are some problems operating in dynamic AI environments. Discrete versus continuous: Discrete AI environment is an environment in which there are a finite (although arbitrarily large) number of percepts and actions that can be performed within. The games such as Chess and GO also come under a discrete environment. In contrast to a discrete environment, a continuous environment relies on unknown and rapidly changing data sources. Vision AI systems in drones and self-driving vehicles are examples of a continuous AI environment. Deterministic versus stochastic: As the name implies, in a deterministic AI environment, an agent’s current state and selected action can determine the next state of the environment. As in a fully observable environment, the agent does not need to worry about uncertainty in a deterministic environment as well. Most of the real-world AI environments are not deterministic in nature. On the other hand, a stochastic AI environment is an environment that cannot be determined completely by an agent. It’s random in nature. Self- driving vehicles are examples of stochastic processes. Single-agent versus multi-agent: As the name implies, in a single-agent AI environment, there is only one agent that is involved and operating by itself. In contrast, in a multi-agent AI environment, multiple agents are involved and operating. AI and Python – how do they relate? There is a lot of confusion among researchers and developers about which programing language to choose for building AI applications. The list may include LISP, Prolog, Python, Java, C#, and a few more as well. The choice of a programming language depends upon many factors like ease of code, personal preference, and available resources. Although the skills of the developer always matter more than any programming language, here, we are going to justify just one, that is, Python programming language for AI. What is Python? Python, created by Guido Van Rossum in 1991, is an OOPs based high-level interpreted programming language and focuses on Rapid Application Development (RAD) and Don’t Repeat Yourself (DRY). Due to ease of learning and adaptation, Python has become one of the fastest-growing programming languages. Python’s ever-evolving libraries make it a good choice for any project whether IoT, Data Science, AI or Mobile App. Why choose Python for building AI applications? Python programming language is favored by developers for a whole set of applications, but what makes it a particularly good fit for applications/projects involving AI? Let’s have a look: A great library and framework ecosystem: One of the aspects that makes Python the most popular language used for AI is its abundance of libraries and frameworks. A library is a module or group of modules, published by various sources like PyPi, that includes a pre-written piece of code that saves development time and allows users to perform different actions or reach some functionality. As we know, ML and DL require continuous data processing, and Python’s libraries let us access, handle, and transform data. The following are some of the widespread libraries we can use for building AI applications: Scikit-learn: It is very useful in handling basic ML algorithms like clustering, regression, classification, and so on. Pandas: It is used for high-level data structures and analysis. It allows the filtering and merging of data. It can also be used for gathering data from external resources. Keras: It’s a very useful Python library for deep learning. It uses the GPU in addition to the CPU, hence allows fast calculations. TensorFlow: Another useful library for deep learning. It allows efficient training and utilizing an ANN with massive datasets. NLTK: One of the most useful Python libraries for working with natural language recognition and computational linguistics. Matplotlib: It is used for visualization. We can easily create 2D plots, histograms, charts with Matplotlib. Caffe: Another very useful Python library for deep learning. It allows switching between the CPU and the GPU. Ease of use and simplicity: Python is almost unrivaled when it comes to ease of use and simplicity, particularly for novice AI developers. Python’s ease of use and simplicity has several advantages for ML and DL: ML and DL both rely on extremely complex algorithms and multi- stage workflows. That’s why the less the developers worry about the intricacies of coding, the more they can focus on finding the solutions to the problems. The simple syntax of Python makes it faster in development than many other programming languages. That’s the reason a developer can quickly test algorithms without having to implement them. In addition to the preceding benefits, Python’s easily readable code is invaluable for collaborative coding. Low-entry barrier: The process of learning Python programming language is very easy because it resembles to everyday English language. That’s the reason data scientists can quickly pick up Python and start using it for developing AI applications without wasting too much time into learning the language. Flexibility: Due to its flexibility, Python for AI is a great choice: Developers have an option to choose either OOPs or scripting. No need to recompile the source code and developers can implement any changes and check the result quickly. It can be combined with other programming languages. Moreover, Python’s flexibility also allows a developer to choose from various programming styles like the imperative, functional, object- oriented, or procedural style. Platform agnostic: Python is also a very versatile language. What we mean is that Python is platform-independent and can run on any platform including Windows, macOS, Linux, Unix, and 21 others. With some small-scale changes in code, you can get your code running in the new OS. Again, this saves development time and money in testing on various platforms. The abundance of community support: It’s always very helpful if there is strong community support built around the language. Python is an open-source programming language, which means that it is supported by a lot of resources. Moreover, a lot of Python documentation is also available online as well as in Python forums where developers can discuss errors, solve problems, and help each other out. Python3 – installation and setup Let’s see how to set up a working Python 3 distribution on Windows, macOs, and Linux. Windows Installing Python on Windows OS does not involve much more than downloading the Python installer from the Python.org website and running it. Follow the steps to install Python 3 on Windows: 1. Downloading Python installer 1. First, open a browser window and go to Python.org website. Now, navigate to the download page for windows. 2. Underneath the heading at the top that says Download the latest version for Windows, click on the link for the Latest Python 3 Release – Python 3.x.x (as of this writing, the latest is Python 3.8.0). 3. Scroll to the bottom and select either of the following: Windows x86-64 executable installer for 64-bit. Windows x86 executable installer for 32-bit. 2. Running the installer Now after downloading an installer, we need to simply run it by double- clicking on the downloaded file. A dialog box should appear that looks something like shown in the following screenshot: Figure 1.6: Python setup In order to make sure that the interpreter will be placed in your execution path, check the box that says Add Python 3.x.x to PATH. Linux There are very high chances your Linux distribution has Python installed already, but it probably won’t be the latest version. Instead of Python 3, it may have Python 2. In order to find out what version(s) you have, try the following commands on terminal windows: python –version python2 –version python3 --version One or more of preceding commands should respond with a version, as shown in the following: $ python3 --version Python 3.8.0 Suppose if the version shown is of Python 2 or a version of Python 3 that is not the latest one (3.8.0 as of this writing), then you should install the latest version. The procedure of installing the latest Python version will depend on the Linux distribution you are running. Ubuntu Depending on the Ubuntu distribution version, the instructions for installing Python vary. First, we need to determine our local Ubuntu version by running the following command: $ lsb_release -a We will get something like the following output for the preceding command: No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.4 LTS Release: 16.04 Codename: xenial Check the version number you see under Release in the output. Depending on the version number for your Ubuntu distribution, follow the instructions: Ubuntu 16.10 and 17.04 have Python 3.x.x in the Universe repository. we can install it with the following commands: $ sudo apt-get update $ sudo apt-get install python3.8 Once installed, we can invoke it with the command python3.8. Ubuntu 14.04 or 16.04 do not have Python 3.x.x in the Universe repository; hence, we need to get it from a Personal Package Archive (PPA). For instance, to install Python from the PPA named deadsnakes, use the following commands: $ sudo add-apt-repository ppa:deadsnakes/ppa $ sudo apt-get update $ sudo apt-get install python3.8 Once installed, we can invoke it with the command python3.8. Linux Mint We can follow the preceding instructions for Ubuntu 14.04 as Mint and Ubuntu use the same package management system. The PPA named deadsnakes works with Mint as well. CentOS The IUS Community is providing newer versions of software for Enterprise Linux distros that is, Red Hat Enterprise and CentOS. We can use their work to install Python 3. In order to install, we must first update our system with the yum package manager. Use the following commands to do so: $ sudo yum update $ sudo yum install yum-utils After that, we can install the CentOS IUS package by using the following command: $ sudo yum install https://centos7.iuscommunity.org/ius- release.rpm Now, with the help of the following commands, we can install Python and Pip: $ sudo yum install python36u $ sudo yum install python36u-pip Fedora Fedora has a roadmap to switch to Python 3, which indicates that the current version and the next few versions will all ship with Python 2 as the default; however, Python 3 will be installed. If the python3 installed on Fedora is not Python 3.8, you can use the following command to install it: $ sudo dnf install python3.8 Installing and compiling Python from Source It may be possible that our Linux distribution will not have the latest version of Python, or we may not be able to build the latest version ourselves. In that case, we can use the following steps to build and compile Python from the source: 1. Downloading the source code 1. To start with, we need to get the Python source code. As we did for Windows, we can go to the Downloads page of Python.org and check for the latest source (3.8.0) for Python 3. 2. Now once the version is selected, at the bottom of the page there is a Files section. We now need to select the Gzipped source tarball and must download it on our machine. 3. For them who prefer a command-line method, they can use wget to download it to their current directory: $ wget https://www.python.org/ftp/python/3.8.0/Python- 3.8.0.tgz 2. Preparing the system 1. In order to build Python from scratch, we need to follow some steps that are specific to that Linux distribution. Although, on all the distributions, the goal of these steps is same, but in case, if it does not use apt-get, you might still need to translate them according to your Linux distribution. 2. Before getting started, we need to update the system packages on our machine. For apt-based systems (Debian, Ubuntu, and so on) use the following commands: $ sudo apt-get update $ sudo apt-get upgrade 3. Next, make sure that your system has the tools needed to build Python. Some of them are listed as follows in the command. Commands for apt-based systems like Debian, Ubuntu, and so on are as follows: $ sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk- dev 4. Commands for yum-based systems like CentOS are as follows: $ sudo yum -y groupinstall development $ sudo yum -y install zlib-devel 3. Building and compiling Python 1. As we have the prerequisites and the TAR file, we can unpack the source into a directory. The following command will create a new directory called Python-3.8.0: $ tar xvf Python-3.8.0.tgz 2. Now go to the directory: $ cd Python-3.8.0 3. Now, in order to prepare the build, we need to run the./configure tool as follows: $./configure --enable-optimizations --with- ensurepip=install 4. Next, build the Python programs using make, where the -j option will ask you to split the building into parallel steps to speed up the compilation. $ make -j 8 5. As we want to install a new version of Python, we will use the altinstall target here so that the system’s version of Python should not be overwritten. As you’re installing Python into /usr/bin, we should run as root: $ sudo make altinstall 4. Verifying the installation 1. Finally, with the help of the following command, we can test whether our new Python version is installed or not: $ python3.8 -V Python 3.8.0 macOS/Mac OS X If you are using Mac OS X, the best way to install Python is through the Homebrew package manager. If you don’t have Homebrew, you can install it by navigating to http://brew.sh/. Once you have installed Homebrew, use the following command to install Python3: $ brew install python3 Conclusion The intent of this chapter was to get you familiarized with the foundations of Artificial Intelligence and Python before deep diving into building AI applications. The capabilities of AI, with a focus on important fields of study under AI, are introduced in this chapter. Machine learning, deep learning, logic, artificial neural network (ANN), and knowledge representation are some of the most important areas covered under AI. Keeping in mind the scope of AI, this chapter also defines intelligence. AI is no longer a science-fiction term; it becomes a reality now. Industries are deploying AI and its subsets, namely, machine learning and deep learning for a more productive and profitable solution. In this chapter, we tried to make you feel the presence and importance of AI technology in five fast-growing industries, and also how these industries are reaping the benefits of this technology. Next up, we also explored some cool AI applications enhancing our day-to-day life. The subject of AI is all about practical reasoning and is composed of agents and its environment. Concepts relevant to agents and environments have also been covered in this chapter including the structure of agents, types of agents, rationality, and nature of the environment. We briefly described the relation between AI and Python including the features of Python that make it one of the most suitable languages for building AI applications. Finally, you learned how to install and set up Python on various platforms like Windows, macOS/Mac OS X, and Linux. We brought everything covered in this chapter together to make you understand the basic concepts of AI, its impact on our lifestyle, and also learn about the Python programming language. This gets you ready for the next chapters, where you will implement AI algorithms with Python. Questions 1. What is Artificial Intelligence (AI)? Describe some of the AI applications and explain how they are enhancing our lifestyle? 2. What are the most important fields of study within AI? 3. What is an AI agent? Explain its structure and types. 4. What is a rational agent? What are the factors on which the rationality of any agent depends? 5. Write down the features of the Python programming language that make it a good fit for building AI applications. ______________________________ 1 Gardner, Howard. Frames of Mind: The Theory of Multiple Intelligences. New York: Basic Books, 1983. CHAPTER 2 Machine Learning and Its Algorithms Introduction Do you find any similarity between steam engines, age of science, and digital technology? They are known as the first three industrial revolutions responsible for fundamentally transforming our society and the world around us. In this digital era, we are experiencing this for the fourth time. But, this fourth industrial revolution is powered by Artificial Intelligence (AI), Machine Learning (ML), Deep Learning, IoT (Internet of Things), edge computing along with increasing computing power like quantum computing. The data or the information is the driver and fuel of this industrial revolution. No doubt, with better computational power and more storage resources, this data is increasing day by day at a very rapid pace. For businesses and organizations, the real challenge is to make sense of this huge data. That’s the reason they are trying to build intelligence systems by using methodologies from ML, one of the most exciting fields of computer science. We can see ML as the application and science of algorithms that provide meaning to the data. This chapter provides a brief overview of ML and its model. It also addresses various ML methods. Using the Python programming language, we will also implement some of the most useful ML algorithms. Structure Understanding machine learning The landscape of machine learning algorithms Components of a machine learning algorithm Different learning styles in machine learning algorithms Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning Popular machine learning algorithms Linear regression Logistic regression Decision tree Random forest Naïve Bayes Support Vector Machine (SVM) k-Nearest Neighbor (KNN) k-means clustering Objectives After studying this chapter, you should be able to implement various popular machine learning algorithms, namely, Linear Regression, Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), Naïve Bayes, k-Nearest Neighbor, and k-means clustering in the Python programming language. You will also learn different learning styles such as supervised, unsupervised and semi-supervised, and reinforcement used in ML algorithms. Understanding Machine Learning (ML) Machine learning, a subset of AI, is the practice of computer systems to extract patterns out of raw data by using an algorithm or a method. ML algorithms allow computer systems to learn from experience without explicit programming or any human intervention. To give you an example, a spam filter, one of the first applications of ML, can easily determine if the email is important or a spam. The Landscape of Machine Learning Algorithms The field of machine learning consists of learning algorithms that help the machine to learn from data and improve its performance with time. Also, based on its interaction with the environment or input data, there are different ways an algorithm can model a problem. In this section, we’ll go through the components of an ML algorithm, different learning styles, or learning models that an ML algorithm can have, and we’ll take a tour of the most popular ML algorithms also. Components of a Machine Learning algorithm Before deep diving into the components of the ML algorithm, we must understand ML through the very interesting definition given by Professor Mitchell in 19971: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.” The preceding definition focuses on the following three parameters: (T): Task (P): Performance (E): Experience These three are the main components of any of the learning algorithms shown in the following figure: Figure 2.1: Components of ML algorithm Let’s simplify the definition of ML: ML is that field of AI which consists of learning algorithms that improve their performance (P), at executing some task say T, over the time with experience E. The three main components of the ML algorithm are described as follows: Task (T): A task should be defined in a two-fold manner. A task, T, from a problem’s perspective, can be defined as the real-world problem to be solved. The problem, finding the best marketing strategy or predicting the house price, can be anything. Whereas from the ML perspective, defining a task is quite different because it’s difficult to solve ML-based tasks by using the traditional programming approach. A task, T, is called a machine learning-based task if it is based on the workflow that the system should follow to operate on sample data points. These sample data points typically consist of data attributes or features. Classification, Regression, Anomaly detection, Clustering, Translation, and so on, are some of the tasks that could be classified as the ML tasks. Experience (E): In layman’s terms, experience is the knowledge a person gets by doing something or observing someone else do it. In the case of machine learning-based tasks, it is the knowledge gained from sample data points provided to the ML algorithm. After getting the data points, the ML algorithm runs iteratively and learns from the inherent pattern. Such learning by the ML algorithm or model is called experience (E), which will be used to solve task T. There are various ways of learning and gaining experience, including supervised, unsupervised, semi- supervised, and reinforcement learning. We will discuss them in the next section. Performance (P): How do we know if our ML model, which is supposed to perform a task T and learning or gaining experience E from sample data points over time, is performing well or not? This is where the third component of the ML algorithm comes into the picture. This component is called performance, P, which is a quantitative metric used to measure how well the ML model is performing the task, T, with experience, E. Accuracy score, F1 score, precision, confusion matrix, recall, and specificity are some of the performance metrics we can choose from to measure the performance of our ML model. Different learning styles in machine learning algorithms Let’s look at the following four different learning styles in ML algorithms. Supervised learning Supervised learning methods are the ML methods that are most commonly used. It takes the data sample (usually called training data) and the associated output (usually called labels or responses) with each data sample during the training process of the model. The main objective of supervised learning is to understand the association between input training data and corresponding labels. Let’s understand it with an example. Suppose we have: Input variable: Output variable: In order to learn the mapping function from the input to output, we need to apply an algorithm whose main objective is to approximate the mapping function so well that we can also easily predict the output variable (Y) for the new input data, as shown in the following example: Y = f(x) These methods are called supervised learning methods because the ML model learns from the training data where the desired output is already known. Logistic regression, k-Nearest neighbors (KNN), Decision tree, and Random Forest are some of the well-known supervised machine learning algorithms. Based on the type of ML-based tasks, supervised learning methods can be divided into two major classes as follows: Classification: The main objective of the classification-based tasks is to predict categorical output responses based on the input data that is being provided. The output depends on the ML model’s learning in the training phase. Categorical means unordered and discrete values; hence, the output responses will belong to a specific discrete category. For example, predicting high-risk patients and discriminating them from low-risk patients is also a classification task. Suppose for newly admitted patients, an emergency room in a hospital measures 12 variables (such as blood sugar, blood pressure, age, weight, and so on). After measuring these variables, a decision is to be taken whether to put the patient in ICU or not. There is a simple condition that a high priority should be given to the patients who may survive more than a month. Regression: The main objective of regression-based tasks is to predict continuous numerical output responses based on the input data that is being provided. The output depends on the ML model’s learning in the training phase. Similar to classification, with the help of regression, we can predict the output responses for unseen data instances, but that is with continuous numerical output values. Predicting the price of houses is one of the most common real-world examples of regression. Unsupervised learning Unsupervised learning methods (as opposed to supervised learning methods) do not require any pre-labeled training data. In such methods, the machine learning model or algorithm tries to learn patterns and relationships from the given raw data without any supervision. Although there are a lot of uncertainties in the result of these models, we can still obtain a lot of useful information like all kinds of unknown patterns in the data and the features that can be useful for categorization. To make it clearer, suppose we have: Input variable: x There would be no corresponding output variable. For learning, the algorithm needs to discover interesting patterns in data. K-means Clustering, Hierarchical Clustering, and Hebbian Learning are some of the well-known unsupervised machine learning algorithms. Based on the type of ML-based tasks, unsupervised learning methods can be categorized into the following broad areas: Clustering: Clustering, one of the most useful unsupervised machine learning algorithms/methods, is used to find the similarity and relationship patterns among data samples. Once the relationship patterns are found, it clusters the data samples into groups having similar features. The following figure illustrates the working of clustering methods: Figure 2.2: Clustering Method Association: One other useful unsupervised machine learning algorithm/method is Association. In order to find patterns representing the interesting relationships between a variety of items, association analyzes a large dataset. For example, analyzing customer shopping patterns comes under association. It is also known as Association Rule Mining or Market Basket Analysis. Anomaly detection: Sometimes, we need to find out and eliminate the observations that do not occur generally. In that case, the most useful unsupervised ML method is Anomaly detection. It uses learned knowledge to differentiate between anomalous and normal data points. K-means clustering, mean shift clustering, and K-nearest neighbors (KNN) are some of the unsupervised learning algorithms that can detect anomalous data based on its features. Dimensionality reduction: As the name implies, dimensionality reduction is used to reduce the number of feature variables for every data sample by selecting a principal feature. One of the main reasons behind using the dimensionality reduction method is the problem of feature space complexity (curse of dimensionality). This problem arises when we start analyzing and extracting features, probably millions of features from our data sample. For example, Principal Component Analysis (PCA), KNN are some of the popular dimensionality reduction methods. Semi-supervised learning Semi-supervised machine learning methods fall between supervised and unsupervised machine learning methods. In simple words, they are neither fully supervised nor fully unsupervised. For training, such methods use a small amount of pre-labeled annotated data and lots of unlabeled data. Following are the two approaches that one can follow to implement semi-supervised learning methods: Approach-I: In this approach, we can first use the small amount of annotated and labeled data to build the supervised model. Once done with the supervised model, we can then apply the same to large amounts of unlabeled data to get more labeled samples. Then, train the model on these labeled samples and repeat the process. Approach-II: In this approach, we can first use the unsupervised methods to cluster similar data samples and then annotate these groups. Once annotated, we can use them to train the model. Reinforcement learning Reinforcement machine learning methods are a bit different from supervised, unsupervised, and semi-supervised machine learning methods. In these kinds of learning algorithms, a trained agent interacts with a specific environment. The job of the agent is to interact with the environment and once observed, it takes actions regarding the current state of that environment. Let’s understand the working of reinforcement learning methods in the following steps: 1. Prepare an agent with some set of strategies. 2. Observe the environment’s current state. 3. Regarding the current state of the environment, select the optimal policy and perform suitable action accordingly. 4. An agent gets a reward or penalty based on the action it took according to the current state of the environment. 5. If needed, update the set of strategies. 6. Repeat the process until the agent learns and adopts the optimal policy. Popular machine learning algorithms There are so many ML algorithms that we can feel overwhelming when algorithm names are thrown around us. It is always expected from us to just know what these algorithms are and where they actually fit. So, let’s take a tour of various popular ML algorithms and their implementation in the Python programming language. Linear regression A statistical model that attempts to model the linear relationship between a dependent variable with a given set of independent and explanatory variables by fitting a linear equation into the observed data is called linear regression. What does the linear relationship between variables mean? It means that with the change (increase or decrease) in the value of one or more independent variables, the value of the dependent variable will also change (increase or decrease) accordingly. Mathematically, a linear regression line has an equation of the form: Y=mX+b Where Y is the dependent variable and X is the independent or explanatory variable. The slope of the regression line is m. It represents the effect X has on Y. b is the Y-intercept. When X = 0, Y = b Types of linear regression Simple Linear Regression (SLR) and Multiple Linear Regression (MLR) are the two types of linear regressions. Let’s learn about them and their implementation using Python. Simple Linear Regression (SLR) SLR, the most basic version of linear regression, predicts a response using a single feature. It assumes that the two variables are linearly related. Implementing simple linear regression in Python: Let’s see how we can implement SLR in the Python programming language. In the following example, we will use a small dataset. We can also use a dataset from the scikit-learn library, which will be used in our next example: Example-1: #Importing necessary packages import numpy as np import matplotlib.pyplot as plt %matplotlib inline #Defining a function for calculating values needed for Simple Linear Regression (SLR) def coef_estimation(x, y): n = np.size(x) #calculating number of observations ‘n’. mean_x, mean_y = np.mean(x), np.mean(y) #calculating mean of x and y vectors cross_xy = np.sum(y*x) – n*mean_y*mean_x #calculating cross- deviation and deviation about x. cross_xx = np.sum(x*x) – n*mean_x*mean_x reg_b_1 = cross_xy / cross_xx #calculating regression coefficients, i.e., b. reg_b_0 = mean_y – reg_b_1*mean_x return(reg_b_0, reg_b_1) #Defining a function for plotting the regression line def plot_regression_line(x, y, b): plt.scatter(x, y, color = "r", marker = "o", s = 20) #plotting actual points as scatter plot y_pred = b + b*x #predicting response vector plt.plot(x, y_pred, color = "g")#plotting the regression line and labels on it plt.xlabel('x') plt.ylabel('y') plt.show() #Defining the main() function to provide dataset and calling preceding-defined functions def main(): x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14]) y = np.array([100, 300, 350, 500, 750, 850, 900,950, 1250, 1350, 1400, 1550, 1600, 1650,1700]) b = coef_estimation(x, y) print("Estimated coefficients:\nreg_b_0 = {} \nreg_b_1 = {}".format(b,b)) plot_regression_line(x, y, b) if __name__ == "__main__": main() We will get the following output for the preceding Python program: Output: Estimated coefficients: reg_b_0 = 187.08333333333337 reg_b_1 = 118.03571428571429 Figure 2.3: Simple Linear Regression Example-2: In this example, we will implement SLR by using a diabetes dataset from scikit-learn: #Importing necessary packages import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model from sklearn.metrics import mean_squared_error, r2_score %matplotlib inline #Loading the dataset and creating its object diabetes_data = datasets.load_diabetes() #Using one feature X = diabetes_data.data[:, np.newaxis, 2] #Splitting the data into training and testing sets X_train = X[:-35] X_test = X[-35:] #Splitting the target into training and testing sets y_train = diabetes_data.target[:-35] y_test = diabetes_data.target[-35:] #Creating linear regression object SLR_reg = linear_model.LinearRegression() #Training the model using the training sets SLR_reg.fit(X_train, y_train) #Making predictions by using the testing set y_pred = SLR_reg.predict(X_test) # Printing Regression Coefficient, Mean Squared Error(MSE), Variance Score. Also plotting the regression line and labels on it print('Coefficients: \n', SLR_reg.coef_) print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred)) print('Variance score: %.2f' % r2_score(y_test, y_pred)) plt.scatter(X_test, y_test, color='red') plt.plot(X_test, y_pred, color='green', linewidth=3) plt.xticks(()) plt.yticks(()) plt.show() We will get the following output for the preceding Python program: Output: Coefficients: [963.82249207] Mean squared error: 3487.66 Variance score: 0.26 Figure 2.4: Simple Linear Regression for diabetes dataset Multiple Linear Regression (MLR) MLR, the extension of SLR, predicts a response that is a dependent variable using two or more than two features or independent variables. Suppose a dataset is having observations and features, then the regression line for these features can be calculated with the help of the following equation: yi= b0+ b1 xi1+b2 xi2+ … + bp xip Where Yi is the predicted response value. And b0, b1, b2, … bpare regression coefficients. MLR model also includes the error known as residual error. This changes the preceding calculation as follows: yi = b00 + b1 xi1+ b2 xi2 + … +bp xip + ei Implementing multiple linear regression in Python: #Importing necessary packages import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model, metrics %matplotlib inline #Loading the dataset and creating its object boston_data = datasets.load_boston(return_X_y=False) #Defining feature matrix X and response vector Y X = boston_data.data y = boston_data.target #Splitting the data into training and testing sets from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.6, random_state=1) #Creating regression object and do the training of the model MLR_reg = linear_model.LinearRegression() MLR_reg.fit(X_train, y_train) # Printing Regression Coefficient, and Variance Score. Also plotting the regression line and labels on it print('Coefficients: \n', MLR_reg.coef_) print('Variance score: {}'.format(MLR_reg.score(X_test, y_test))) plt.style.use('bmh') plt.scatter(MLR_reg.predict(X_train), MLR_reg.predict(X_train) - y_train, color = "green", s = 20, label = 'Train_data') plt.scatter(MLR_reg.predict(X_test), MLR_reg.predict(X_test) - y_test, color = "blue", s = 10, label = 'Test_data') plt.hlines(y = 0, xmin = 0, xmax = 50, color = 'red', linewidth = 1.25) plt.legend(loc = 'upper right') plt.title("Residual errors(eo)") plt.show() We will get the following output for the preceding Python program. Output: Coefficients: [-7.95572889e-02 7.11808367e-02 5.82382970e-02 1.48237233e+00 -1.67360287e+01 2.95000985e+00 2.33290549e-02 -1.35721280e+00 3.13822151e-01 -1.16929875e-02 -8.07436236e-01 6.67075368e-03 -6.71019667e-01] Variance score: 0.7325323805669589 Figure 2.5: Multiple Linear Regression Logistic regression Logistic regression, a supervised learning classification algorithm, predicts the probability of a dependent variable. Being a classification algorithm, the dependent or target variable can have two possible classes (1 or 0). Here, 1 stands for success/yes and 0 stands for failure or no. In simple words, the target variable is binary in nature. LR is one of the simplest machine learning algorithms, which can be used for various classification problems like diabetes prediction, cancer detection, spam detection, and so on. In the following example, we will implement logistic regression on digit datasets, which can be downloaded from sklearn.datasets: Implementing logistic regression algorithm in Python: #Importing necessary packages %matplotlib inline import numpy as np import matplotlib.pyplot as plt #Downloading the digit dataset from sklearn.datasets import load_digits digits_dataset = load_digits() # Printing total images and labels in the dataset print(digits_dataset.data.shape) print(digits_dataset.target.shape) (1797, 64) (1797,) The preceding output shows that there are 1797 images (8 by 8 images for a dimensionality of 64) and 1797 labels (integers from 0 to 9). #Let’s have a look at the training data plt.figure(figsize=(20,4)) for index, (image, label) in enumerate(zip(digits.data[0:10], digits.target[0:10])): plt.subplot(1, 10, index + 1) plt.imshow(np.reshape(image, (8,8)), cmap=plt.cm.gray) plt.title('Training: %i\n' % label, fontsize = 20) Figure 2.6: Training data (0-9 digits) for Logistic Regression #Splitting the dataset into training and testing data set from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.30, random_state=0) #70% data for Training and 30% Data for Testing #Import the LogisticRegression class from sklearn and use the fit method to train the model from sklearn.linear_model import LogisticRegression logRegression = LogisticRegression() logRegression.fit(x_train, y_train) #Predicting for images logRegression.predict(x_test.reshape(1,-1)) logRegression.predict(x_test[0:10]) y_pred = logRegression.predict(x_test) #Calculating performance metrics (Confusion matrix, Classification Report and Accuracy). from sklearn.metrics import classification_report, confusion_matrix, accuracy_score print('Confusion Matrix:-\n', confusion_matrix(y_test, y_pred)) print('Classification Report:-\n', classification_report(y_test, y_pred)) print('Accuracy:-\n',accuracy_score(y_test,y_pred)) We will get the following output for the preceding Python program: Output: Confusion Matrix:- [[45 0 0 0 0 0 0 0 0 0] [0 47 0 0 0 0 2 0 3 0] [0 0 51 2 0 0 0 0 0 0] [0 0 1 52 0 0 0 0 0 1] [0 0 0 0 48 0 0 0 0 0] [0 1 0 0 0 55 1 0 0 0] [0 1 0 0 0 0 59 0 0 0] [0 1 0 1 1 0 0 50 0 0] [0 3 1 0 0 0 0 0 55 2] [0 0 0 1 0 1 0 0 2 53]] Classification Report:- precision recall f1-score support 0 1.00 1.00 1.00 45 1 0.89 0.90 0.90 52 2 0.96 0.96 0.96 53 3 0.93 0.96 0.95 54 4 0.98 1.00 0.99 48 5 0.98 0.96 0.97 57 6 0.95 0.98 0.97 60 7 1.00 0.94 0.97 53 8 0.92 0.90 0.91 61 9 0.95 0.93 0.94 57 accuracy 0.95 540 macro avg 0.96 0.96 0.96 540 weighted avg 0.95 0.95 0.95 540 Accuracy:- 0.9537037037037037 The preceding output that shows our model gives 95.37 % accuracy. Decision tree algorithm Decision trees are the most powerful supervised learning classification algorithm. It works based on a tree that has the following two main entities: Decision nodes: Where the data is split. Leaves: Where we get the output. Let’s look at the following binary tree that predicts whether a person is fit or not. In order to predict this, we need to provide various information like eating habits, age of the person exercise habits, and so on. Figure 2.7: Decision Tree Implementing decision tree algorithm in Python: Let’s see how we can implement a Decision Tree Classifier in the Python programming language. For this, we are going to use the Pima Indians Diabetes dataset. You can download it from https://archive.ics.uci.edu/ml/datasets/diabetes and save it to your system. #Importing necessary packages import pandas as pd from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split #Download the Pima-Indians-Diabetes dataset and read it using Pandas as follows Data_column_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label'] Dataset_pima_diabetes = pd.read_csv(r"C:\Users\Desktop\pima- indians-diabetes.csv", header=None, names= Data_column_names) #With the help of following script, you can look at the dataset Dataset_pima_diabetes.head() Figure 2.8: Pima-Indians Dataset #Splitting the dataset in features and target variables feature_columns = ['pregnant', 'insulin','bmi', 'age','glucose','bp','pedigree','skin'] #Features X = Dataset_pima_diabetes[feature_columns] #Target variable y = Dataset_pima_diabetes.label #Splitting the dataset for training and testing purpose. Here we are splitting the dataset into 80% training data and 20% of testing data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1) #Train the model. We are using DecisionTreeClassifier() class of Scikit-learn DT_classifier = DecisionTreeClassifier() DT_classifier = DT_classifier.fit(X_train,y_train) #Make predictions from trained model y_pred = DT_classifier.predict(X_test) #Calculating performance metrics (Confusion matrix, Classification Report and Accuracy) of our decision tree classifier from sklearn.metrics import classification_report, confusion_matrix, accuracy_score print('Confusion Matrix:-\n', confusion_matrix(y_test, y_pred)) print('Classification Report:-\n', classification_report(y_test, y_pred)) print('Accuracy:-\n',accuracy_score(y_test,y_pred)) We will get the following output for the preceding Python program: Output: Confusion Matrix:- [[78 21] [25 30]] Classification Report:- precision recall f1-score support 0 0.76 0.79 0.77 99 1 0.59 0.55 0.57 55 accuracy 0.70 154 macro avg 0.67 0.67 0.67 154 weighted avg 0.70 0.70 0.70 154 Accuracy:- 0.7012987012987013 You can see that the accuracy of our Decision tree classifier is around 70%. With the help of the following code, we can also visualize the decision tree: #Visualizing our decision tree import graphviz from sklearn import tree dot_data = tree.export_graphviz(DT_classifier,out_file=None,feature_names=fea ture_columns,class_names=True) graph = graphviz.Source(dot_data) graph.render("DTVisualize",view=True) As output, the preceding code will generate a PDF file named DTVisualize.pdf having the decision tree of the Pima-Indians-diabetes dataset. We set the view parameter in a graph.render() to True, so that it will open the file as well. If you do not want it to open the file automatically, you can also set it to ‘False’. Output: 'DTVisualize.pdf' Random forest Random forest, a supervised machine learning classification algorithm, creates decision trees on data samples, and after getting the prediction from each of them, it selects the best solution by means of voting. It reduces overfitting by averaging the result, that’s why we get better results as compared with using a single decision tree. The following figure illustrates the working of the Random Forest algorithm: Figure 2.9: Working of Random Forest Algorithm The Random forest starts with the selection of random samples from the dataset. It then constructs a decision tree for every sample and gets the prediction result from all of them. Once we get the predictions, it votes among them and selects the most voted prediction result as the final predicted result. Implementing Random forest algorithm in Python: In the following example, we will implement the Random Forest algorithm on the same dataset, that is, the Pima Indians Diabetes dataset, on which we implemented the Decision Tree Classifier. Let’s implement it and see the variation in the accuracy result: #Importing necessary packages import pandas as pd from sklearn.model_selection import train_test_split #Download the Pima-IndianDiabetes dataset and read it using Pandas as follows Data_column_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label'] Dataset_pima_diabetes = pd.read_csv(r"C:\Users\Desktop\pima- indians-diabetes.csv", header=None, names= Data_column_names) #Splitting the dataset in features and target variables feature_columns = ['pregnant', 'insulin','bmi', 'age','glucose','bp','pedigree','skin'] # Features X = Dataset_pima_diabetes[feature_columns] # Target variable y = Dataset_pima_diabetes.label #Splitting the dataset for training and testing purpose. Here we are splitting the dataset into 80% training data and 20% of testing data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1) #Train the model. We are using RandomForestClassifier() class of Scikit-learn RF_classifier = DecisionTreeClassifier() RF_classifier = RF_classifier.fit(X_train,y_train) #Make predictions from trained model y_pred = RF_classifier.predict(X_test) #Calculating performance metrics (Confusion matrix, Classification Report and Accuracy) of our decision tree classifier from sklearn.metrics import classification_report, confusion_matrix, accuracy_score print('Confusion Matrix:-\n', confusion_matrix(y_test, y_pred)) print('Classification Report:-\n', classification_report(y_test, y_pred)) print('Accuracy:-\n',accuracy_score(y_test,y_pred)) We will get the following output for the preceding Python program: Output: Confusion Matrix:- [[89 10] [24 31]] Classification Report:- precision recall f1-score support 0 0.79 0.90 0.84 99 1 0.76 0.56 0.65 55 accuracy 0.78 154 macro avg 0.77 0.73 0.74 154 weighted avg 0.78 0.78 0.77 154 Accuracy:- 0.7792207792207793 You can s

Learn AI with Python PDF

Document Details

Tags

Related

Summary

Full Transcript