deep-learning-dummies.pdf

Deep Learning by John Paul Mueller and Luca Massaron Deep Learning For Dummies® Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com Copyright © 2019 by John Wiley & Sons, Inc., Hoboken, New Jersey Media and software compilation copyright © 2019 by John Wiley & Sons, Inc. All rights reserved. Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions. Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book. LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ. For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit https://hub.wiley.com/community/support/dummies. Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com. Library of Congress Control Number is available from the publisher: 2019937505 ISBN 978-1-119-54304-6 (pbk); ISBN 978-1-119-54303-9 (ebk); ISBN ePDF 978-1-119-54302-2 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Contents at a Glance Introduction......................................................... 1 Part 1: Discovering Deep Learning............................... 7 CHAPTER 1: Introducing Deep Learning........................................ 9 CHAPTER 2: Introducing the Machine Learning Principles....................... 25 CHAPTER 3: Getting and Using Python........................................ 45 CHAPTER 4: Leveraging a Deep Learning Framework........................... 73 Part 2: Considering Deep Learning Basics..................... 91 CHAPTER 5: Reviewing Matrix Math and Optimization.......................... 93 CHAPTER 6: Laying Linear Regression Foundations........................... 111 CHAPTER 7: Introducing Neural Networks................................... 131 CHAPTER 8: Building a Basic Neural Network................................ 149 CHAPTER 9: Moving to Deep Learning...................................... 163 CHAPTER 10: Explaining Convolutional Neural Networks....................... 179 CHAPTER 11: Introducing Recurrent Neural Networks......................... 201 Part 3: Interacting with Deep Learning...................... 215 CHAPTER 12: Performing Image Classification................................ 217 CHAPTER 13: Learning Advanced CNNs...................................... 233 CHAPTER 14: Working on Language Processing............................... 251 CHAPTER 15: Generating Music and Visual Art................................ 269 CHAPTER 16: Building Generative Adversarial Networks........................ 279 CHAPTER 17: Playing with Deep Reinforcement Learning....................... 293 Part 4: The Part of Tens......................................... 307 CHAPTER 18: Ten Applications that Require Deep Learning..................... 309 CHAPTER 19: Ten Must-Have Deep Learning Tools............................ 317 CHAPTER 20: Ten Types of Occupations that Use Deep Learning................ 327 Index............................................................... 335 Table of Contents INTRODUCTION.................................................... 1 About This Book............................................... 1 Foolish Assumptions........................................... 2 Icons Used in This Book........................................ 3 Beyond the Book.............................................. 4 Where to Go from Here........................................ 5 PART 1: DISCOVERING DEEP LEARNING....................... 7 CHAPTER 1: Introducing Deep Learning.............................. 9 Defining What Deep Learning Means............................ 10 Starting from Artificial Intelligence........................... 10 Considering the role of AI................................... 12 Focusing on machine learning............................... 15 Moving from machine learning to deep learning............... 16 Using Deep Learning in the Real World.......................... 18 Understanding the concept of learning....................... 18 Performing deep learning tasks............................. 19 Employing deep learning in applications...................... 19 Considering the Deep Learning Programming Environment........ 19 Overcoming Deep Learning Hype............................... 22 Discovering the start-up ecosystem.......................... 22 Knowing when not to use deep learning...................... 22 CHAPTER 2: Introducing the Machine Learning Principles....... 25 Defining Machine Learning.................................... 26 Understanding how machine learning works.................. 26 Understanding that it’s pure math........................... 27 Learning by different strategies............................. 28 Training, validating, and testing data......................... 30 Looking for generalization.................................. 31 Getting to know the limits of bias............................ 32 Keeping model complexity in mind.......................... 33 Considering the Many Different Roads to Learning................ 33 Understanding there is no free lunch........................ 34 Discovering the five main approaches........................ 34 Delving into some different approaches...................... 36 Awaiting the next breakthrough............................. 40 Pondering the True Uses of Machine Learning.................... 40 Understanding machine learning benefits.................... 41 Discovering machine learning limits.......................... 43 Table of Contents v CHAPTER 3: Getting and Using Python............................... 45 Working with Python in this Book............................... 46 Obtaining Your Copy of Anaconda.............................. 46 Getting Continuum Analytics Anaconda...................... 47 Installing Anaconda on Linux................................ 47 Installing Anaconda on MacOS.............................. 48 Installing Anaconda on Windows............................ 49 Downloading the Datasets and Example Code.................... 54 Using Jupyter Notebook.................................... 54 Defining the code repository................................ 56 Getting and using datasets................................. 61 Creating the Application....................................... 62 Understanding cells........................................62 Adding documentation cells................................ 63 Using other cell types...................................... 64 Understanding the Use of Indentation.......................... 65 Adding Comments............................................ 66 Understanding comments.................................. 67 Using comments to leave yourself reminders................. 68 Using comments to keep code from executing................ 69 Getting Help with the Python Language......................... 69 Working in the Cloud......................................... 70 Using the Kaggle datasets and kernels....................... 70 Using the Google Colaboratory.............................. 70 CHAPTER 4: Leveraging a Deep Learning Framework............. 73 Presenting Frameworks....................................... 74 Defining the differences.................................... 74 Explaining the popularity of frameworks...................... 75 Defining the deep learning framework....................... 77 Choosing a particular framework............................ 78 Working with Low-End Frameworks............................. 79 Caffe2................................................... 79 Chainer.................................................. 80 PyTorch.................................................. 80 MXNet................................................... 81 Microsoft Cognitive Toolkit/CNTK............................ 82 Understanding TensorFlow.................................... 82 Grasping why TensorFlow is so good......................... 82 Making TensorFlow easier by using TFLearn................... 84 Using Keras as the best simplifier............................ 85 Getting your copy of TensorFlow and Keras................... 86 Fixing the C++ build tools error in Windows................... 88 Accessing your new environment in Notebook................ 89 vi Deep Learning For Dummies PART 2: CONSIDERING DEEP LEARNING BASICS............ 91 CHAPTER 5: Reviewing Matrix Math and Optimization........... 93 Revealing the Math You Really Need............................ 94 Working with data......................................... 94 Creating and operating with a matrix......................... 95 Understanding Scalar, Vector, and Matrix Operations............. 96 Creating a matrix.......................................... 97 Performing matrix multiplication............................ 99 Executing advanced matrix operations...................... 100 Extending analysis to tensors.............................. 102 Using vectorization effectively.............................. 104 Interpreting Learning as Optimization.......................... 105 Exploring cost functions................................... 105 Descending the error curve................................ 106 Learning the right direction................................ 107 Updating................................................ 109 CHAPTER 6: Laying Linear Regression Foundations............. 111 Combining Variables......................................... 112 Working through simple linear regression................... 112 Advancing to multiple linear regression..................... 113 Including gradient descent................................. 115 Seeing linear regression in action........................... 116 Mixing Variable Types........................................ 117 Modeling the responses................................... 117 Modeling the features..................................... 118 Dealing with complex relations............................. 119 Switching to Probabilities..................................... 121 Specifying a binary response............................... 121 Transforming numeric estimates into probabilities............ 122 Guessing the Right Features.................................. 124 Defining the outcome of incompatible features............... 124 Solving overfitting using selection and regularization.......... 125 Learning One Example at a Time.............................. 127 Using gradient descent.................................... 127 Understanding how SGD is different........................ 127 CHAPTER 7: Introducing Neural Networks........................ 131 Discovering the Incredible Perceptron.......................... 132 Understanding perceptron functionality..................... 132 Touching the nonseparability limit.......................... 134 Hitting Complexity with Neural Networks....................... 136 Considering the neuron................................... 136 Pushing data with feed-forward............................ 138 Table of Contents vii Going even deeper into the rabbit hole...................... 140 Using backpropagation to adjust learning.................... 143 Struggling with Overfitting.................................... 146 Understanding the problem............................... 146 Opening the black box.................................... 146 CHAPTER 8: Building a Basic Neural Network.................... 149 Understanding Neural Networks.............................. 150 Defining the basic architecture............................. 151 Documenting the essential modules........................ 153 Solving a simple problem.................................. 155 Looking Under the Hood of Neural Networks................... 158 Choosing the right activation function....................... 158 Relying on a smart optimizer............................... 160 Setting a working learning rate............................. 161 CHAPTER 9: Moving to Deep Learning............................. 163 Seeing Data Everywhere...................................... 164 Considering the effects of structure......................... 164 Understanding Moore’s implications........................ 165 Considering what Moore’s Law changes..................... 166 Discovering the Benefits of Additional Data..................... 167 Defining the ramifications of data.......................... 168 Considering data timeliness and quality..................... 168 Improving Processing Speed.................................. 169 Leveraging powerful hardware............................. 170 Making other investments................................. 170 Explaining Deep Learning Differences from Other Forms of AI..... 171 Adding more layers....................................... 172 Changing the activations.................................. 174 Adding regularization by dropout........................... 175 Finding Even Smarter Solutions............................... 176 Using online learning..................................... 176 Transferring learning..................................... 177 Learning end to end...................................... 177 CHAPTER 10: Explaining Convolutional Neural Networks....... 179 Beginning the CNN Tour with Character Recognition............. 180 Understanding image basics............................... 180 Explaining How Convolutions Work............................ 183 Understanding convolutions............................... 183 Simplifying the use of pooling.............................. 187 Describing the LeNet architecture.......................... 188 viii Deep Learning For Dummies Detecting Edges and Shapes from Images...................... 193 Visualizing convolutions................................... 194 Unveiling successful architectures.......................... 196 Discussing transfer learning............................... 197 CHAPTER 11: Introducing Recurrent Neural Networks.......... 201 Introducing Recurrent Networks............................... 202 Modeling sequences using memory......................... 202 Recognizing and translating speech......................... 204 Placing the correct caption on pictures...................... 206 Explaining Long Short-Term Memory........................... 207 Defining memory differences.............................. 208 Walking through the LSTM architecture...................... 209 Discovering interesting variants............................ 211 Getting the necessary attention............................ 212 PART 3: INTERACTING WITH DEEP LEARNING............. 215 CHAPTER 12: Performing Image Classification..................... 217 Using Image Classification Challenges.......................... 218 Delving into ImageNet and MS COCO....................... 219 Learning the magic of data augmentation................... 221 Distinguishing Traffic Signs................................... 223 Preparing image data..................................... 224 Running a classification task............................... 228 CHAPTER 13: Learning Advanced CNNs............................. 233 Distinguishing Classification Tasks............................. 234 Performing localization.................................... 235 Classifying multiple objects................................ 235 Annotating multiple objects in images....................... 237 Segmenting images....................................... 237 Perceiving Objects in Their Surroundings....................... 239 Discovering how RetinaNet works.......................... 239 Using the Keras-RetinaNet code............................ 241 Overcoming Adversarial Attacks on Deep Learning Applications... 245 Tricking pixels............................................ 246 Hacking with stickers and other artifacts..................... 248 CHAPTER 14: Working on Language Processing................... 251 Processing Language........................................ 252 Defining understanding as tokenization..................... 253 Putting all the documents into a bag........................ 254 Memorizing Sequences that Matter............................ 257 Understanding semantics by word embeddings.............. 257 Using AI for Sentiment Analysis............................... 261 Table of Contents ix CHAPTER 15: Generating Music and Visual Art.................... 269 Learning to Imitate Art and Life................................270 Transferring an artistic style............................... 271 Reducing the problem to statistics.......................... 272 Understanding that deep learning doesn’t create............. 274 Mimicking an Artist.......................................... 274 Defining a new piece based on a single artist................. 274 Combining styles to create new art......................... 276 Visualizing how neural networks dream..................... 276 Using a network to compose music......................... 277 CHAPTER 16: Building Generative Adversarial Networks........ 279 Making Networks Compete................................... 280 Finding the key in the competition.......................... 280 Achieving more realistic results............................. 282 Considering a Growing Field.................................. 289 Inventing realistic pictures of celebrities..................... 289 Enhancing details and image translation..................... 290 CHAPTER 17: Playing with Deep Reinforcement Learning....... 293 Playing a Game with Neural Networks.......................... 294 Introducing reinforcement learning......................... 294 Simulating game environments............................ 296 Presenting Q-learning..................................... 299 Explaining Alpha-Go......................................... 302 Determining if you’re going to win.......................... 303 Applying self-learning at scale.............................. 305 PART 4: THE PART OF TENS.................................... 307 CHAPTER 18: Ten Applications that Require Deep Learning..... 309 Restoring Color to Black-and-White Videos and Pictures.......... 310 Approximating Person Poses in Real Time...................... 310 Performing Real-Time Behavior Analysis........................ 311 Translating Languages....................................... 312 Estimating Solar Savings Potential............................. 312 Beating People at Computer Games........................... 313 Generating Voices........................................... 314 Predicting Demographics..................................... 314 Creating Art from Real-World Pictures.......................... 315 Forecasting Natural Catastrophes............................. 316 x Deep Learning For Dummies CHAPTER 19: Ten Must-Have Deep Learning Tools................ 317 Compiling Math Expressions Using Theano..................... 317 Augmenting TensorFlow Using Keras........................... 318 Dynamically Computing Graphs with Chainer................... 319 Creating a MATLAB-Like Environment with Torch................ 319 Performing Tasks Dynamically with PyTorch.................... 320 Accelerating Deep Learning Research Using CUDA............... 321 Supporting Business Needs with Deeplearning4j................ 323 Mining Data Using Neural Designer............................ 323 Training Algorithms Using Microsoft Cognitive Toolkit (CNTK)...... 324 Exploiting Full GPU Capability Using MXNet..................... 325 CHAPTER 20: Ten Types of Occupations that Use Deep Learning..................................... 327 Managing People............................................ 327 Improving Medicine......................................... 328 Developing New Devices..................................... 329 Providing Customer Support.................................. 329 Seeing Data in New Ways..................................... 330 Performing Analysis Faster................................... 331 Creating a Better Work Environment........................... 331 Researching Obscure or Detailed Information................... 333 Designing Buildings.......................................... 333 Enhancing Safety............................................ 334 INDEX............................................................. 335 Table of Contents xi Introduction W hen you talk to some people about deep learning, they think of some deep dark mystery, but deep learning really isn’t a mystery at all — you use it every time you talk to your smartphone, so you have it with you every day. In fact, you find deep learning used everywhere. For example, you see it when using many applications online and even when you shop. You are surrounded by deep learning and don’t even realize it, which makes learning about deep learning essential because you can use it to do so much more than you might think possible. Other people have another view of deep learning that has no basis in reality. They think that somehow deep learning will be responsible for some dire apocalypse, but that really isn’t possible with today’s technology. More likely is that someone will find a way to use deep learning to create fake people in order to commit crimes or to bilk the government out of thousands of dollars. However, killer robots are most definitely not part of the future. Whether you’re part of the mystified crowd or the killer robot crowd, we hope that you’ll read Deep Learning For Dummies with the goal of understanding what deep learning can actually do. This technology can probably do a lot more in the way of mundane tasks than you think possible, but it also has limits, and you need to know about both. About This Book When you work through Deep Learning For Dummies, you gain access to a lot of example code that will run on a standard Mac, Linux, or Windows system. You can also run the code online using something like Google Colab. (We provide pointers on how to get the information you need to do this.) Special equipment, such as a GPU, will make the examples run faster. However, the point of this book is that you can create deep learning code no matter what sort of machine you have as long as you’re willing to wait for some of it to complete. (We tell you which exam- ples take a long time to run.) Introduction 1 The first part of this book gives you some starter information so that you don’t get completely lost before you start. You discover how to install the various products you need and gain an understanding of some essential math. The beginning examples are more along the lines of standard regression and machine learning, but you need this basis to gain a full appreciation of just what deep learning can do for you. After you get past these initial bits of information, you start to do some pretty amazing things. For example, you discover how to generate your own art and per- form other tasks that you might have assumed to require many of coding and some special hardware to accomplish. By the end of the book, you’ll be amazed by what you can do, even if you don’t have an advanced machine learning or deep learning degree. To make absorbing the concepts even easier, this book uses the following conventions: »» Text that you’re meant to type just as it appears in the book is in bold. The exception is when you’re working through a step list: Because each step is bold, the text to type is not bold. »» When you see words in italics as part of a typing sequence, you need to replace that value with something that works for you. For example, if you see “Type Your Name and press Enter,” you need to replace Your Name with your actual name. »» Web addresses and programming code appear in monofont. If you’re reading a digital version of this book on a device connected to the Internet, you can click or tap the web address to visit that website, like this: http://www. dummies.com. »» When you need to type command sequences, you see them separated by a special arrow, like this: File ➪ New File. In this example, you go to the File menu first and then select the New File entry on that menu. Foolish Assumptions You might find it difficult to believe that we’ve assumed anything about you — after all, we haven’t even met you yet! Although most assumptions are indeed foolish, we made these assumptions to provide a starting point for the book. 2 Deep Learning For Dummies You need to be familiar with the platform you want to use because the book doesn’t offer any guidance in this regard. (Chapter 3 does, however, provide Anaconda installation instructions, and Chapter 4 helps you install the TensorFlow and Keras frameworks used for this book.) To give you the maximum information about Python concerning how it applies to deep learning, this book doesn’t discuss any platform-specific issues. You really do need to know how to install applications, use applications, and generally work with your chosen platform before you begin working with this book. You must know how to work with Python. You can find a wealth of tutorials online (see https://www.w3schools.com/python/ and https://www.tutorialspoint. com/python/ as examples). This book isn’t a math primer. Yes, you see many examples of complex math, but the emphasis is on helping you use Python to perform deep learning tasks rather than teaching math theory. We include some examples that also discuss the use of machine learning as it applies to deep learning. Chapters 1 and 2 give you a better understanding of precisely what you need to know to use this book successfully. This book also assumes that you can access items on the Internet. Sprinkled throughout are numerous references to online material that will enhance your learning experience. However, these added sources are useful only if you actually find and use them. Icons Used in This Book As you read this book, you see icons in the margins that indicate material of inter- est (or not, as the case may be).This section briefly describes each icon in this book. Tips are nice because they help you save time or perform some task without a lot of extra work. The tips in this book are time-saving techniques or pointers to resources that you should try so that you can get the maximum benefit from Python or from performing deep learning–related tasks. We don’t want to sound like angry parents or some kind of maniacs, but you should avoid doing anything that’s marked with a Warning icon. Otherwise, you might find that your application fails to work as expected, you get incorrect answers from seemingly bulletproof algorithms, or (in the worst-case scenario) you lose data. Introduction 3 Whenever you see this icon, think advanced tip or technique. You might find these tidbits of useful information just too boring for words, or they could contain the solution you need to get a program running. Skip these bits of information when- ever you like. If you don’t get anything else out of a particular chapter or section, remember the material marked by this icon. This text usually contains an essential process or a bit of information that you must know to work with Python or to perform deep learning–related tasks successfully. Beyond the Book This book isn’t the end of your Python or deep learning experience — it’s really just the beginning. We provide online content to make this book more flexible and better able to meet your needs. That way, as we receive e-mail from you, we can address questions and tell you how updates to either Python or its associated add- ons affect book content. In fact, you gain access to all these cool additions: »» Cheat sheet: You remember using crib notes in school to make a better mark on a test, don’t you? You do? Well, a cheat sheet is sort of like that. It provides you with some special notes about tasks that you can do with Python, machine learning, and data science that not every other person knows. You can find the cheat sheet by going to www.dummies.com, searching this book’s title, and scrolling down the page that appears. The cheat sheet contains really neat information such as the most common programming mistakes that cause people woe when using Python. »» Updates: Sometimes changes happen. For example, we might not have seen an upcoming change when we looked into our crystal ball during the writing of this book. In the past, this possibility simply meant that the book became outdated and less useful, but you can now find updates to the book by searching this book’s title at www.dummies.com. In addition to these updates, check out the blog posts with answers to reader questions and demonstrations of useful book-related techniques at http:// blog.johnmuellerbooks.com/. »» Companion files: Hey! Who really wants to type all the code in the book and reconstruct all those neural networks manually? Most readers would prefer to spend their time actually working with Python, performing machine learning or deep learning tasks, and seeing the interesting things they can do, rather 4 Deep Learning For Dummies than typing. Fortunately for you, the examples used in the book are available for download, so all you need to do is read the book to learn Python for deep learning usage techniques. You can find these files at www.dummies.com. Search this book’s title, and on the page that appears, scroll down to the image of the book cover and click it. Then click the More about This Book button and on the page that opens, go to the Downloads tab. Where to Go from Here It’s time to start your Python for deep learning adventure! If you’re completely new to Python and its use for deep learning tasks, you should start with Chapter 1 and progress through the book at a pace that allows you to absorb as much of the material as possible. If you’re a novice who’s in an absolute rush to get going with Python for deep learning as quickly as possible, you can skip to Chapter 3 with the understanding that you may find some topics a bit confusing later. Skipping to Chapter 4 is okay if you already have Anaconda (the programming product used in the book) installed, but be sure to at least skim Chapter 3 so that you know what assump- tions we made when writing this book. This book relies on a combination of TensorFlow and Keras to perform deep learning tasks. Even if you’re an advanced reader, you need to go to Chapter 4 to discover how to configure the environment used for this book. Failure to configure the environment according to instructions will almost certainly cause failures when you try to run the code. Introduction 5 1 Discovering Deep Learning IN THIS PART... Understand how deep learning impacts the world around us. Consider the relationship between deep learning and machine learning. Create a Python setup of your own. Define the need for a framework in deep learning. IN THIS CHAPTER »» Understanding deep learning »» Working with deep learning »» Developing deep learning applications »» Considering deep learning limitations Chapter 1 Introducing Deep Learning Y ou have probably heard a lot about deep learning. The term appears all over the place and seems to apply to everything. In reality, deep learning is a sub- set of machine learning, which in turn is a subset of artificial intelligence (AI). The first goal of this chapter is to help you understand what deep learning is really all about and how it applies to the world today. You may be surprised to learn that deep learning isn’t the only game in town; other methods of analyzing data exist. In fact, deep learning meets a specific set of needs when it comes to data analysis, so you might be using other methods and not even know it. Deep learning is just a subset of AI, but it’s an important subset. You see deep learning techniques used for a number of tasks, but not every task. In fact, some people associate deep learning with tasks that it can’t perform. The next step in discovering deep learning is to understand what it can and can’t do for you. As part of working with deep learning in this book, you write applications that rely on deep learning to process data and then produce a desired output. Of course, you need to know a little about the programming environment before you can do much. Even though Chapter 3 discusses how to install and configure Python, the language used to demonstrate deep learning in this book, you first need to know a little more about the options available to you. CHAPTER 1 Introducing Deep Learning 9 The chapter closes with a discussion of why deep learning shouldn’t be the only data processing technique in your toolkit. Yes, deep learning can perform amazing tasks when used appropriately, but it can also cause serious problems when applied to problems that it doesn’t support well. Sometimes you need to look to other tech- nologies to perform a given task, or figure out which technologies to use with deep learning to provide a more efficient and elegant solution to specific problems. Defining What Deep Learning Means An understanding of deep learning begins with a precise definition of terms. Otherwise, you have a hard time separating the media hype from the realities of what deep learning can actually provide. Deep learning is part of both AI and machine learning, as shown in Figure 1-1. To understand deep learning, you must begin at the outside — that is, you start with AI, and then work your way through machine learning, and then finally define deep learning. The following sections help you through this process. FIGURE 1-1: Deep learning is a subset of machine learning which is a subset of AI. Starting from Artificial Intelligence Saying that AI is an artificial intelligence doesn’t really tell you anything mean- ingful, which is why so many discussions and disagreements arise over this term. Yes, you can argue that what occurs is artificial, not having come from a natural 10 PART 1 Discovering Deep Learning source. However, the intelligence part is, at best, ambiguous. People define intel- ligence in many different ways. However, you can say that intelligence involves certain mental exercises composed of the following activities: »» Learning: Having the ability to obtain and process new information. »» Reasoning: Being able to manipulate information in various ways. »» Understanding: Considering the result of information manipulation. »» Grasping truths: Determining the validity of the manipulated information. »» Seeing relationships: Divining how validated data interacts with other data. »» Considering meanings: Applying truths to particular situations in a manner consistent with their relationship. »» Separating fact from belief: Determining whether the data is adequately supported by provable sources that can be demonstrated to be consistently valid. The list could easily get quite long, but even this list is relatively prone to interpre- tation by anyone who accepts it as viable. As you can see from the list, however, intelligence often follows a process that a computer system can mimic as part of a simulation: 1. Set a goal based on needs or wants. 2. Assess the value of any currently known information in support of the goal. 3. Gather additional information that could support the goal. 4. Manipulate the data such that it achieves a form consistent with existing information. 5. Define the relationships and truth values between existing and new information. 6. Determine whether the goal is achieved. 7. Modify the goal in light of the new data and its effect on the probability of success. 8. Repeat Steps 2 through 7 as needed until the goal is achieved (found true) or the possibilities for achieving it are exhausted (found false). Even though you can create algorithms and provide access to data in support of this process within a computer, a computer’s capability to achieve intelligence is severely limited. For example, a computer is incapable of understanding anything because it relies on machine processes to manipulate data using pure math in a strictly mechanical fashion. Likewise, computers can’t easily separate truth from CHAPTER 1 Introducing Deep Learning 11 mistruth. In fact, no computer can fully implement any of the mental activities described in the list that describes intelligence. When thinking about AI, you must consider the goals of the people who develop an AI. The goal is to mimic human intelligence, not replicate it. A computer doesn’t truly think, but it gives the appearance of thinking. However, a computer actually provides this appearance only in the logical/mathematical form of intelligence. A computer is moderately successful in mimicking visual-spatial and bodily- kinesthetic intelligence. A computer has a low, passable capability in interper- sonal and linguistic intelligence. Unlike humans, however, a computer has no way to mimic intrapersonal or creative intelligence. Considering the role of AI As described in the previous section, the first concept that’s important to understand is that AI doesn’t really have anything to do with human intelligence. Yes, some AI is modeled to simulate human intelligence, but that’s what it is: a simulation. When thinking about AI, notice that an interplay exists between goal seeking, data processing used to achieve that goal, and data acquisition used to better understand the goal. AI relies on algorithms to achieve a result that may or may not have anything to do with human goals or methods of achieving those goals. With this in mind, you can categorize AI in four ways: »» Acting humanly: When a computer acts like a human, it best reflects the Turing test, in which the computer succeeds when differentiation between the computer and a human isn’t possible (see http://www.turing.org.uk/ scrapbook/test.html for details). This category also reflects what the media would have you believe that AI is all about. You see it employed for technolo- gies such as natural language processing, knowledge representation, auto- mated reasoning, and machine learning (all four of which must be present to pass the test). The original Turing Test didn’t include any physical contact. The newer, Total Turing Test does include physical contact in the form of perceptual ability interrogation, which means that the computer must also employ both com- puter vision and robotics to succeed. Modern techniques include the idea of achieving the goal rather than mimicking humans completely. For example, the Wright brothers didn’t succeed in creating an airplane by precisely copying the flight of birds; rather, the birds provided ideas that led to aerodynamics, which in turn eventually led to human flight. The goal is to fly. Both birds and humans achieve this goal, but they use different approaches. »» Thinking humanly: When a computer thinks as a human, it performs tasks that require intelligence (as contrasted with rote procedures) from a human 12 PART 1 Discovering Deep Learning to succeed, such as driving a car. To determine whether a program thinks like a human, you must have some method of determining how humans think, which the cognitive modeling approach defines. This model relies on three techniques: Introspection: Detecting and documenting the techniques used to achieve goals by monitoring one’s own thought processes. Psychological testing: Observing a person’s behavior and adding it to a database of similar behaviors from other persons given a similar set of circumstances, goals, resources, and environmental conditions (among other things). Brain imaging: Monitoring brain activity directly through various mechani- cal means, such as Computerized Axial Tomography (CAT), Positron Emission Tomography (PET), Magnetic Resonance Imaging (MRI), and Magnetoencephalography (MEG). After creating a model, you can write a program that simulates the model. Given the amount of variability among human thought processes and the difficulty of accurately representing these thought processes as part of a program, the results are experimental at best. This category of thinking humanly is often used in psychology and other fields in which modeling the human thought process to create realistic simulations is essential. »» Thinking rationally: Studying how humans think using some standard enables the creation of guidelines that describe typical human behaviors. A person is considered rational when following these behaviors within certain levels of deviation. A computer that thinks rationally relies on the recorded behaviors to create a guide as to how to interact with an environment based on the data at hand. The goal of this approach is to solve problems logically, when possible. In many cases, this approach would enable the creation of a baseline technique for solving a problem, which would then be modified to actually solve the problem. In other words, the solving of a problem in principle is often different from solving it in practice, but you still need a starting point. »» Acting rationally: Studying how humans act in given situations under specific constraints enables you to determine which techniques are both efficient and effective. A computer that acts rationally relies on the recorded actions to interact with an environment based on conditions, environmental factors, and existing data. As with rational thought, rational acts depend on a solution in principle, which may not prove useful in practice. However, rational acts do provide a baseline upon which a computer can begin negotiating the successful completion of a goal. CHAPTER 1 Introducing Deep Learning 13 HUMAN VERSUS RATIONAL PROCESSES Human processes differ from rational processes in their outcome. A process is rational if it always does the right thing based on the current information, given an ideal perfor- mance measure. In short, rational processes go by the book and assume that “the book” is actually correct. Human processes involve instinct, intuition, and other variables that don’t necessarily reflect the book and may not even consider the existing data. As an example, the rational way to drive a car is to always follow the laws. However, traffic isn’t rational. If you follow the laws precisely, you end up stuck somewhere because other drivers aren’t following the laws precisely. To be successful, a self-driving car must therefore act humanly, rather than rationally. You find AI used in a great many applications today. The only problem is that the technology works so well that you don’t even know it exists. In fact, you might be surprised to find that many devices in your home already make use of this tech- nology. The uses for AI number in the millions — all safely out of sight even when they’re quite dramatic in nature. Here are just a few of the ways in which you might see AI used: »» Fraud detection: You get a call from your credit card company asking whether you made a particular purchase. The credit card company isn’t being nosy; it’s simply alerting you to the fact that someone else could be making a purchase using your card. The AI embedded within the credit card company’s code detected an unfamiliar spending pattern and alerted someone to it. »» Resource scheduling: Many organizations need to schedule the use of resources efficiently. For example, a hospital may have to determine where to put a patient based on the patient’s needs, availability of skilled experts, and the amount of time the doctor expects the patient to be in the hospital. »» Complex analysis: Humans often need help with complex analysis because there are literally too many factors to consider. For example, the same set of symptoms could indicate more than one problem. A doctor or other expert might need help making a diagnosis in a timely manner to save a patient’s life. »» Automation: Any form of automation can benefit from the addition of AI to handle unexpected changes or events. A problem with some types of automation today is that an unexpected event, such as an object in the wrong place, can actually cause the automation to stop. Adding AI to the automation can allow the automation to handle unexpected events and continue as though nothing happened. 14 PART 1 Discovering Deep Learning »» Customer service: The customer service line you call today may not even have a human behind it. The automation is good enough to follow scripts and use various resources to handle the vast majority of your questions. With good voice inflection (provided by AI as well), you may not even be able to tell that you’re talking with a computer. »» Safety systems: Many of the safety systems found in machines of various sorts today rely on AI to take over the vehicle in a time of crisis. For example, many automatic braking systems rely on AI to stop the car based on all the inputs that a vehicle can provide, such as the direction of a skid. »» Machine efficiency: AI can help control a machine in such a manner as to obtain maximum efficiency. The AI controls the use of resources so that the system doesn’t overshoot speed or other goals. Every ounce of power is used precisely as needed to provide the desired services. Focusing on machine learning Machine learning is one of a number of subsets of AI and the only one this book discusses. In machine learning, the goal is to create a simulation of human learn- ing so that an application can adapt to uncertain or unexpected conditions. To perform this task, machine learning relies on algorithms to analyze huge datasets. Currently, machine learning can’t provide the sort of AI that the movies present (a machine can’t intuitively learn as a human can); it can only simulate specific kinds of learning, and only in a narrow range at that. Even the best algorithms can’t think, feel, present any form of self-awareness, or exercise free will. Characteristics that are basic to humans are frustratingly difficult for machines to grasp because of these limits in perception. Machines aren’t self-aware. What machine learning can do is perform predictive analytics far faster than any human can. As a result, machine learning can help humans work more efficiently. The current state of AI, then, is one of performing analysis, but humans must still consider the implications of that analysis: making the required moral and ethical decisions. The essence of the matter is that machine learning provides just the learning part of AI, and that part is nowhere near ready to create an AI of the sort you see in films. The main point of confusion between learning and intelligence is that people assume that simply because a machine gets better at its job (it can learn), it’s also aware (has intelligence). Nothing supports this view of machine learning. The same phenomenon occurs when people assume that a computer is purposely causing problems for them. The computer can’t assign emotions and therefore CHAPTER 1 Introducing Deep Learning 15 acts only upon the input provided and the instruction contained within an application to process that input. A true AI will eventually occur when computers can finally emulate the clever combination used by nature: »» Genetics: Slow learning from one generation to the next »» Teaching: Fast learning from organized sources »» Exploration: Spontaneous learning through media and interactions with others To keep machine learning concepts in line with what the machine can actually do, you need to consider specific machine learning uses. It’s useful to view uses of machine learning outside the normal realm of what many consider the domain of AI. Here are a few uses for machine learning that you might not associate with an AI: »» Access control: In many cases, access control is a yes-or-no proposition. An employee smartcard grants access to a resource in much the same way as people have used keys for centuries. Some locks do offer the capability to set times and dates that access is allowed, but such coarse-grained control doesn’t really answer every need. By using machine learning, you can determine whether an employee should gain access to a resource based on role and need. For example, an employee can gain access to a training room when the training reflects an employee role. »» Animal protection: The ocean might seem large enough to allow animals and ships to cohabitate without problem. Unfortunately, many animals get hit by ships each year. A machine learning algorithm could allow ships to avoid animals by learning the sounds and characteristics of both the animal and the ship. (The ship would rely on underwater listening gear to track the animals through their sounds, which you can actually hear a long distance from the ship.) »» Predicting wait times: Most people don’t like waiting when they have no idea of how long the wait will be. Machine learning allows an application to determine waiting times based on staffing levels, staffing load, complexity of the problems the staff is trying to solve, availability of resources, and so on. Moving from machine learning to deep learning Deep learning is a subset of machine learning, as previously mentioned. In both cases, algorithms appear to learn by analyzing huge amounts of data (however, learning can occur even with tiny datasets in some cases). However, deep learning varies in the depth of its analysis and the kind of automation it provides. You can summarize the differences between the two like this: 16 PART 1 Discovering Deep Learning »» A completely different paradigm: Machine learning is a set of many different techniques that enable a computer to learn from data and to use what it learns to provide an answer, often in the form of a prediction. Machine learning relies on different paradigms such as using statistical analysis, finding analogies in data, using logic, and working with symbols. Contrast the myriad techniques used by machine learning with the single technique used by deep learning, which mimics human brain functionality. It processes data using computing units, called neurons, arranged into ordered sections, called layers. The technique at the foundation of deep learning is the neural network. »» Flexible architectures: Machine learning solutions offer many knobs (adjustments) called hyperparameters that you tune to optimize algorithm learning from data. Deep learning solutions use hyperparameters, too, but they also use multiple user-configured layers (the user specifies number and type). In fact, depending on the resulting neural network, the number of layers can be quite large and form unique neural networks capable of specialized learning: Some can learn to recognize images, while others can detect and parse voice commands. The point is that the term deep is appropriate; it refers to the large number of layers potentially used for analysis. The architecture consists of the ensemble of different neurons and their arrangement in layers in a deep learning solution. »» Autonomous feature definition: Machine learning solutions require human intervention to succeed. To process data correctly, analysts and scientist use a lot of their own knowledge to develop working algorithms. For instance, in a machine learning solution that determines the value of a house by relying on data containing the wall measures of different rooms, the machine learning algorithm won’t be able to calculate the surface of the house unless the analyst specifies how to calculate it beforehand. Creating the right information for a machine learning algorithm is called feature creation, which is a time- consuming activity. Deep learning doesn’t require humans to perform any feature-creation activity because, thanks to its many layers, it defines its own best features. That’s also why deep learning outperforms machine learning in otherwise very difficult tasks such as recognizing voice and images, understanding text, or beating a human champion at the Go game (the digital form of the board game in which you capture your opponent’s territory). You need to understand a number of issues with regard to deep learning solutions, the most important of which is that the computer still doesn’t understand anything and isn’t aware of the solution it has provided. It simply provides a form of feedback loop and automation conjoined to produce desirable outputs in less time than a human could manually produce precisely the same result by manipu- lating a machine learning solution. CHAPTER 1 Introducing Deep Learning 17 The second issue is that some benighted people have insisted that the deep learn- ing layers are hidden and not accessible to analysis. This isn’t the case. Anything a computer can build is ultimately traceable by a human. In fact, the General Data Protection Regulation (GDPR) (https://eugdpr.org/) requires that humans perform such analysis (see the article at https://www.pcmag.com/commentary/ 361258/how-gdpr-will-impact-the-ai-industry for details). The requirement to perform this analysis is controversial, but current law says that someone must do it. The third issue is that self-adjustment goes only so far. Deep learning doesn’t always ensure a reliable or correct result. In fact, deep learning solutions can go horribly wrong (see the article at https://www.theverge.com/2016/3/24/ 11297050/tay-microsoft-chatbot-racist for details). Even when the applica- tion code doesn’t go wrong, the devices used to support the deep learning can (see the article at https://www.pcmag.com/commentary/361918/learning-from- alexas-mistakes?source=SectionArticles for details). Even so, with these problems in mind, you can see deep learning used for a number of extremely popular applications, as described at https://medium.com/@vratulmittal/top- 15-deep-learning-applications-that-will-rule-the-world-in-2018-and- beyond-7c6130c43b01. Using Deep Learning in the Real World Make no mistake: People do use deep learning in the real world to perform a broad range of tasks. For example, many automobiles today use a voice interface. The voice interface can perform basic tasks, even right from the outset. However, the more you talk to it, the better the voice interface performs. The interface learns as you talk to it — not only the manner in which you say things, but also your personal preferences. The following sections give you a little information on how deep learning works in the real world. Understanding the concept of learning When humans learn, they rely on more than just data. Humans have intuition, along with an uncanny grasp of what will and what won’t work. Part of this inborn knowledge is instinct, which is passed from generation to generation through DNA. The way humans interact with input is also different from what a computer will do. When dealing with a computer, learning is a matter of building a database consisting of a neural network that has weights and biases built into it to ensure proper data processing. The neural network then processes data, but not in a manner that’s even remotely the same as what a human will do. 18 PART 1 Discovering Deep Learning Performing deep learning tasks Humans and computers are best at different tasks. Humans are best at reasoning, thinking through ethical solutions, and being emotional. A computer is meant to process data — lots of data — really fast. You commonly use deep learning to solve problems that require looking for patterns in huge amounts of data — problems whose solution is nonintuitive and not immediately noticeable. The article at http://www.yaronhadad.com/deep-learning-most-amazing-applications/ tells you about 30 different ways in which people are currently using deep learning to perform tasks. In just about every case, you can sum up the problem and its solution as processing huge amounts of data quickly, looking for patterns, and then relying on those patterns to discover something new or to create a particular kind of output. Employing deep learning in applications Deep learning can be a stand-alone solution, as illustrated in this book, but it’s often used as part of a much larger solution and mixed with other technologies. For example, mixing deep learning with expert systems is not uncommon. The article at https://www.sciencedirect.com/science/article/pii/0167923694900213 describes this mixture to some degree. However, real applications are more than just numbers generated from some nebulous source. When working in the real world, you must also consider various kinds of data sources and understand how those data sources work. A camera may require a different sort of deep learning solution to obtain information from it, while a thermometer or proximity detector may output simple numbers (or analog data that requires some sort of processing to use). Real-world solutions are messy, so you need to be prepared with more than one solution to problems in your toolkit. Considering the Deep Learning Programming Environment You may automatically assume that you must jump through a horrid set of hoops and learn esoteric programming skills to delve into deep learning. It’s true that you gain flexibility by writing applications using one of the programming languages that work well for deep learning needs. However, Deep Learning Studio (see the article at https://towardsdatascience.com/is-deep-learning-without-pro gramming-possible-be1312df9b4a for details) and other products like it are enabling people to create deep learning solutions without programming. Essen- tially, such solutions involve describing what you want as output by defining a CHAPTER 1 Introducing Deep Learning 19 model graphically. These kinds of solutions work well for straightforward prob- lems that others have already had to solve, but they lack the flexibility to do some- thing completely different — a task that requires something more than simple analysis. Deep learning solutions in the cloud, such as that provided by Amazon Web Services (AWS) (https://aws.amazon.com/deep-learning/), can give you addi- tional flexibility. These environments also tend to make the development environment simpler by providing as much or little support as you want. In fact, AWS provides support for various kinds of serverless computing (https://aws. amazon.com/serverless/) in which you don’t worry about any sort of infrastruc- ture. However, these solutions can become quite expensive. Even though they give you greater flexibility than using a premade solution, they still aren’t as flexible as using an actual development environment. You have other nonprogramming solutions to consider as well. For example, if you want power and flexibility, but don’t want to program to get it, you could rely on a product such as MATLAB (https://www.mathworks.com/help/deeplearning/ ug/deep-learning-in-matlab.html), which provide a deep learning toolkit. MATLAB and certain other environments do focus more on the algorithms you want to use, but to gain full functionality from them, you need to write scripts as a minimum, which means that you’re dipping your toe into programming to some extent. A problem with these environments is that they can also be lacking in the power department, so some solutions may take longer than you expect. At some point, no matter how many other solutions you try, serious deep learning problems will require programming. When reviewing the choices online, you often see AI, machine learning, and deep learning all lumped together. However, just as the three technologies work at different levels, so do the programming languages that you require. A good deep learning solution will require the use of multiprocessing, preferably using a Graphics Processing Unit (GPU) with lots of cores. Your language of choice must also support the GPU through a compatible library or package. So, just choosing a language usually isn’t enough; you need to investigate further to ensure that the language will actually meet your needs. With this caution in mind, here are the top languages (in order of popularity, as of this writing) for deep learning use (as defined at https://www.datasciencecentral. com/profiles/blogs/which-programming-language-is-considered-to-be-best- for-machine): »» Python »» R »» MATLAB (the scripting language, not the product) »» Octave 20 PART 1 Discovering Deep Learning The only problem with this list is that other developers have other opinions. Python and R normally appear at the top of everyone’s lists, but after that you can find all sorts of opinions. The article at https://www.geeksforgeeks.org/top- 5-best-programming-languages-for-artificial-intelligence-field/ gives you some alternative ideas. When choosing a language, you usually have to consider these issues: »» Learning curve: Your experiences have a lot to say about what you find easiest to learn. Python is probably the best choice for someone who has programmed for a number of years, but R might be the better choice for someone who has already experienced functional programming. MATLAB or Octave might work best for a math professional. »» Speed: Any sort of deep learning solution will require a lot of processing power. Many people say that because R is a statistical language, it offers more in the way of statistical support and usually provides a faster result. Actually, Python’s support for great parallel programming probably offsets this advantage when you have the required hardware. »» Community support: Many forms of community support exist, but the two that are most important for deep learning are help in defining a solution and access to a wealth of premade programming aids. Of the four, Octave probably provides the least in the way of community support; Python provides the most. »» Cost: How much a language costs depends on the kind of solution you choose and where you run it. For example, MATLAB is a proprietary product that requires purchase, so you have something invested immediately when using MATLAB. However, even though the other languages are free at the outset, you can find hidden costs, such as running your code in the cloud to gain access to GPU support. »» DNN Frameworks support: A framework can make working with your language significantly easier. However, you have to have a framework that works well with all other parts of your solution. The two most popular frameworks are TensorFlow and PyTorch. Oddly enough, Python is the only language that supports both, so it offers you the greatest flexibility. You use Caffe with MATLAB and TensorFlow with R. »» Production ready: A language has to support the kind of output needed for your project. In this regard, Python shines because it’s a general-purpose language. You can create any sort of application needed with it. However, the more specific environments provided by the other languages can be incredibly helpful with some projects, so you need to consider all of them. CHAPTER 1 Introducing Deep Learning 21 Overcoming Deep Learning Hype Previous parts of this chapter discuss some issues with the perception of deep learning, such as some people’s belief that it appears everywhere and does everything. The problem with deep learning is that it has been a victim of its own media campaign. Deep learning solves specific sorts of problems. The following sections help you avoid the hype associated with deep learning. Discovering the start-up ecosystem Using a deep learning solution is a lot different from creating a deep learning solution of your own. The infographic at https://www.analyticsvidhya.com/ blog/2018/08/infographic-complete-deep-learning-path/ gives you some ideas on how to get started with Python (a process this book simplifies for you). The educational requirements alone can take a while to fulfill. However, after you have worked through a few projects on your own, you begin to realize that the hype surrounding deep learning extends all the way to the start of setup. Deep learning isn’t a mature technology, so trying to use it is akin to building a village on the moon or deep diving the Marianas Trench. You’re going to encounter issues, and the technology will constantly change on you. Some of the methods used to create deep learning solutions need work, too. The concept of a computer actually learning anything is false, as is the idea that computers have any form of sentience at all. The reason that Microsoft, Amazon, and other vendors have problems with deep learning is that even their engineers have unrealistic expectations. Deep learning comes down to math and pattern matching — really fancy math and pattern matching, to be sure, but the idea that it’s anything else is simply wrong. Knowing when not to use deep learning Deep learning is only one way to perform analysis, and it’s not always the best way. For example, even though expert systems are considered old technology, you can’t really create a self-driving car without one for the reasons described at https://aitrends.com/ai-insider/expert-systems-ai-self-driving-cars- crucial-innovative-techniques/. A deep learning solution turns out to be way too slow for this particular need. Your car will likely contain a deep learning solution, but you’re more likely to use it as part of the voice interface. 22 PART 1 Discovering Deep Learning AI in general and deep learning in particular can make the headlines when the technology fails to live up to expectations. For example, the article at https:// www.techrepublic.com/article/top-10-ai-failures-of-2016/ provides a list of AI failures, some of which relied on deep learning as well. It’s a mistake to think that deep learning can somehow make ethical decisions or that it will choose the right course of action based on feelings (which no machine has). Anthropo- morphizing the use of deep learning will always be a mistake. Some tasks simply require a human. Speed and the capability to think like a human are the top issues for deep learning, but there are many more. For example, you can’t use deep learning if you don’t have sufficient data to train it. In fact, the article at https://www.sas.com/en_us/ insights/articles/big-data/5-machine-learning-mistakes.html offers a list of five common mistakes that people make when getting into machine learn- ing and deep learning environments. If you don’t have the right resources, deep learning will never work. CHAPTER 1 Introducing Deep Learning 23 IN THIS CHAPTER »» Considering what machine learning involves »» Understanding the methods used to achieve machine learning »» Using machine learning for the correct reasons Chapter 2 Introducing the Machine Learning Principles A s discussed in Chapter 1, the concept of learning for a computer is different from the concept of learning for humans. However, Chapter 1 doesn’t really describe machine learning, the kind of learning a computer uses, in any depth. After all, what you’re really looking at is an entirely different sort of learning that some people would view as a combination of math, pattern matching, and data storage. This chapter begins by pointing the way to a deeper understanding of how machine learning works. However, an explanation of machine learning doesn’t completely help you understand what’s going on when you work with it. How machine learning works is also important, which is the subject of the next section of the chapter. In this section, you discover that no perfect methods exist for performing analysis. You may have to experiment with your analysis to get the expected output. In addition, different approaches to machine learning are available, and each has advantages and disadvantages. The third part of the chapter takes what you’ve discovered in the previous two parts and helps you apply it. No matter how you shape your data and perform analysis on it, machine learning is the wrong approach in some cases and will never provide you with useful output. Knowing the right uses for machine learning CHAPTER 2 Introducing the Machine Learning Principles 25 is essential if you want to receive consistent output that helps you perform interesting tasks. The whole purpose of machine learning is to learn something interesting from the data and then to do something interesting with it. Defining Machine Learning Here’s a short definition of machine learning: It’s an application of AI that can automatically learn and improve from experience without being explicitly pro- grammed to do so. The learning occurs as a result of analyzing ever increasing amounts of data, so the basic algorithms don’t change, but the code’s internal weights and biases used to select a particular answer do. Of course, nothing is quite this simple. The following sections discuss more about what machine learning is so that you can understand its place within the world of AI and what deep learning acquires from it. Data scientists often refer to the technology used to implement machine learning as algorithms. An algorithm is a series of step-by-step operations, usually computations, that can solve a defined problem in a finite number of steps. In machine learning, the algorithms use a series of finite steps to solve the problem by learning from data. Understanding how machine learning works Machine learning algorithms learn, but it’s often hard to find a precise meaning for the term learning because different ways exist to extract information from data, depending on how the machine learning algorithm is built. Generally, the learning process requires huge amounts of data that provides an expected response given particular inputs. Each input/response pair represents an example and more examples make it easier for the algorithm to learn. That’s because each input/response pair fits within a line, cluster, or other statistical representation that defines a problem domain. Learning is the act of optimizing a model, which is a mathematical, summarized representation of data itself, such that it can pre- dict or otherwise determine an appropriate response even when it receives input that it hasn’t seen before. The more accurately the model can come up with cor- rect responses, the better the model has learned from the data inputs provided. An algorithm fits the model to the data, and this fitting process is training. Figure 2-1 shows an extremely simple graph that simulates what occurs in machine learning. In this case, starting with input values of 1, 4, 5, 8, and 10 and pairing them with their corresponding outputs of 7, 13, 15, 21, and 25, the machine 26 PART 1 Discovering Deep Learning learning algorithm determines that the best way to represent the relationship between the input and output is the formula 2x + 5. This formula defines the model used to process the input data — even new, unseen data —to calculate a corresponding output value. The trend line (the model) shows the pattern formed by this algorithm, such that a new input of 3 will produce a predicted output of 11. Even though most machine learning scenarios are much more complicated than this (and the algorithm can’t create rules that accurately map every input to a pre- cise output), the example gives provides you a basic idea of what happens. Rather than have to individually program a response for an input of 3, the model can compute the correct response based on input/response pairs that it has learned. FIGURE 2-1: Visualizing a basic machine learning scenario. Understanding that it’s pure math The central idea behind machine learning is that you can represent reality by using a mathematical function that the algorithm doesn’t know in advance, but which it can guess after seeing some data (always in the form of paired inputs and outputs). You can express reality and all its challenging complexity in terms of unknown mathematical functions that machine learning algorithms find and make available as a modification of their internal mathematical function. That is, every machine learning algorithm is built around a modifiable math function. The function can be modified because it has internal parameters or weights for such a purpose. As a result, the algorithm can tailor the function to specific information taken from data. This concept is the core idea for all kinds of machine learning algorithms. Learning in machine learning is purely mathematical, and it ends by associating certain inputs with certain outputs. It has nothing to do with understanding what CHAPTER 2 Introducing the Machine Learning Principles 27 the algorithm has learned. (When humans analyze data, we build an understand- ing of the data to a certain extent.) The learning process is often described as training because the algorithm is trained to match the correct answer (the output) to every question offered (the input). (Machine Learning For Dummies, by John Paul Mueller and Luca Massaron, [Wiley], describes how this process works in detail.) In spite of lacking deliberate understand

deep-learning-dummies.pdf

Document Details

Tags

Related

Full Transcript

Upgrade to continue