Elective III - Machine Learning Reviewer.pdf
Document Details
Uploaded by Deleted User
Tags
Full Transcript
Elective III - Machine Learning Reviewer Introduction to ML ► Machine learning is about extracting knowledge from data ► Research Fields, statistics, artificial intelligence, and computer science ► predictive analytics or statistical learning ► Early days of “intelligent” applications – rules of “...
Elective III - Machine Learning Reviewer Introduction to ML ► Machine learning is about extracting knowledge from data ► Research Fields, statistics, artificial intelligence, and computer science ► predictive analytics or statistical learning ► Early days of “intelligent” applications – rules of “if” and “else” decisions to process data ►automate decision-making processes – successful ML Two Types of ML ►Supervised learning - user provides the algorithm with pairs of inputs and desired outputs ►the algorithm is able to create an output for an input it has never seen before without any help from a human ►Ex. Spam classification Examples of Supervised Machine Learning Tasks ►Identifying the zip code from handwritten digits on an envelope ►Determining whether a tumor is benign based on a medical image ►Detecting fraudulent activity in credit card transactions ►Unsupervised learning - only the input data is known, and no known output data is given to the algorithm. ►harder to understand and evaluate ►Examples: ►large collection of text data ►Segmenting customers into groups with similar preferences ►Detecting abnormal access patterns to a website Essential Libraries ► Scikit learn – contains machine learning algorithms ► Numpy - functionality for multidimensional arrays, linear algebra operations, Fourier transform, and pseudorandom number generators ► SciPy- advanced linear algebra routines, mathematical function optimization, signal processing, special mathematical functions, and statistical distributions. ► matplotlib - is the primary scientific plotting library in Python, publication-quality visualizations such as line charts, histograms, scatter plots, and so on ► pandas - is a Python library for data wrangling and analysis. It is built around a data structure called the DataFrame Types of Supervised Machine Learning ►Classification tasks ► The goal is to predict a class label, which is a choice from a predefined list of possibilities. ►Types of classification ►Binary – distinguish between 2 classes. (Yes or no, positive or negative, spam or not) ►Multiclass classification - classification between more than two classes ►Regression tasks ►to predict a continuous number ►person’s annual income from their education and their age ►Predicted value – amount of income,any numeric value Generalization, overfitting and underfitting ► Generalization - If a model can make accurate predictions on unseen data ► We want to build a model that can generalize as accurately as possible ► Overfitting occurs when you fit a model too closely to the particularities of the training set and obtain a model that works well on the training set but is not able to generalize to new data ► Underfitting refers to a model that can neither model the training data nor generalize to new data. Preprocessing Data Using Different Technique ► Data Normalization - is a rescaling of the data from the original range so that all values are within the new range of 0 and 1 ► Data Standardization - scales each input variable separately by subtracting the mean (called centering) and dividing by the standard deviation to shift the distribution to have a mean of zero and a standard deviation of one. ► Label Encoding – transforming the word label into numerical form so that the algorithm can understand how to operate them AN OVERVIEW OF BUSINESS INTELLIGENCE, ANALYTICS, AND DECISION SUPPORT Changing Business Environment & Computerized Decision Support Companies are moving aggressively to computerized support of their operations ⇨ ◦Business Intelligence Business Pressures–Responses–Support Model Business pressures result of today's competitive business climate Responses to counter the pressures Support to better facilitate the process The Business Environment The environment in which organizations operate today is becoming more and more ◦complex, creating opportunities, and problems. Example: globalization. Business environment factors: markets, consumer demands, technology, and societal... Organizational Responses Be Reactive, Anticipative, Adaptive, and Proactive Managers may take actions, such as Employ strategic planning. Use new and innovative business models. Restructure business processes. Participate in business alliances. Improve corporate information systems.... more [in your book] Managerial Decision Making Management is a process by which organizational goals are achieved by using resources. Inputs: resources Output: attainment of goals Measure of success: outputs / inputs Management ≅ Decision Making Decision making: selecting the best solution from two or more alternatives The Nature of Managers’ Work Mintzberg's 10 Managerial Roles Interpersonal 1. Figurehead 2. Leader 3. Liaison Informational 4. Monitor 5. Disseminator 6. Spokesperson Decisional 7. Entrepreneur 8. Disturbance handler 9. Resource allocator 10. Negotiator Decision-Making Process Managers usually make decisions by following a four-step process (a.k.a. the scientific approach) 1. Define the problem (or opportunity) 2. Construct a model that describes the real-world problem. 3. Identify possible solutions to the modeled problem and evaluate the solutions. 4. Compare, choose, and recommend a potential solution to the problem. Information Systems Support for Decision Making Group communication and collaboration Improved data management Managing data warehouses and Big Data Analytical support Overcoming cognitive limits in processing and storing information Knowledge management Anywhere, anytime support An Early Decision Support Framework Degree of Structuredness (Simon, 1977) Decisions are classified as Highly structured (a.k.a. programmed) Semi-structured Highly unstructured (i.e., nonprogrammed) Types of Control (Anthony, 1965) Strategic planning (top-level, long-range) Management control (tactical planning) Operational control The Concept of DSS DSS - interactive computer-based systems, which help decision makers utilize data and models to solve unstructured problems (Gorry and Scott-Morton, 1971) Decision support systems couple the intellectual resources of individuals with the capabilities of the computer to improve the quality of decisions. DS as an Umbrella Term Evolution of DS into Business Intelligence Framework for Business Intelligence (BI) BI is an evolution of decision support concepts over time Then: Executive Information System Now: Everybody’s Information System (BI) BI systems are enhanced with additional visualizations, alerts, and performance measurement capabilities The term BI emerged from industry Definition of BI BI is an umbrella term that combines architectures, tools, databases, analytical ◦ tools, applications, and methodologies BI is a content-free expression, so it means different things to different people BI's major objective is to enable easy access to data (and models) to provide business managers with the ability to conduct analysis ◦BI helps transform data, to information (and knowledge), to decisions, and finally to action A Brief History of BI The term BI was coined by the Gartner Group in the mid-1990s However, the concept is much older 1970s - MIS reporting - static/periodic reports 1980s - Executive Information Systems (EIS) 1990s - OLAP, dynamic, multidimensional, ad-hoc reporting -> coining of the term “BI” 2010s - Inclusion of AI and Data/Text Mining capabilities; Web-based Portals/Dashboards, Big Data, Social Media, Analytics 2020s - yet to be seen The Architecture of BI A BI system has four major components a data warehouse, with its source data business analytics, a collection of tools for manipulating, mining, and analyzing the data in the data warehouse business performance management (BPM) for monitoring and analyzing performance a user interface (e.g., dashboard) Analytics Overview Analytics? Something new or just a new name for... A Simple Taxonomy of Analytics (proposed by INFORMS) Descriptive Analytics Predictive Analytics Prescriptive Analytics Analytics or Data Science?