BCIS Notes PDF
Document Details
Uploaded by Deleted User
Tags
Related
- UNSW Business School Information Systems and Technology Management INFS3604 Business Process Management PDF
- Management Information Systems: Managing the Digital Firm PDF
- Management Information Systems Managing the Digital Firm PDF
- Marketing Information Systems PDF
- Marketing Information Systems PDF
- Management Information Systems Lecture Notes PDF
Summary
These notes provide an overview of management information systems (MIS), focusing on data as a strategic asset and insight as currency. Information systems are presented as organized collections of people, procedures, software, databases, and devices that support decision-making. The notes also touch upon the three phenomena powering the fourth industrial revolution - data explosion, increasing storage capacity, and increasing computing power.
Full Transcript
BCIS NOTES LESSON 1: MANAGEMENT INFORMATION SYSTEMS - Data is the new oil (a strategic asset) - Insight is the new currency - Information system: a system designed to collect data, store, process, output information. - Management Information Systems: an organized collection of peop...
BCIS NOTES LESSON 1: MANAGEMENT INFORMATION SYSTEMS - Data is the new oil (a strategic asset) - Insight is the new currency - Information system: a system designed to collect data, store, process, output information. - Management Information Systems: an organized collection of people, procedures, software, databases, and devices that provide routine information for decision-making. THREE PHENOMENA THAT POWER THE FOURTH INDUSTRIAL REVOLUTION: - Data Explosion - Increasing Storage Capacity - Increasing Computing Power - Sensors are everywhere collecting and sending data about us, our activities, and our environments - The old computer can execute 5,000 instructions per second - Our phones now can execute 25 billion instructions per second There are about 10 sensors in a phone. 1. Camera 2. Pedometer 3. Light Sensor 4. Thermometer 5. Fingerprint Sensor 6. Microphone 7. Gyroscope 8. Accelerometer 9. Magnetometer 10. Proximity Sensor NETFLIX VS BLOCKBUSTER - Blockbuster is membership-based movie rental company - Go in-store to browse and physically borrow physical disks - Need to return within a certain period - Netflix started as an over the mail DVD rental company - Customers pay a fixed subscription fee to request movies through the website - Netflix pays for the postage of the disks to customers, no late fees FIRST MOVER ADVANTAGE - First mover advantage is the competitive benefit a company gains by being the first to enter a new market or industry. It allows the company to build strong brand recognition, establish customer loyalty, capture significant market share, and secure critical resources before competitors arrive. However, this advantage isn't guaranteed, as fast followers can learn from the first mover's mistakes and enter the market with improvements. POWER OF THE LONG TAIL - The "power of the long tail" refers to the business strategy of selling a large number of niche products, each with relatively low demand, rather than focusing solely on a few bestsellers. This concept, popularized by Chris Anderson, highlights how the internet and digital platforms have made it easier and more cost-effective to reach diverse audiences with specific tastes. By offering an extensive variety of products, companies can tap into a broader market, collectively generating significant revenue from niche items that wouldn't be viable in traditional, physical retail settings. - Selection attracts customers, and - the internet allows large-selection inventory efficiencies that offline firms can’t match GREAT EXPECTATIONS FOR AI - 91% survey respondents expect new business value from AI implementations in the coming five years INVESTMENTS IN AI, 2021 - 99% survey respondents say they’re investing in AI LESSON 2: FIVE FORCES OF COMPETITION, ETHICS 1. Potential new entrants - The ease or difficulty with which new competitors can enter the market and challenge established companies. 2. Threat of substitute products or services - The likelihood that customers will switch to alternatives, which can limit growth and profitability. 3. Power of suppliers - The power suppliers have to influence the price of inputs, which can affect profitability for companies. 4. Power of buyers - The ability of customers to influence prices and demand better quality or service. 5. Rivalry among existing competitors - The intensity of competition among existing competitors in the industry, which can impact pricing, customer loyalty, and innovation. The power of buyers and the power of suppliers are two forces in Porter's Five Forces model, and they represent opposing influences in an industry: Power of Buyers: This refers to the ability of customers to influence pricing, demand higher quality, or negotiate better terms. Buyers have more power when they can easily switch to other suppliers, purchase in large volumes, or if the products are undifferentiated. When buyer power is strong, companies may face pressure to lower prices or improve product offerings. Power of Suppliers: This refers to the influence suppliers have over the prices of inputs, the quality of goods, and the terms of supply. Suppliers have more power when there are few alternatives, their products are essential or differentiated, or switching suppliers is costly for the company. When supplier power is strong, companies may face increased costs, which can squeeze their profit margins. Price transparency solves information asymmetry. Information asymmetry - refers to the disparity of information available to different parties in the market. Planned Obsolescence Practice of designing products to break quickly or become obsolete in the short to mid-term period. Perceived Obsolescence (Fashion) Artificially limiting useful life (Light Bulb, Phoebus Cartel) Purposely frail design (Batteries) Prevention of Repairs (Apple) Systemic Obsolescence (Computer ports) Programmed Obsolescence (Printer inks) Legal vs Ethical 1. Legality - accordance to law - Subject to regulations - Punishable by law 2. Ethics - accordance to personal “morality” or Code of Conduct - Social, Religious, or Personal Norms - Reputation The two are not synonymous but are also not mutually exclusive’ WHAT DO YOU CONSIDER ETHICAL? 1. Utilitarian- greatest benefit for the greatest number of people - Trolley problem 2. Deontological- adherence to “Duty” regardless of consequences - Honesty despite unfavorable outcomes 3. Virtue Ethics- emphasized individual’s character and personality - Courage, honesty, and wisdom 4. Ethics of Care- importance of interpersonal relationship - Empathy and compassion 5. Rights-based- uphold the rights of people - Civil rights and democracy ACTING WITH INTEGRITY - Whenever we do business, we need to establish a set of standards that we need to uphold (i.e. ethics) Ethics→ Decisions → Success Integrity - Unity between what we say and what we do - Strongly adhering to a code of ethics - Implies trustworthiness and incorruptibility ETHICS AND PROFITABILITY - Focused and happy employees - Worthwhile Products - Services that meet consumer demand - Expert management teams Simply put, being ethical is simply good business!!! LESSON 3: ETHICAL AI 1. Explainability - complex black boxes that are not interpretable - can only monitor input and output - in machine learning, a black box refers to a process where we know the inputs and outputs but not the internal workings that produce the output. This lack of transparency raises issues, as it's unclear how the AI makes decisions. 2. Responsibility - attribution of consequences - especially true for autonomous systems 3. Fairness - reduction of discrimination - AI has can propagate biases 4. Misuse - reduction of risks - introduction of security measures 5. Privacy - surveillance concerns privacy rights - protection of personal information 6. Security - protection of well-being - safety regulations across domains (employment and welfare, cyber, military) 7. Equity - accessibility for all - inclusion across sectors (including the marginalized) LESSON 4: COMPUTERS 1. Data Explosion 2. Increasing Storage Capacity 3. Increasing Computing Power HISTORY OF COMPUTERS 1. 200 BC - 70 BC - First known computer is The Antikythera Mechanism - computing devices were mechanical 2. 1613 - the word “computer” first used but as a job title - computers are people 3. 1822 - Charles Babbage begins designs for a Difference Engine - difference engine was able to interpolate functions using polynomial approximations 4. 1833 - Charles Babbage begins designs for the Analytical Engine 5. 1847-1849 - Charles Babbage begins design for Difference Engine no. 2 6. 1897 - the word “computer” first used to describe a machine 7. 1936 - Alan Turing writes “On Computable Numbers, with an Application to the Entscheidungsproblem” and Invents computer science - Turing developed the foundations of theoretical computer science Alan Turing - theorized the possibility of building a general purpose computer, a Turing machine, that can accept inputs, store and process data, and then generate outputs from it - Turing famously theorized that machines could ultimately fool humans into thinking they are human - “Can machines think?” - Computing Machinery and Intelligence, 1950. Lays the groundwork for Artificial Intelligence research While Alan Turing conceptualized/theorized computers, Charles Babbage invented it. CAPTCHA - Completely Automated Public Turing Test To Tell Computers and Humans Apart COMPUTER - is a machine or device that performs processes, calculations and operations based on instructions - has the ability to accept data (input), process it, and then produce outputs - 15 years ago, most popular form is the desktop computer - Now, it’s phones OTHER COMPUTER COMPONENTS Peripherals vs Internals Processor - Composed of micro (now nano) transistors, (7-10 nm) - Brain of a computing device - Typically measured using number of cores (individual processing units) and clock speed (GHz) - Applies to both CPU (Central Processing Unit) and GPU (Graphics Processing Unit) RAM - Random-Access Memory - Short-term or temporary storage for memory (volatile) - Higher RAM —> larger datasets - Measured in capacity (GB) and speed (MHz) Motherboard- the primary circuit board in a computer, the motherboard provides the means for other hardware components to communicate Storage- where data is stored long-term Power Supply Unit (PSU)- this converts power from the outlet into a usable form for the computer components; it also distributes power to these components Cooling System- ensures that core components remain within safe temperature limits; can be fans, heat sinks, or even liquid cooling OTHER COMPUTER COMPONENTS - Internal - Audio Card - Network Card - Peripheral (Typically I/O Devices) - Webcams - Mouse and Keyboard - Monitor - External Storage - Audio Devices HOW DO COMPUTERS STORE DATA? - Bits and Bytes - Computers express data as a bit that is either one or zero, or byte that is a group of eight bits - You can think of a byte as a single character you type in a keyboard BINARY NUMBERS - Binary - 2 states, given by 0 or 1 - We can convert binary to decimal (10) to hexadecimal (16) - You treat each digit of binary number as 2^x where x is the position from the right starting at 0 HOW DO COMPUTERS STORE DATA? 1. Combination of electric components: Transistors and Capacitors 2. Transistors- process electrical signals 3. Capacitors- store electrical signals (little batteries) TRANSISTORS AND CAPACITORS - Very small devices - Very very small devices (7 to 15 nanometers) HOW SMALL IS A NANOMETER? 1. Meter- arm’s length 2. Centimeter- width of a fingernail 3. Millimeter- thickness of a credit card 4. Micrometer (or micron)- width of a single strand of human hair (generally between 50-100 micrometers) 5. Nanometer- the size of a DNA molecule’s diameter (about 2nm for a double helix) MOORE’S LAW 1. Gordon Moore, founder of Intel 2. “Number of transistors on integrated circuits (or chips) was doubling approximately every two years” - This would imply an exponential increase in computing power 3. Is not an actual physical or natural law but rather an observation DO YOU THINK THIS WILL CONTINUE FOREVER? NO!!!!! THERE IS A PHYSICAL LIMIT TO OUR DEVICES BEYOND MOORE’S LAW SIZE HEAT POWER PHYSICAL LIMITS: INCREASED DENSITY: HIGHER POWER reduction to atomic scales more transistors in small CONSUMPTION: more where traditional space increases heat density. transistors —> more power semiconductor behavior Transistors dissipate heat, consumption; even if starts to break down which can become a individual transistors might be challenge when there are using less power billions on a small chip QUANTUM EFFECTS: THERMAL STRESS: BATTERY LIFE: increased extremely small scales excessive head induce power consumption introduces quantum effects thermal stress on the chip, translates to reduced battery such as electron tunneling reducing its lifespan and life leading to data corruption and reliability other errors MANUFACTURING COOLING CHALLENGES: LEAKAGE CURRENT: CHALLENGES: more cooling solutions become transistors shrinking lead to precision required in more complex and more leakage current (current manufacturing increasing the expensive; traditional air that flows even when the complexity and cost of chip cooling becomes inadequate, transistor is off). This fabrication requiring liquid or even more increases power advanced cooling solutions consumption, especially in idle states VOLTAGE SCALING CHALLENGES: reduction of voltage to save power can lead to reliability issues, as lower voltages can be more susceptible to variations and noise QUANTUM COMPUTERS - Uses the principles of quantum theory; i.e. instead of bits, we use quantum bits or qubits - Physically, these qubits are the spin states of the subatomic particles of our atoms 1 Byte 8 bits 1 Kilobyte (KB) One thousand bytes 1 Megabyte (MB) One million bytes 1 Gigabyte (GB) One billion bytes 1 Terabyte (TB) One trillion bytes 1 Petabyte (PB) One quadrillion bytes 1 Exabyte (EB) One quintillion bytes 1 Zettabyte (ZB) One septillion bytes PRICE ELASTICITY - Measures the responsiveness of the quantity demanded or supplied of a good to a change in its price FIVE WAVES OF COMPUTING 1. Mainframes (1950s - 1970s)- This was the first wave, where large and expensive mainframe computers dominated. They were primarily used by large institutions and governments 2. Minicomputers (Late 1960s - 1980s)- These were smaller and less expensive than mainframes, making them more accessible to medium-sized businesses 3. Personal Computers [PCs] (Late 1970s - 2000s)- This wave marked the advent of individual computing power on desktops, initially driven by companies like Apple and IBM. Microsoft’s Windows and Intel’s processors (often referred to as the Wintel dominance) played a significant role in popularizing PCs in household and businesses worldwide 4. Internet and Cloud Computing (1990s - Present)- This wave started with the popularization of the internet and later moved into the concept of cloud computing. Data and application began to shift from local machines to centralized data centers accessible via the internet. Companies like Amazon, Google, and Microsoft became significant plates in offering cloud infrastructure and services 5. Ubiquitous Computing (2000s - Present)- Sometimes referred to as “pervasive computing,” this wave is characterized by the seamless integration of computing capabilities into everyday objects and environments. Devices like smartphones, tablets, smartwatches, and IoT (Internet of Things) devices exemplify this trend. Instead of one single device (like a PC), users might interact with multiple computing devices throughout their day, often simultaneously PROLIFERATION OF UBIQUITOUS PRODUCTS - As products become more elastic, consumer buy more product as they become cheaper - Entire new markets open up GLOW CAP - “Smart” pill bottle that alerts the users if they miss a dose - drug adherence is at just 50 percent - $290 billion in increased medical costs are due to patients missing their meds APPLE INC. - One of the most agile surfer of fifth wave of computing AMAZON 1. 1995 - one terabyte (1TB) of storage for the entire corporate database 2. 2003 - launched “Search inside the Book by digitizing images and text from thousands of books from their catalog, leading to a 7% increase in sales for included 3. 2007 - released Kindle, their first e-reader - Sony released their own electronic paper by 2004 with their Sony Librie 4. 2009 - for books available in Kindle, sales are already at 35% of the same books in print From a strategic viewpoint, current trends indicate that limitations in today’s cost or performance could be overcome tomorrow. This realization opens doors for those who can identify and harness the potential of emerging technologies. As technology progresses, it paves the way for novel industries, business strategies, and products, but it can also render established companies and traditional practices obsolete. It’s imperative for leaders to consistently monitor technological trends and their direction to seize opportunities and sidestep potential upheavals LESSON 5: BUSINESS PROCESS 1. Set of goal-directed activities to accomplish a task a. Personal - Opening a bank account - Booking a flight b. Industry - Fulfill an order - Stock up on inventories 2. Not necessarily computerized WHY DO WE NEED TO IDENTIFY OUR BUSINESS PROCESSES? 1. Efficient 2. Reliable 3. Convenient 4. Effective 5. Economic AS - IS PROCESS 1. “Now” or current state 2. Identify potential gaps or issues with the current operation 3. Process map / flowchart to visualize the steps PROCESS MAPPING 1. Swim lane process map (cross- functional flowchart) 2. Distinguished roles, capabilities, and responsibilities 3. Clear and easy way to communicate complex principles TO - BE PROCESS 1. “Future” or redesigned state 2. Takes into consideration deficiencies identified in the As - Is process 3. Use process map / flowchart to integrate business process redesigns SCOPING PROCESS IMPROVEMENTS 1. Research - Identify main goals and objectives - Identify products and services Current state: What does the client see as the current state of the situation/project? Future state: What is the vision of the client for the end-point of the situation/project? Barriers: What barriers does the client envision will hinder reaching the vision? Enablers: What is the client already doing to reach the vision? What does the client think will help? HOW TO DO THE SCOPING? 1. Personal Interviews - Subject Matter Expert / Domain Expert 2. Direct Observation - Site- visits - On the ground interviews 3. Surveys - Formal written responses 4. Focus Group Discussion - Representatives from relevant stakeholders SMART 1. SPECIFIC - make your goal specific and narrow for more effective planning 2. MEASURABLE - make sure your goal and progress are measurable 3. ACHIEVABLE - make sure you can reasonable accomplish your goal within a certain time frame 4. RELEVANT - your goal should align with your values and long-term objectives 5. TIME - BASED - set a realistic but ambitious end date to clarify task prioritization and increase motivation LESSON 6: DATA GATHERING Enterprise Softwares: CRM SCM ERP Customer Relationship Supply Chain Management: Enterprise Resource Management: manage a manage and optimize the Planning: integrate and company’s interactions with flow of goods, data, and manage the main business current and potential finances in a supply chain processes of an organization, customers from supplier to manufacturer typically using a single, to wholesaler to retailer to unified system consumer Contact management: store Procurement: source and Finance: handle accounting, and retrieve contact order raw materials budgeting, and financial information reporting Sales management: track Production planning: Human Resources: manage leads, prospects, schedule manufacturing personnel, payroll, and opportunities, and sales processes training Customer service: resolve Inventory management: Manufacturing: control customer inquiries and control warehousing and production, quality, and complaints levels of inventory inventory Marketing: manage Logistics: coordinate Sales and Marketing: handle marketing campaigns and transportation of goods sales orders, customer track their effectiveness management, and marketing campaigns Analytics: understand Demand forecasting: predict Procurement: manage customer behavior and sales future demand to plan for suppliers and purchase trends supply needs orders TRANSACTION PROCESSING SYSTEMS (TPS) - Information processing system for collection, modification, and retrieval of all transaction data - Typical example is the Point-of-Sale (POS) Systems and Automated Teller Machines (ATM) - The biggest challenge of a typical POS system is you can't tag sales to specific customers LOYALTY SYSTEMS - Exchange our information → Financial Incentives TRANSACTION DATA IN WEBSITES? 1. COOKIES - Unique data that websites use to identify specific users without the use of usernames and passwords WHY USE COOKIES? Session Management Personalization Tracking - Cookies let websites - Customized - Shopping sites use recognize users and advertising. You may cookies to track items recall their individual view certain items or users previously login information and parts of a site, ad viewed. They will also preferences, such as cookies use this data track and monitor sports news versus to help build targeted performance politics ads that you might analytics, like how enjoy. They’re also many times you used for language visited a page or how preferences as well. much time you spent on a page PIXEL TRACKING - a 1x1 pixel image url is embedded in the email - when the email is opened, the image is loaded - this lets the tracking server know that the email has been accessed - aside from being read or not, the server can also identify the time it was opened and corresponding cookies DATA AGGREGATORS - Organization that collects data from one or more sources, provides some value-added processing, and repackages the result in a usable form DATA AGGREGATION PROCESS Retrieving data from Cleaning and preparing the Combining and organizing multiple sources input data data A data aggregator gathers The collected data is filtered The processed data is data from several sources, and preprocessed to remove merged into a single dataset. such as different databases, any inconsistencies, errors, The final step involves spreadsheets, and HTML or invalid values. This step joining, concatenating, and files ensures that the data is summarizing data into a accurate and consistent meaningful and before being aggregated. easier-to-read form. Next, the filtered data is Generally, this process converted into a format that includes producing simplified makes aggregation easier views, calculating summary statistics, or creating pivot tables PRIVACY REGULATION - General Data Protection Regulation (GDPR) - European Union (EU) regulation on information privacy - Published in April 27, 2016 - Became a model for many other laws around the world 8 USER RIGHTS UNDER GDPR 1. Right to Avoid Automated Decision-Making 2. Right to Object 3. Right to Information 4. Right to Access 5. Right to Rectification 6. Right to Erasure 7. Right to Restriction of Processing 8. Right to Data Portability LESSON 7: THE INTERNET AND THE WEB THE INTERNET VS THE WEB THE INTERNET THE WEB Infrastructure of the global network that Vast amount of information and online content connects smaller local networks together accessed via the internet Uses the TCP / IP Protocol to link devices Uses the HTTP (HTTPS) Protocol to transfer worldwide information THE FIRST NETWORK 1. In 1969, the ARPANET, Advanced Research Project Agency Network, was created between four “nodes”: - UCLA - Stanford Research Institute - UC Santa Barbara - University of Utah 2. The first message sent was “LO” - Supposedly “LOGIN” but the system crashed after O SNEAKERNET - Transferring digital data using physical devices - Can go up to 2 Terabytes per minute or more THE INTERNET - Decentralized system - no center - no owner - Devices just connect to the network via their ISP (Internet Service Providers) IF THERE ARE NO OWNERS, HOW DO WE MAKE SURE THAT WE CAN CONNECT TO EACH OTHER? PROTOCOLS!!!! TCP / IP 1. Transmission Control Protocol / Internet Protocol - TCP - communications standard that allows devices to exchange message - IP - method for sending data from one device to another 2. built in all devices that connects to the internet TCP 1. Reliable Data Transfer: ensures data integrity through error checking and acknowledgments 2. Connection-Oriented: establishes a dedicated path ensuring ordered and intact data delivery 3. Flow Control: manages data transmission rate to match the recipient’s processing capacity 4. Congestion Control: adjusts the sending rate to prevent network overload and ensure stability HOW DOES IT WORK? - Breaks messages into packets (or Datagrams) - to avoid having the resend the entire message in case it encounters a problem during transmission - Packets are reassembled at the destination - Every packet can take different routes between source and destination IP 1. Addressing: assigns unique IP addresses to devices for identification and location 2. Routing: directs packets between source and destination through optimal paths 3. Packet- Based: transmits data in packets, allowing efficient and flexible data transfer 4. Unreliable Delivery: offers a best-effort delivery service without guarantees for packet delivery or order HOW DOES IT WORK? - relay work is done via special computers called routers - routers talk to each other using IP addresses - talk to each other all the time - keeps the internet decentralized and robust TRACING YOUR PACKETS - Tracert for windows - Traceroute for Linux and Mac - Time for three packets to get transferred WHAT CONNECTS ROUTERS AND COMPUTERS - Transmission can happen over different types of connections EXAMPLES: 1. Satellite Internet 2. Fixed Wireless Access 3. Fiber Optic Internet 4. Cable Broadband 5. DSL Internet Satellite Wi-Fi Cable Cellular Global Coverage: Wireless Stable Speeds: Mobility: Cellular satellite internet can Connectivity: Wi-Fi Cable internet internet provides provide connectivity allows devices to generally offers wireless connectivity in remote and rural connect to the stable and consistent on the go, making it areas where other internet wirelessly internet speeds, ideal for mobile types of internet within a specific although it can be devices like services are range, offering affected by the smartphones and unavailable flexibility and number of tablets. mobility simultaneous users in the area. Latency Issues: due Variable Speeds: Physical Network to the long distance Speeds can be Connection: It Congestion: signals must travel to influenced by the requires a physical Speeds can be and from satellites in number of connected cable to connect the affected by the space, there can be devices, interference, user to the internet, number of users on noticeable latency or and the distance limiting mobility but the network, delay in data between the device often offering faster especially in densely transmission and the Wi-Fi source. speeds compared to populated areas or Wi-Fi. during peak usage times. Weather Security Concerns: Availability: Cable Coverage Area: Dependent: the Without proper internet service Internet accessibility connection quality security measures, availability is and speed can vary can be affected by Wi-Fi networks can dependent on the based on the cellular weather conditions, be vulnerable to infrastructure in the coverage in the area, leading to potential unauthorized access area, often making it with potential service service interruptions and cyber threats. unavailable in gaps in remote or or slower speeds remote locations. obstructed locations. during storms OTHER RELEVANT DEVICES 1. Modem - converts analog signals from telephone or cable lines to digital signals readable by devices 2. Switch - connects multiple devices within a Local Area Network (LAN) 3. Access Point - extends the wireless coverage of a network 4. Firewall - Network security that monitors and filters incoming and outgoing traffic 5. Gateway - “Gate” between two networks, often connecting a local network to the internet THE URL - URL (uniform resource locator) defines the address that your web browser will look for http://www.nytimes.com/tech/index.html 1. http: - application transfer protocol 2. www. - host name 3. nytimes - domain name 4..com - top-level domain 5. /tech - path (case sensitive) 6. /index.html - file (case sensitive) PROTOCOLS 1. set of rules for communication - grammar and vocabulary 2. HTTP: HyperText Transfer Protocol 3. HTTPS: HyperText Transfer Protocol Secure 4. FTP: File Transfer Protocol 5. SMTP: Simple Mail Transfer Protocol LESSON 8: ANALYTICS AND DATABASES Big Data - extremely large datasets that are too large or complex to be dealt with by traditional data-processing application software; to put it simply, these are datasets that we cannot process in a single workstation. The three V’s of big data (often characterized by): 1. Volume - quantity of data 2. Velocity - speed of generation 3. Variety - diversity and types of data Missing row for the three V’s of big data is the underutilized data or dark data. We’ve been generating a lot of datasets and we have storage for it. We store and access them through databases. Databases - organized lists or collection of lists of data - Stored and accessed electronically using database management system or sometimes referred to as the database software - Managed by database programmers or database administrators Types of Databases: 1. Document - hierarchical data organization 2. Relational - connected tables; - they use key column to connect tables 3. Analytical - multidimensional arrays - hard to manage because of too much data and we only see the surface information 4. Graphs - nodes and edges - nodes are the individual “point” and edges are the line connections 5. Key value - stored in dictionaries for look-up operations - you have a key that corresponds to a certain value 6. Column-Family - stored in schema-free columns - unstructured excel sheet because there’s no label; the government uses this because each person has many information Relational database - Most popular format of databases; consists of multiple tables connected to each other Schema - blueprint or structures of how data is organized and how relationships are handled Table - individual file or array of data Column - field that defines the data Row - record of a single instance Key - field used to connect or relate databases with each other SQL - standard query language; often pronounced as sequel. SQL is the most common language in creating and manipulating databases. Simple query language in getting the data: Select From Where Data warehouse and Data Marts - Running analytics on your database can bog down the system Data warehouse - set of databases designed to support decision making in an organization Data mart - database focused on addressing the concerns of a specific problem Production DB vs. Staging DB Production database - is the actual output and if you change something in the database, it will show in the actual output. Real-time. Staging database - is the simulation or the editing “database” if you want to change something so that people cannot see what you're changing. Things to consider: 1. Data relevance - what data are needed and does it meet with the current and future goal? 2. Data sourcing - can even get the data we need? Is it clean, complete, accurate, and valuable? 3. Data quantity - how much data is needed? 4. Data quality - can our data be trusted? 5. Data hosting - where will the systems be housed? What are the hardware and networking requirements? 6. Data governance - what rules and processes are needed to manage data from its creation through its retirement? Are there operational issues? Legal issues? Privacy issues? How do we handle security and access? What are the possibilities once you have an efficient working database? Analytics Analytics - extensive use of data, statistical, and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions Three levels of analytical maturity: 1. Descriptive analytics - what happened? 2. Predictive analytics - what will happen? 3. Prescriptive analytics - what will we do to make it happen? Business Intelligence - catchall term combining aspects of reporting, data exploration, and ad hoc queries and sophisticated data modeling and analysis - Essentially management of business information Two of the most popular BI solutions: 1. Tableau 2. PowerBI Advanced Analytics (Use Case): 1. Customer segmentation - figuring out which customers are likely to be the most valuable to a firm 2. Marketing and promotion targeting - identifying which customers will respond to which offers at which price at what time. 3. Market basket analysis - determining which products customers buy together, and how an organization can use this information to cross-sell more products or services. 4. Collaborative filtering - personalizing an individual's customer’s experience based on the trends and preferences identified across similar customers. 5. Customer churn - determining which customers are likely to leave, and what tactics can help the firm avoid unwanted defections. 6. Fraud detection - uncovering patterns consistent with criminal activity 7. Financial modeling - building trading systems to capitalize on historical trends. 8. Hiring and promotion - identifying characteristics consistent with employee success in the firm’s various roles.