IT Chapter Four 2024-2025 PDF
Document Details
Uploaded by CourtlyMilkyWay
Cairo University
Paul Bocij, Andrew Greasley and Simon Hickie
Tags
Summary
This textbook, Business Information Systems, covers the topic of databases and analytics. It explores the concepts of different types of databases (e.g., flat-file, free-form, hypertext), and the advantages of database systems. It also introduces data warehouses and big data, along with data mining techniques.
Full Transcript
Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved CHAPTER 4 Databases and analytics Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Learning objectives: After studying this chapter, the students will be able to: LO 1. U...
Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved CHAPTER 4 Databases and analytics Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Learning objectives: After studying this chapter, the students will be able to: LO 1. Understand the use of database application software LO 2. Understand the concept of data warehouses LO 3. Understand the concept of analytics and the use of big data Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Learning Objective 1: Understand the use of database application software This objective covers the following points: LO 1.1- What is a database? LO 1.2- Business-Level advantages of databases LO 1.3- An overview of database types LO 1.4- Database management systems Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 1.1- What is a database? - A database is a collection of related information stored in an organized way so that specific items can be selected and retrieved quickly. 1.2- Business-level advantages of databases: The main business benefits of databases derive from the way that databases are designed for sharing information. They are superior for: 1- Multi-user access: - Multi-user access means allowing different people in the business access to the same data simultaneously, such as a manager and another member of staff accessing a single customer’s data. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 2- Distributed access: - Distributed access means users in different departments of the business can readily access data. 3- Speed: - For accessing large volumes of information, such as the customers of a bank, only databases are designed to produce reports or access the information rapidly about a single customer. 4- Data quality: - For data quality, sophisticated validation checks can be performed when data are entered to ensure their integrity. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 5- Security: - Access to different types of data can readily be limited to different members of staff: For example, in a car dealership database the manager of a single branch could be restricted to sales data for that branch. 6- Space efficiency: - Space efficiency can be achieved by splitting up a database into different tables when it is designed, therefore less space is needed. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 1.3- An overview of database types: - Approaches to the design of databases include: file processing databases, database management systems, data warehouses and databases for big data. - The following provides a brief overview of each of these approaches: 1.3.1- File Processing databases: - Early data processing systems were based around numerous files containing large amounts of data related to daily business transactions. - As a result, many organizations found themselves in a position where they held large amounts of valuable data but were unable to maximise their use of them. - A major problem stemmed from the fact that the data held were often stored in different formats. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Notice that in order to make use of these data, it was usually necessary to create specialized computer programs, often at great expense. - With file processing databases, the main types of databases are: 1- Flat-file database: - A flat-file database is a self-contained database that only contains one type of record – or table – and cannot access data held in other database files. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 2- Free-form database: - A free-form database allows users to store information in the form of notes or passages of text. - Notice that information is organized and retrieved by using categories or key words. 3- Hypertext database: - In a hypertext database, information is stored as a series of objects that can consist of text, graphics, numerical data and multimedia data. - Any object can be linked to any other, allowing users to store disparate information in an organized manner. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 1.4- Database management systems (DBMS): 1- Database management system concept: - A DBMS is one or more computer programs that allow users to enter, store, organize, manipulate and retrieve data from a database. 2- Characteristics of database management systems: - Some of the major characteristics of the database management systems (DBMS) approach include the following: Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved a- Programs included a range of general- purpose tools and utilities for producing reports or extracting data. b- The availability of general-purpose tools enabled non-technical users to access data. - Users were able to analyze data, extract records and produce reports with little support from technical staff. c- The use of a DBMS encouraged organizations to introduce standards for developing and operating their databases. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Learning Objective 2: Understand the concept of a data warehouses This objective covers the following points: LO 2.1- Data warehouses concept LO 2.2- Data marts concept LO 2.3- Data warehousing Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 2.1- Data warehouses concept: - Data warehouses are large database systems containing current and historical data that can be analysed to produce information to support organisational decision making. 2.2- Data marts concept: - Data marts are a smaller, departmental version of a data warehouse which may be easier to manage than a company-scale data warehouse. - It should be noted that: ** Data marts do not aim to hold information across an entire company, but rather focus on one department. ** A data warehouse can consist of many data marts supporting different (smaller) operations. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 2.3- Data warehousing 1. Data warehousing concept: Data warehousing is the process of creating and maintaining a data warehouse. Figure 4.1, P.140 indicates the major steps in data warehousing process. Figure 4.1 The data warehousing process. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved There are three main elements in data warehousing process: First: The data warehouse takes information from internal and external sources such as operational systems which record sales or transactions with customers. Notice that data can come from sources such as: ** legacy databases holding historical data, ** operational systems such as enterprise resource planning (ERP), ** electronic point-of-sale (EPOS) data from customer transactions, and ** data from electronic data interchange (EDI) systems. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Second: Extracting data from one or more databases and transforming those data into a suitable format for the data warehouse. Third: Loading those data into the data warehouse. Note: ETL (Extraction, Transformation, and Load) software extracts data from one or more databases, transforms those data into a suitable format for the data warehouse and loads those data into the data warehouse. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved Learning Objective 3: Understand the concept of analytics and the use of big data This objective covers the following points: LO 3.1- Analytics concept LO 3.2- Types of analytics LO 3.3- Big data concept LO 3.4- Data mining concept Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 3.1- Analytics concept: Analytics refers to data (quantitative and qualitative) driven analysis using various methods such as data mining and statistical techniques to provide better decision making. Notice that: ** analytics is built upon various approaches to data- driven analysis. ** analytics can be viewed as the integration of business intelligence/information systems, statistics and modelling and optimisation tools. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 3.2- Types of analytics: - In terms of the practice of analytics, techniques can be categorized into three types: 1- Descriptive analytics: - Descriptive analytics answers the question of what has happened and what is happening through approaches such as: ** business intelligence, ** web analytics, and ** statistical techniques. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 2- Predictive analytics: Predictive analytics answers the question of what will be happening through approaches using statistical techniques such as: ** regression analysis, ** data mining, and ** forecasting techniques. 3- Prescriptive analytics: - Prescriptive analytics answers the question of what should be happening (i.e. recommending course of action and the likely outcome of those actions) through approaches such as: ** linear programming, ** decision trees, and ** simulation. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved The different types of data that are used in analytics include: 1- Structured data: - Structured data are what might be considered the traditional data that are produced by IT systems and include financial and customer data that can be defined in a field within a database file. - For example, customer data fields might include customer number, name, and address. 2- Unstructured data: - Unstructured data represent the majority of business- related data and include video, graphic images, web sites, e-mail, social media posts. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 3- Semi-structured data: - Semi-structured data represent data that have elements of structure but also contain arbitrary aspects. - For example, E-mails represent semi-structured data in that the sender's name and timestamp are provided but the content of the e-mail is unstructured. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 3.3- Big data concept: - The term big data refers to the large datasets that are enabled by IT systems which support, capture and disseminate these data. - The real value of big data stems from its ability to analyse large datasets using analytical tools. - This analysis has been enabled by innovations such as improved computer network speeds and storage on cloud computing platforms. - A particular emphasis of the analysis of big data is the use of unstructured data such as e-mail exchanges, social media posts, video and voice recordings. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved 3.4- Data mining concept: - Data mining in its broadest sense is a process that uses statistical, mathematical, artificial intelligence and other techniques to extract useful information from large databases. - Notice that under this wide definition most types of data analysis can be classified as data mining. - In its original definition data mining is used to identify patterns or trends in the data in data warehouses which can be used for improved profitability. Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved END of CHAPTER FOUR Copyright 2019, 2015,2008 Pearson Education, Inc. All Rights Reserved