Module 4: Data Resource Management PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document presents a module on data resource management and fundamentals of information systems. It covers topics such as learning objectives, data concepts, big data, and various types of databases. The module provides a conceptual understanding of databases and data structures involved in database management.
Full Transcript
Module 4: DATA RESOURCE MANAGEMENT Fundamentals of Information Systems Learning objectives Explain the business value of implementing data resource management processes and technologies in an organization. Outline the advantages of a database management approach to managing the data res...
Module 4: DATA RESOURCE MANAGEMENT Fundamentals of Information Systems Learning objectives Explain the business value of implementing data resource management processes and technologies in an organization. Outline the advantages of a database management approach to managing the data resources of a business compared to a file processing approach Explain how database management software helps business professionals and supports the operations and management of a business Illustrate each of the following concepts: Major types of databases Data warehouses and data mining Logical data elements Fundamental database structures Database access methods Database development Data Concepts Data Information Knowledge Data Information Knowledge It is a human belief or perception about relationship among facts or concepts related to that area. It is a general awareness or possession of information, facts, ideas, truths, or principles Big Data It refers to massively large data sets that conventional data processing technologies do not have sufficient power to analyze them. Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Examples: Civil Registry and Statistics Exploratory biomedical research Targeting of Consumers for ads Why data management is important? Most common defects in database resources management No control of redundant data - Redundant data could make the data set inconsistent. Why data management is important? Violation of data integrity You can find that Alex Wilson received a grade in MKT211; However, you can’t find Alex Wilson in the student roster. That is, the two rosters are not consistent. Suppose we have a data integrity control to enforce the rules, say, “no student can receive a grade unless she/he has registered and paid tuition”, then such a violation of data integrity can never happen. Why data management is important? Relying on human memory to store and to search needed data. The third common mistake in data resource management is the over use of human memory for data search. A human can remember what data are stored and where the data are stored, but can also make mistakes. If a piece of data is stored in an un-remembered place, it has actually been lost. As a result of relying on human memory to store and to search needed data, the entire data set eventually becomes disorganized. Why do we need database? To avoid the above common flaws in data resource management, database technology must be applied. Though not good for replacing databases, spreadsheets can be ideal tools for analyzing the data stored in a database. A spreadsheet package can be connected to a specific table or query in a database and used to create charts or perform analysis on that data. Databases can be organized in many different ways by using different models. The data model of a database is the logical structure of data items and their relationships. Foundation data concepts Logical data Elements Character Single alphabetical, numeric, or other symbols Field or Data Item Groupings of characters Represents an attribute of some entity (Characteristics or quality) Example: SALARY, Job, title Record Related fields of data Collection of attributes that describes an entity Fixed length or variable length File or table A group of related records Database is an organized collection of related data. It is an organized collection, because in a database, all data is described and associated with other data. An integrated collection of logically related data elements Consolidates records into a common pool of data elements Data is independent of the application program using them and type of storage device Database examples Database Structure Hierarchical Structure Network Relational Object-oriented Multi-dimensional Hierarchical Structure Early DBMS structure Records arranged in tree-like structure Relationships are one-to-many The data element or record at the highest level of the hierarchy is called the root element. Network Structure Used in some mainframe DBMS packages Many-to-many relationship The network model can access a data element by following one of several paths because any data element or record can be related to any number of other data elements Relational Structure Most widely used structure Data elements are stored in tables, Row represents a record; column is a field Can relate data in one file with data in another, if both files share a common data element Relational Data model In the example above, we have a table of student data, with each row representing a student record, and each column representing one filed of the student record. A special filed or a combination of fields that determines the unique record is called primary key (or key). A key is usually the unique identification number of the records. Relational Operations Select Create a subset of records that meet a stated criterion Example: employees earning more than $30,000 Join Combine two or more tables temporarily Looks like one big table Project Create a subset of columns in a table Multi-dimensional structure Time Region Variation of relational model Uses multidimensional structures to organize data Data elements are viewed as being in cubes Popular for analytical databases that support Online Analytical Processing (OLAP) Product type Sales Channel Object Oriented Structure An object consists of Data values describing the attributes of an entity Operations that can be performed on the data Encapsulation Combine data and operations Inheritance New objects can be created by replicating some or all of the characteristics of parent objects Evaluation of Database Structures Hierarchical Works for structured, routine transactions Cannot handle many-to-many relationship Network More flexible than hierarchical Unable to handle ad hoc requests Relational Easily responds to ad hoc requests Easier to work with and maintain Not as efficient/quick as hierarchical or network Database Development Database Administrator (DBA) In charge of enterprise database development Improves the integrity and security of organizational databases Uses Data Definition Language (DDL) to develop and specify data contents, relationships, and structure Stores these specifications in a data dictionary or a metadata repository Data Dictionary A data dictionary Contains data about data (metadata) Relies on specialized software component to manage a database of data definitions It contains information on.. The names and descriptions of all types of data records and their interrelationships Requirements for end users access and use of application programs Database maintenance Security Database Development Conceptual Design End users must identify the key data elements that are needed to perform their specific business activities. Entity Relationship Diagram is a graphical model of the various files and their relationships, contained within a database system Database Design Process Data relationships are represented in a data model that supports a business process This model is the schema or subschema on which to base: The physical design of the database The development of application programs to support business processes Logical Design – relationships among data elements Schema - overall logical view of relationships Subschema - logical view for specific end users Data models for DBMS Physical Design – lists down the actual structure of data How data are to be physically stored and accessed on storage devices Logical and Physical Database Views Designing a Database Scenario: Suppose a university wants to create a School Database to track data. After interviewing several people, the design team learns that the goal of implementing the system is to give better insight into students’ performance and academic resources. From this the team decides that the system must keep track of the students, their grades, courses, and classrooms. Database Design Example A primary key must be selected for each table in a relational database. This key is a unique identifier for each record in the table. For example, in the STUDENT table, it might be possible to use the student name as a way to identify a student. A foreign key is a field in one table that connects to the primary key data in the original table. Proposed Discussion Activity Name a database you interact with frequently. What would some of the field names be? Research on the following: What are the different types of databases? What is a Data Warehouse? Data Mining? What are the components of Database Management System Submit your answers in a word file using: Midterm_Act1_Lastname as your filename. Database Illustration Field Record File / Table Primary Key Data sources Types of databases Operational Distributed External Hypermedia Operational Database Stores detailed data needed to support business processes and operations Also called subject area databases (SADB), transaction databases, and production databases Database examples: customer, human resource, inventory Examples of operational databases that can be created and managed or a small business by microcomputer database management software like Microsoft Access. Distributed Database Distributed databases are copies or parts of databases stored on servers at multiple locations Improves database performance at worksites Advantages Protection of valuable data Data can be distributed into smaller databases Each location has control of its local data All locations can access any data, any where Disadvantages Maintaining data accuracy – one change in table should be made to all other locations Replication Look at each distributed database and find changes Apply changes to each distributed database Reference: phoenixnap.com Very complex External Database Databases available for a fee from commercial online services, or free from the Web Example: hypermedia databases, statistical databases, bibliographic and full text databases Search engines like Google or Yahoo are external databases Hypermedia Database A hypermedia database contains Hyperlinked pages of multimedia Interrelated hypermedia page elements, rather than interrelated data records Data Warehouse Stores static data that has been extracted from other databases in an organization Central source of data that has been cleaned, transformed, and cataloged Data is used for data mining, analytical processing, analysis, research, decision support Data warehouses may be divided into data marts Subsets of data that focus on specific aspects of a company (department or business process) Data Warehouse Components Applications and Data Marts Source: panoply.io A data warehouse is a large centralized repository of data that contains information from many sources within an organization. A data mart is a subset of a data warehouse oriented to a specific business line Data Mining Data in data warehouses are analyzed to reveal hidden patterns and trends: Market-basket analysis to identify new product bundles Find root cause of quality or manufacturing problems Prevent customer attrition Acquire new customers Cross-sell to existing customers Profile customers with more accuracy Data Mining Process How data mining extracts business knowledge from a data warehouse Examples of Data Mining Marketing - use of twitter Banking - AI system that is powered to process and analyze documents faster. Government o revealing trends to counter the threat of IEDs (Improvised Explosive Devices), suicide bombers in Syria and Pakistan and even infiltration of allied governments by spies. o patterns of roadside bombs in Afghanistan leading them to predict attacks and placement of bombs. Healthcare – how overtreatment happens in the case of diabetic patients Education - way to identify dyslexic kids quite early on Retail Industry – customer experience through the use of Pinterest pins to check which products are trending to push it to physical stores Source: https://prowebscraper.com/blog/data-mining-examples/ Traditional File Processing Data are organized, stored, and processed in independent files Each business application designed to use specialized data files containing specific types of data records. Traditional File Processing Example of file processing systems in banking. Note the use of separate computer programs and independent data files in a file processing approach to the savings, installment loan, and checking account applications. Problems with Traditional File Processing Data redundancy Lack of data integration Data dependence (files, storage devices, software) Lack of data integrity or standardization Database Management Approach The foundation of modern methods of managing organizational data Consolidates data records formerly in separate files into databases Data can be accessed by many different application programs A database management system (DBMS) is the software interface between users and databases Database Management Approach An example of a database management approach in a banking information system. Note how the savings, checking, and installment loan program Use a database management system to share a customer database. Note also that the DBMS allows a user to make direct, adhoc interrogations of the database without using application programs. Database Management System It is the main software tool of the database management approach because it controls the creation, maintenance, and use of the databases of an organization and its end users. Create new databases and database applications Maintain the quality of the data in an organization’s databases Use the databases of an organization to provide the information needed by end users Questions???