Week_01_Chapter_01_and_04_Relational.pptx

Full Transcript

Introduction Modern Database Management 12th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi Why do we need data? Information: Context helps Graphical displays turn data into users understand data useful information th...

Introduction Modern Database Management 12th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi Why do we need data? Information: Context helps Graphical displays turn data into users understand data useful information that managers can use for decision making and interpretation Definitions: Database: organized collection of logically related data Data: stored representations of meaningful objects and events  Structured: numbers, text, dates  Unstructured: images, video, documents Information: data processed to increase knowledge in the person using the data 3 Definition: Metadata  Data that describe the properties and context of user data Data about the data Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and data context Why do we need databases? How large are typical databases? How large are typical databases?  Personal databases Designed to support one user Reside on Personal computers/Laptops/smartphones Provide the ability to store data However the data cannot easily be shared Steward - Personal Database Manag How large are typical databases?  Personal databases  Workgroup databases Drawback: A lot of functionality is programmed on the client Each member of the workgroup has a computers computer They are linked by means of a network Each computer has a copy of the user interface application (client) The database is on the server How large are typical databases?  Personal databases  Workgroup databases  Departmental/Divisional databases The user interface is accessible on user’s computers (or smartphones) The application/Web server contains the business logic required to do business transactions It ‘talks’ to the database server How large are typical databases?  Personal databases  Workgroup databases  Departmental/Divisional databases  Enterprise resource planning (ERP) systems  Enterprise applications (transactional)  Data warehousing implementations They support the entire organization An organization may have multiple databases How large are the largest databases? 1. World Data Center for Climate 220 terabytes of data promptly available on the web 7. YouTube 6 petabytes worth of extra information 100 million videos watched per day 65,000 videos added each day At least 45 terabytes of videos 2. NERSC – No. 1 – The World Data Centre for Climate Data of nuclear energy research, – No. 2 – National Energy Research Scientific Computing Center – No. 3 – AT&T high energy material science tests, – No. 4 – Google recreations of the early universe – No. 5 – Sprint 8. Amazon 2.8 petabytes of data – No. 6 – ChoicePoint / LexisNexis – No. 7 – YouTube – No. 8 – Amazon 59 million – No. 9 – Central Intelligence Agency (CIA) active – No. 10 – Library of Congress customers More than 42 terabytes of data 3. AT&T AT&T 323 terabytes of information SAP 1.9 trillion phone call largest data warehouse at 12.1 petabytes (PB) records What do we do with a database?  Enterprise Resource Planning (ERP) Recording every single event that may take place in an  Central db holds all enterprise functions organization: (manufacturing, finance, sales, marketing, - Orders inventory, accounting, human resources) - Payments - Class registration..  Contain the current operational data of the enterprise Harrah’s (now part of the Caesar’s  Data Warehouse Entertainment casino empire) understands  Integrated from various operational databases how much money particular gamblers can afford to lose in a day before they won’t  Contains historical data come back the next day. A Wal-Mart manager on the East Coast prominently displayed an on-sale computer, driving a spike in sales. Alerted to the sales anomaly by EDW analysts, Wal-Mart corporate managers quickly notified all store managers to display the computer in a similar fashion, highlighting the combination of a computer and printer for one low price. By the end of the History of databases History of databases History of databases The database now consists of a collection of records connected to one another through links (pointers). Network Model: Hierarchical Model: If an account belongs to more than one customers, the data will have to be replicated. Note that: the set of all customer and account records is organized in the form of a rooted tree, where the root of the tree is a dummy node. A hierarchical database is a collection of such rooted trees, and hence forms a forest. History of databases Codd’s contribution was revolutionary: 1. He separated the data from the physical information storage, and 2. He described a framework for storing and retrieving The breakthrough earned data using Codd simple rowsa Turing and Award. tables History of databases 1970s: Two major relational database system prototypes were created between the years 1974 and 1977, and they were the Ingres, which was developed at UBC, and System R, created at IBM San Jose 1980s: Structured Query Language, or SQL, became the standard query language and Relational database systems became a commercial success 1990s: new client tools for application development were released, and these included the Oracle Developer, PowerBuilder, VB, and others. - The advent of the Internet led to exponential growth of the database industry. Average desktop users began to use client- server database History of databases 2010s: NoSQL databases are becoming popular. - They reject the ‘relational structuring of data’ and offer increased speed - The jury is undecided over whether NoSQL will supplant the relational model. Relational Model – Data Structure: In Flat Files everything is stored in one table: In Relational Databases the data are separated into Relations, such that every table describes one and only STUDENT one topic (ENTITY): COURSE TRAINER data data data In Relational Databases the relationships between the data are maintained by Foreign keys:  A table may have one primary key and any number of Foreign keys  A foreign Key is the Primary key of another table. Relational Model – Data Structure: In Relational Databases the data are separated into Relations, such that every table describes one and only one topic (ENTITY): In Relational Databases the relationships between the data are maintained by Foreign keys. In Relational Databases data redundancy is minimized. Relational Model – Data Structure: In Relational Databases using Foreign keys it is easy to recover the relationships among the data: Example: Which students are taking classes held by Charles Hill? Relational Model – Data Structure: In Relational Databases using Foreign keys it is easy to recover the relationships among the data: Example: Which students are taking classes held by Charles Hill? Relational Model in more details: Relatio ns Is a Relation the same as a Table? A relation is a named, two-dimensional table of data A relation consists of rows (records-instances) and columns (attribute or field) the same as a table - But a Relation has additional Requirements:  It must have a unique name.  Every attribute value must be atomic (single valued)  Every row must be unique (can’t have two rows with exactly the same values for all their fields).  Attributes (columns) in tables must have unique names.  The order of the columns must be irrelevant.  The order of the rows must be irrelevant Relational Model in more details: Relatio ns Example: Is this a relation? Relational Model in more details: Relatio ns Example: So, if we do not allow multivalued attributes we get a Relation: EMPLOYEE(EmployeeID, EmployeeName, DeptName, Salary, CourseTitle, DateCompleted) Relational Model in more details: Key Attributes o Keys are special fields that serve two main purposes and are 1. Primary keys 2. Foreign keys o Keys can be simple (a single field) or composite (more than one field) o Keys usually are used as indexes to speed up the response to user queries Relational Model in more details: Key Attributes 1. Primary keys  are unique identifiers of the relation  They guarantee that all rows are unique  Examples include employee numbers, social security numbers, etc  Can be simple or Composite Relational Model in more details: Key Attributes 2. Foreign keys  are identifiers that enable a dependent relation to refer to its parent relation SAMPLE SCHEMA Primary Key Foreign Key (The Customer that places an order must already exist in the Customer table) Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys Quiz Identify all primary and foreign keys For the foreign keys also identify the dependent and parent relation Quiz 2 Identify all primary and foreign keys For the foreign keys also identify the dependent and parent relation Relational Model: Consists of: 1. Data structure Relations (tables), rows, columns Country_T City_T Primary Key Primary Foreign Key Relational Model: Consists of: 1. Data structure Tables (relations), rows, columns 2. Data integrity Mechanisms for implementing business rules that maintain integrity of manipulated data Relational Model – Data Integrity: Integrity constraints They are Rules They limit acceptable values and actions Their purpose is to facilitate maintaining the accuracy and integrity of data in the database. Relational Model – Data Integrity: Integrity constraints are of 3 major types: o Domain Constraints  Allowable values for an attribute o Entity Integrity  No primary key attribute may be null. All primary key fields MUST have data o Referential Integrity  A rule that states that either each foreign key value must match a primary key value in another relation or the foreign key value must be null  Example: The CountryCode in City_T must match the Code in Country_T or be null i.e. You cannot have cities from countries that do not exist in Country_T Country_T City_T Relational Model – Data Integrity: o Referential Integrity in more detail: Referential Integrity constraint can be used with several rules that protect the integrity of data For example: Delete Rules Restrict–don’t allow delete of “parent” side if related rows exist in “dependent” side: Cannot delete a customer if there are orders he placed in the ORDER relation Cascade–automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted Delete the customer and the orders he placed in the ORDER relation Set-to-Null–set the foreign key in the dependent side to null if deleting from the parent side  not allowed for weak entities Delete the customer and set the Customer ID of his orders to NULL 36 Quiz Having the Referential Integrity constraint in mind: Can you delete the city Kabul? Can you delete the Country Afghanistan? Country_T City_T Primary Key Primary Foreign Key Quiz – 2 Having the Referential Integrity constraint in mind: What will happen if you try to delete the Country Afghanistan and the Delete Rule of this Foreign Key is set to: A. Restrict B. Cascade C. Set-to-Null Country_T City_T Primary Key Primary Foreign Key Relational Model: Consists of: 1. Data structure Tables (relations), rows, columns 2. Data integrity Mechanisms for implementing business rules that maintain integrity of manipulated data 3. Data manipulation Powerful SQL operations for retrieving and modifying data SQL ENVIRONMENT DDL - Data Definition Language DML - Data Manipulation Language DCL - Data Control Language  DDL, DML, DCL are used in different stages of the database development process Why Databases? What is the difference with File Processing Systems? Database Management System This is the difference: How it is achieved? - Central repository of shared data A Database Management System (DBMS) i.e. a software system that is used - Data is managed by a controlling agent to create, maintain, and provide - Stored in a standardized, convenient form controlled access to user databases Order Filing System Central database Invoicing DBMS Contains employee, System order, inventory, pricing, and Payroll customer data System DBMS manages data resources like an operating system manages hardware resources File Processing Systems Vs DBMS: Program-Data Dependence Program-Data Independence  All programs maintain metadata for each file  The separation of data description they use (metadata) from the application  Changes in the data file means changes in the programs that access the data is called program that is used to access it data Independence  Difficult to locate all the programs that access  Changes in the metadata do not lead to this file changes in the application programs  Error prone process Change of a field’s length in Customer data file means changes in the code of up to 5 programs used to access the data 43 File Processing Systems Vs DBMS: Duplication of Data Planned data redundancy  Different programs have separate copies of the same  Depends on the design of the data database  Different departments are not always synchronized  Good design means little to no  Waste of storage space redundancy  Increased effort to maintain all the data  No reliable metadata: for example the same field may have different name, datatype or length in different files  The biggest problem:  Data changes in one file could cause inconsistencies  Compromises in data integrity Duplicate Data 44 File Processing Systems Vs DBMS: Data Inconsistencies Improved Data  Capital and Lower Cases Consistency  Dates on different formats  Reliable metadata: for example  Names with different order of first and the same field may have different last name, datatype or length in different files  Undefined values of different formats Enforcement of standards  One central management by the DBA Improved data quality  Enforced by database constraints 45 File Processing Systems Vs DBMS: Limited Data Sharing Improved data sharing  A database is designed as a  No centralized control of data shared corporate resource  Cloud facilities are not always  Access is granted to users by permitted, when dealing with sensitive the DBA data Limitation in available Information: It is difficult to create a report requested by the managers, if the data needed for Improved data accessibility the report must be drawn from different and responsiveness departments  Easy access to data with SQL (Structured Query Language) Example: A non registered customer cannot place an order 46 File Processing Systems Vs DBMS: Lengthy Development Times Increased application  For a new application new data are development productivity  Initial requirements may be high required  But new applications are  The Programmers must design new file developed extremely fast formats  The Programmers must design new programs for accessing the new data  TheWaste Reduced program Programmers of time andmust write market: in today’s maintenance documentation forisall time that …. money  Easy to make changes to data and their metadata Excessive Program Maintenance  All the above combined together means an excessive amount of time and money dedicated to maintaining data files. Could be up to 80% of information systems budget 47 Database Management System More Advantages Increased Data Security  Access is controlled by the dba  Access is restricted to only specific data for each user  Concurrent access Control mechanisms Example: Imagine two users accessing the same data at the same time. They both try to change the same field. Which of the two entries should be maintained? Improved decision support  Some databases are designed for fast decision support (DW) The Database Approach – Definitions 49 lements of the Database Approach: 1. Data models  Graphical system capturing the nature and relationships of data  Enterprise Data Model–high-level entities and relationships for the organization  Project Data Model–more detailed view, matching data structure in database or data warehouse Enterprise Data Model Project Data Model lements of the Database Approach: 2. Entity Types  Noun form describing a person, place, object, event, or concept  Composed of attributes Entity types: CUSTOMER, ORDER, PRODUCT Attributes: The data you are interested in capturing about an entity. Example: Customer name or address Data are recorded for many customers. Each customer’s information is an Entity Instance of CUSTOMER entity type lements of the Database Approach: 3. Relationships  Between entities  Can be one-to-one (1:1), one-to-many (1:M) or many-to- many (M:N) One customer may place many orders, but each order is placed by a single customer  One-to-many relationship lements of the Database Approach: 3. Relationships  Between entities  Usually one-to-many (1:M) or many-to-many (M:N) One order has many order lines; each order line is associated with a single order  One-to-many relationship lements of the Database Approach: 3. Relationships  Between entities  Usually one-to-many (1:M) or many-to-many (M:N) One product can be in many order lines, each order line refers to a single product  One-to-many relationship lements of the Database Approach: 3. Relationships  Between entities  Usually one-to-many (1:M) or many-to-many (M:N) Therefore, one order involves many products and one product is involved in many orders  Many-to-many relationship lements of the Database Approach: 4. Relational Databases  Entities are represented as Relations  Relationships are represented by means of common field values. They are referred to as: foreign keys Attribu te ENTITY Instanc e ENTITY (TYPE)

Use Quizgecko on...
Browser
Browser