Database Systems PDF - Introduction To Database Systems

Document Details

PrivilegedMatrix

Uploaded by PrivilegedMatrix

New York University

2011

Dr. Jean-Claude Franchitti

Tags

database systems database management systems computer science introduction to databases

Summary

This document is a presentation on database systems, providing an overview of the subject. It details course objectives and key materials, including course introduction, database systems, database architecture, and summary. The document also includes information about the instructor, and course materials.

Full Transcript

Database Systems Session 1 – Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitt...

Database Systems Session 1 – Main Theme Introduction to Database Systems Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences Presentation material partially based on textbook slides Fundamentals of Database Systems (6th Edition) by Ramez Elmasri and Shamkant Navathe Slides copyright © 2011 1 Agenda 11 Instructor Instructor and and Course Course Introduction Introduction 22 Introduction Introduction to to Database Database Systems Systems 33 Database Database System System Architecture Architecture 44 Summary Summary and and Conclusion Conclusion 2 Who am I? - Profile - ¾ 27 years of experience in the Information Technology Industry, including twelve years of experience working for leading IT consulting firms such as Computer Sciences Corporation ¾ PhD in Computer Science from University of Colorado at Boulder ¾ Past CEO and CTO ¾ Held senior management and technical leadership roles in many large IT Strategy and Modernization projects for fortune 500 corporations in the insurance, banking, investment banking, pharmaceutical, retail, and information management industries ¾ Contributed to several high-profile ARPA and NSF research projects ¾ Played an active role as a member of the OMG, ODMG, and X3H2 standards committees and as a Professor of Computer Science at Columbia initially and New York University since 1997 ¾ Proven record of delivering business solutions on time and on budget ¾ Original designer and developer of jcrew.com and the suite of products now known as IBM InfoSphere DataStage ¾ Creator of the Enterprise Architecture Management Framework (EAMF) and main contributor to the creation of various maturity assessment methodology ¾ Developed partnerships between several companies and New York University to incubate new methodologies (e.g., EA maturity assessment methodology developed in Fall 2008), develop proof of concept software, recruit skilled graduates, and increase the companies’ visibility 3 How to reach me? Come on…what else did you expect? Cell (212) 203-5004 Email [email protected] AIM, Y! IM, ICQ jcf2_2003 MSN IM [email protected] LinkedIn http://www.linkedin.com/in/jcfranchitti Woo hoo…find the word of the day… Twitter http://twitter.com/jcfranchitti Skype [email protected] 4 What is the class about? ƒ Course description and syllabus: » http://www.nyu.edu/classes/jcf/CSCI-GA.2433-001 » http://cs.nyu.edu/courses/fall11/CSCI-GA.2433-001/ ƒ Textbooks: » Fundamentals of Database Systems (6th Edition) Ramez Elmasri and Shamkant Navathe Addition Wesley ISBN-10: 0-1360-8620-9, ISBN-13: 978-0136086208 6th Edition (04/10) 5 Icons / Metaphors Information Common Realization Knowledge/Competency Pattern Governance Alignment Solution Approach 66 Course Objectives „ Gain understanding of fundamental concepts of state-of-the art databases (more precisely called: Database Management Systems) „ Get to know some of the tools used in the design and implementations of databases „ Know enough so that it is possible to read/skim a database system manual and „ Start designing and implementing small data bases „ Start managing and interacting with existing small to large databases „ Experiment and practice with industry leading vendor solutions: „ CA’s Erwin for design of relational database „ Oracle, IBM DB2, and other DB products for writing relational queries 7 Key Material Covered (1/2) „ Methodology used for modeling a business application during the database design process, focusing on entity- relationship model and entity relationship diagrams „ Relational model and implementing an entity-relationship diagram „ Relational algebra (using SQL syntax) „ SQL as data manipulation language „ SQL as data definition language „ Refining a relational implementation, including the normalization process and the algorithms to achieve normalization 8 Key Material Covered (2/2) „ Physical design of the database using various file organization and indexing techniques for efficient query processing „ Concurrency Control „ Recovery „ Query execution „ Data warehouses „ Additional topics may be covered as time allows, these topics are covered in greater depth in other courses but PowerPoint presentations for them will still be provided „ The course material is partially derived from the textbook slides and material covered as part of the Database Systems course offered at NYU Courant in previous semesters 9 Software Requirements „ Microsoft Windows XP (Professional Ed.) / Vista / 7 „ Software tools will be available from the Internet or from the course Web site under demos as a choice of freeware or commercial tools „ Database Modeling Tools „ Database Management Software Tools „ etc. „ References will be provided on the course Web site 10 Agenda 11 Instructor Instructor and and Course Course Introduction Introduction 22 Introduction Introduction to to Database Database Systems Systems 33 Database Database System System Architecture Architecture 44 Summary Summary and and Conclusion Conclusion 11 Section Outline ƒ Introduction ƒ An Example ƒ Characteristics of the Database Approach ƒ Actors on the Scene ƒ Workers behind the Scene ƒ Advantages of Using the DBMS Approach ƒ A Brief History of Database Applications ƒ When Not to Use a DBMS 12 Overview ƒ Traditional database applications ƒ Store textual or numeric information ƒ Multimedia databases ƒ Store images, audio clips, and video streams digitally ƒ Geographic information systems (GIS) ƒ Store and analyze maps, weather data, and satellite images ƒ Data warehouses and online analytical processing (OLAP) systems ƒ Extract and analyze useful business information from very large databases ƒ Support decision making ƒ Real-time and active database technology ƒ Control industrial and manufacturing processes 13 Introduction (1/3) ƒ Database ƒ Collection of related data ƒ Known facts that can be recorded and that have implicit meaning ƒ Mini-world or Universe of Discourse (UoD) ƒ Represents some aspect of the real world ƒ Logically coherent collection of data with inherent meaning ƒ Built for a specific purpose ƒ Example of a large commercial database ƒ Amazon.com ƒ Database management system (DBMS) ƒ Collection of programs ƒ Enables users to create and maintain a database ƒ Defining a database ƒ Specify the data types, structures, and constraints of the data to be stored 14 Introduction (2/3) ƒ Meta-data ƒ Database definition or descriptive information ƒ Stored by the DBMS in the form of a database catalog or dictionary ƒ Manipulating a database ƒ Query and update the database miniworld ƒ Generate reports ƒ Sharing a database ƒ Allow multiple users and programs to access the database simultaneously ƒ Application program ƒ Accesses database by sending queries to DBMS ƒ Query ƒ Causes some data to be retrieved 15 Introduction (3/3) ƒ Transaction ƒ May cause some data to be read and some data to be written into the database ƒ Protection includes: ƒ System protection ƒ Security protection ƒ Maintain the database system ƒ Allow the system to evolve as requirements change over time 16 Database System Environment 17 High-Level Example - Metadata ƒ UNIVERSITY database ƒ Information concerning students, courses, and grades in a university environment ƒ Data records ƒ STUDENT ƒ COURSE ƒ SECTION ƒ GRADE_REPORT ƒ PREREQUISITE ƒ Specify structure of records of each file by specifying data type for each data element ƒ String of alphabetic characters ƒ Integer ƒ etc. 18 High-Level Example – Database Implementation (1/2) ƒ Construct UNIVERSITY database ƒ Store data to represent each student, course, section, grade report, and prerequisite as a record in appropriate file ƒ Relationships among the records ƒ Manipulation involves querying and updating ƒ Examples of queries: ƒ Retrieve the transcript ƒ List the names of students who took the section of the ‘Database’ course offered in fall 2008 and their grades in that section ƒ List the prerequisites of the ‘Database’ course ƒ Examples of updates: ƒ Change the class of ‘Smith’ to sophomore ƒ Create a new section for the ‘Database’ course for this semester ƒ Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester ƒ Phases for designing a database: ƒ Requirements specification and analysis ƒ Conceptual design ƒ Logical design ƒ Physical design 19 High-Level Example – Database Implementation (2/2) 20 Characteristics of the Database Approach ƒ Traditional file processing ƒ Each user defines and implements the files needed for a specific software application ƒ Database approach ƒ Single repository maintains data that is defined once and then accessed by various users ƒ Main characteristics of database approach ƒ Self-describing nature of a database system ƒ Insulation between programs and data, and data abstraction ƒ Support of multiple views of the data ƒ Sharing of data and multiuser transaction processing 21 Self-Describing Nature of a Database System ƒ Database system contains complete definition of structure and constraints ƒ Meta-data ƒ Describes structure of the database ƒ Database catalog used by: ƒ DBMS software ƒ Database users who need information about database structure 22 Insulation Between Programs and Data ƒ Program-data independence ƒ Structure of data files is stored in DBMS catalog separately from access programs ƒ Program-operation independence ƒ Operations specified in two parts: Interface includes operation name and data types of its arguments Implementation can be changed without affecting the interface 23 Data Abstraction ƒ Data abstraction ƒ Allows program-data independence and program-operation independence ƒ Conceptual representation of data ƒ Does not include details of how data is stored or how operations are implemented ƒ Data model ƒ Type of data abstraction used to provide conceptual representation 24 Support of Multiple Views of the Data ƒ View ƒ Subset of the database ƒ Contains virtual data derived from the database files but is not explicitly stored ƒ Multiuser DBMS ƒ Users have a variety of distinct applications ƒ Must provide facilities for defining multiple views 25 Sharing of Data and Multiuser Transaction Processing ƒ Allow multiple users to access the database at the same time ƒ Concurrency control software ƒ Ensure that several users trying to update the same data do so in a controlled manner Result of the updates is correct ƒ Online transaction processing (OLTP) application ƒ Transaction ƒ Central to many database applications ƒ Executing program or process that includes one or more database ƒ Isolation property Each transaction appears to execute in isolation from other transactions ƒ Atomicity property Either all the database operations in a transaction are executed or none are 26 Actors on the Scene ƒ Database administrators (DBA) are responsible for: ƒ Authorizing access to the database ƒ Coordinating and monitoring its use ƒ Acquiring software and hardware resources ƒ Database designers are responsible for: ƒ Identifying the data to be stored ƒ Choosing appropriate structures to represent and store this data ƒ End users ƒ People whose jobs require access to the database ƒ Types Casual end users Naive or parametric end users Sophisticated end users Standalone users ƒ System analysts ƒ Determine requirements of end users ƒ Application programmers ƒ Implement these specifications as programs 27 Workers behind the Scene ƒ DBMS system designers and implementers ƒ Design and implement the DBMS modules and interfaces as a software package ƒ Tool developers ƒ Design and implement tools ƒ Operators and maintenance personnel ƒ Responsible for running and maintenance of hardware and software environment for database system 28 Advantages of Using the DBMS Approach (1/3) ƒ Controlling redundancy ƒ Data normalization ƒ Denormalization Sometimes necessary to use controlled redundancy to improve the performance of queries ƒ Restricting unauthorized access ƒ Security and authorization subsystem ƒ Privileged software ƒ Providing persistent storage for program objects ƒ Complex object in C++ can be stored permanently in an object-oriented DBMS ƒ Impedance mismatch problem Object-oriented database systems typically offer data structure compatibility ƒ Providing storage structures and search techniques for efficient query processing ƒ Indexes ƒ Buffering and caching ƒ Query processing and optimization 29 Advantages of Using the DBMS Approach (2/3) ƒ Providing backup and recovery ƒ Backup and recovery subsystem of the DBMS is responsible for recovery ƒ Providing multiple user interfaces ƒ Graphical user interfaces (GUIs) ƒ Representing complex relationships among data ƒ May include numerous varieties of data that are interrelated in many ways ƒ Enforcing integrity constraints ƒ Referential integrity constraint Every section record must be related to a course record ƒ Key or uniqueness constraint Every course record must have a unique value for Course_number ƒ Business rules ƒ Inherent rules of the data model 30 Advantages of Using the DBMS Approach (3/3) ƒ Permitting inferencing and actions using rules ƒ Deductive database systems Provide capabilities for defining deduction rules Inferencing new information from the stored database facts ƒ Trigger Rule activated by updates to the table ƒ Stored procedures More involved procedures to enforce rules ƒ Additional implications of using the database approach ƒ Reduced application development time ƒ Flexibility ƒ Availability of up-to-date information ƒ Economies of scale 31 A Brief History of Database Applications (1/2) ƒ Early database applications using hierarchical and network systems ƒ Large numbers of records of similar structure ƒ Providing data abstraction and application flexibility with relational databases ƒ Separates physical storage of data from its conceptual representation ƒ Provides a mathematical foundation for data representation and querying ƒ Object-oriented applications and the need for more complex databases ƒ Used in specialized applications: engineering design, multimedia publishing, and manufacturing systems 32 A Brief History of Database Applications (2/2) ƒ Interchanging data on the Web for e-commerce using XML ƒ Extended markup language (XML) primary standard for interchanging data among various types of databases and Web pages ƒ Extending database capabilities for new applications ƒ Extensions to better support specialized requirements for applications ƒ Enterprise resource planning (ERP) ƒ Customer relationship management (CRM) ƒ Databases versus information retrieval ƒ Information retrieval (IR) Deals with books, manuscripts, and various forms of library-based articles 33 When Not to Use a DBMS ƒ More desirable to use regular files for: ƒ Simple, well-defined database applications not expected to change at all ƒ Stringent, real-time requirements that may not be met because of DBMS overhead ƒ Embedded systems with limited storage capacity ƒ No multiple-user access to data 34 Summary ƒ Database ƒ Collection of related data (recorded facts) ƒ DBMS ƒ Generalized software package for implementing and maintaining a computerized database ƒ Several categories of database users ƒ Database applications have evolved ƒ Current trends: IR, Web 35 Agenda 11 Instructor Instructor and and Course Course Introduction Introduction 22 Introduction Introduction to to Database Database Systems Systems 33 Database Database System System Architecture Architecture 44 Summary Summary and and Conclusion Conclusion 36 Section Outline ƒ Data Models, Schemas, and Instances ƒ Three-Schema Architecture and Data Independence ƒ Database Languages and Interfaces ƒ The Database System Environment ƒ Centralized and Client/Server Architectures for DBMSs ƒ Classification of Database Management Systems 37 Database System Concepts and Architecture ƒ Basic client/server DBMS architecture ƒ Client module ƒ Server module 38 Data Models, Schemas, and Instances ƒ Data abstraction ƒ Suppression of details of data organization and storage ƒ Highlighting of the essential features for an improved understanding of data ƒ Data model ƒ Collection of concepts that describe the structure of a database ƒ Provides means to achieve data abstraction ƒ Basic operations ƒ Specify retrievals and updates on the database ƒ Dynamic aspect or behavior of a database application ƒ Allows the database designer to specify a set of valid operations allowed on database objects 39 Categories of Data Models (1/2) ƒ High-level or conceptual data models ƒ Close to the way many users perceive data ƒ Low-level or physical data models ƒ Describe the details of how data is stored on computer storage media ƒ Representational data models ƒ Easily understood by end users ƒ Also similar to how data organized in computer storage ƒ Entity ƒ Represents a real-world object or concept ƒ Attribute ƒ Represents some property of interest ƒ Further describes an entity ƒ Relationship among two or more entities ƒ Represents an association among the entities ƒ Entity-Relationship model 40 Categories of Data Models (2/2) ƒ Relational data model ƒ Used most frequently in traditional commercial DBMSs ƒ Object data model ƒ New family of higher-level implementation data models ƒ Closer to conceptual data models ƒ Physical data models ƒ Describe how data is stored as files in the computer ƒ Access path Structure that makes the search for particular database records efficient ƒ Index Example of an access path Allows direct access to data using an index term or a keyword 41 Schemas, Instances, and Database State (1/2) ƒ Database schema ƒ Description of a database ƒ Schema diagram ƒ Displays selected aspects of schema ƒ Schema construct ƒ Each object in the schema ƒ Database state or snapshot ƒ Data in database at a particular moment in time 42 Schemas, Instances, and Database State (2/2) ƒ Define a new database ƒ Specify database schema to the DBMS ƒ Initial state ƒ Populated or loaded with the initial data ƒ Valid state ƒ Satisfies the structure and constraints specified in the schema ƒ Schema evolution ƒ Changes applied to schema as application requirements change 43 Three-Schema Architecture and Data Independence ƒ Internal level ƒ Describes physical storage structure of the database ƒ Conceptual level ƒ Describes structure of the whole database for a community of users ƒ External or view level ƒ Describes part of the database that a particular user group is interested in 44 Data Independence ƒ Capacity to change the schema at one level of a database system ƒ Without having to change the schema at the next higher level ƒ Types: ƒ Logical ƒ Physical 45 DBMS Languages ƒ Data definition language (DDL) ƒ Defines both schemas ƒ Storage definition language (SDL) ƒ Specifies the internal schema ƒ View definition language (VDL) ƒ Specifies user views/mappings to conceptual schema ƒ Data manipulation language (DML) ƒ Allows retrieval, insertion, deletion, modification ƒ High-level or nonprocedural DML Can be used on its own to specify complex database operations concisely Set-at-a-time or set-oriented ƒ Low-level or procedural DML Must be embedded in a general-purpose programming language Record-at-a-time 46 DBMS Interfaces ƒ Menu-based interfaces for Web clients or browsing ƒ Forms-based interfaces ƒ Graphical user interfaces ƒ Natural language interfaces ƒ Speech input and output ƒ Interfaces for parametric users ƒ Interfaces for the DBA 47 The Database System Environment ƒ DBMS component modules ƒ Buffer management ƒ Stored data manager ƒ DDL compiler ƒ Interactive query interface Query compiler Query optimizer ƒ Pre-compiler ƒ Runtime database processor ƒ System catalog ƒ Concurrency control system ƒ Backup and recovery system 48 Component Modules of a DBMS 49 Database System Utilities ƒ Loading ƒ Load existing data files ƒ Backup ƒ Creates a backup copy of the database ƒ Database storage reorganization ƒ Reorganize a set of database files into different file organizations ƒ Performance monitoring ƒ Monitors database usage and provides statistics to the DBA 50 Tools, Application Environments, and Communications Facilities ƒ CASE Tools ƒ Data dictionary (data repository) system ƒ Stores design decisions, usage standards, application program descriptions, and user information ƒ Application development environments ƒ Communications software 51 Centralized and Client/Server Architectures for DBMSs ƒ Centralized DBMSs Architecture ƒ All DBMS functionality, application program execution, and user interface processing carried out on one machine 52 Basic Client / Server Architectures (1/2) ƒ Servers with specific functionalities ƒ File server Maintains the files of the client machines. ƒ Printer server Connected to various printers; all print requests by the clients are forwarded to this machine ƒ Web servers or e-mail servers ƒ Client machines ƒ Provide user with: Appropriate interfaces to utilize these servers Local processing power to run local applications 53 Basic Client/Server Architectures (2/2) ƒ Client ƒ User machine that provides user interface capabilities and local processing ƒ Server ƒ System containing both hardware and software ƒ Provides services to the client machines Such as file access, printing, archiving, or database access 54 Sample Two-Tier Client / Server Architecture 55 Two-Tier Client/Server Architectures for DBMSs ƒ Server handles ƒ Query and transaction functionality related to SQL processing ƒ Client handles ƒ User interface programs and application programs ƒ Open Database Connectivity (ODBC) ƒ Provides application programming interface (API) ƒ Allows client-side programs to call the DBMS Both client and server machines must have the necessary software installed ƒ JDBC ƒ Allows Java client programs to access one or more DBMSs through a standard interface 56 Application Server or Web Server Three-Tier and n-Tier Architectures for Web Applications ƒ Application server or Web server ƒ Adds intermediate layer between client and the database server ƒ Runs application programs and stores business rules ƒ N-tier ƒ Divide the layers between the user and the stored data further into finer components 57 Sample Logical Three-Tier Client/Server 58 Classification of Database Management Systems ƒ Data model ƒ Relational ƒ Object ƒ Hierarchical and network (legacy) ƒ Native XML DBMS ƒ Number of users ƒ Single-user ƒ Multiuser ƒ Number of sites ƒ Centralized ƒ Distributed ƒ Homogeneous ƒ Heterogeneous ƒ Cost ƒ Open source ƒ Different types of licensing 59 Classification of Database Management Systems ƒ Types of access path options ƒ General or special-purpose 60 Section Summary ƒ Concepts used in database systems ƒ Main categories of data models ƒ Types of languages supported by DMBSs ƒ Interfaces provided by the DBMS ƒ DBMS classification criteria: ƒ Data model, number of users, number of sties, access paths, cost 61 Agenda 11 Instructor Instructor and and Course Course Introduction Introduction 22 Introduction Introduction to to Database Database Systems Systems 33 Database Database System System Architecture Architecture 44 Summary Summary and and Conclusion Conclusion 62 Course Assignments „ Individual Assignments „ Reports based on case studies / class presentations „ Textbook problem sets „ Project-Related Assignments „ All assignments (other than the individual assessments) will correspond to milestones in the course project 63 Assignments & Readings ƒ Readings » Slides and Handouts posted on the course web site » Textbook: Chapters 1 & 2 64 Next Session: Relational Data Model and Relational Database Constraints 65 Any Questions? 66

Use Quizgecko on...
Browser
Browser