Summary

This document provides an introduction to database systems. It covers topics such as database applications, data models, and relational keys. The document is likely instructional material for university students at Xi'an Jiaotong-Liverpool University.

Full Transcript

Introduction to Database Welcome to CPT103! Introduction 1 Table of Contents Module information Introduction to database Relational Model Relational Keys Introduction 2 Module Information About the m...

Introduction to Database Welcome to CPT103! Introduction 1 Table of Contents Module information Introduction to database Relational Model Relational Keys Introduction 2 Module Information About the module instructor, teaching organisation Introduction 3 Teaching Organisation Module instructors: Jianjun Chen ([email protected]) SD541 Jun Qi ([email protected]) SD461 Yu Liu ([email protected]) SD465 Office hours: please check the module page online Textbook: “Database Systems: a practical approach to design, implementation, and management” by Connolly, Thomas M., Begg, Carolyn E. Introduction 4 Teaching Organisation Support: Discussions with classmates enable you to better understand concepts and terminologies. I will also send notifications to you about any updates. Please check emails frequently. Lab sessions are supported by TA and module instructors. Assessments: 2* 15% online quiz and 70 % closed-book exam. Quiz time will be announced in lectures and will be sent to you via notification. Exam time can be found from the university timetable (end of semester). Introduction 5 What Do You Need? Lectures: A piece of note that covers all important knowledge. Laptop or tablets: Some of the questions are on the LearningMall. A friend that you can discuss with during breaks. Labs: Someone to discuss with. Questions you want to ask to me. Assessments: Revise contents weekly. Do not check slides only at the last moment! Introduction 6 Introduction to Database Systems What is a database? What is a database management system? Introduction 7 What is Data? Example 1: Data could be a docx file storing your project status report. Can be opened using Word. Example 2: Data could be a spreadsheet containing patients’ information. Can be opened using Excel Introduction 8 What is Data? Data is only meaningful under its designed scenario. In example 1: Tracking and reporting project status. In example 2: Storing patient information for retreival later. Data must have the means to be created/modified. Data must have the means to be accessed. Introduction 9 What is Database? Database: “Organised collection of data. Structured, arranged for ease and speed of search and retrieval.” Database Management System (DBMS): Software that is designed to enable users and programs to store, retrieve and update data from database. Introduction 10 Database Applications Mobile Payment: account balance, transaction information are stored in databases. Hotel booking website: guest information, booking history, hotel information etc.. Introduction 11 Database Applications Online game data storage: account names, skins and equipment information of gamers are stored in databases. Healthcare: patient information is stored in databases. Introduction 12 Question Can you identify other information that could be stored in database? Introduction 13 Components of the DBMS Environment Hardware: the computer where the DBMS runs on. Can be a personal computer, a mainframe, or a network of computers. Software: comprises the DBMS itself, application programs (such as Alipay) and the operating system. Data: comprises both operational data and metadata. Operational data: e.g., game account information of a game company. Metadata: also know as “data about data”. Procedures: the instructions and rules that govern the design and use of the database. Such as “how to backup database” People: database designers, end users, application developers and database administrators. Introduction 14 DBMS Functions / Must Haves 1. Allow users to store, retrieve and update data. 2. A “system catalog” that holds hold data about the schemas, users, applications, and so on. (Also called metadata) 3. Ensure either that all the updates corresponding to a given action are made or that none of them is made (Atomicity). 4. Ensure that DB is updated correctly when multiple users are updating it concurrently. 5. Recover the DB in the event it is damaged in any way. 6. Ensure that only authorised users can access the DB. 7. DBMS should be accessible from remote computers over a network. 8. Be capable of integrating with other software. *all underlined points will be covered (more or less) in this module. Introduction 15 Data Models “Organised collection of data. Structured, arranged for ease and speed of search and retrieval.” The structure may follow different models. Relational Data Model Hierarchical Data Model Network Data Model Introduction 16 Data Models “Organised collection of data. Structured, arranged for ease and speed of search and retrieval.” The structure may follow different models. Relational Data Model Hierarchical Data Model Network Data Model Introduction 17 Data Models “Organised collection of data. Structured, arranged for ease and speed of search and retrieval.” The structure may follow different models. Relational Data Model Hierarchical Data Model Network Data Model Introduction 18 Other Data Models Other data models: Entity-Relationship model: No databases directly uses this model. It is now commonly used for teaching students the basics of database structure. Graph model: uses graph structures for semantic queries with nodes, edges Document model: stores documents in the form of JSON, XML etc.. Documents are fetched using keys. Object-Oriented: information is presented as objects as used in object-oriented programming. Introduction 19 Optional Reading For additional contents, you may read Chapters 1 ~ 3. Some contents can be hard to fully understand now. Introduction 20 Relational Model Terminologies, properties Introduction 21 The Relational Model The relational model is one approach to managing data. Originally Introduced by E.F. Codd in his paper “A Relational Model of Data for Large Shared Databanks”, 1970. The model uses a structure and language that is consistent with first-order predicate logic Provides a declarative method for specifying data and queries Details are covered in the Chapter 4~5 of the textbook. Introduction 22 Terminologies A relation is a mathematical concept. The physical form of a relation is a table with columns and rows. An attribute is a named column of a relation. A domain is the set of allowable values for attributes. Introduction 23 Terminologies Tuple: a tuple is a row of a relation. Mathematically, the order of tuples does not matter. The degree of a relation is the number of attributes it contains. Cardinality: the number of tuples in a relation. Introduction 24 Alternative Terminologies Formal Terms Alternative #1 Alternative #2 Relation Table File Tuple Row Record Attribute Column Field Introduction 25 Question Can you point out what the previous terminologies refer to in this table? Staff ID Name Salary Department M139 John Smith 18000 Marketing M140 Mary Jones 22000 Marketing A368 Jane Brown 22000 Accounts P222 Mark Brown 24000 Personnel A367 David Jones 20000 Accounts Introduction 26 Question Attributes : ID, Name, Salary & Department The degree of the relation is 4 Staff ID Name Salary Department M139 John Smith 18000 Marketing M140 Mary Jones 22000 Marketing A368 Jane Brown 22000 Accounts Tuples P222 Mark Brown 24000 Personnel A367 David Jones 20000 Accounts The cardinality of the relation is 5 Introduction 27 Additional Properties of Relations Relation’s name is unique in the relational database schema. Each cell contains exactly one atomic value. Each attribute of a relation must have a distinct name. The values of an attribute are from the same domain. The order of attributes has no significance. The order of tuples has no significance. No duplicate tuples. Introduction 28 Relation and Tuples One relation stands for one class/type of objects. Each tuple is an instance of that class/type. In the example below, each tuple of staff refers to a real staff. Each tuple has a value in ID, Name, Salary and Department. Staff ID Name Salary Department M139 John Smith 18000 Marketing M140 Mary Jones 22000 Marketing A368 Jane Brown 22000 Accounts P222 Mark Brown 24000 Personnel A367 David Jones 20000 Accounts Introduction 29 Relational Keys Super key, candidate key, primary key, foreign key. Introduction 30 Relation Keys No duplicate tuples are allowed within a relation. Thus, we need to be able to identify one or more attributes (called relational keys) that uniquely identifies each tuple in a relation. Examples: Staff ID Name Age 1 Jason 22 Office Building Number Room Room Size Has Printer 11 301 96 True Introduction 31 “Uniquely Identifies” Given two staff members in a company. Their names could be the same. Their ages could be the same. But their ID numbers are always different. The ID is unique to the person. Staff ID Name Age 1 Jason 22 2 John 31 3 Joson 31 Introduction 32 “Uniquely Identifies” Given two office rooms in a company. Their building number could be the same. Their room numbers could be the same. Rooms with the same room size are extremely common. But the combination of “Building number + Room” is unique. Office Building Number Room Room Size Has Printer 11 301 96 True 11 502 96 False 12 301 32 False 12 502 32 True Introduction 33 Questions Which of the following attribute combinations uniquely identifies offices? 1. (Building Number, Room Size) 2. (Room, Room Size) 3. (Building Number, Room, Has Printer) 4. (Building Number, Room, Room Size) 5. (Building Number, Room, Room Size, Has Printer) Office Building Number Room Room Size Has Printer 11 301 96 True 11 502 96 False 12 301 32 False 12 502 32 True Introduction 34 Relation Keys Superkey: An attribute, or set of attributes, that uniquely identifies a tuple within a relation. A superkey may contain additional attributes that are not necessary for unique identification. Candidate key: A superkey such that no proper subset is a superkey within the relation. There may be several candidate keys for a relation. Primary key: The candidate key that is selected to identify tuples uniquely within the relation. The candidate keys that are not selected to be the primary key are called alternate keys. A relation can only have one primary key. When a key consists of more than one attribute, we call it a composite key. Introduction 35 Relation Keys Super keys ` Candidate keys Primary key Introduction 36 Question Identify the type of relation key of the following items for table ‘teaching_staff’: superkey 1. (staff_id, email) 2. (staff_id) candidate key 3. (passport_id) 4. (passport_id, fname) primary key 5. (staff_id, passport_id) 6. (email, lname) alternate key 7. (fname, lname) composite key teaching_staff staff_id email fname lname passport_id 1 [email protected] Steven Guan P123456 2 [email protected] John Woodward U543121 3 [email protected] Nathan Tubb U998877 Introduction 37 Answer 1. (staff_id, email) superkey, composite key 2. (staff_id) superkey, candidate key, primary key 3. (passport_id) superkey, candidate key, alternate key 4. (passport_id, fname) superkey, composite key 5. (staff_id, passport_id) superkey, composite key 6. (email, lname) superkey, composite key 7. (fname, lname) ---none--- teaching_staff staff_id email fname lname passport_id 1 [email protected] Steven Guan P123456 2 [email protected] John Woodward U543121 3 [email protected] Nathan Tubb U998877 Introduction 38 Finding Candidate Key You can’t necessarily infer the candidate keys based solely on the data in your table. More often than not, an instance of a relation will only hold a small subset of all the possible values. E.g. Restaurants’ booking number might reset to 1 after a large number. Queue No. A31 Table size: up to 4 A1, A2, A3… A99, A999 -> A1 31 People are waiting ahead of you. Queue queue_no table_size phone_no ticket_time A1 4 123321 2021-9-10 12:00:03 A2 4 321123 2021-9-10 12:15:06 B1 8 789987 2021-9-10 12:16:09 Introduction 39 Finding Candidate Key What is a suitable candidate key for the Queue table then? Queue No. A31 Table size: up to 4 A1, A2, A3… A99, A999 -> A1 31 People are waiting ahead of you. Queue queue_no table_size phone_no ticket_time A1 4 123321 2021-9-10 12:00:03 A2 4 321123 2021-9-10 12:15:06 B1 8 789987 2021-9-10 12:16:09 Introduction 40 Relation Keys: Foreign Key It is also very common that tuples in one relation references data from another relation. As a result, a database should provide such mechanism to ensure correct references. This is enforced by something called foreign key Introduction 41 Foreign Key Foreign key: One or more attributes within one relation that must match the candidate key of some (possibly the same) relation. Example: We want the values of the ‘branchNo’ in relation staff to be one of the ‘branchNo’ in relation Branch. Introduction 42

Use Quizgecko on...
Browser
Browser