Advanced Database System Reviewer for IT208 PDF

Summary

This document is a review of advanced database systems. It covers fundamental concepts, definitions, and operations related to data, information, databases, and database management systems (DBMS).

Full Transcript

Information Management Database: A database is a collection of interrelated data files or tables Midterm Reviewer Lesson 2: File Sys...

Information Management Database: A database is a collection of interrelated data files or tables Midterm Reviewer Lesson 2: File System and Database System Unit 1: Fundamentals of Database Manual File Management and File System Lesson 1: Basic Concepts and Definitions The earliest form of organizing files is Data with a filing cabinet where vital records were usually kept and secured with a Data (plural for datum) is referred to as lock. any raw facts or concepts The file system, as depicted in the figure, Information stored data in the form of files. Once data that have been processed so DBMS Approach as the meaning would be consistent and As organization data processing be vividly understood by those who use requirements become more complex, it them, they are turned into information becomes evident that traditional file Database systems had several shortcomings and limitations collection of interrelated data and The DBMS approach emphasizes information. integrating and sharing data throughout Metadata the organization or at least across major segments of the organization. The data about data. Information that describes the characteristics or Characteristics of DBMS Approach properties of the database's data. DBMS is required: BMS is an DBMS (Database Management System) intermediary between database users and database. A database management system Self-describing: The DBMS system (DBMS) is a software package that contains the actual data and the allows users to design, create, access, metadata, which are descriptions of the maintain, manipulate, and retrieve data data structure and constraints. and information Program-data independence: Database Operations Program-data independence refers to the non-dependency of data descriptions Insert on application programs. Retrieve Support multiple data views: DBMS Modify system controls database access Delete through a view, a subset of the database. Hierarchy of Data Multiuser system: DBMS allows multiple users to have concurrent access Character: A character, equivalent to a to the database at the same time. byte, can be a letter, a number, special symbol, punctuation mark, or space. Advantages of DBMS Approach over File Field: A file is usually a combination of System one or more related characters identified Lower data redundancy: There is a high by a field name. possibility that several departments Record: A record is equivalent to one row maintain the same file, as users need of a file/table or only a related field group. them, such as in the case of the student Data File: A data file is a collection of master file held both in the MIS and interrelated records. In layman’s term, it Accounting departments in our file merely refers to a table of values. system example. Better data consistency and Internet Databases – Internet is a public reliability: Inconsistent data in the file network accessed by hundreds of system is possible because of data millions of people worldwide through a redundancy. Web browser interface. Improved data sharing: In the DBMS Types of Databases approach, the database is designed as a shared corporate resource. Single-user Database Allows data concurrency: DBMS Multiuser Database approach enables several users to Database Types According to Database access the same file, which is not Location permitted in file systems as only one user can access data at a time. Centralized databases – A centralized Enhanced data searching: The file database supports data located and system needs different application maintained at a single location, usually programs in every search operation called the central computer, mainframe performed. computer, or main computer. It is Higher data integrity: The file system generally accessed using an internet specifies constraints in the application connection or through LAN or WAN. program that will use the data. Distributed databases – A distributed Advanced data security: The DBMS database is a database that supports approach has specialized features and data distributed across several different more security mechanisms than the file sites such as in several computers in a system that usually uses password particular place or geographically spread mechanisms. in distant places but appears as a single Permit backup and recovery of data: database to users. DBMS approach has a specialized Database Types According to the Use mechanism that provides backup and recovery of data even if it is lost. Operational databases – day-to-day operations of the organization. Lesson 3: Database Application and Data Warehouse – usually hold big data Database Types and store information from various data Database Application sources Enrolment System Lesson 4: The Database Environment & Library System Users Telecom System The Database System Environment Banking System Payroll System group of system components that define Hotel Reservation System and control the collection, storage, Airlines Reservation System management, and data use Accounting System Hardware Range of Databases Application all tangible or physical devices Personal Databases – Personal Software databases support only one user at a time. intangible elements that help control and Multitier Client/Server Databases – manage the database and its contents+ Multitier databases support a large o DBMS number of users. o Operating System (OS) o Application Programs Enterprise Database – An enterprise database is a database whose scope is People the entire organization or the majority of its departments. all who control, manage, and use its contents. o System Administrators The E-R Model o Database Administrators The entity-relationship model is yielded a o Database Designers graphical representation of entities and o System Analysts and their relationships in a database Application Developers structure. o End Users Logical representation of a data for an Procedures organization or for a business area. Set of rules or instructions that govern Symbols Used in Entity-Relationship DBMS, design, and implementation of Diagram the database and the database system Strong (rectangle) Data Weak (double lined rectangle) The data refers to the entire information Associative (rectangle with round edges) resource, both the actual or operational Components in Entity-Relationship Diagram data and the metadata stored in its (ERD) database. Entity type The Database Users o Person people who are directly and indirectly o Place involved in database use such as IT o Events specialists who are involved in the o Objects design, use, and maintenance of a o Concept database or anyone who simply request Attributes – are characteristics of an information in the database. entity type that is of interest to the organization. Attribute names are singular nouns. Ovals or ellipses are Unit 2: Modeling Data in the Organization used to represent attribute Relationship – holds together various LESSON 1: Business Rules and the Entity components of the E-R diagram. Relationship Model Constructs Business Rules statement that defines or constraints some aspect of the business organization. Structural Assertion statement that something of importance to the business either exists as a concept of interest or exists in relationship to another thing of interest. Action Assertion statement that concerns some dynamic aspect of the business. Derivation Unit 3: The Enhanced Entity Relationship statement taken from other knowledge in Model the business. Lesson 1: The Supertype/Subtype Notation Remember! Enhanced Entity Relationship Model A business rule follows a certain grammar The enhanced entity-relationship model LESSON 2: METHODS IN DEVELOPING (EERD), otherwise known as the SUPERTYPE/SUBTYPE RELATIONSHIPS extended entity-relationship model. GENERALIZATION The supertype is a generic type that has Methods in Developing Supertype/Subtype at least two subtypes. A subtype is a Relationship subgrouping of the entities that are valuable to the organization. The The generalization and specialization subtypes share common attributes, but are the methods we can use to develop each subtype possesses attributes or the supertype/subtype relationships. relationships distinct from other Generalization subgroupings. The supertype/subtype notation depicts Generalization is a process of the parent-child relationship. In the identifying the common attributes from a context of entity-relationship diagraming, set of entities and creating a generalized the supertype contains the shared entity. Two or more entity types can be attributes, while the subtype includes generalized if there are attributes unique attributes. common to them; that is why the generalization process is considered a bottom-up approach. Basic Notation of Supertype / Subtypes We illustrate a supertype/subtype relationship by drawing a line leading from the supertype to the circle. Relationships and Subtypes Specialization Specialization is a process of dividing an entity type into two or more sub- entities due to two primary reasons. First, some attributes apply to some, but not to all instances of the entity type. Second, some relationships are participated in by some, but not to all instances of the entity type. Specialization thus is considered a top-down process. Partial Completeness Rule also known as partial specialization rule, specifies that the occurrence of an entity in the supertype is not required to be a member or part of the subtype identified under it. Differences between Generalization and Specialization Disjointness Constraints – whether an instance of a supertype is allowed to be a member of more than one subtype simultaneously Disjoint Rule – does not allow overlapping subtypes; thus, it specifies that an instance of the supertype may only exist as one of the subtypes. A small Lesson 3: Constraints in letter "d" is placed inside the circle to Supertype/Subtype Relationships denote that the constraint exists is Constraints Supertype/Subtype disjoint. Relationship The completeness and disjointness constraints are among the most vital constraint we will discuss in this section. Completeness Constraints – whether an instance of a supertype is required to be a member of at least one subtype. Total Completeness Rule – total specialization rule, specifies that each occurrence of an entity in the supertype Overlap Rule – allows the overlapping must be a member of a subtype or more. subtypes; thus, it specifies that an instance of the supertype may be part of Unit 4: The Relational Database Model more than one subtypes simultaneously. Lesson 1: Logical Database Design A small letter “o" is placed inside the circle to denote that the constraint exists Logical Database Design is overlap. a process of transforming the conceptual data model (E-R model) into a logical data model that emphasizes the relational data model. 1. Data Structure. Data are organized in form of tables with rows and columns. 2. Data Manipulation. Powerful operations are used to manipulate data stored in the relations. 3. Data Integrity. Data validation Lesson 4: Subtype Discriminators ensured that the data will be used is valid and determines the quality of the Subtype Discriminator for Disjoint Rule generated information a simple attribute with alternative values Relational Data Structure. A relation is to indicate the possible subtypes. named a two-dimensional table of data. Each relation consists of a set of named columns and an arbitrary number of unnamed rows Relational Keys Primary Key. It consists of one or more columns whose data contained within are used to uniquely identify each row in the table Foreign Key. It is a set of one or more columns in a table that refers to the primary key in another table Composite Key. The composite key is a primary key that consists of more than Subtype Discriminator for Overlapping Rule one column and that can be used to a composite attribute with component uniquely identify each row in the table. parts that indicate the possible subtypes Lesson 2: Integrity Constraints an instance of the supertype belong. Types of Integrity Constraints Domain Constraints – A domain is the set of values that should be associated with the column Entity Integrity – All primary key values are unique, and no part of the primary key may be null. Referential Integrity – A referential integrity constraint requires that a foreign key must have a matching primary key or it must be null. Lesson 3: Transforming ERD/EERD to relationship that is on the many-side of the Relations relationship (Hoffer, Ramesh, & Topi, 2016). Step 1: Map Regular Entity – For each regular entity type, create a relation that includes all the simple columns. Each simple column of the ER diagram becomes the column of the relation and the identifier becomes the primary key of the relation (Elmasri, & Navathe, 2016). Step 2: Map Weak Entity – For the weak entity Map Binary Many-to-Many type, create a new relation and include all of the Relationships – Binary many-to-many (M: N) simple columns of this relation. Then include the relationships can be viewed as a special case of primary key of the identifying owner as a foreign a one-to-many relationship. In mapping, such a key column in this relation. relationship to relations requires two steps, first create a relation for each of the two entity types participating in the relationship, and then create a third relation. Include as foreign key columns in the third relation the primary keys of the two participating relations. These will also be considered its composite key columns. Also, all columns connected with the M: N relationship is considered columns of the third relation(Hoffer, Ramesh, & Topi, 2016). Step 3: Map Binary Relationship – The procedure for representing relationships depends on both the degree of relationships and the cardinalities of the relationships. Binary relationships are those, which are established two entity types. Map Binary One-to-Many Relationships – For each binary 1:M relationship, first, create a relation for each of the two entity types participating in the relationship, use the procedure described in Step 1. Next, incorporate the primary key column (or columns) of the entity on the one side of the relationship as a foreign key in the Map Binary One-to-One Relationships – In mapping, such a binary one-to-one relationship into relations requires two steps. In the first step, two relations are created and secondly, the primary key of one of the relations is included as a foreign key in the other relation. In a 1:1 relationship, the association in one direction is nearly always optional, while the association in the other direction is a mandatory one. The rule is to include a foreign key on the optional side of the relationship the primary key of the mandatory participation in the 1:1 relationship. This approach will avoid the need for null values in the foreign key column. Any columns associated with the relationship itself have also included the foreign key in the same relation (Hoffer, Ramesh, & Topi, 2016). Examples are shown in figures 4.20 and 4.21. Identifier Assigned – If a surrogate identifier is assigned to the associative entity type, still an associative relation is created. However, the primary key for this relation is the surrogate identifier and the primary keys of the two participating entity types are then included as its foreign keys (Hoffer, Ramesh, & Topi, 2016). Example shows in figure 4.23. Step 4: Map Associative Entities – Many-to- many relationships may be chosen to be modeled as associative entities in the E-R diagram. Mapping the associative entity involves essentially the same procedure as what we do in mapping an M: N relationship. The first step is to create two relations one for each of the two participating entity types, then creates a third relation for the associative entity. Step 5: Map Unary Relationships – The unary The second step depends on whether or not an relationship is a relationship between the identifier is assigned to the associative entity on instances of a single entity type and it is also the E-R diagram. called recursive relationships. The two Identifier Not Assigned – If an identifier important cases of unary relationships are one- was not assigned, the steps followed are just the to-many and many-to-many. same as the steps followed in binary M: N Unary One-to-Many Relationships – relationship. As a review, the default primary key The primary key of the same entity type is used for the associative relation consists of the two as a foreign key in the same relation and with a primary key columns from the other two different name since the same column name relations. These columns are the foreign keys cannot be used in the same table (Hoffer, that reference from the other two relations Ramesh, & Topi, 2016). Examples are shown in (Hoffer, Ramesh, & Topi, 2016). Examples are figures 4.24 and 4.25. shown in figures 4.21 and 4.22. To map an associative entity type that connects the three regular entity types creates a new associative relation. The associative entity has foreign keys to each entity type. These columns act in the role of foreign keys that reference the individual primary keys of the participating entity types (Hoffer, Ramesh, & Topi, 2016). Examples are shown in figures 4.28 and 4.29. Unary Many-to-Many Relationship – With this type of relationship, two relations are created: one to represent the entity type in the relationship and the other one the associative relation to representing the M: N relationship itself. The primary key of the associative relation is composed of two columns. These columns both take their values from the primary key of the other relation which need not have the same name. Any non-key column is included in the associative relation (Hoffer, Ramesh, & Topi, 2016). Examples are shown in figures 4.26 and 4.27. Step 7: Map Supertype/Subtype Relationships – Strategies to represent these in the relational model: 1. Create a separate relation for the supertype and each of the subtypes. 2. Assign to the relation to the columns that are common to all members of the supertype, including the primary key 3. Assign to the relation for each subtype the primary key of the supertype, only those columns that are unique to that subtype. 4. Supertype columns include the identifier and Step 6: Map Ternary Relationships – The subtype discriminator go to in the supertype ternary relationship is a relationship type that relation. involves many to many relationships between three entity types. It is recommended that you convert a ternary relationship to an associative entity to represent participation constraints more accurately. 2. Simplify the enforcement of referential integrity constraints. 3. Make it easier to maintain the data (insert, update, and delete). 4. Provide a better design that is to improved representation of the real world and a stronger basis for future growth. Normalization works through a series of steps called normal forms, is used to eliminate or reduce redundancy in database tables. Normal Form – It is a state of relations that requires certain rules regarding relationships between columns (functional dependencies) are satisfied (Hoffer, Ramesh, & Topi, 2016). First Normal Form. 1NF Rule – Removes all Lesson 4: Normalization multi-valued columns: Multi-valued columns are Well Structured Relations – For a relation to also called repeating groups, which refer to a be well-structured, it should have minimum group of columns in a record that occurs several redundancy and as such should allow a user to times in each record. As per the rule of the first modify, insert, or delete a row without creating normal form, a column of a table cannot hold inconsistencies or errors. Well-structured multiple values. It should hold only a single relations are important in logical database value. design because they increase database integrity. It ensures there is no duplication or omission of data when creating a database (Luke, & Crowther, 2013). The errors or inconsistencies that may result when a user attempts to update a relation that contains redundant data are referred to as anomalies. Types of anomalies: 1. Insert / Insertion Anomaly 2. Delete/Deletion Anomaly 3. Update/Modification Anomaly NORMALIZATION – It is a process of evaluating and correcting relation structures to minimize data redundancies, thereby reducing the likelihood of data anomalies (Coronel, Morris, & Rob, 2011). Main goals of normalization: 1. Minimize data redundancy, avoiding anomalies, and conserving storage Second Normal Form. 2NF Rule – Removes space. all partial functional dependencies: The term functional dependencies refer to the relationship between the columns in a table. A relation that is in First Normal Form and every non-primary key column is fully dependent on the whole set of a primary key. Analyzing the result of 1NF, we will notice the SubjectDescription depends on the primary key SubjectCode (which is only part of the composite primary) in the Grade table. Thus, partial functional dependency exists. To remove the partial functional dependency, Third Normal Form. 3NF Rule – Removes all transitive dependencies: Transitive dependencies are the dependencies between non-key columns. If a transitive dependency exists, we remove the transitively dependent column from the relation by putting the column in a new relation.

Use Quizgecko on...
Browser
Browser