CQ1 6INFOMAN LECTURE NOTES REVIEWER.pdf
Document Details
Uploaded by RefinedLimerick
Harvard University
Full Transcript
Q1 6INFOMAN LECTURE NOTES REVIEWER DAYRIT, Jann Ericka D. | 1Y1S-C102 LC1: DATABASE SYSTEMS TYPES OF DATA DATA VS. INFORMATION DATABASE SYSTEMS DATA consists of raw facts facts that have not yet been processed to reveal their meaning...
Q1 6INFOMAN LECTURE NOTES REVIEWER DAYRIT, Jann Ericka D. | 1Y1S-C102 LC1: DATABASE SYSTEMS TYPES OF DATA DATA VS. INFORMATION DATABASE SYSTEMS DATA consists of raw facts facts that have not yet been processed to reveal their meaning constitutes the building blocks of information w/o it, there’s no information can stand alone INFORMATION result of processing raw data to reveal its meaning produced by processing data Information is used to reveal the meaning of data ★Accurate, relevant, and timely information is the key to good decision-making. ★Good decision making is the key to organizational survival in a global environment. A.QUALITATIVE DATA describes the data that fits into the categories data that are not numerical ➔Nominal Data ‐ helps to label the variables without providing the numerical value ‐ cannot be ordered and measured ➔Ordinal Data ‐ follows a natural order ‐ significant feature: the difference between the data values is not determined B.QUANTITATIVE DATA data which represents the numerical value (i.e., how much, how often, how many) ➔Discrete Data ‐ can take only discrete values ‐ contains only a finite number of possible values STATES OF DATA ➔Continuous Data A.DATA AT REST ‐ data that can be calculated ‐ has an infinite number of probable values that can be data in computer storage that is not currently being accessed or transferred selected within a given specific range a stock data ★Corporate files, backup data storage, USB drive data, cloud WHY DATABASES? storage, file archives Data is ubiquitous (abundant, global, everywhere) and pervasive (unescapable, prevalent, persistent) B.DATA IN MOTION Databases make data persistent and shareable in a secure data that is moving or being transferred between locations way within or between computer systems Data is not only ubiquitous and pervasive, it is essential for moving from one place to another organizations to survive and prosper ★Email attachments, FTP sites, Wifi and mobile networks, files being downloaded, synced or transferred INTRODUCING THE DATABASE C.DATA IN USE DATABASE data that is currently being updated, processed, accessed shared, integrated computer structure that stores a and read by a system collection of the ffg.: ★Files in Office applications (documents, PDFs, PowerPoints), ➔End-User Data database applications, CPU data, RAM data - raw facts of interest to the end user ➔Metadata - data about data, through which the end-user data is integrated and managed DAYRIT, J.E.D. 001 - describes the data characteristics and the set of crucial aspect of working with databases that most of this relationships that links the data found within the database book is dedicated to the development of good database DATABASE MANAGEMENT SYSTEM (DBMS) design techniques Even a good DBMS will perform poorly with a badly designed collection of programs that manages the database structure database and controls access to the data stored in the database DATABASE EVOLUTION OF FILE SYSTEM DATA PROCESSING MANUAL VS. COMPUTERIZED FILE SYSTEMS resembles a very well-organized electronic filing cabinet in which powerful software (the DBMS) helps manage the A.Manual File System paper-and-pencil systems cabinet’s contents B.Computerized File System ROLES AND ADVANTAGES OF DBMS computer-based systems Improved data sharing BASIC FILE TERMINOLOGIES Improved data security Better data integration A.Data raw facts Minimized data inconsistency Improved data access B.Field character or group of characters that has a special meaning Improved decision making Increased end-user productivity C.Record logically connected set of one or more fields that describes a TYPES OF DATABASES [USERS] person, place, or thing D.File A.SINGLE-USER DATABASE collection of related records desktop database B.WORKGROUP DATABASE PROBLEMS WITH FILE SYSTEM DATA PROCESSING ➔Workgroup Database PROBLEMS WITH FILE SYSTEMS - less than 50 users ➔Enterprise Database Lengthy development times Difficult of getting quick answers - more than 50 users Complex system administration TYPES OF DATABASES [LOCATION] Lack of security and limited data sharing A.Centralized Database Extensive programming located in a single site STRUCTURAL AND DATA DEPENDENCE B.Distributed Database A.Structural Dependence distributed across several sites access to a file is dependent on its structure TYPES OF DATABASES [TECHNOLOGY] B.Structural Independence A.Cloud Database changing the file structure without affecting the applications ability to access the data database that is created and maintained using cloud data services, such as Microsoft Azure or Amazon Web Services C.Data Dependence (AWS) all data access programs are subject to change when any of the file’s data storage characteristics change TYPES OF DATABASES [LOCATION] D.Data Independence A.Operational Database possible to make changes in the data storage characteristics used for running the business on a day-to-day basis without affecting the application program’s ability to access B.Analytical Database the data used for understanding and improving the business through research and analysis DATA REDUNDANCY exists when the same data is stored unnecessarily at different TYPES OF DATABASE places Social media refers to web and mobile technologies that ➔ Poor data security enable “anywhere, anytime, always on” human interactions ➔ Data inconsistency Websites such as Google, Facebook, Twitter, and LinkedIn ➔ Data-entry errors capture vast amounts of data about end users and consumers ➔ Data integrity problems This data grows exponentially and requires the use of specialized database systems DATA ANOMALIES develops when not all of the required changes in the WHY DATABASE DESIGN IS IMPORTANT? redundant data are made successfully DATABASE DESIGN cactivities that focus on the design of the database structure that will be used to store and manage end-user data DAYRIT, J.E.D. 002 DATABASE SYSTEMS STRUCTURE QUERY LANGUAGE (SQL) INTRODUCTION TO SQL DATABASE SYSTEM a database language that allows you to create database and refers to an organization of components that define and table structures, perform basic data management chores regulate the collection, storage, management, and use of (add, delete, and modify), and perform complex queries data within a database environment designed to transform the raw data into useful information. lets you access and manipulate databases became a standard of the American National Standards Institute (ANSI) in 1986, and of the International Organization for Standardization (ISO) in 1987 WHAT CAN SQL DO? execute queries against a database retrieve data from a database insert records in a database update records in a database delete records from a database create new databases create new tables in a database create stored procedures in a database create views in a database set permissions on tables, procedures, and views DBMS FUNCTIONS SQL IS A STANDARD — BUT… Data dictionary management although SQL is an ANSI/ISO standard, there are different Data storage management versions of the SQL language Data transformation and presentation however, to be compliant with the ANSI standard, they all Security management support at least the major commands such as SELECT, Multiuser access control UPDATE, DELETE, INSERT, WHERE in a similar manner Backup and recovery management Data integrity management RDBMS Database access languages and application programming stands for Relational Database Management System interface basis for SQL, and for all modern database systems such as Database communication interfaces MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft MANAGING THE DATABASE SYSTEM Access the data in the system is stored in database objects called Increased costs tables—a collection of related data entries and consists of Management complexity columns and rows Maintaining currency Vendor dependence SQL STATEMENTS Frequent upgrade/replacement cycles most of the actions you need to perform on a database are DATABASE CAREER OPPORTUNITIES done with SQL statements or queries consists of keywords that are easy to understand DATABASE TABLES database most often contains one or more tables; each table is identified by a name (e.g. “Customers” or “Orders”), and contain records (rows) with data Keep in Mind: ○ SQL keywords are NOT case sensitive: select = SELECT ○ Some database systems require a semicolon at the end of each SQL statement ○ Semicolon is the standard way to separate each SQL statement in database systems that allow more than one SQL statement to be executed in the same call to the server CATEGORIES OF SQL FUNCTIONS DATA DEFINITION LANGUAGE (DDL) ○ refers to SQL commands used to create, modify, and delete database structures such as tables, indexes, and views DATA MANIPULATION LANGUAGE (DML) LB1: INTRODUCTION TO SQL ○ refers to SQL commands used to insert, update, and delete data within a database DAYRIT, J.E.D. 003 OPENING XAMPP SHELL Type in: mysql -h localhost -u root –p DATA DEFINITION COMMANDS Double-ENTER CREATE SHOW DATABASES; ○ used to create objects in the database, such as tables, ○ To show the existing databases views, stored procedures, and more CREATE DATABASE [name_of_database]; ALTER ○ To create the database with the database name ○ used to modify the structure of an existing database object USE [name_of_database]; DROP ○ To show the current database ○ used to delete an entire object or part of an object from the SHOW TABLES; database ○ To show the tables in the current database TRUNCATE SQL CONSTRAINTS ○ used to delete all records from a table but does not delete SQL constraints are used to specify rules for the data in the table structure a table. RENAME If there is any violation between the constraint and the ○ used to rename an existing database object data action, the action is aborted by the constraint. DATA MANIPULATION COMMANDS Constraints can be specified when the table is created (inside the CREATE TABLE statement) or after the table INSERT is created (inside the ALTER TABLE statement). ○ used to add new records to a database table ○ NOT NULL - Indicates that a column cannot UPDATE store NULL value ○ used to modify existing records in a database table ○ UNIQUE - Ensures that each row for a column DELETE must have a unique value ○ used to delete existing records from a database table ○ PRIMARY KEY - A combination of a NOT NULL MERGE and UNIQUE. Ensures that a column (or ○ used to combine data from two or more tables into one combination of two or more columns) have a SELECT unique identity which helps to find a particular ○ used to retrieve data from one or more tables in a database record in a table more easily and quickly CALL ○ FOREIGN KEY - Ensure the referential integrity ○ used to call a stored procedure or function of the data in one table to match values in DATA TYPES CHECK ○ another table - Ensures that the value in a column The data type of a column defines what value the column can meets a specific condition hold: integer, character, money, date and time, binary, and so DEFAULT ○ - Specifies a default value for a on. column Each column in a database table is required to have a name SQL COMMANDS and a data type. TABLE CREATION An SQL developer must decide what type of data that will be ○ CREATE TABLE TABLENAME stored inside each column when creating a table. The data (FIRST_COLUMN DATATYPE(SIZE) type is a guideline for SQL to understand what type of data is CONSTRAINT, SECOND_COLUMN expected inside of each column, and it also identifies how DATATYPE(SIZE)CONSTRAINT, SQL will interact with the stored data. THIRD_COLUMN DATATYPE(SIZE) In MySQL there are three main data types: string, numeric, CONSTRAINT….); and date and time. INSERTING DATA/RECORD IN A TABLE: STRING ○ INSERT INTO Tablename VALUES (Data….); ○ used to add new records to a database table To display ALL data inserted ○ SELECT * FROM table_name; NUMERIC ○ used to modify existing records in a database To arrange the data to be displayed DATE AND TIME ○ SELECT * FROM table_name ORDER BY column_name [ASC|DESC]; ○ used to modify existing records in a database To add a column in a table, use the following syntax: XAMPP ○ ALTER TABLE table_name ADD column_name datatype; is an abbreviation where X stands for Cross-Platform, A To rename a column in a table, use the following stands for Apache, M stands for MYSQL, and the Ps stand for PHP and Perl, respectively. syntax: ○ ALTER TABLE table_name RENAME COLUMN the most popular PHP development environment old_name to new_name; is a completely free, easy to install Apache distribution To delete a column in a table, use the following syntax containing MariaDB, PHP, and Perl. The XAMPP open source (notice that some database systems don't allow deleting package has been set up to be incredibly easy to install and a column): to use. ○ ALTER TABLE table_name DROP COLUMN https://www.apachefriends.org/index.html column_name; DAYRIT, J.E.D. 004 To change the data type of a column in a table, use the research laboratory—that stores and uses data to following syntax: generate information. ○ ALTER TABLE table_name MODIFY COLUMN To be effective, business rules must be easy to column_name datatype; To update the data inserted in the database understand and widely disseminated, to ensure that every person in the organization shares a common ○ UPDATE tablename SET columnname = interpretation of the rules. Business rules describe, in expression WHERE conditionlist; To delete a record simple language, the main and distinguishing characteristics of the data as viewed by the company. ○ DELETE FROM tablename [WHERE conditionlist]; The process of identifying and documenting business To delete the table rules is essential to database design for several reasons: ○ DROP TABLE table_name; ○ They help to standardize the company’s view To delete the database of data. ○ DROP DATABASE database_name; MOST IMPORTANT SQL COMMANDS ○ They can be a communications tool between users and designers. SELECT, UPDATE, DELETE, INSERT INTO, CREATE ○ They allow the designer to understand the DATABASE, ALTER DATABASE, CREATE TABLE, ALTER TABLE, DROP TABLE, CREATE INDEX, DROP INDEX nature, role, and scope of the data. ○ They allow the designer to understand LC2: DATA MODELS LC2: DATA MODELS business processes. ○ They allow the designer to develop DATA MODELING AND DATA MODELS appropriate relationship participation rules Data Modeling - the first step in designing a database, and constraints and to create an accurate data refers to the process of creating a specific data model model. for a determined problem domain. Data Model - is a relatively simple representation, EVOLUTION OF DATA MODELS usually graphical, of more complex real-world data HIERARCHICAL MODEL structures. ○ Its basic logical structure is represented by an upside-down tree. The hierarchical structure IMPORTANCE OF DATA MODELS contains levels, or segments. A segment is the Data models can facilitate interaction among the equivalent of a file system’s record type. Within designer, the applications programmer, and the end the hierarchy, a higher layer is perceived as the user. A well-developed data model can even foster parent of the segment directly beneath it, improved understanding of the organization for which which is called the child. the database design is developed. In short, data models NETWORK MODEL ○ It was created to represent complex data are a communication tool. relationships more effectively than the DATA MODEL BASIC BUILDING BLOCKS hierarchical model, to improve database Entity - is a person, place, thing, or event about which performance, and to impose a database standard. data will be collected and stored. Attribute - is a characteristic of an entity ○ The network model allows a record to have Relationship - describes an association among entities more than one parent. While the network database model is generally not used today, the TYPES OF RELATIONSHIPS definitions of standard database concepts that emerged with the network model are still used One to One (1 : 1) by modern data models. One to Many (1 : M) ○ Schema - conceptual organization of the entire Many to Many (M : M) database CONSTRAINT ○ Subschema - portion of the database ○ Data Manipulation Language (DML) - is a restriction placed on the data. Constraints are environment in which data can be managed important because they help to ensure data integrity and used BUSINESS RULES ○ Data Definition Language (DDL) - enables the database administrator to define schema It is a brief, precise, and unambiguous description of a components policy, procedure, or principle within a specific organization. In a sense, business rules are misnamed: RELATIONAL MODELS they apply to any organization, large or small—a The relational model foundation is a mathematical business, a government unit, a religious group, or a concept known as a relation. To avoid the complexity of DAYRIT, J.E.D. 005 abstract mathematical theory, you can think of a Object-oriented data models are typically depicted relation (sometimes called a table) as a matrix using Unified Modeling Language (UML) class composed of intersecting rows and columns. Each row diagrams. in a relation is called a tuple. Each column represents an attribute. The relational model also describes a EMERGING DATA MODELS precise set of data manipulation constructs based on Big Data advanced mathematical concepts. ○ refers to a movement to find new and better The relational data model is implemented through a ways to manage large amounts of web and very sophisticated relational database management sensor-generated data and derive business system (RDBMS). The RDBMS performs the same basic insight from it, while simultaneously providing functions provided by the hierarchical and network high performance and scalability at a DBMS systems, in addition to a host of other functions reasonable cost. that make the relational data model easier to ○ In order to create value from their previously understand and implement. unused Big Data stores, companies are using ENTITY RELATIONSHIP MODEL new Big Data technologies. These emerging technologies allow organizations to process The conceptual simplicity of relational database massive data stores of multiple formats in technology triggered the demand for RDBMSs. In turn, cost-effective ways. Some of the most the rapidly increasing requirements for transaction and frequently used Big Data technologies are information created the need for more complex Hadoop, MapReduce, and NoSQL databases. database implementation structures, thus creating the NoSQL Databases need for more effective database design tools. ○ Every time you search for a product on Entity. An entity was defined as anything about which Amazon, send messages to friends on data is to be collected and stored. An entity is Facebook, watch a video on YouTube, or represented in the ERD by a rectangle, also known as an search for directions in Google Maps, you are entity box. The name of the entity, a noun, is written in using a NoSQL database. the center of the rectangle. The entity name is generally ○ NoSQL databases are not based on the written in capital letters and is written in the singular relational model. In fact, there is no standard form. NoSQL data model. Relationships. Relationships describe associations Degrees of Abstraction among data. Most relationships describe associations ○ External model is the end users’ view of the between two entities. When the basic data model data environment. The term end users refers to components were introduced, three types of people who use the application programs to relationships among data were illustrated: one-to-many manipulate the data and generate information. (1:M), many-to-many (M:N), and one-to-one (1:1). The End users usually operate in an environment in ER model uses the term connectivity to label the which an application has a specific business relationship types. unit focus. Companies are generally divided into several business units, such as sales, OBJECT-ORIENTED MODEL finance, and marketing. Each business unit is subject to specific constraints and Increasingly complex real-world problems requirements, and each one uses a subset of demonstrated a need for a data model that more closely the overall data in the organization. Therefore, represented the real world. In the object-oriented data end users within those business units view model (OODM), both data and its relationships are their data subsets as separate from or external contained in a single structure known as an object. In to other units within the organization. turn, the OODM is the basis for the object-oriented database management system (OODBMS). ○ Conceptual Model represents a global view of Object - is an abstraction of a real-world entity. the entire database by the entire organization. Attributes - describe the properties of an object. That is, the conceptual model integrates all external views (entities, relationships, Objects that share similar characteristics are grouped in classes. constraints, and processes) into a single global A class is a collection of similar objects with shared view of the data in the enterprise. Also known as a conceptual schema, it is the basis for the structure (attributes) and behavior (methods). identification and high-level description of the Classes are organized in a class hierarchy. main data objects (avoiding any database Inheritance is the ability of an object within the class model-specific details). hierarchy to inherit the attributes and methods of the classes above it. ○ Internal Model The internal model is the representation of the database as “seen” by the DAYRIT, J.E.D. 006 DBMS. In other words, the internal model Each row is called tuple that represents a single entity requires the designer to match the conceptual occurrence within the entity set. model’s characteristics and constraints to Each table column represents an attribute, and each those of the selected implementation model. column has a distinct name. An internal schema depicts a specific Each intersection of a row and column represents a representation of an internal model, using the single data value. database constructs supported by the chosen All values in a column must conform to the same data database. format. ○ Physical Model operates at the lowest level of Each column has a specific range of values known as abstraction, describing the way data is saved the attribute domain. on storage media such as magnetic, solid state, The order of the rows and columns is immaterial to or optical media. The physical model requires DBMS. the definition of both the physical storage Each table must have an attribute or combination of devices and the (physical) access methods attributes that uniquely identifies each row. required to reach the data within those storage devices, making it both software and hardware KEYS dependent. The storage structures used are In the relational model, keys are important because dependent on the software (the DBMS and the they are used to ensure that each row in a table is operating system) and on the type of storage uniquely identifiable. They are also used to establish devices the computer can handle. The relationships among tables and to ensure the integrity precision required in the physical model’s of the data. A key consists of one or more attributes that definition demands that database designers determine other attributes. have a detailed knowledge of the hardware A composite key is a key that is composed of more than and software used to implement the database one attribute. design. ey An attribute that is a part of a key is called a k attribute. A superkey is a key that can uniquely identify any row in the table. In other words, a superkey functionally determines every attribute in the row. LC3: THE RELATIONAL DATABASE MODEL A candidate key is a minimal superkey—that is, a superkey without any unnecessary attributes. A candidate key is based on a full functional dependency. A LOGICAL VIEW OF DATA Entity integrity is the condition in which each row (entity instance) in the table has its own unique The relational data model changed all of that by identity. To ensure entity integrity, the primary key has allowing the designer to focus on the logical two requirements: representation of the data and its relationships, rather ○ all of the values in the primary key must be than on the physical storage details. unique, and Relational model enables you to view data logically ○ no key attribute in the primary key can contain rather than physically. a null. TABLES AND THEIR CHARACTERISTICS A null is the absence of any data value, and it is never allowed in any part of the primary key and it could The logical view of the relational database is facilitated represent any of the following: by the creation of data relationships based on a logical ○ An unknown attribute value construct known as a relation. ○ A known, but missing, attribute value A table is perceived as a two-dimensional structure ○ A “not applicable” condition composed of rows and columns. A foreign key (FK) is the primary key of one table that You can think of a table as a persistent representation has been placed into another table to create a common of a logical relation—that is, a relation whose contents attribute. can be permanently saved for future use. As far as the Referential integrity refers to the relationship between table’s user is concerned, a table contains a group of tables. Because each table in a database must have a related entity occurrences—that is, an entity set. primary key, this primary key can appear in other tables CHARACTERISTICS OF A RELATIONAL TABLE because of its relationship to data within those tables. When a primary key from one table appears in another A table is perceived as a two-dimensional structure table, it is called a foreign key. composed of rows and columns Secondary Key DAYRIT, J.E.D. 007 ○ is defined as a key that is used strictly for data DATA DICTIONARY retrieval purposes. The data dictionary provides a detailed description of ○ is an additional key, or alternate key, which can all tables in the database created by the user and be used in addition to the primary key to locate designer. Thus, the data dictionary contains at least all specific data. of the attribute names and characteristics for each table To avoid nulls, some designers use special codes, known as flags, to indicate the absence of some value. in the system. In short, the data dictionary contains metadata—data about data. DEPENDENCIES DATA REDUNDANCY REVISITED Determination - is the state in which knowing the value The real test of redundancy is not how many copies of a of one attribute makes it possible to determine the given attribute are stored, but whether the elimination value of another. The relationship is calledfunctional dependence , of an attribute will eliminate information. Therefore, if you delete an attribute and the original information can which means that the value of one or more attributes still be generated through relational algebra, the determines the value of one or more other attributes. Determinant - the attribute whose value determines inclusion of that attribute would be redundant. Given that view of redundancy, proper foreign keys are clearly another Dependent - the attribute whose value is determined by not redundant in spite of their multiple occurrences in a table. However, even when you use this less the other attribute restrictive view of redundancy, keep in mind that RELATIONAL ALGEBRA controlled redundancies are often designed as part of the system to ensure transaction speed and/or Relational algebra defines the theoretical way of information requirements. manipulating table contents using relational operators. RELATIONAL SET OPERATORS SELECT is referred to as a unary operator because it INDEXES only uses one table as input. It yields values for all rows An index is an orderly arrangement used to logically found in the table that satisfy a given condition. access rows in a table. PROJECT yields all values for selected attributes. It is Indexes in the relational database environment work also a unary operator, accepting only one table as input. like the indexes described in the preceding paragraphs. PROJECT will return only the attributes requested, in From a conceptual point of view, an index is composed the order in which they are requested. In other words, of an index key and a set of pointers. PROJECT yields a vertical subset of a table. UNION combines all rows from two tables, excluding duplicate rows. To be used in the UNION, the tables must have the same attribute characteristics; in other words, the columns and domains must be compatible. When two or more tables share the same number of columns, and when their corresponding columns share the same or compatible domains, they are said to be union-compatible. INTERSECT yields only the rows that appear in both tables. As with UNION, the tables must be union-compatible to yield valid results. For example, you cannot use INTERSECT if one of the attributes is numeric and one is character-based. For the rows to be considered the same in both tables and appear in the result of the INTERSECT, the entire rows must be exact duplicates. DIFFERENCE yields all rows in one table that are not found in the other table; that is, it subtracts one table from the other. As with UNION, the tables must be union-compatible to yield valid results. PRODUCT yields all possible pairs of rows from two tables—also known as the Cartesian product. THE DATA DICTIONARY AND THE SYSTEM CATALOG DAYRIT, J.E.D. 008