IT307_BitaraZyronJacob_Reviewer_Unit1.docx

Full Transcript

**The Database Environment** - **Definition:** The environment where databases are created, managed, and utilized. - **Components:** - **Database Management System (DBMS):** Software used to manage and interact with databases (e.g., MySQL, Oracle). - **Database:** Org...

**The Database Environment** - **Definition:** The environment where databases are created, managed, and utilized. - **Components:** - **Database Management System (DBMS):** Software used to manage and interact with databases (e.g., MySQL, Oracle). - **Database:** Organized collection of data, usually structured in tables. - **Users:** Individuals or applications interacting with the database. - **Applications:** Software that uses the database to perform tasks (e.g., ERP systems). **Data Models: Operational Database** - **Definition:** A database designed to handle daily operations and transactions. - **Characteristics:** - **Real-time Data:** Stores current and accurate data for day-to-day activities. - **Normalization:** Data is often normalized to reduce redundancy and improve integrity. - **Examples:** Customer orders, inventory records, employee data. - **Use Cases:** - **Transactional Processing:** Handling sales transactions, inventory updates. - **Operational Efficiency:** Supports routine operations like payroll, billing, and customer service. **Data Models: Data Warehouse** - **Definition:** A database optimized for querying and reporting rather than transaction processing. - **Characteristics:** - **Historical Data:** Stores large amounts of historical data from various sources. - **Denormalization:** Data may be denormalized for faster querying and analysis. - **Data Integration:** Combines data from multiple operational databases and external sources. - **Use Cases:** - **Business Intelligence:** Analyzing trends, generating reports, and making strategic decisions. - **Data Mining:** Discovering patterns and insights from large datasets. **Operational Database vs Data Warehouse** - **Purpose:** - **Operational Database:** Focused on daily operations and transactional processing. - **Data Warehouse:** Focused on analysis, reporting, and historical data. - **Data Structure:** - **Operational Database:** Typically normalized for efficient transactions. - **Data Warehouse:** Often denormalized for efficient querying and reporting. - **Data Updates:** - **Operational Database:** Real-time updates and transactions. - **Data Warehouse:** Batch updates, often scheduled during off-peak hours. - **Performance:** - **Operational Database:** Optimized for read-write operations and transaction speed. - **Data Warehouse:** Optimized for read operations and complex queries. **Business Insights & Data Availability** - **Business Insights:** - **Definition:** Valuable information gained from analyzing data that can inform business decisions. - **Examples:** Sales trends, customer behavior, market opportunities. - **Sources:** Data from operational databases, data warehouses, and external data sources. - **Data Availability:** - **Importance:** Ensures that data is accessible when needed for decision-making. - **Challenges:** Data integration, data quality, and timely access. - **Solutions:** Data warehousing, cloud storage, data management strategies. **Overview of Big Data (including the V\'s)** - **Definition:** Large and complex datasets that traditional data processing tools cannot handle efficiently. - **Characteristics (V\'s):** - **Volume:** The amount of data generated and stored. - **Velocity:** The speed at which data is generated and processed. - **Variety:** The different types and sources of data (e.g., text, images, videos). - **Veracity:** The accuracy and trustworthiness of the data. - **Value:** The potential insights and benefits derived from analyzing the data. **Entity Relationship Diagram (ERD)** - **Definition:** A visual representation of the data structure in a database, showing how entities relate to one another. - **Purpose:** - **Design:** Helps in designing databases by illustrating the relationships and structure of data. - **Communication:** Facilitates communication between stakeholders and developers by providing a clear model of the data. - **Components:** - **Entities:** Represented as rectangles, they denote objects or concepts that store data (e.g., Employee, Product). - **Attributes:** Represented as ovals, they describe properties or characteristics of entities (e.g., Employee ID, Product Name). - **Relationships:** Represented as diamonds, they show how entities are related (e.g., works\_for, purchases). - **Lines:** Connect entities to relationships and attributes, indicating associations. **Entity Types** - **Definition:** Categories of objects or concepts that the database will track. - **Types:** - **Strong Entity:** Represents an independent entity with its own unique identifier (e.g., Customer). - **Weak Entity:** Dependent on another entity for its identification and existence (e.g., Order Detail, which depends on Order). - **Associative Entity:** Represents a relationship between two or more entities, often with its own attributes (e.g., Enrollment in a Course). **Attributes** - **Definition:** Properties or characteristics that describe an entity. - **Types:** - **Simple Attribute:** Cannot be divided further (e.g., First Name, Age). - **Composite Attribute:** Can be divided into smaller attributes (e.g., Address, which can be broken down into Street, City, State). - **Derived Attribute:** Can be calculated from other attributes (e.g., Age derived from Date of Birth). - **Multivalued Attribute:** Can hold multiple values (e.g., Phone Numbers). - **Keys:** - **Primary Key:** Unique identifier for each entity instance (e.g., Student ID). - **Foreign Key:** Attribute that creates a link between two entities (e.g., Customer ID in an Order). **Relationship** - **Definition:** Describes how two or more entities interact or are associated with each other. - **Types:** - **One-to-One (1:1):** Each entity in the relationship will have only one related entity (e.g., each person has one passport). - **One-to-Many (1):** One entity is associated with multiple instances of another entity (e.g., a department has many employees). - **Many-to-Many (M):** Multiple entities can be associated with multiple other entities (e.g., students enrolling in multiple courses and courses having multiple students). - **Cardinality:** Specifies the number of instances of one entity that can or must be associated with each instance of another entity (e.g., a customer may place multiple orders). **Relation and Its Structure** - **Definition:** A relation is a table in a relational database where data is organized into rows and columns. - **Components:** - **Table (Relation):** Represents the entity and consists of rows (tuples) and columns (attributes). - **Rows (Tuples):** Each row represents a unique instance of the entity (e.g., a specific employee). - **Columns (Attributes):** Each column represents a property or characteristic of the entity (e.g., Employee ID, Name). - **Domain:** The set of allowable values for an attribute (e.g., integer, string, date). **Relational Keys** - **Definition:** Keys are attributes or sets of attributes used to uniquely identify rows in a relation. - **Types:** - **Primary Key:** A unique identifier for each row in a table. It must be unique and not null (e.g., Employee ID). - **Candidate Key:** An attribute or set of attributes that could be used as a primary key. There may be multiple candidate keys (e.g., Social Security Number, Employee ID). - **Foreign Key:** An attribute in one table that refers to the primary key of another table. It establishes a relationship between two tables (e.g., Department ID in the Employee table refers to Department ID in the Department table). - **Composite Key:** A primary key composed of two or more attributes (e.g., a combination of Student ID and Course ID to uniquely identify a course enrollment). **Integrity Constraints** - **Definition:** Rules applied to ensure the accuracy and consistency of data within a database. - **Types:** - **Domain Integrity:** Ensures that attributes adhere to defined domains (e.g., Age must be a positive integer). - **Entity Integrity:** Ensures that each entity has a unique and non-null primary key (e.g., every row must have a unique Employee ID). - **Referential Integrity:** Ensures that foreign keys correctly reference primary keys in related tables, maintaining valid relationships (e.g., every Order must be associated with a valid Customer). - **Business Rules:** Specific constraints based on the business logic or requirements (e.g., an employee's salary must be within a certain range). **Mapping ERD/EERD** - **Definition:** The process of converting an Entity-Relationship Diagram (ERD) or Enhanced Entity-Relationship Diagram (EERD) into a relational schema. - **Steps:** - **Entities to Tables:** Convert entities into tables. Each entity becomes a table, with attributes becoming columns. - **Attributes to Columns:** Map attributes of entities to columns in the table. - **Relationships to Foreign Keys:** Convert relationships into foreign keys. Relationships between entities are represented by adding foreign keys in the relevant tables. - **Handling Multi-Valued Attributes:** Create separate tables for multi-valued attributes, with foreign keys to link back to the original table. - **Handling Weak Entities:** Convert weak entities into tables with foreign keys referencing the strong entity's primary key. **Normalization** - **Definition:** The process of organizing data to reduce redundancy and improve data integrity. - **Normal Forms:** - **First Normal Form (1NF):** Ensures that each column contains atomic (indivisible) values and each row is unique. - **Second Normal Form (2NF):** Ensures that all non-key attributes are fully functionally dependent on the primary key (removes partial dependency). - **Third Normal Form (3NF):** Ensures that all attributes are dependent only on the primary key and not on other non-key attributes (removes transitive dependency). - **Boyce-Codd Normal Form (BCNF):** A stronger version of 3NF that deals with anomalies related to functional dependencies. - **Fourth Normal Form (4NF):** Removes multi-valued dependencies, ensuring that no table contains two or more independent multi-valued facts about an entity. - **Fifth Normal Form (5NF):** Ensures that every join dependency is a consequence of the candidate keys. **Querying a Single Table** - **Definition:** Retrieving data from one table based on specified criteria. - **Basic Syntax:** To select data from a table, you use the SELECT statement followed by the column names and the FROM clause specifying the table name. You can include a WHERE clause to filter results based on a condition. - **Selecting All Columns:** Retrieves all columns from the specified table. - Example: SELECT \* FROM Employees; - **Selecting Specific Columns:** Retrieves only the specified columns from the table. - Example: SELECT FirstName, LastName FROM Employees; - **Filtering Data:** Retrieves rows that match specified conditions using the WHERE clause. - Example: SELECT \* FROM Employees WHERE Department = \'Sales\'; - **Sorting Data:** Orders the results using the ORDER BY clause, specifying the column and order (ascending or descending). - Example: SELECT \* FROM Employees ORDER BY LastName ASC; - **Limiting Results:** Restricts the number of rows returned using the LIMIT clause. - Example: SELECT \* FROM Employees LIMIT 10; **Common & Aggregate Functions** - **Common Functions:** Perform calculations or operations on data within a table. - **COUNT()**: Counts the number of rows that match a specified condition. - Example: SELECT COUNT(\*) FROM Employees; - **SUM()**: Calculates the sum of a numeric column. - Example: SELECT SUM(Salary) FROM Employees; - **AVG()**: Computes the average value of a numeric column. - Example: SELECT AVG(Salary) FROM Employees; - **MIN()**: Returns the smallest value in a column. - Example: SELECT MIN(Salary) FROM Employees; - **MAX()**: Returns the largest value in a column. - Example: SELECT MAX(Salary) FROM Employees; - **Grouping Data:** The GROUP BY clause groups rows that have the same values into summary rows. - Example: SELECT Department, AVG(Salary) FROM Employees GROUP BY Department; - **Filtering Groups:** The HAVING clause filters groups based on a specified condition, similar to WHERE but applied after aggregation. - Example: SELECT Department, AVG(Salary) FROM Employees GROUP BY Department HAVING AVG(Salary) \> 50000; **Working with Multiple Tables** - **Definition:** Combining data from two or more tables based on related columns using joins. - **Types of Joins:** - **Inner Join:** Returns rows where there is a match in both tables. - Example: SELECT Employees.FirstName, Departments.DepartmentName FROM Employees INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID; - **Left (Outer) Join:** Returns all rows from the left table and matched rows from the right table. Unmatched rows from the right table will have NULL values. - Example: SELECT Employees.FirstName, Departments.DepartmentName FROM Employees LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID; - **Right (Outer) Join:** Returns all rows from the right table and matched rows from the left table. Unmatched rows from the left table will have NULL values. - Example: SELECT Employees.FirstName, Departments.DepartmentName FROM Employees RIGHT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID; - **Full (Outer) Join:** Returns all rows when there is a match in one of the tables. Unmatched rows from both tables will have NULL values. - Example: SELECT Employees.FirstName, Departments.DepartmentName FROM Employees FULL OUTER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID; - **Cross Join:** Returns the Cartesian product of the two tables, i.e., all possible combinations of rows. - Example: SELECT Employees.FirstName, Departments.DepartmentName FROM Employees CROSS JOIN Departments; **Subqueries** - **Definition:** A query nested inside another query, used to provide intermediate results or perform more complex filtering. - **Types of Subqueries:** - **Single-row Subquery:** Returns a single row and value, often used with operators like =. - Example: SELECT FirstName FROM Employees WHERE DepartmentID = (SELECT DepartmentID FROM Departments WHERE DepartmentName = \'Sales\'); - **Multiple-row Subquery:** Returns multiple rows, often used with operators like IN. - Example: SELECT FirstName FROM Employees WHERE DepartmentID IN (SELECT DepartmentID FROM Departments WHERE Location = \'New York\'); - **Correlated Subquery:** References columns from the outer query and returns results based on each row of the outer query. - Example: SELECT FirstName FROM Employees e WHERE EXISTS (SELECT \* FROM Departments d WHERE d.DepartmentID = e.DepartmentID AND d.Location = \'New York\'); - **Scalar Subquery:** Returns a single value (scalar) and can be used in expressions. - Example: SELECT FirstName, (SELECT AVG(Salary) FROM Employees) AS AverageSalary FROM Employees;

Use Quizgecko on...
Browser
Browser