Podcast
Questions and Answers
Which of the following scenarios would warrant specifying more access paths for a file?
Which of the following scenarios would warrant specifying more access paths for a file?
- When the file contains attributes that are neither primary nor candidate keys.
- When the file is small and rarely accessed.
- When the file is infrequently updated.
- When the file is updated frequently. (correct)
What is the primary purpose of specifying access paths on candidate key attributes?
What is the primary purpose of specifying access paths on candidate key attributes?
- To ensure uniqueness constraints are efficiently checked. (correct)
- To optimize join operations with other relations.
- To speed up range queries on the attribute.
- To reduce storage space used by the attribute.
In the context of physical database design, what is a critical trade-off to consider when deciding whether to index an attribute?
In the context of physical database design, what is a critical trade-off to consider when deciding whether to index an attribute?
- The trade-off between security and accessibility.
- The trade-off between data integrity and data redundancy.
- The trade-off between query performance and update overhead. (correct)
- The trade-off between storage space and memory usage.
Under which of the following conditions is creating a multiattribute index most beneficial?
Under which of the following conditions is creating a multiattribute index most beneficial?
Which statement accurately describes the implications of choosing an attribute for a clustered index?
Which statement accurately describes the implications of choosing an attribute for a clustered index?
For what type of queries do range queries benefit most from?
For what type of queries do range queries benefit most from?
In general, why are B+-trees the typical choice for indexing in RDBMSs compared to hash indexes?
In general, why are B+-trees the typical choice for indexing in RDBMSs compared to hash indexes?
When is dynamic hashing a particularly suitable choice for file organization?
When is dynamic hashing a particularly suitable choice for file organization?
What is denormalization, and why might a database designer choose to implement it?
What is denormalization, and why might a database designer choose to implement it?
What practical reasons typically drive the decision to denormalize a database?
What practical reasons typically drive the decision to denormalize a database?
What are the primary inputs to the database tuning process after the database is deployed?
What are the primary inputs to the database tuning process after the database is deployed?
What is the main goal of database tuning?
What is the main goal of database tuning?
Which of the following is a key indicator that indexes may need to be revised?
Which of the following is a key indicator that indexes may need to be revised?
What action does rebuilding a clustered index typically involve?
What action does rebuilding a clustered index typically involve?
Which of the following scenarios might justify dropping an existing index?
Which of the following scenarios might justify dropping an existing index?
Why is it important to avoid using DISTINCT unnecessarily in SQL queries?
Why is it important to avoid using DISTINCT unnecessarily in SQL queries?
After identifying a poorly performing query, which strategies can be applied?
After identifying a poorly performing query, which strategies can be applied?
What does query tuning involve?
What does query tuning involve?
Why might query optimizers not use indexes in the presence of arithmetic expressions or substring comparisons?
Why might query optimizers not use indexes in the presence of arithmetic expressions or substring comparisons?
What is a common strategy for improving the performance of a query with multiple selection conditions connected by OR
?
What is a common strategy for improving the performance of a query with multiple selection conditions connected by OR
?
How can using temporary tables assist in optimizing complex queries?
How can using temporary tables assist in optimizing complex queries?
In the context of transactions, what does atomicity ensure?
In the context of transactions, what does atomicity ensure?
What is an interleaved execution of processes in a multi-programming system?
What is an interleaved execution of processes in a multi-programming system?
Why is concurrency control essential in a multiuser database system?
Why is concurrency control essential in a multiuser database system?
What is the 'lost update' problem in concurrent transactions?
What is the 'lost update' problem in concurrent transactions?
What is a 'dirty read' in the context of database transactions?
What is a 'dirty read' in the context of database transactions?
What is the approach taken by a DBMS when a transaction fails after executing some operations but before completing all of them?
What is the approach taken by a DBMS when a transaction fails after executing some operations but before completing all of them?
Which type of failure is typically classified as a media failure?
Which type of failure is typically classified as a media failure?
What marks the beginning of a transaction's execution from the recovery manager's perspective?
What marks the beginning of a transaction's execution from the recovery manager's perspective?
What signal is sent at the end of a transaction if any changes executed by the transaction can be safely committed to the database and will not be undone?
What signal is sent at the end of a transaction if any changes executed by the transaction can be safely committed to the database and will not be undone?
What can be used when a system failure occurs to restore data?
What can be used when a system failure occurs to restore data?
What is force-writing in the context of database transactions?
What is force-writing in the context of database transactions?
A transaction that fails to complete leaves a database in what sort of inconsistent state?
A transaction that fails to complete leaves a database in what sort of inconsistent state?
Isolation is enforced by which part of the DBMS?
Isolation is enforced by which part of the DBMS?
What condition describes the circumstances relating to a recoverable schedule?
What condition describes the circumstances relating to a recoverable schedule?
What is the name of an event where an uncommitted transaction has to be rolled back because it read an item from a failed transaction?
What is the name of an event where an uncommitted transaction has to be rolled back because it read an item from a failed transaction?
What should occur to satisfy the criterion that prevents cascading rollbacks from occurring?
What should occur to satisfy the criterion that prevents cascading rollbacks from occurring?
What are schedules called in which transactions can neither read nor write an item X until the last transaction that wrote X has committed (or aborted)?
What are schedules called in which transactions can neither read nor write an item X until the last transaction that wrote X has committed (or aborted)?
What is a necessary, although not sufficient by itself, action that each transaction T must take part in?
What is a necessary, although not sufficient by itself, action that each transaction T must take part in?
If a schedule S of n transactions is a complete schedule, what must the operations in S include?
If a schedule S of n transactions is a complete schedule, what must the operations in S include?
When are two operations in a schedule said to conflict?
When are two operations in a schedule said to conflict?
Under what conditions is a schedule S of n transactions serializable?
Under what conditions is a schedule S of n transactions serializable?
Flashcards
Uniqueness Constraints on Attributes
Uniqueness Constraints on Attributes
Specifying access paths on candidate key attributes or sets of attributes that should be unique.
Design Decisions About Indexing
Design Decisions About Indexing
Attributes required in equality or range conditions and keys/join conditions require access paths for indexing.
Whether to Index an Attribute
Whether to Index an Attribute
The attribute must be a key, or used in a query for selection (equality/range) or join conditions.
What Attribute to Index On
What Attribute to Index On
Signup and view all the flashcards
Denormalization
Denormalization
Signup and view all the flashcards
Materialized Join
Materialized Join
Signup and view all the flashcards
Vertical Partitioning
Vertical Partitioning
Signup and view all the flashcards
When Disk Accesses are too High
When Disk Accesses are too High
Signup and view all the flashcards
Using DNO = DNUMBER
Using DNO = DNUMBER
Signup and view all the flashcards
Redundant Distinct Operations
Redundant Distinct Operations
Signup and view all the flashcards
Collapsing Multiple Queries
Collapsing Multiple Queries
Signup and view all the flashcards
Indexing
Indexing
Signup and view all the flashcards
Tables May Be Joined
Tables May Be Joined
Signup and view all the flashcards
BCNF-based Table Storage
BCNF-based Table Storage
Signup and view all the flashcards
BCNF Relation Storage
BCNF Relation Storage
Signup and view all the flashcards
Lost Update
Lost Update
Signup and view all the flashcards
Temporary Update Problem
Temporary Update Problem
Signup and view all the flashcards
Incorrect summary function
Incorrect summary function
Signup and view all the flashcards
Unrepeatable Reads
Unrepeatable Reads
Signup and view all the flashcards
Transaction Failures
Transaction Failures
Signup and view all the flashcards
Computer Failure
Computer Failure
Signup and view all the flashcards
Transaction/System Error
Transaction/System Error
Signup and view all the flashcards
Database Transaction
Database Transaction
Signup and view all the flashcards
BEGIN_TRANSACTION Operation
BEGIN_TRANSACTION Operation
Signup and view all the flashcards
ROLLBACK Operation
ROLLBACK Operation
Signup and view all the flashcards
COMMIT_TRANSACTION
COMMIT_TRANSACTION
Signup and view all the flashcards
System Log
System Log
Signup and view all the flashcards
Commit Point of Transaction
Commit Point of Transaction
Signup and view all the flashcards
The Log File
The Log File
Signup and view all the flashcards
Atomicity
Atomicity
Signup and view all the flashcards
Isolation
Isolation
Signup and view all the flashcards
Durability
Durability
Signup and view all the flashcards
The Durability Property
The Durability Property
Signup and view all the flashcards
Schedules (Histories)
Schedules (Histories)
Signup and view all the flashcards
Conflict of Schedules
Conflict of Schedules
Signup and view all the flashcards
Committed Projection C(S)
Committed Projection C(S)
Signup and view all the flashcards
Recoverable Schedules
Recoverable Schedules
Signup and view all the flashcards
Cascadeless Rollback, or Avoiding it.
Cascadeless Rollback, or Avoiding it.
Signup and view all the flashcards
Two phase locking
Two phase locking
Signup and view all the flashcards
Study Notes
Physical Database Design in Relational Databases
- Performance constraints on query invocation times, aiming for under 20 seconds, guide access path attribute priorities
- Time-constrained queries and transactions elevate the priority of selection attributes for primary access structures
Analyzing Update Operation Frequencies
- A minimum number of access paths should be set for frequently updated files
- Updating access paths can slow down update operations
Uniqueness Constraints on Attributes
- Access paths should be specified for candidate key attributes or sets of attributes to guarantee uniqueness
- Indexes can efficiently check uniqueness constraints, using only the index's leaf nodes
Physical Database Design Decisions
- Relational systems represent each base relation as a physical database file
- Access path options include specifying file types and defining attributes for indexes.
- At most one index can be primary or clustered per file
- Multiple secondary indexes are allowed
Indexing Design Decisions
- Attributes used in equality or range conditions and those used in join conditions or keys require access paths
- Query performance hinges on indexes or hashing schemes for selections and joins
Indexing Categories
- Whether to index an attribute is based on that attributes use such as it being a key, or query uses to satisfy selections and joins is present
- Indexes can be used without retrieving full records by scanning indexes
Attributes to Index
- Indexes can be made on one or many attributes.
- Multiple attributes within a relation in several queries warrant a multiattribute index, e.g., (garment_style, color)
- The sequencing of attributes has to correspond to the queries
Clustered Index Setup
- One primary/clustering index is allowed per table, physically ordering the file by that attribute
- CLUSTER keyword specifies this in most RDBMSs
- Primary indexes are created, for key attributes; clustering indexes, for non-key attributes
- The decision of the attribute best benefiting from ordering depends on if keeping the table ordered on the attribute
- Range queries get help from clustering
- Only index searches mean the corresponding index should not be clustered, because the main effect of clustering is achieved through record retrieval
Hash vs Tree Index
- RDBMSs generally use B+-trees for indexing, although ISAM and hash indexes are also available in some systems
- B+-trees support both equality/range queries
- Hash indexes excel in equality conditions, especially during joins to find matches
Dynamic Hashing
- Dynamic hashing is suitable for volatile files that grow and shrink continuously
- Dynamic hashing schemes are not offered by most commercial RDBMSs
Speeding Up Queries
- Denormalization separates logically related attributes into tables to reduce redundancy
- Denormalization can improve query execution speed
Tradeoffs for Execution Speed
- Faster execution is traded with normalization such that frequent queries and transactions can occur.
- Logical design is stored in a weaker normal form, e.g., 2NF or 1NF, instead of BCNF or 4NF
- Attributes needed for answering queries/reports are added to a table, avoiding joins, but reintroducing functional/transitive dependencies and redundancy issues
Denormalization
- Storing extra tables maintains original functional dependencies lost during the decomposition of relations
Database Tuning
- Post deployment, actual database use reveals problem areas unnoticed during initial design
- Revision of inputs from Section 16.1.1. involves the gathering of statistics about usage patterns, and the monitoring of internal DBMS processing, such as, query optimizations
- Bottlenecks involving data or device contention are revealed
- Volumes of activity and data sizes are estimated
- Constant monitoring and revision of the physical design is maintained
- Tuning attempts to reduce query/transaction response times and improve overall throughput
Factors used by the DBMS
- Statistics related to factors are an input to the tuning process, including:
- Sizes of individual tables
- Number of distinct values in a column
- Frequency of query/transaction submission in a time interval
- Times for different query/transaction processing phases
Utilization
- Other statistics create a profile of the data contents, use of the database and include include information by monitoring
- Storage statistics: Data about allocation of storage into tablespaces, indexspaces, and buffer ports
- Total read/write activity (paging) on disk extents and disk hot spots for I/O and device performance
- Query/transaction processing statistics: Execution times of queries and transactions, optimization times during query optimization
- Locking/logging related statistics: Rates of issuing different types of locks, transaction throughput rates, and log records activity
- Index statistics: Number of levels in an index, number of noncontiguous leaf pages, etc
Tuning
- Tuning transactions, concurrency control, and recovery involve:
- Avoiding excessive lock contention to increase concurrency
- Minimizing logging overhead and unnecessary data dumping
- Optimizing buffering and scheduling of processes
- Allocating resources for efficient utilization of disks, RAM, and processes
- Solutions are tailored to specific systems by the trained DBAs who tune physical DBMS parameters and configurations of devices and operating systems
Tuning Indexes
- Index choices can be revised if:
- Queries take too long because of missing indexes
- Indexes are not being utilized
- Indexes cause excessive overhead because of frequent attribute changes
- DBAs can analyze query execution plans using DBMS commands or trace facilities to understand the utilization of operations
Tuning Goals
- Tuning dynamically evaluates the requirements and attempts indexes reorganization
- Dropping and building indexes can have heavy cost but improvements can be huge
Indexing Costs
- Table updating generally stops while an index is dropped or created; the loss must be accounted for
- Besides dropping or creating indexes and changing from a nonclustered to a clustered index and vice versa, index rebuilding may improve performance.
- Many deletions on the index key can have pages containing wasted space, which may be claimed during a rebuild operation while insertions may cause too many overflows
Clustered Tuning
- Rebuilding a clustered index rearranges the entire table ordered on that key
Options
- Option availability of indexing and its reorganization vary
- Sparse indexes have one index pointer for each page (disk block) in the data file, while dense indexes have an index pointer for each record.
- Some indexes can be sparse indexes with clustering index with B+-trees while others indexes can be sparse with a clustering index using an ISAM file and dense indexes as B+-trees
- In some versions, setting up a clustering index is limited to a dense index, the DBA must work with this limitation
Database Designs
- A possible denormalization is departure from keeping every table a BCNF relation. If a specific design has bad results, it could result in a logical database needing adjustments and remapping of new tables and indexes
Data Requirement
- A database needs to be driven by processing requirements along with data need
- Dynamic processing requirements necessitate conceptual, logical, and physical schema adjustments
Physical design
- Several processes may occur depending on processing requirements:
- Due to the frequency of attributes form two or more tables are brought together, existing tables may be denormalized, reducing the normalization level from BCNF to 3NF, 2NF, or 1NF
- 3NF and 2NF address different problem dependencies that do not relate to each other, that is why normalization address these independently
- For table sets, there may be many possible design choices, which may have one replaced
Common place adjustments
- For attributes R(K,A,B,C,D,...), BCNF can be split into multiple tables also in BCNF
- Example is R1(K,A,B), R2(K,C,D), R3(K...), in each table for accessing.
- Tables with lots of columns may want to vertically partition attributes for access.
- Repeating of attributes in other tables even though potential anomalies may occur
- The main table guarantees updated data, is up-to-date
Data Control
- Slicing takes a table vertically like partitioning.
- Each table in the diagram contains the set of distinct products/tuples in which a specific query or set of transactions apply.
- These changes can be used to meet specific requirements
Bad Queries
- Inappropriate index selection causes bad query performance because commands produce
- Too many disk accesses occur such as queries scanning all
- Showing appropriate indexes are not being accessed
Optimizers
- Common issues seen in optimizing can be
- Optimizers won't use arithmetic
- Null comparisons aren't easy to optimize
- Unnecessary distinct operation lead to sorts
Correlated queries
- Bad practice can arise searching the inner table for each outer tuple.
- You can use temporary queries and tables to make things more efficient
- In join conditions use clustered indexes over string comparisons
Clause Order
- Some optimizers are greatly affected with clause order
- Switch this to look at the smaller scanner used by the larger one
- Perform bad in these conditions
- Uncorrelated/Correlated with aggregate/no aggregate subqueries However, the first one above doesn't seem to be a problem that query optimizer evaluate
Additional Query Tuning Guidelines
- Avoid using OR operator as this could prompt optimizer to use indexes; query be split into a union
SQL commands
- Following transformations help:
- Not transforms to positive
- Embedding can lead to join
Rewrites
- Rewrite WHERE to utilise indexes. The syntax to rewrite SQL statements is
SELECT REGION#, PROD_TYPE, MONTH, SALES FROM SALES_STATISTICS WHERE REGION# = 3 AND ((PRODUCT_TYPE BETWEEN 1 AND 3) OR (PRODUCT_ TYPE BETWEEN 8 AND 10));
Improvements
- Actions to improve the inefficient queries are by such as the use of temp tables, view and certain avoidance operations
- Techniques will depend upon optimzer
Introduction to Transaction Processing
- Techniques used involves by the concurrency control, and the recovery, The transaction processing includes in some basic concepts of these transaction
- Concurrent execution of transactions and recovery from transaction failures
- Single-user and multiuser database systems and demonstrates how concurrent execution of transactions can take place
- Concept of transaction, read and write operations on basic model
Multi users
- Multi users access computer same time involves in MULTI programming
Interleaving
- I/O operations keeps CPU busy , then prevent the long process
- More CPU involves in the process
Granularity
- Database models and functions which the size of data describes and has more concepts to cover
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.