19 Questions
What is the cost of linear search in the best case scenario?
(br / 2)
When is linear search generally used?
When selection is an equality comparison on the primary key attribute and the file is ordered on the primary key attribute.
What does 'br' represent in the context of binary search?
Number of blocks containing records from relation r
In evaluating an expression with multiple operations, why can it be difficult to solve if it contains more than one operation?
Because each operation needs to be evaluated one by one in a specific order.
Which algorithm is faster, binary search or linear search?
Binary search
In linear search, what needs to be added to the cost estimate if multiple blocks contain required records?
The cost of scanning such blocks
Which scenario would favor using binary search over linear search?
When selection is an equality comparison on a primary key attribute and file records are ordered on that key.
When evaluating expressions with multiple operations, which method requires processing operations from top to bottom?
Top-Down Execution
What increases the difficulty of solving expressions with multiple operations?
Operations involving different precedences.
What is the cost of linear search in the worst case scenario?
br
In linear search, when the selection condition is on a (primary) key attribute and the required record is found, what can the system do?
System can stop searching
What occurs if the selection in linear search is on a non-(primary) key attribute and multiple blocks contain required records?
Extra cost for scanning these blocks needs to be added
Which method for evaluating expressions with multiple operations involves passing results from one operation to another without storing intermediate results?
Pipelining
What is a common problem associated with materialization when evaluating expressions with multiple operations?
Creating lots of temporary relations
What is the procedure of converting a query written in high-level language into a correct and efficient execution plan expressed in low-level language?
Scanning, Parsing, and Validating
What is the predominant cost factor in query processing?
Disk accesses
Why is the cost to write a block greater than the cost to read a block in query processing?
To ensure successful write, data must be read back after writing
Which of the following contributes to the total time cost in executing a query?
Disk accesses, CPU time, and Network communication cost
What is the main role of the parser in query processing?
Checking query syntax against language rules
Study Notes
Query Processing and Optimization
- Query processing is a procedure of converting a query written in high-level language (e.g. SQL) into a correct and efficient execution plan expressed in low-level language, used for data manipulation.
- A query must be scanned, parsed, and validated to ensure it is formulated according to the syntax rules of the query language.
- Validation checks that all attribute and relation names are valid and semantically meaningful in the schema of the particular database being queried.
Steps in Query Processing
- Parser checks the syntax of the query and verifies attribute names and relation names.
- Query Translator translates the query into its internal form (relational algebra).
- The optimizer chooses the best execution plan.
- The query evaluation plan is executed, and the output is returned.
Measures of Query Cost
- Cost is generally measured as the total time required to execute a statement/query.
- Factors that contribute to time cost include disk accesses, CPU time, and network communication cost.
- Disk access is the predominant cost, as it is slow compared to in-memory operation.
- The cost of writing a block is greater than the cost of reading a block.
Selection Operator
- Symbol: σ (Sigma)
- Notation: σ condition (Relation)
- Operation: Selects tuples from a relation that satisfy a given condition.
- Operators: =, <, >, ≤, ≥, ∧ (AND), ∨ (OR)
Search Algorithms for Selection Operation
- Linear search (A1)
- Scans each block and tests all records to see whether they satisfy the selection condition.
- Cost of linear search (worst case) = br, where br is the number of blocks containing records from relation r.
- Cost of linear search (best case) = (br /2)
- Binary search (A2)
- Generally used if selection is an equality comparison on the primary key attribute and the file (relation) is ordered on the primary key attribute.
- Cost of binary search = [log2(br)]
Evaluation of Expressions
- Expressions may contain more than one operation, making it difficult to solve if it contains more than one operation.
- Two methods for evaluating an expression carrying multiple operations are:
- Materialization: evaluates the expression tree from the bottom and performs the innermost or leaf-level operations first.
- Pipelining: operations form a queue, and results are passed from one operation to another as they are calculated.
Materialization and Pipelining
- Materialization:
- Evaluates the expression tree from the bottom and performs the innermost or leaf-level operations first.
- The intermediate result of each operation is materialized (stored in a temporary relation) and becomes input for subsequent operations.
- The cost of materialization is the sum of the individual operations plus the cost of writing the intermediate results to disk.
- Pipelining:
- Operations form a queue, and results are passed from one operation to another as they are calculated.
- Combining operations into a pipeline eliminates the cost of reading and writing temporary relations.
- Pipelines can be executed in two ways: demand-driven and producer-driven.
Test your knowledge on query processing and optimization in Database Management Systems with this quiz covering steps in query processing, measures of query cost, selection operation, evaluation of expressions, query optimization, transformation of relational expressions, sorting and join operations.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free