Physical Database Design

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Explain how decisions made during physical database design, such as merging relations, can impact the logical data model and application design?

Merging relations can alter the logical structure, potentially affecting how applications interact with the database and requiring adjustments to application code.

In the context of physical database design, what does 'physical data independence' mean, and why is it important?

Physical data independence means users can access data without needing to know how or where it's stored. It's important because it allows flexibility in changing the physical structure without affecting applications.

Outline the primary responsibility of the designer regarding the database management system (DBMS) and the operating system during physical database design.

The designer must provide physical design details to both the DBMS (specifying file organizations) and the operating system (specifying location and protection for each file).

What are the key considerations when deciding whether to store a derived attribute in a database, instead of calculating its value each time it is needed?

<p>Consider the additional storage cost, the cost to keep the derived data consistent, and the cost of calculating it each time. Choose the less expensive option, considering performance constraints.</p> Signup and view all the answers

How can you design a general constraint that prevents a member of staff from managing more than 100 properties at the same time using SQL?

<p>Use a <code>CHECK</code> constraint with a <code>NOT EXISTS</code> clause. Select staff numbers from the <code>PropertyForRent</code> table, group them by staff number, and check if any group has a count greater than 100.</p> Signup and view all the answers

If a query optimizer doesn't utilize a secondary index, even when available, what implications does this have for physical database design, and what should be considered?

<p>Adding the secondary index won't improve query performance and the overhead would be unjustified. Processing and optimization should be considered</p> Signup and view all the answers

How might a transaction/relation cross-reference matrix assist in the physical database design process?

<p>It helps identify the relations that each transaction accesses. As well as, diagrammatically indicates which relations are potentially heavily used.</p> Signup and view all the answers

What qualitative and quantitative information about transactions should be gathered to carry out physical database design effectively?

<p>Transactions that run frequently and will have a significant impact on performance, transactions that are critical to the operation of the business and the time of day when there will be a high demand made on the database.</p> Signup and view all the answers

Describe how the '80/20 rule' can be applied to the analysis of database transactions, and what is its significance?

<p>This refers to the idea that the most active 20% of user queries account for 80% of the total data access</p> Signup and view all the answers

What considerations are associated with using a secondary index and how does that weigh against performance improvements?

<p>An index record is added to every secondary index whenever a tuple is inserted into the relation, updating a secondary index when the corresponding tuple in the relation is updated, the increase in disk space needed to store the secondary index and possible performance degradation during query optimization</p> Signup and view all the answers

What is an "index-only plan" in the context of SQL query execution, and under what conditions might you consider creating indexes to enable such a plan?

<p>This is where data in the index alone are leveraged as the the source for producing the required response without accessing the data file.</p> Signup and view all the answers

Outline some guidelines to generate a 'wish-list' of index candidates?

<p>Do not index small relations, index primary key of a relation if it is not a key of the file organization and add a secondary index to a foreign key. Also add secondary index on any attribute heavily used as the secondary key.</p> Signup and view all the answers

What factors should be evaluated when deciding to apply an index to attributes with long character strings?

<p>Avoid adding index on such attributes</p> Signup and view all the answers

What are the advantages of defining UNIQUE integrity constraints instead of explicitly defining UNIQUE indexes on tables?

<p>Oracle enforces UNIQUE integrity constraints by automatically defining a unique index on the unique key and recommends that UNIQUE indexes not be explicitly defined on tables but instead UNIQUE integrity constraints be defined on the desired columns.</p> Signup and view all the answers

While estimating disk space requirements, how would you determine the potential size of the database in the future?

<p>Consider how the relation grew over the time and modify the disk size by growth factor to determine the potential size of the database.</p> Signup and view all the answers

Describe how user views can provide security in the database.

<p>User views play a central role in defining the structure of the database and enforcing security. In a multi-user DBMS, user views play a central role in defining the structure of the database and enforcing security.</p> Signup and view all the answers

Explain how the physical design process can affect the logical data model, and what should be done to address this?

<p>The design of the security measures should be fully documented. If the physical design affects the logical data model, this model should also be updated.</p> Signup and view all the answers

What is meant by the term 'denormalization' in database design, and why is it sometimes considered?

<p>Denormalization refers to a refinement to the relational schema such that the degree of normalization for a modified relation is less than the degree of at least one of the original relations.</p> Signup and view all the answers

Under what circumstances might denormalization be a consideration when tuning a database system?

<p>If performance is unsatisfactory and a relation has low update rate and very high query rate, denormalization may be a viable option .</p> Signup and view all the answers

How can a transaction/relation cross-reference matrix assist in identifying potential candidates for denormalization?

<p>The matrix summarizes in a visual way the access patterns of the transactions that will run on the database.</p> Signup and view all the answers

What steps are involved when duplicating key attributes in relation to one-to-many relationships to reduce joins?

<p>Specific aim of reducing or removing joins from frequent or critical queries, consider the benefits that may result in duplicating one or more non-key attributes of the parent relation in the child relation in a 1 relationship</p> Signup and view all the answers

How do you minimize the risk of inconsistency when duplicating attributes from parent relations to child relations?

<p>For example, if duplicated data is changed in the parent relation, it must be updated in the child relation. if the update of the attribute in the relations cannot be automated, the potential for loss of integrity is considerable.</p> Signup and view all the answers

When planning on adding a new foreign key to PrivateOwner relation, what should be done following such addition?

<p>Introduce additional foreign key constraints</p> Signup and view all the answers

In what circumstances is it appropriate to incorporate repeating groups into a relation?

<p>This denormalization should be considered only in the following circumstances: absolute number of items in repeating group is known, the number is static.</p> Signup and view all the answers

What technique is used to produce extract tables?

<p>Most common technique for producing extract tables is to create and populate the tables in an overnight batch run when the system is lightly loaded.</p> Signup and view all the answers

What are the two main types of partitioning for a very large relation?

<p>Horizonting Partitioning and Vertical Partitioning.</p> Signup and view all the answers

Why use a partition when designing new tables?

<p>Partitions are particularly useful in applications that store and analyze large amounts of data and have a number of advantages including improved load balancing and improved performance.</p> Signup and view all the answers

Should alterations to the database schema and structures be tested before implementing?

<p>If possible, test the changes either on a test database, or alternatively, when the system is not being fully used (such as outside of working hours).</p> Signup and view all the answers

Explain the difference between horizontal and vertical partitioning.

<p>Horizontal partitions distribute tuples across relations whereas Vertical partitions distributes attributes across relations</p> Signup and view all the answers

List the advantages of using partitioning in tuning a very large database.

<p>Improved balance loading, improved performance through limiting examined data, increase availability and improve recovery.</p> Signup and view all the answers

Does improved response times provide a benefit to the organization?

<p>Improved response times can improve staff morale and increased customer satisfaction.</p> Signup and view all the answers

What is throughput and how do those affect system efficiency?

<p>This is the number of transactions that can be processed in a given amount of time and it's directly related to systems efficiency.</p> Signup and view all the answers

Name three ways response time can be shortened from the user perspective.

<p>Reducing contention and wait times, disk I/O, reducing time for resources and using faster components.</p> Signup and view all the answers

Describe how a physical database designer could monitor and tune the operational system.

<p>Many DBMSs provide the DBA with utilities to monitor and tune the operation of the system because the initial design should not be regarded as static.</p> Signup and view all the answers

What is one way a designer can use available resources better to reduce CPU bottlenecks and make it the system more effective?

<p>One option is to ensure that during peak load no unnecessary jobs are being run and that such jobs are instead run in off-hours or consider multiple CPUs.</p> Signup and view all the answers

How does the physical organization of data on a disk affect performance, and what techniques can be used to address potential problems?

<p>Its organization can have a major impact on the overall disk performance and techniques used to address that include distributing I/O.</p> Signup and view all the answers

Mention some possible benefits from tuning the database, and the resources these may apply to.

<p>Improved load balancing, improved performance, increased availability and improved recovery.</p> Signup and view all the answers

How do OLE object fields affect performance? How can large object files be better handled?

<p>The primary concern due to their significant storage requirements and how one could better monitor performance and tune it, while maintaining system integrity.</p> Signup and view all the answers

Provide some insight on a general rule of thumb regarding database indexing.

<p>Avoid indexing an attribute or relation that is frequently updated and/or if the query will retrieve a significant portion (for example 25%)</p> Signup and view all the answers

When considering vertical and horizontal partitioning, what does the designer need to consider?

<p>Potential inconsistency and the high management of the increased complexity related to joining them, increased costs</p> Signup and view all the answers

Flashcards

Physical Database Design

The process of implementing a database on secondary storage. It includes defining base relations, file organizations, indexes, integrity constraints, and security measures.

Step 3 in Physical Database design

To define base relations and general constraints using the target DBMS functionality.

Step 4 in Physical Database design

Choosing file organizations and indexes for base relations.

Step 5 in Physical Database design

Deciding how each user view should be implemented.

Signup and view all the flashcards

Step 6 in Physical Database design

Designing security measures to protect data.

Signup and view all the flashcards

Step 7 in Physical Database design

Considering relaxing normalization constraints and introducing redundancy.

Signup and view all the flashcards

Step 8 in Physical Database design

Monitoring and tuning the operational system.

Signup and view all the flashcards

Translate Logical Data Model

Converting logical data model relations into a form that can be implemented in the target relational DBMS.

Signup and view all the flashcards

Derived/Calculated Attributes

Attributes whose value can be found by examining the values of other attributes.

Signup and view all the flashcards

General Constraints

Constraints beyond required data, domain constraints, entity and referential integrity.

Signup and view all the flashcards

File Organizations and Indexes

Storing and accessing data in an efficient way.

Signup and view all the flashcards

Transaction Relation Cross-Reference

A matrix of relations that each transaction accesses.

Signup and view all the flashcards

Transaction Usage Map

Diagram that indicates which relations are potentially heavily used.

Signup and view all the flashcards

Denormalization

A process where normalization constraints are relaxed on the logical data model to improve performance.

Signup and view all the flashcards

Duplicating Attributes

Combining relations to decrease the number of necessary joins.

Signup and view all the flashcards

Partitioning

Decompose into smaller, manageable pieces.

Signup and view all the flashcards

Horizontal Partitioning

Distributing tuples across smaller relations.

Signup and view all the flashcards

Vertical Partitioning

Distributing attributes across smaller relations.

Signup and view all the flashcards

Transaction Throughput

Number of transactions processed in a given time interval.

Signup and view all the flashcards

Response Time

Elapsed time for the completion of a single transaction.

Signup and view all the flashcards

Study Notes

  • Physical database design involves implementing a database on secondary storage, focusing on base relations, file organizations, indexes, integrity constraints, and security measures.
  • Physical database design is highly dependent on the target DBMS, as different systems offer varying functionalities.

Chapter Objectives

  • Physical database design aims to translate the logical data model into a physical database design.
  • It involves designing base relations for the target DBMS.
  • General constraints tailored for the target DBMS are designed.
  • Selecting suitable file organizations is achieved through the analysis of transactions.
  • The process includes deciding when to utilize secondary indexes for boosting performance.
  • Estimating the database size is a key part of the planning process.
  • Designing user views and security mechanisms to meet user requirements are part of this methodology.

Database Design Methodology

  • Involves conceptual, logical, and physical database design phases
  • Logical database design is implementation-independent, focusing on ER diagrams, relational schemas, and data dictionaries
  • Physical database design is concerned, with knowing how the computer system and DBMS operate.
  • Physical database design requires close knowledge of the target DBMS and isn't isolated, providing feedback to logical and application design.

Overview of Physical Database Design Methodology

  • The physical database design methodology has six main steps, from 3 to 8, following conceptual and logical database design.
  • Translate the logical data model for target DBMS is step 3.
  • Design file organizations and indexes is step 4.
  • Design user views is step 5.
  • Design security mechanisms is step 6.
  • Consider the introduction of controlled redundancy is step 7.
  • Monitor and tune the operational system is step 8.

Translating the Logical Data Model

  • Produces a relational database schema implementable in the target DBMS.
  • Requires collating information from logical design, data dictionary, requirements collection, and systems specification.
  • The designer also needs to know -How to create base relations. -If the system supports primary, foreign, and alternate keys. -If the system supports required data (NOT NULL). -If the system supports domain and integrity definitions, and general constraints.
  • Three core activities: design base relations, decide how to represent derived data, and design general constraints.

Designing Base Relations

  • Decision-making on how to represent base relations in the target DBMS.
  • Involves collating information from logical database design, including data dictionary and DBDL (DataBase Description Language) definitions.
  • Extended form of DBDL is used to define domains, default values, and null indicators.
  • The process involves implementing base relations, dependent on the capabilities of the target DBMS.
  • All design choices should be fully documented.

Design Representation of Derived Data

  • Deciding how to show derived data within the target DBMS.
  • Derived or calculated attributes have values computable from other attribute values.
  • It involves weighing the cost of storing derived data versus calculating it on demand.
  • Document all design choices and alternatives.

Design General Constraints

  • The step is focused on designing general constraints for the target DBMS.
  • Relational updates may be constrained by integrity constraints on transactions.
  • Considers remaining general constraints beyond required data, domains, and integrity covered in Step 3.1.
  • Design depends on DBMS choice, with some systems offering more facilities.
  • The designer also needs to fully document the process.

File Organizations and Indexes

  • Determines the optimal file organizations and indexes for acceptable performance.
  • It should allow efficient storage and access ways that relations and tuples will be held on secondary storage.
  • Needs full understanding of storage structures and the system's query optimizer, which is dependent on the target DBMS.
  • Physical database design must align with the nature and intended use of data.
  • It relies on workload knowledge, and understanding performance requirements specified during analysis and the requirements stage.

Analyzing Transactions

  • The objective is to fully understand the functionality and important transactions.
  • It's important to identify - -Frequently run transactions with a big impact on performance, -Transactions critical to operations.
  • Peak load times.
  • The database designer must identify performance criteria.

Using a Transaction Relation Cross-Reference Matrix

  • There are processes that help identify transactions to investigate
  • A transaction relation cross-reference matrix can be utilized, showing relations accessed by each transaction.
  • Diagrammatically indicate/map which relations are potentially heavily used.
  • Three step process:
  1. map all transaction paths to relations.
  2. determine which relations are most frequently accessed by transactions.
  3. analyze the data usage of selected transactions that involve these relations.

Frequency Information

  • The text presents that there needs to be a determination of frequency information
  • It's important to determine not only the average and maximum number of times it runs per hour, but also the day and time that the transaction is run.

Data Usage

  • How to analyze the data usage, and determine: -The relations and attributes accessed by the transaction -The type of access. -Predicates that may be used -If there can be candidates for access structure

File Organizations

  • Efficiency is key, want to retrieve tuples in alphabetical order of name, sorting by staff name and salary range.
  • Objective is to choose optimized approach that is relevant to the correct DBMS.
  • DBMSs may offer little or no choice for file organizations.
  • File types:
  • Heap
  • Hash
  • Indexed Sequential Access Method (ISAM)
  • B+-tree
  • Clusters
  • Choice of file orgs should be documented.

Types of Indexes

  • Unordered tuples
  • Specifying primary or clustering index, in this case, decide the attribute for best efficiency
  • Key/relation, primary index, if not key, clustering index
  • Each relation can have one primary or clustering index

Choosing Secondary Indexes

  • Secondary indexes allow extra keys for a base relation to retrieve data better
  • Example: PropertyForRent may hash on PropertyNo but have frequent access on rent attributes. Therefore can add rent as secondary index
  • There is overhead, which includes:
  • Adding index record
  • Updating secondary index
  • Increase in disk space
  • Performance degradation

Index Guidelines

  • Do not index small relations
  • Use primary key of a relation if not a key of the file organization, but be aware this doesn't guarantee primary key is indexed
  • Add secondary index to a foreign key if use is frequent. Some DBMSs will automate index of the foreign keys
  • Add a secondary index to any attribute that is heavily used
  • Secondary index on attributes that contain selection criteria, are ordered by, are grouped by, or contain other sorting (union or distinct)
  • Attributes used in built in aggregated functions

Wish List Guidelines

  • Can be inefficient to add to group by clause
  • Can allow better performance from index only data, without having to access the data file
  • Query will retrieve significant proportion of the tuples, so there is no point in indexing
  • Avoid indexes on long character strings
  • Consider impact on updates; are they going to slow things down too much?
  • Experiment to determine. In the last case, clearly remove this index from the wish-list
  • If there is little observed improvement with the addition of the index, further examination may be necessary
  • Microsoft Office Access a Performance Analyzer and other products have "Query Execution Plans (QEP)" to inspect the optimizers strategy like Oracle EXPLAIN utility

Estimating Disk Space

  • There needs to be documentation for any choices -Objective is to estimate disk space
  • Physical database implementation is handled by hardware, but designer will need knowledge to assess how much space will need to be allocated for hardware procurement.
  • Estimate based on size of each tuple and number of tuples in relation
  • Latter estimate should be max number, to see how the relation will grow and potential disk size.

Designing User Views

  • The objective is to design the user views by using the requirements provided by SQL or the database development lifecycle.
  • First phase of the database design methodology involves a conceptual data model. This can be for a single view, combined, or other.
  • The objective of this step is to design views previously identifies.
  • Views are very versatile, they work very well for standardizing and simplifying the database requests in a DBMS on a PC. In a multi-user one, though, views are critical to define and enforce database structure.
  • There is the ISO SQL standard that's relevant here.
  • All individual user views should be fully documented.

Security Mechanisms

  • The goal is to develop security system that are compliant to requirements of data specifications and life cycle.
  • Has to comply to system security requirements. Also, database is essential corporate resource so it must be protected
  • Designer must understand data and facilities
  • Relational DBMS provide two types

Aspects Of Relational DBMS Security

  • System Security: user and passwords.
  • Data Security: Views and access

Document And Redundancy

  • Redundancy should be documented
  • Reasons for this must be present
  • Then, update models for the data

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Use Quizgecko on...
Browser
Browser