Podcast
Questions and Answers
Which of the following is the MOST crucial reason for performing backups in a computing environment?
Which of the following is the MOST crucial reason for performing backups in a computing environment?
- To enable the restoration of lost data due to various incidents. (correct)
- To reduce the overall cost of IT infrastructure.
- To improve system performance during peak hours.
- To ensure compliance with industry regulations.
What is the key difference between a full backup and an incremental backup?
What is the key difference between a full backup and an incremental backup?
- A full backup copies all data, while an incremental backup copies only the data that has changed since the last full backup. (correct)
- A full backup is performed daily, while an incremental backup is performed weekly.
- A full backup only copies system files, while an incremental backup copies user data.
- A full backup requires more storage space than an incremental backup.
In the context of data backup and recovery, what is the purpose of 'corporate guidelines'?
In the context of data backup and recovery, what is the purpose of 'corporate guidelines'?
- To define terminology and dictate minimum requirements for data-recovery systems. (correct)
- To outline the budget allocated for data storage.
- To define the specific software to be used for backups.
- To establish the frequency of system hardware upgrades.
An organization experiences a disk failure resulting in total file system loss. Which type of restore is required in this scenario?
An organization experiences a disk failure resulting in total file system loss. Which type of restore is required in this scenario?
Which of the following is a typical characteristic of archival backups?
Which of the following is a typical characteristic of archival backups?
Why is it recommended to perform periodic 'fire drills' in the context of backup and restore procedures?
Why is it recommended to perform periodic 'fire drills' in the context of backup and restore procedures?
When engineering a backup and restore system, what is the role of a 'procedure'?
When engineering a backup and restore system, what is the role of a 'procedure'?
How does the use of snapshots on a SAN or NAS system benefit the process of restoring accidentally deleted files?
How does the use of snapshots on a SAN or NAS system benefit the process of restoring accidentally deleted files?
What factor primarily limits the speed of a backup?
What factor primarily limits the speed of a backup?
What is the 'shoe-shining effect' in the context of tape backups, and why does it occur?
What is the 'shoe-shining effect' in the context of tape backups, and why does it occur?
Which of the following is the MOST significant challenge when restoring an entire disk after a disk failure?
Which of the following is the MOST significant challenge when restoring an entire disk after a disk failure?
How is data consistency typically ensured when backing up a high-availability database?
How is data consistency typically ensured when backing up a high-availability database?
What are the three aspects of the backup procedure that can be automated?
What are the three aspects of the backup procedure that can be automated?
What is the purpose of defining different SLAs for each type of data within corporate guidelines?
What is the purpose of defining different SLAs for each type of data within corporate guidelines?
Why are backups typically performed during off-peak times?
Why are backups typically performed during off-peak times?
What is the role of the Service Level Agreement (SLA) in backup and restore operations?
What is the role of the Service Level Agreement (SLA) in backup and restore operations?
How do disk drives used as a buffer alleviate performance problems during backups?
How do disk drives used as a buffer alleviate performance problems during backups?
What is the primary factor affecting restore speed?
What is the primary factor affecting restore speed?
Which of the following best describes the difference in backup windows between an e-commerce site that operates globally and a traditional office environment?
Which of the following best describes the difference in backup windows between an e-commerce site that operates globally and a traditional office environment?
The detailed schedule shows which disk will be backed up when; what does this schedule usually consist of?
The detailed schedule shows which disk will be backed up when; what does this schedule usually consist of?
Flashcards
Full Backup
Full Backup
A complete copy of all files on a partition.
Incremental Backup
Incremental Backup
Copying all files changed since the last full backup.
Corporate Guidelines
Corporate Guidelines
Defines terminology and dictates minimums for data recovery.
Data Recovery Policy
Data Recovery Policy
Signup and view all the flashcards
Procedure
Procedure
Signup and view all the flashcards
Backup Schedule
Backup Schedule
Signup and view all the flashcards
Accidental File Deletion
Accidental File Deletion
Signup and view all the flashcards
Disk Failure Restore
Disk Failure Restore
Signup and view all the flashcards
Archival
Archival
Signup and view all the flashcards
Perform Fire Drills
Perform Fire Drills
Signup and view all the flashcards
Service Level Agreement (SLA)
Service Level Agreement (SLA)
Signup and view all the flashcards
Shoe-Shining Effect
Shoe-Shining Effect
Signup and view all the flashcards
Database Mirroring
Database Mirroring
Signup and view all the flashcards
Backup Automation
Backup Automation
Signup and view all the flashcards
Study Notes
Backups vs. Restores
- Everyone dislikes backups because they are inconvenient and costly.
- When servers undergo backups, services operate slower or may not operate at all.
- Restores are favored by customers.
- SAs perform backups for the purpose of restores.
- Restoring lost data is a critical part of any environment.
- Data loss and equipment failure occur.
- Humans delete data both by mistake and on purpose.
- Judges impound all lawsuit-related electronic documents as needed.
- Shareholders need assurance that investments will not be worthless due to disasters.
- Data can be corrupted by mistake, on purpose, and by gamma rays from space.
- Backups are considered insurance, paid for with the hope of never needing them, but, in reality, they are necessary.
Terminology
- A full backup is a complete backup of all files on a partition; Unix users call this a “level 0 backup."
- Incremental backups copy all files that have changed since the previous full backup; Unix users call this a “level 1 backup", and they are often referred to as “incrementals.”
- Incremental backups grow over time.
- If a full backup is performed on Sunday and an incremental backup each day of the week that follows, the amount of data being backed up should grow each day.
- Tuesday's incremental backup will include all the files from Monday's backup, as well as what changed since then.
- Friday's incremental backup includes all the files that were part of Monday's, Tuesday's, Wednesday's, and Thursday's backups, in addition to what changed since Thursday's backup.
- Some systems perform an incremental backup that collects all files changed since a particular incremental backup rather than the last full backup.
- Level 2 incremental backups contain files changed since the last level 1 backup, and level 3 if they contain files changed since the last level 2 backup, and so on.
Engineering Your Backup and Restore System
- Determine the desired end result and work backward.
- Corporate guidelines define the SLA for restores based on a site's needs, which becomes the backup policy, which dictates the backup schedule.
- The “corporate guidelines” define the terminology and minimums and requirements for data-recovery systems.
- The "SLA" defines the requirements for a particular site or application and is guided by the corporate guidelines.
- The "policy" documents the implementation of the SLA in general terms, and is written in English.
- The "procedure" outlines how the policy is to be implemented.
- The detailed "schedule" shows which disk will be backed up when and may be static or dynamic; such a schedule usually consists of the policy translated from English into the backup software's configuration.
Reasons for Restores
- Restores are requested for accidental file deletion, disk failure, or for archival purposes.
- "Accidental File Deletion" happens when a customer has accidentally erased one or more files and needs to have them restored.
- "Disk Failure" results from a hard drive failing and all data needing to be restored.
- "Archival" restores are completed for business reasons since a snapshot of the entire “world” needs to be made on a regular basis for disaster-recovery, legal, or fiduciary reasons.
- Individual file restores assist customers who accidentally deleted the data, these are also direct users of the data.
- Complete restores after a disk failure assist the SAs who committed to providing a particular SLA.
- Archival backups serve the needs of legal and financial departments that require the data, who are usually far detached from the data itself.
- Judges can't subpoena documents that aren't backed up.
Accidental File Deletion
- Customers want to quickly restore any file as it existed at any instant, though not always possible.
- If data is on a SAN or NAS with the snapshot feature, it is usually possible to perform a self-service restore with reasonable time granularity.
- Without SAN or NAS, a file can typically be restored to what it looked like at any one-day granularity and to have it, it takes three to five hours to have the restore completed
- Snapshots reduce workload because the most common type of request becomes self-service and they increase customer productivity by reducing the amount of lost data that must be manually reconstructed.
Disk Failure
- Restoration is related to hard drive failure or any other hardware or software failure resulting in total file system loss.
- Disk failure causes loss of service and data.
- On critical systems, such as e-commerce and financial systems, RAID should be deployed so that disk failures do not affect service, with the possible exception of a loss in performance.
- However, in noncritical systems, customers can typically (in common office environments) expect the restore to be completed in a day.
- Restores often take a long time to complete.
- The Restoration speed is slow because large volumes of data are being restored, and the entire volume of data is unavailable until the last byte is written.
- To make matters worse, a two-step process is involved: First the most recent full backup must be read, and then the most recent incremental(s) are read.
Archival Purposes
- Corporate policies may require you to be able to reproduce the entire environment with a granularity of a quarter, half, or full year in case of disasters or lawsuits.
- The work that needs to be done to create an archive is similar to the full backups required for other purposes.
- Archives are full backups and in environments that usually mix full and incremental backups on the same tapes, archive tapes should not be so mixed.
- Some sites require archive tapes to be separate from the other backups.
- Archives are usually stored off-site.
- Archive tapes age more than other tapes, and they may be written on media that will become obsolete and eventually unavailable.
- If the archives are part of a disaster-recovery plan, special policies or laws may apply.
Perform Fire Drills
- The only time you know the quality of your backup media is when you are doing a restore.
- Restore time is generally the worst time to learn that you have problems.
- Assess the backup system by doing an occasional fire drill involving selecting a random file and restoring it from tape to verify that your process is working.
- It can be useful to do an occasional fire drill that involves restoring an the entire disk volume.
- The speed at which the entire disk volume can be restored is often unknown because it is so rarely requested.
Corporate Guidelines
- Organizations need a corporate-wide document that defines terminology and dictates requirements for data-recovery systems.
- The guideline should begin by defining why backups are required, and what constitutes a backup, and which kind of data should be backed up.
- A set of retention guidelines should be clearly spelled out.
- There should be different SLAs for each type of data like finance, mission critical, projects, general home directory data, email, and experimental data.
- Backups usually have a performance impact, so they should be done during off-peak times.
- E-commerce sites with a global customer base will have very different backup windows than offices with normal business schedules.
A Data-Recovery SLA and Policy
- Determine the service level that's right for your particular site.
- An SLA is a written document that specifies what kind of service and performance that service providers commit to providing.
- This policy should be written in cooperation with your customers.
- Once the SLA is determined, it can be turned into a policy specifying how the SLA will be achieved.
- For most SAs, a corporate standard already exists, with vague, high-level parameters that they must follow.
- Make sure that your customers are aware of these guidelines.
The Backup Schedule
- With an SLA and policy, the schedule can be set which is specific and lists details down to which partitions of which hosts are backed up when.
- Although an SLA should change only rarely, the schedule changes often, tracking changes in the environment.
- Many SAs choose to specify the schedule by means of the backup software's configuration.
- SA may have to decide how often full backups are performed.
- Backup software has become increasingly automated over the years.
- It is common to simply list all partitions that need to be backed up and to have the software generate a schedule based on the requirements.
- The backups are performed automatically, and email notification is generated when tapes must be changed.
Time and Capacity Planning
- Restores and backups are constrained by time.
- Restores need to happen within the time permitted by the SLA of the service.
- Most systems slow down considerably when backups are being performed.
- Some services must be shut down entirely during backups.
- The speed of a backup is limited by the slowest of the read performance of the disk, the write performance of the backup medium, bandwidth, and the latency of the network between the disk and the backup medium.
- Restore time is affected by the reverse of those factors.
- Tape units frequently write to the tape at a much slower speed than they read from it.
Backup Speed
- The slowest link in the chain will determine the speed at which the backup will happen.
- The backup process is also affected by mechanical issues: most tape drives write at a high speed if they are being fed data as quickly as they can write (streaming mode), but downshift to a considerably slower speed if they are not being fed data quickly enough to keep up with the tape write speed.
- If the drive has no data to write, the drive must stop, reverse position, and wait until it has enough data to start writing again -- Drive manufacturers call this the "shoe-shining effect", as the read/write mechanism moves back and forth over the same spot on the tape.
- In addition to slowing tape performance, such repetition puts undue stress on the tape medium.
- Therefore, if the server cannot provide data quickly enough, backup speed is dramatically reduced.
- If network congestion is slowing the data's movement to the tape host, backups may be significantly slower than if congestion is not an issue.
- To alleviate performance problems during backups, it is common to use a disk drive as a buffer (servers back up their data to the disks on the backup host, often one file per disk volume per server) so the backup host can then write the completed backup files at full tape speed.
Restore Speed
- Restores will also be as slow as the slowest link; the additional factors that come into play are:
- Finding a single file on a tape can take as long as a full disk restore itself.
- Restoring an entire disk is extremely slow, too.
- The main issue affecting the restore speed is not the drive read speed, but rather the file system write speed.
- Writes are much less efficient than reads on almost every file system, and reconstructing a file system often leads to worst-case performance.
- If the server is able to receive data quickly enough that the tape drive can stay in streaming mode, the restore can happen at maximum speed.
High-Availability Databases
- Databases, have specific requirements for ensuring that a backup is successful.
- A database manages its own storage space and optimizes it for particular kinds of access to its complex set of data tables.
- Databases often need to be shut down so that no transactions can occur during the backup to ensure consistency.
- If the database has high-availability requirements, it is not acceptable to shut it down each night for backups.
- However, the risks associated with not doing a backup or performing the backup while the database is live are also unacceptable.
- Since it is safest to back up the data when the database is not running, it is usually achieved by having the database mirrored-using RAID 1 + 0, for example, and the database can be stopped long enough to disconnect a mirror.
- The disconnected mirror disks are in a consistent state and unaffected by database transactions and can be safely written to tape.
- Finally, when the backup is finished, the mirror disks can be reconnected to the live database, and the mirror will be brought back up-to-date automatically.
- In many environments a high-availability database is triply mirrored, with one set of disks actively mirroring the database and one set being detached and backed up.
Backup Automation
- Not automating backups is dangerous because the more you automate, the more you eliminate the chance of human error.
- Because backups are boring, if they aren't automated, they will not be reliably done, and that is embarrassing to have to face your CEO's question: “But why weren't there backups?"
- Backup procedures that can be automated include the commands, schedule, tape management, and inventory.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.