Lesson 13: Backup, Storage Redundancy, RAID PDF
Document Details

Uploaded by MatureHeptagon
Republic Polytechnic
Tags
Summary
This lesson discusses critical concepts related to data backup and storage redundancy. Topics covered include strategies like the 3-2-1 backup, different storage media, and RAID technology. The document emphasizes the importance of data protection for business continuity.
Full Transcript
Lesson 13 Saturday, 15 February 2025 8:38 pm Backup (Availability Control) Storage Redundancy Hard Disk Drive (HDD) is a commonly-used magnetic storage t...
Lesson 13 Saturday, 15 February 2025 8:38 pm Backup (Availability Control) Storage Redundancy Hard Disk Drive (HDD) is a commonly-used magnetic storage type A key element in any Business Continuity Plan is the availability of backups. Large capacities - up to 16 / 18 / 20 TB (from Seagate & Western Digital), larger sizes are on the horizon with continuing research This is true not only because of the possibility of a disaster but also because hardware and storage media will periodically fail, resulting in loss or corruption of critical data. Solid State Drive (SSD) uses non-volatile (NV) NAND flash memory chips, much faster than HDD’s, but with lower capacity and limited write lifespan 3-2-1 Backup Strategy SSD’s are more often used when speed is critical - for operating system (OS) partitions, some cloud-based workloads, while HDD’s used more for backups Create THREE copies of your data (1 primary copy and 2 backups) Storage drives can crash due to : Store your copies in at least TWO types of storage media (local drive, Wear and tear (hardware) network share/NAS, tape drive, etc.) Store ONE of these copies offsite Manufacturing defect (hardware) (e.g., in the cloud) Software bugs or malware Backup Media Hardware solution HDD’s / SSD’s : Hard Disk Drives or Solid-State Drives, ranging from RAID (Redundant Array of Independent Drives) internal disks to other external drives or such as Network Attached Storage (NAS) What is RAID (Redundant Array of Independent Removable media : tape, thumb drives, portable hard disk, mobile phones, smart devices, etc. Disks)? Optical : CD-R/RW, DVD-R/RW, Blu-Ray (BD-R/RW) Cloud : Amazon S3, Dropbox, Box, Microsoft One Drive, Google Drive A technology that provides increased storage functions and reliability through redundancy achieved by combining multiple disk drive components into a logical unit, where data is distributed across the drives in one of several ways called "RAID levels" In this lesson, we focus on RAID 0, RAID 1, RAID 5 and RAID 6 Watch video on What is RAID 0, 1, 2, 3, 4, 5, 6 and 10 (1+0)? (Note: focus of RAID 0, 1, 5 and 6 only) RAID 0 Backup Site No Fault Tolerance Redundancy (data is not mirrored) On-site Striped Disk Array Backup is stored locally in the same location as the original data Take a single chunk of data ○ => spreads data across multiple drives Off-site (physical or remote) Advantage is in improved performance (twice the Backup is stored in different location (remote backup) amount of data can be written in a given time frame to the two drives) Via the network or physical delivery of the media Minimum of 2 drives to implement. For example, Republic Poly off-site backup is in Nanyang If one disk fails, Poly. all the data on the array will be lost, Cloud Backup as there is neither parity nor mirroring Many products now in the market with varying features for Advantages business users e.g.: Great performance, both in read and write operations. Data compression, encryption, offline backup (when network is unavailable) No overhead caused by parity controls. File retention policies and versioning – based on customer All storage capacity can be used. requirements Technology is easy to implement. Redundancy – multiple copies at different locations Centralised management console for IT administrator Disadvantages lesson 13 Page 1 Technology is easy to implement. Redundancy – multiple copies at different locations Centralised management console for IT administrator Disadvantages Cloud elasticity (automated storage space increase) No fault-tolerant (If one disk fails, all data are lost). Help-desk support Should not be used on mission-critical systems. Backup considerations such as media storage, rotation, Ideal use Ideal for non-critical storage of data that have to be read/written at high speed. backup frequency, etc. are outsourced to vendor RAID 0 (Striping, No Redundancy) Cloud Backup Service Providers Splits data across multiple drives for higher performance. No fault tolerance – if one disk fails, all data is lost. Requires at least 2 drives. Pros: ✔ Very fast read/write speeds. ✔ No storage wasted on redundancy. ✔ Simple and easy to implement. Acronis Backup Cloud Cons: ✖ No protection against disk failure. ✖ Not suitable for critical data. Best for: Backblaze Business Backup ✅ High-speed, non-critical storage (e.g., gaming, temporary data). Carbonite Safe Core Computer Backup Dropbox Business RAID 1 Google Drive Enterprise RAID 1 – Mirror Content of 1 hard disk is copied to the other hard disk (mirror) in real time Microsoft OneDrive for Business … etc. If one hard disk crashed, the other can be used and have all the up-to-date data Backup Types Full Backup Complete and comprehensive backup of all files on a disk or server Information and data only current at the point of backup Time consuming process leading to a large “backup window” (backup Advantages time) Fastest to restore Excellent read-speed and a write-speed that is comparable to that of a single disk. Incremental Backup In case a disk fails, data do not have to be rebuilt, they just have to be copied to Partial backup that stores only the information that has been changed since the last full or partial backup. the replacement disk. ○ E.g., If full backup on Monday night, Disadvantages ○ Tuesday night incremental backup will only contain information changed since Monday night. Effective storage capacity is only half of the total disk capacity because all data ○ Wednesday night incremental backup will only contain information get written twice. changed since Tuesday night. Software doesn’t always allow a hot swap of a failed disk (meaning it cannot be Usually, the fastest backup to do replaced while the server keeps running). Slow to restore Ideal use Differential Backup Ideal for mission critical storage (e.g. accounting systems) Similar to incremental backup, but back up file that have been Suitable for small servers in which only two disks will be used. altered since the last full backup. If more than 2 disks are used in a RAID 1 configuration, the effective storage capacity may remain at 50% of the total disk capacity; or it may decrease ○ Mon Night Full Backup depending on how the disks are configured. For example, if 2 disks are used, the effective storage efficiency is 50%. If 4 disks are used, the effective storage efficiency is 50% if 2 of the disks are ○ Tue Night Differential Backup store changes on Tue. used for data, and the other 2 disks are configured for mirroring. The effective storage efficiency drops to 25% (or 1/4), if only 1 of the disks is used for data, ○ Wed Night Differential Backup stores changes on Tue and and all the other 3 disks are configured as mirrors. RAID 1 (Mirroring) Wed. Copies data from one disk to another in real time. If one disk fails, the other has all the data. Pros: ✔ Fast read speed, decent write speed. The size (and time needed) of differential backups will increase ✔ No need to rebuild data, just copy to a new disk. Cons: as they are performed successively ✖ Uses 50% of total storage (data is duplicated). ✖ Some systems don’t support hot-swapping failed disks. Best for: ✅ Mission-critical storage (e.g., accounting systems). Fast to restore compared to incremental ✅ Small servers with two disks. Full Backup Backs up all files on a disk/server. Data is only current at the backup time. Time-consuming but fastest to restore. RAID 5 & 6 Incremental Backup Backs up only changes since the last backup. RAID 5 Fast to perform. Slow to restore. Striped set with distributed parity Differential Backup Backs up changes since the last full backup. Increases in size with each successive backup. Minimum 3 disks Faster to restore than incremental. If one disk crashes, a new blank disk can be inserted without shutting lesson 13 Page 2 Differential Backup Backs up changes since the last full backup. Increases in size with each successive backup. Minimum 3 disks Faster to restore than incremental. If one disk crashes, a new blank disk can be inserted without shutting down the server Full backups - all files are backed up regardless if any changes are made, or not. This takes up the most space. (This is known as hot swapping) Incremental backups - only changed files are backed up, more The new disk will have its content built-up based on the information in the space-efficient other disks RAID 5 (Striping with Parity) Needs at least 3 disks. Data is striped across disks with distributed parity. Differential backups - files changed since the last full backup If one disk fails, it can be rebuilt from the others. Hot-swappable (replace disk without shutting down). are backed up Storage efficiency improves with more disks (e.g., 4 disks = 75%, 5 disks = 80%). RAID 6 Striped disks with dual parity Similar to RAID 5 except that it can recover from loss of TWO disks Advantages Read data transactions are very fast while write data transaction are somewhat slower (due to the parity that has to be calculated). Disadvantages Backup Restoration Example Disk failures have an effect on throughput, although this is still acceptable. Scenario: This is a complex technology. XYZ company does daily backups Ideal use from Mondays to Sundays. Supposing A good all-round system that combines efficient storage with excellent security and decent performance. the server crashes after the entire Ideal for file and application servers. week’s backup has completed, the restoring of data for the different The storage efficiency of RAID 5 and RAID 6 configurations increases with an increase in number of disks used. backup plans is shown in the table on Example1: When 4 disks are used in a RAID 5 configuration, the effective storage the next slide. efficiency is (1 – ¼) or 75%. When 5 disks are used instead, the effective storage efficiency increases to (1 – 1/5) or 80%. Example2: when 4 disks are used in a RAID 6 configuration, the effective storage capacity is (1 – 2/4) or 50%. When 5 disks are used instead, the effective storage efficiency increases to (1 – 2/5) or 60%. RAID 6 (Dual Parity) Like RAID 5 but can handle two disk failures. Requires at least 4 disks. Storage efficiency improves with more disks (e.g., 4 disks = 50%, 5 disks = 60%). Pros: ✔ Fast read speeds, reliable storage. ✔ More efficient as more disks are added. Cons: ✖ Slower writes due to parity calculations. ✖ More complex setup. Best for: ✅ File & application servers needing both storage efficiency and securit Problem with RAID Things to consider for Backup No protection for data corruption How big is the data to be backed up? E.g.: If database file is corrupted, RAID will replicate that corruption to the other drive(s) – both will have corrupted data. All files and folders in the server, or selected only Less than maximum capacity Backup media BluRay : up to 100GB (BD-R XL) For mirror (RAID 1), capacity only 50% because one disk is just a copy of the other disk. Backup media TAPE: up to 18TB (LTO-9 cartridge uncompressed), For RAID 5, the parity information will approximately occupy the space equivalent to one future tape generations to exceed 300-500TB (LTO-12 and beyond) disk. lesson 13 Page 3 Backup media BluRay : up to 100GB (BD-R XL) For mirror (RAID 1), capacity only 50% because one disk is just a copy of the other disk. Backup media TAPE: up to 18TB (LTO-9 cartridge uncompressed), For RAID 5, the parity information will approximately occupy the space equivalent to one future tape generations to exceed 300-500TB (LTO-12 and beyond) disk. Is the chosen media fast enough for your need? RAID 5 can only afford one disk to fail RAID 6 can afford up to two disks to fail – but consumes even more overhead space for Tape is generally slower but has huge capacity parity information. Remote backup RAID 5 has performance overhead due to parity calculation. ○ Via Internet – slow Still Single Point of Failure ○ Via high bandwidth private network - SAN transfer bandwidths RAID hardware, if damaged, may cause data loss too. Proposed Improvement (Solution) using Fibre Channel can range from 16 to 32 Gbps How fast is the Restore? Backup using 3-2-1 backup strategy. System may not be operable until Restore is complete If Restore time is long it may not be suitable for time-critical Use media rotation scheme: Differential combined with Full Backup: Faster application (think of Singapore Stock Exchange) backup, Faster restore Media rotation and backup horizon Off-site backup in case of cataclysmic event affecting the office (fire, earthquake, etc.) How long should the backup data be available? 1 week, 1 month, 1 year? [Refer to additional slides at the end…] Off-site cloud storage Cost Remote backup Fast and high-capacity cost more money Server backup – get the remote backup server to run if the main server is down SAN expensive; TAPE cheaper; DVD cheapest Utilize RAID-based system to avoid data loss due to hard disk crash Depending on budget and requirements Incorporate automation and continuous monitoring Security Come up with company wide policy on data storage and backup Only authorised personnel should be allowed to perform backup operations and have access to backup data Comply with the prevailing regulations and guidelines Physical access to backup media has to be tightly controlled Examples in Singapore: Monetary Authority of Singapore guideline on Option of encrypting data before writing to storage media Internet Banking Continuous Monitoring & Automation Examples in USA: Sarbanes-Oxley Act, HIPAA Strategies Automation: Scripting (e.g. scheduled backup script) Automated Backup Automated courses of action for different scenarios (e.g. power failure, system shutdown while backing up, or faulty backup) Snapshots (scheduled or ad-hoc) Cloud storage elasticity (automatic increase of storage space according to backup needs) Continuous monitoring: Monitoring of backup logs Automated event handling or notifications triggered by alerts Monitoring of remote backup Monitoring of external storage like Network Attached Storage (NAS) Monitoring of RAID storage system Data is an important asset and requires protection Lawsuits happen because of lesson 13 Page 4 Lawsuits happen because of Loss and leakage of confidential information Failure to comply with law and regulations on data backup and storage Use Backup and RAID to improve integrity and availability Better confidentiality and integrity with storage encryption lesson 13 Page 5 Saturday, 15 February 2025 8:38 pm Compare the different data backup types Full Differential Incremental Describe the different backup strategies Identify the factors to consider for a backup scheme Describe the use of Storage Media Redundancy - RAID lesson 13 Page 6