Full Transcript

NTE316,AIE316, ISE316 Storage Techniques Solution Course contents Introduction to Information Storage. Data Center Environment. Data Protection: RAID Intelligent Storage Systems. Fiber Channel Storage Area Networks. IP SAN and FCoE....

NTE316,AIE316, ISE316 Storage Techniques Solution Course contents Introduction to Information Storage. Data Center Environment. Data Protection: RAID Intelligent Storage Systems. Fiber Channel Storage Area Networks. IP SAN and FCoE. Network-Attached Storage. Reference Book Information Storage and Management Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments 2nd Edition Edited by Somasundaram Gnanasundaram Alok Shrivastava Students evaluation 20% Class work for lecture/Section 20% Mid –Term Exam 60% Final Exam Foreword Data carries information during the transmission on networks. What is the relationship between information and data? What is the function of data storage? This course describes the definition of information and data in the computer field, their relationship, as well as the concept, development history, and development trend of data storage. Objective On completion of this course, you will be able to understand: Definition of information and data Concept of data storage History of data storage Development trend of data storage products Data and Information Trend of Storage Technologies Data Storage Storage Architecture Storage Technologies Storage Media Interface Protocols Contents 1. Data and Information 2. Data Storage 3. Development of Storage Technologies 4. Development Trend of Storage Products What is Data SNIA (Storage Networking Industry Association) defines data as the digital representation of anything in any form. Format in which data is 01010010101000100000 Data as a general concept refers to digits, stored Email 11110001110001000111 letters, and symbols that can be input into 00011100000111101010 a computer and processed by a computer 10010101010010101001 program. Digital music 01001010100101010100 DAMA ( Data Management) defines data 10101000101010010101 as the expression of facts in the form of 00101010101001010101 texts, numbers, graphics, images, Digital video 01010010100010010100 sounds, and videos. 10101010101001010101 SNIA is short for Storage Networking Eboo 00101010101010010101 Industry Association. k 01010010100101001000 DAMA refers to the Global Data 10101010010010100100 Management Community. 10 Data Types Semi-struct Structured Unstructur ured data data ed data Based on data storage and management modes, data is classified into structured, semi-structured, and unstructured data. Structured data It can be represented and stored in a relational database, and is often represented as a two-dimensional table. SQL server, MySQL, Oracle Semi-structured data It does not conform to the structure of relational databases or other data tables, but uses tags to separate semantic elements or enforces hierarchies of records and fields. XML, HTML, JSON Unstructured data It is not organized in a regular or complete data structure, or does not have a predefined data model. Texts, pictures, reports, images, audios, and videos Data processing Cycle Steps Data processing is the reorganization or reordering of data by humans or machines to increase their specific value. A data processing cycle includes three basic steps: input, processing, and output. Input Processing Output Input: inputs data in a specific format, which depends on the processing mechanism. For example, when a computer is used, the input data can be recorded on several types of media, such as disks and tapes. Processing: performs actions on the input data to obtain more data value. For example, the time card hours are calculated to payroll, or sales orders are calculated to generate sales reports. Output: generates and outputs the processing result. The form of the output data depends on the data use. For example, the output data can be an employee's salary. What Is Information Information is processed, structured, or rendered data in a given context to make it meaningful and useful. Information is processed data, including data with context, relevance, and purpose. It also involves the manipulation of raw data. There are many definitions of information. Most of them in the computer field are generated based on the definitions proposed by Claude Elwood Shannon, known as "the father of information theory." His theory depicts that the essence of information is the resolution of random uncertainty. This means that the information is useful. Data vs. Information After being processed, data can be converted into information. In order to be stored and transmitted in IT systems, information needs to be processed as data. Item Data Information Raw and meaningless, with no Feature Valuable and logical specific purpose Essence Original materials Processed data Dependence Data never depends on information Information depends on data Meteorological data or satellite image Example Weather forecasts data Data is a raw, unorganized data bit that needs to be processed to make it meaningful, whereas information is a set of data that is processed, interpreted, and presented to become meaningful, in accordance with the given requirement. Data does not have any specific purpose whereas information carries a meaning that has been assigned by interpreting data. Data alone has no significance while information is significant by itself. Data never depends on information while information is dependent on data. Data is measured in bits and bytes while information is measured in meaningful units like time and quantity. Data can be structured, tabular data, graph, data tree whereas information is language, ideas, and thoughts based on the given data. Information Lifecycle Management Information lifecycle management (ILM) refers to a set of management theories and methods from the stage when the information is generated and initially stored to the stage when the information is deleted. Data value Creation Protection Access Migration Archiving Destruction Data creation phase: Data is generated from terminals and saved to storage devices. Data protection phase: Different data protection technologies are used based on data and application system levels to ensure that various types of data and information are effectively protected in a timely manner. A storage system provides data protection functions, such as RAID, HA, disaster recovery (DR), and permission management. Data access phase: Information must be easy to access and can be shared among organizations and applications of enterprises to maximize business value. Data migration phase: When using IT devices, you need to upgrade and replace devices, and migrate the Data archiving phasedata from the old to new devices. : The data archiving system supports the business operation for enterprises by providing the record query for transactions and decision-making. Deduplication and compression are often used in this phase. Data destruction phase: After a period of inactivity, data is no longer saved. In this phase, it is normal to destroy or reclaim data that does not need to be retained or stored, and clear the data from storage systems and data warehouses. Contents 1. Data and Information 2. Data Storage 3. Development of Storage Technologies 4. Development Trend of Storage Products What is Storage? Narrow Definition Wider Definition Storage Systems (Disk Arrays,Controllers,Disk Enclosures,Tape Libraries) Storage Software (Backup and Management Software,and other values added software such as Snapshot and CD,DVD,ZIP,Tape Drives,Hard Disk etc. Cloning software.) Storage Networks (HBA Card,FC Switches,FC/SAS Cables etc.) Storage Solutions (Centralized Storage,Restore,Backup,and Disaster Recovery etc.) -A narrow definition of storage can refer to the physical storage devices such as floppy disks, CD/DVD, hard drives or even tape drives that are still used in some enterprises. - A wider definition of storage refers to the storage devices used in Datacenters such as storage systems, storage software, storage networks and storage solutions. -In reality, multiple storage systems, hardware and software work together to form a solution that could satisfy the needs of business that requires high levels of data management such as data integration, backups and disaster recovery solutions. Etc. - Servers accesses the data on storage hardware via a storage network and storage software manages the data for data integration, backup and disaster recovery purposes. -This module shall introduce the wider definition of complex storage systems used for storing and managing critical business data. Positioning of Storage Systems Disaster Recovery Backup Storage Management(ISM) Storage Solution Snapshot,Mirroring Backup Multipathing Storage Software External Storage Storage Connection Systems: Devices: Storage Disk Array FC HBA Card Hardware NAS FC Switches Tape Libraries Ethernet Switches Virtual Tape Connection Cables Libraries -Current storage technologies are no longer standalone systems, in reality, it consist of a series of components that makes it a complete storage system. -Nowadays, storage systems are divided into 3 different components which are the storage hardware, storage software and storage solution. The storage hardware component is further divided into external storage hardware such as physical storage devices like disk arrays and tape libraries. -Usability of storage systems are greatly increased due to the existence of storage software, features such as data mirroring, cloning, automatic backups and other data operational tasks can be managed and completed via storage software. -A well designed storage solution ensures that data storage operations are conducted easier, however an excellently designed storage solution not only will allow easier storage system deployment, it also lowers the total cost of ownership (TCO) for the customer and ensures that the customer has the best value and protection on their investments. Contents 1. Data and Information 2. Data Storage 3. Development History of Storage 4. Evolution of Storage Technologies 5. Latest Storage Technologies and Trends 6. Storage Products and Solutions Development History of Storage DISTRIBUTE CLOUD DAS NAS SAN D STORAGE STORAGE Direct Attached Storage (DAS). Network Attached Storage (NAS). Storage Area Network (SAN). DAS Architecture Servers SAS FC SCSI Controller Controller Controller Disk Arrays Direct Attached Storage (DAS) : is a storage architecture that consist of storage devices are directly connected to the servers. DAS provides block-based storage for servers (Not File-based storage). Examples of DAS storage: Internal hard disk in servers, tape libraries that are directly connected to servers, and external hard drives that can be directly connected to servers. DAS can be differentiated into Internal DAS and External DAS based on the location of the storage in relation to the servers. Internal DAS: In the architecture of the Internal DAS, storage devices are internally connected to the servers via serial or parallel connection to the internal bus, however, due to the physical limitation of the cables length, it only supports short distance high speed data transfers. Additionally, there are also limitation the number of devices that can be connected to the internal bus, and storage devices will also take up lots of space in the servers if they are placed within the servers, making maintenance of other parts within the servers difficult. External DAS: In the architecture of External DAS, servers are directly connected to external storage devices. In most situations, they communicate to each other through FC or SCSI protocols. In comparison to internal DAS, external DAS has overcome the limitation of short distance and devices limit faced by internal DAS. Additionally, external DAS can even provide centralized management of the storage devices, making management of those storage devices easier. NAS Architecture NAS Operating System Architecture NFS NFS CIFS Dedicated IP Storage Network File System NFS, RAID NAS CIFS Storage Devices Network Attached Storage (NAS) are IP-based file level storage devices that are connected to a local area network. NAS allows customers to share files quickly with lower storage management cost through file-based data access and sharing storage resources over a network. By utilizing NAS, there is no need to set up multiple file servers and it is one of the preferred file-sharing storage solutions. NAS also eliminates the issue of performance bottleneck faced when customers are accessing the file servers. NAS uses the network and file sharing protocols to implement data archiving and data storage. Among the protocols used by NAS also includes the TCP/IP protocols used for data transmission and CIFS and NFS protocols for remote file management. UNIX and Windows user can share data seamlessly through NAS and usually can do so via the sharing methods such as NAS and FTP. When NAS is shared, UNIX usually uses NFS protocol for file management while Windows uses CIFS protocol. With the advancement of network technologies, NAS has expanded to fulfill enterprises needs for high performance and high reliability data access. NAS devices are usually dedicated, high performance, and high speed devices with a single purpose of providing file service and storage. NAS devices uses their own operating system and integrated hardware and software to fulfill specific file service requirements. NAS optimizes its underlying operating system and file I/O (input/output) processes, making it perform file I/O better than common file servers. NAS devices can connect to more clients compared to traditional file servers which allows it to achieve the purpose of aggregating multiple traditional file servers with less NAS devices. SAN Architecture SAN FC SAN IP SAN FC SAN Architecture Servers FC Switches FC Network FC SAN Controller Disk Arrays -Storage Area Network (SAN) : is a dedicated high performance network implemented between the server and storage resources. It has been optimized specifically for high amount of raw data transmission. Hence, FC SAN can be seen as an expanded form of SCSI protocols in terms of long distance application. The default protocols used by FC SAN is SCSI and Fiber Channel. -Fiber Channel is especially suitable for SAN implementation due to the fact that it is capable of transmitting large blocks of data and is also capable of long distance transmission. -The market for FC SAN is mostly centered at high end enterprise grade storage devices implementations. These storage implementations has high requirements for performance, redundancy and availability. Etc. storage arrays and backup devices. IP SAN Architecture Servers Ethernet Switches IP Network iSCSI Storage Controller Disk Arrays IP SAN (Storage Area Network) : is a storage area network architecture based on TCP/IP protocols for data transmission using Ethernet as the medium. The default protocol for IP SAN implementation is iSCSI (Internet Small Computer System Interface), which is a protocol that encapsulates SCSI commands and transmit it over IP networks. NAS SAN Single storage device Multiple device network File storage system Block storage system TCP/IP ethernet network Fibre channel network Multiple users and faster Limited users and speed performance Limited expansion Highly expandable High cost and complex Low cost with easy setup setup Storage Architecture Trends:Converged Storage Convergence Lowers Cost Combines NAS and SAN Simplifies Storage Management, Increases Utilization File Block Converged Storage Converged Storage combines both SAN and NAS storage, fulfills elastic business development, simplifies service deployment, increases storage resources utilization and effectively reduces total cost of ownership (TCO). Cloud Storage:Distributed Storage Front end Clients Back end Maintenance Back end Network Front end Network Distributed Storage in simpler terms can be defined as storing data in a distributed way across multiple data storage servers on a large scale storage resource pool built using standard x86 server’s storage media such as internal HDD and SSD. Current Distributed storage systems are mostly built upon Google’s experiences in building a distributed file system across multiple servers, and implementing data storage services on this distributed file system. Cloud Storage:Software - Defined Storage Compute Storage Storage Controller Compute Compute Controller Controller Legend: Storage Storage :Compute Node Software-defined storage uses distributed technology to build a large scale storage Compute :Storage Node Controller resource pool using storage media of X86 servers such as HDD and SSD, and provides :PCIe SSD industry standard SCSI and iSCSI ports to Controller : Storage Controller non-virtualized applications and virtual machines. Contents 1. Data and Information 2. Data Storage 3. Evolution of Storage Technologies 4. Latest Storage Technologies and Trends 5. Storage Products and Solutions Key Technological Evolution of Low End Storage Systems 1995 and earlier 1998 2003-2005 Present External Disk Dual Controller Rich Software Single Controller SAN Features SAN Emergence of Single Emergence of Growth of data Higher CPU processing storage controller. Dual storage Emergence of low cost power. external storage. controller. Rich storage software Server features introduced to storage systems. Array controller card technologies JBOD Direct Attached 1 Controller Storage Low end storage devices has characteristics such as simple, economical and easy to use, it is capable of running extendedly for 24x7 period, and it satisfies enterprises requirement for availability and data protection. It is considered as an entry level professional storage platform for enterprise datacenters. As of year 1995 and earlier, with the growth of data, local server storage is no longer able to fulfill enterprise requirements. Emergence of low cost external storage devices causes servers to be connected to multiple hard drives. Array controller card technologies was introduced which allows storage arrays to have more disks connected and managed. JBOD( Just a Bunch Of Disks) direct attached storage was also introduced in this period which increases the limit of storage hard disks that can be connected to servers. As of year 1998, single storage controllers was introduced into storage systems. Single Controller in SAN runs the RAID, Cache and Utility Software, which frees up the servers processing power. During the period of year 2003 to 2005, Single Controller SAN is slowly replaced by Dual Controller SAN due to decreasing cost, existence of single point of failure, low reliability, and poor scalability. Dual Controller SAN consist of 2 storage controllers which eliminates the single point of failure, has more reliability and much more scalable compared to single controller SAN. In the present, with the increase in CPU processing capabilities, low end storage devices can provide more and more service and features. This caused software features to move down from mid range to low range storage devices. Previously, only Mid tier storage system has the high CPU processing power to run multiple storage features, but as the low range storage systems is having more CPU power, these systems are also capable of running these features to a certain extent. Flexible Scalability Reliability Key Features High Performance Energy Saving and Easy Operation Flexible and Reliable: Flexible scalability based on needs: Start small and scale up flexibly along with the growth of the enterprises and their needs for storage. Fully Redundant System Design: Full redundancy with dual controllers architecture, ensuring the highest amount of system stability and data protection. Easy operation: Unified management interface using the DeviceManager, 5-steps basic configurations, supports remote management using IPad, and one-click system upgrade for easier operation of your storage. Energy Saving and High Performance: High performance and High Standard: Supports both FC and iSCSI , and satisfies complex storage requirements. Multiple Energy Saving features and lowers the Total Cost of Ownership(TCO): Automatically hibernates the hard disk based on service loads and lowers the energy consumption by 40%, 16-speed smart fan speed control technology, and smart CPU frequency adjustment based on service workloads Key Technological Evolution of Mid Tier Storage Systems 1997 2000 2005 Present Dual Controller Flexible Configuration Software-defined Flexible Manual Configuration Active/Active of Hardware Configuration Configuration Components Provides limited single Migration from Single Supports FE (Fast Converged storage FC port, low flexibility, can only expand controller and Dual Ethernet)cards for that supports both storage volume Controller system, and better flexibility and SAN & NAS quickly through cascading disk enclosures. Active/Active expansion, port types becomes popular, controller and port numbers customers can configuration greatly can be freely flexibly configure increases the configured based on multiple protocols reliability and choice. and services based on processing storage needs. performance of storage systems. Key Technological Evolution of High End Storage Systems 1990 2000 2003 Present Direct-Attached Bus Architecture Hi-Star Architecture Virtual Matrix Architecture Architecture Due to the bus connected Centralized switches Frontend host ports Scale-out multi controller Scale-up architecture, only way to connects to the frontend and backend storage Fully switched upgrade is to use better host ports, backend ports directly CPU, interface cards, RAM x86 servers and protocols. storage ports, cache and connects to cache Loosely coupled other components using and avoids the delay FC backend channel. from bus or switched connections. Current Features of High End Storage Systems Core Storage Platform(Centralized) System Architecture Data Protection Quick Response Online Expansion Fully switched multi Local Snapshot Cache Partition, Linear performance controller architecture Protection Global Cache expansion Global Cache Second-level copying QoS Online volume End to end verification expansion Auto data Limits number of disks categorization Virtualization in loop SSD Optimization Contents 1. Definition of Storage 2. Development History of Storage 3. Evolution of Storage Technologies 4. Latest Storage Technologies and Trends 5. Storage Products and Solutions Server SAN is becoming the Mainstream Storage System in Enterprises CAGR 22.7% CAGR 38.4% Cloud Storage Market:Quickly growing and expected to reach $29B by 2020. Benefited from the requirements of centralized storage resource pools, lower TTM (time to market) and TCO (total cost of ownership). Public cloud and private cloud is growing by 22.7% and 38.4% CAGR respectively. Public Cloud:Aims at the emerging trend of increasing application performance and reliability. Standard Scenarios:Web、VDI、HPC、IOT、RDS、Wed Disk. Private Cloud:focuses increasing efficiency on large scale VM/Containers storage pools. Standard Scenarios:VSI、VDI、Big data. Key Requirement:Cloning/Active-Active、 Deduplication/Compression、High bandwidth and Virtual machines compatibility. New Business Generates New Resource Supply Models Funnel Cloud Database Video Archive IoT Mobile Internet Big Data Processing Network Processing Network Processing Network On Demand Software Defined Elastic Storage as a Service Block Storage File Storage Object Storage Funnel type Changing storage “Software Defined” is the key value of Cloud Multi-feature requirements requirements Complex Architecture Traditional storage are designed for single purpose or single scenario uses, and these funnels cannot satisfy the elastic expansion requirements. On the premise of cloudification, new business need new storage resources that supports elasticity, on demand and scalable expansion Software Defined Converged Cloud Storage Traditional New Generation Converged ・ Cloud Storage Applications Applications Converged Elastic OpenStack iSCSI SCSI NFS CIFS FTP HDFS NDMP S3/Swift Open Software defined and unified hardware platform. Same system that provides Block/ File/ Object Block Storage File Storage Storage Object storage. FusionStorage 6.0 Suitable for finance development. testing, government, security and large Unified Hardware Platform enterprises and carriers with cloud scenarios. Quiz 1. Which of the following are networking methods of Storage Systems? A. DAS B. NAS C. FC SAN D. IP SAN 2. Which of the followings is the features of Cloud Storage? A. Convergence B. Open C. Elastic D. Vertical Expansion Thank www.huawei.com You 02 Storage Technologies for AI, Big Data and the Cloud www.huawei.co m Copyright © 2018 Huawei Technologies Co., Ltd. All rights reserved. Foreword This module mainly introduces: Development trends of new ICT architectures. Concept of the Cloud, the key technologies used and the application of storage in the Cloud. Concept of Big Data, the key technologies used and the application of storage in Big Data. Converged technologies and applications between the Cloud, Big Data and AI. Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 1 reserved. Objectives Upon completion of this module, you will be able to: Understand the development trends of ICT. Understand what is the Cloud, Big Data and AI. Learn about storage technologies and its application in the Cloud. Learn about storage technologies and its application in AI and Big Data. Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 2 reserved. Contents 1. ICT Technologies Development Trends. 2. Storage Technologies and Its Application in the Cloud. 3. Storage Technologies and Its Application in AI and Big Data. Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 3 reserved. ICT Becoming The Engine of Transformation for Traditional Industry 4th Intelligent Systems 2nd Industrial Application based on Big Data Electrification Internet of Things 1st Mechanization 3rd Automation Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 4 reserved. 4 Biggest IT Trends That Are Rebuilding The World Mobile Internet Data Generation/ Data Consumption Internet Data Data of Big Data Collection Analysis Things Data Computing Cloud Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 5 reserved. Big Data: Data is one of the most important asset of businesses, and enterprises will be competing between each other in terms of data in the future. McKinsey (a worldwide consulting firm) mentioned that enterprises that are unable to fully utilize the capabilities of Big Data will be phased out in the future. Big data analysis has given us a lot of promises, but in the near term, there are just too many big data solutions looking for solvable problems. In the long run, the potential of big data will outpace the optimization of e-commerce by embracing all vertical sectors, including the financial sector, manufacturing, transport and power sectors etc. However, all these vertical sectors requires Industrial Internet(also known as Internet of Things) to interconnect large amount of sensors that could provide huge amount of data that can be used for improving product designs, and accurate prediction of faults etc. GE (General Electric) and IBM are the pioneers and leaders of this field currently, but we are still at the very initial stages of Big Data revolution. In a few years from now, Industrial Internet will develop in a greater pace and Big Data will also grow bigger, making the demands for Big Data solutions unstoppable. Cloud: The Cloud has become the next generation of IT infrastructure, 56% of SMEs will purchase 4+ of cloud services in the next 3 years. Up to 2016, 75% of the new IT investment is on the Cloud or Hybrid Cloud. 70% of the CIOs also deploys a “Cloud Prioritization” strategy in 2016. 80% of the new IT decisions will have the business representatives involved, while 53% of the IT decisions will be led by the business representatives Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 6 reserved. Cloud Data Center is Everywhere 660+ Data Centers are built globally, in which 255+ are Cloud Data Centers. Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 7 reserved. Contents 1. ICT Technologies Development Trends. 2. Storage Technologies and Its Application in the Cloud. 3. Storage Technologies and Its Application in AI and Big Data. Copyright © 2018 Huawei Technologies Co., Ltd. All rights Page 8 reserved. Positioning and Function of the Cloud in ICT Industry Layout Huge Number of Connections, Huge Dataflow: 10G/Person, 1000G/Enterprise Large Scale Application Exponential Low Latency: 4G

Use Quizgecko on...
Browser
Browser