chapter3.pdf
Document Details

Uploaded by AdoredCharoite
null
Full Transcript
Cloud Computing Majdi Maabreh, Ph.D. Program of Data Science and AI ( ) DR. MAJDI MAABREH, [email protected] Parallel processing and distributed computing CH03 DR. MAJDI MAABREH, [email protected] Parallel Vs. Distributed Parallel computing refers to a model in which the computation is divided among sev...
Cloud Computing Majdi Maabreh, Ph.D. Program of Data Science and AI ( ) DR. MAJDI MAABREH, [email protected] Parallel processing and distributed computing CH03 DR. MAJDI MAABREH, [email protected] Parallel Vs. Distributed Parallel computing refers to a model in which the computation is divided among several processors sharing the same memory. The architecture of a parallel computing system is often characterized by the homogeneity of components: each processor is of the same type and it has the same capability as the others. the shared memory has a single address space, which is accessible to all the processors. Parallel programs are then broken down into several units of execution that can be allocated to different processors and can communicate with each other by means of the shared memory. DR. MAJDI MAABREH, [email protected] Parallel Computing DR. MAJDI MAABREH, [email protected] Distributed Computing The term distributed computing encompasses any architecture or system that allows the computation to be broken down into units and executed concurrently on different computing elements, whether these are processors on different nodes, processors on the same computer, or cores within the same processor. The term distributed often implies that the locations of the computing elements are not the same and such elements might be heterogeneous in terms of hardware and software features. DR. MAJDI MAABREH, [email protected] Distributed Computing DR. MAJDI MAABREH, [email protected] Execution Times (example-1) DR. MAJDI MAABREH, [email protected] Execution Times (example-2) DR. MAJDI MAABREH, [email protected] Hardware architectures for parallel processing The core elements of parallel processing are CPUs. Based on the number of instructions and data streams that can be processed simultaneously, computing systems are classified into the following four categories: Flynn's classification Single-instruction, single-data (SISD) systems. Single-instruction, multiple-data (SIMD) systems. Multiple-instruction, single-data (MISD) systems. Multiple-instruction, multiple-data (MIMD) systems. DR. MAJDI MAABREH, [email protected] Single-instruction, single-data (SISD) Machine instructions sequentially. are processed All the instructions and data to be processed have to be stored in primary memory. The speed of the processing element in the SISD is limited by the rate at which the computer can transfer information internally. DR. MAJDI MAABREH, [email protected] Single-instruction, multiple-data (SIMD) SIMD computing system is a multiprocessor machine capable of executing the same instruction on all the CPUs but operating on different data streams Well suited to scientific computing since they involve lots of vector and matrix operations. e.g. Ci = Ai * Bi e.g. Array and vector processors on GPU DR. MAJDI MAABREH, [email protected] Multiple-instruction, single-data (MISD) A multiprocessor machine capable of executing different instructions on the same dataset. e.g. y = sin(x) + cos(x) + tan(x) Machines built using the MISD model are not Useful in most of the applications; a few machines are built, but none of them are available commercially. DR. MAJDI MAABREH, [email protected] Multiple-instruction, multiple-data (MIMD) A multiprocessor machine capable of executing multiple instructions on multiple datasets. Multitasking DR. MAJDI MAABREH, [email protected] Distributed memory MIMD machines All PEs have a local memory. Systems based on this model are also called loosely coupled multiprocessor systems. The communication between PEs in this Model takes place through the interconnection network (the inter-process communication channel, or IPC). Message Exchange DR. MAJDI MAABREH, [email protected] Shared memory MIMD machines Systems based on this model are also called tightly coupled Multiprocessor systems. The communication between PEs in this model takes place through the shared memory; modification of the data stored in the global memory by one PE is visible to all other PEs. DR. MAJDI MAABREH, [email protected] Approaches to parallel programming To make many processors collectively work on a single program, the program must be divided into smaller independent chunks so that each processor can work on separate chunks of the problem. A wide variety of parallel programming approaches are available. The most prominent among them are the following: Data parallelism. Process parallelism. Farmer-and-worker model. DR. MAJDI MAABREH, [email protected] Data parallelism In the case of data parallelism, the divide-and-conquer technique is used to split data into multiple sets, and each data set is processed on different PEs using the same instruction. This approach is highly suitable to processing on machines based on the SIMD model. DR. MAJDI MAABREH, [email protected] Process parallelism In the case of process parallelism, a given operation has multiple (but distinct) activities that can be processed on multiple processors. DR. MAJDI MAABREH, [email protected] Farmer-and-worker model Master-Slave DR. MAJDI MAABREH, [email protected] Fine Vs. Coarse Granularity Computation to communication ratio Opportunity for performance enhancement Communication overhead Fine- Grain Low Less Coarse-Grain High More Higher Lower DR. MAJDI MAABREH, [email protected] Two important guidelines to take into account: Speed of computation is proportional to the square root of system cost; they never increase linearly. Therefore, the faster a system becomes, the more expensive it is to increase its speed DR. MAJDI MAABREH, [email protected] Speed by a parallel computer increases as the logarithm of the number of processors (i.e., y= k * log(N)). Amdahl’s Law Speedup = 𝑠𝑠𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃 DR. MAJDI MAABREH, [email protected] From supercomputers to distributed systems Modern supercomputers derive their power from architecture and parallelism rather than faster processors running at higher clock rates. The supercomputers of today consist of a very large number of processors and cores communicating through very fast and expensive custom interconnects. DR. MAJDI MAABREH, [email protected] From supercomputers to distributed systems A distributed system is a collection of computers connected through a network and a distribution software called middleware, which enables computers to coordinate their activities and to share the resources of the system; DR. MAJDI MAABREH, [email protected] References Mark Grechanik, Cloud Computing: Theory and Practice (Third Edition: 2022) Mastering Cloud Computing: Foundations and Applications Programming (1st Edition, 2013) DR. MAJDI MAABREH, [email protected]