csc25-chapter_08-4-14_part1.pdf

Document Details

SelfDeterminationOmaha

Uploaded by SelfDeterminationOmaha

ITA

2024

Tags

computer architecture parallel processing high performance

Full Transcript

Parallel Architectures Overview Multiple instruction streams, multiple data streams - MIMD Multiprocessors Computers consisting of tightly coupled processors whose coordination/usage are gener- ally controlled by a single OS and that share memory through a shared address spa...

Parallel Architectures Overview Multiple instruction streams, multiple data streams - MIMD Multiprocessors Computers consisting of tightly coupled processors whose coordination/usage are gener- ally controlled by a single OS and that share memory through a shared address space 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 4/44 Parallel Architectures (cont.) Memory Organization Multiple processors options, sharing 1. cache, memory, and I/O system 2. memory and I/O system 3. I/O system 4. nothing, usually communicates through networks Are all options feasible or interesting? I remember it is important to avoid bottlenecks 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 5/44 Parallel Architectures (cont.) Memory Organization Multiprocessors by their memory organization 1. symmetric (shared-memory) multiprocessors - SMP 2. distributed shared memory - DSM 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 6/44 Parallel Architectures (cont.) Memory Organization Symmetric (shared-memory) multiprocessors - SMP1 I ≈ 32 cores or less I share a single centralized memory where processors have equal access to, i.e., symmetric I memory/bus may become a bottleneck I use of large caches and many buses I uniform access time - UMA to all of the memory from all of the processors 1 a.k.a. centralized shared-memory multiprocessors 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 7/44 Parallel Architectures (cont.) Memory Organization Basic SMP structure 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 8/44 Parallel Architectures (cont.) Memory Organization Distributed shared memory - DSM I larger processor counts, e.g., 16-64 processor cores I distributed memory to I increase bandwidth I reduce access latency I communicating data among processors becomes more complex I requires more effort in the software to take advantage of the increased memory bandwidth I I/O system is also distributed I each node can be a small distributed system with centralized memory I nonuniform memory access - NUMA, i.e., access time depends on the location of a data word in memory 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 9/44 Parallel Architectures (cont.) Memory Organization Basic DSM structure 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 10/44 Parallel Architectures (cont.) Memory Architecture SMP (e.g., following UMA) I processors share a single memory and I they have uniform access times to the memory DSM (e.g., following NUMA) I processors share the same address space I not necessarily the same physical memory Multicomputers I processors with independent memories and address spaces I communicate through interconnection networks I may even be complete computers connected in network, i.e., clusters 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 11/44 Parallel Architectures (cont.) Communication Models SMP - central memory I threads and fork-join model2 I open multi-processing - OpenMP I implicit communication, i.e., memory access DSM - distributed memory I message passing model3 I message passing interface - MPI I explicit communication, i.e., message passing I synchronization problems 2 can also be applied to DSM 3 can also be applied to SMP 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 12/44 Parallel Architectures (cont.) Market Share SMP I bigger market share, both in $ and units I multiprocessors in a chip Multicomputers I popularization of clusters for systems on the internet I >100 processors, i.e., massively parallel processors - MPP 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 13/44 Parallel Architectures (cont.) SMP Large and efficient cache systems can greatly reduce the need for memory bandwidth SMP provide some cost benefits as they need not much extra hardware, and are based on general purpose processors - GPP Caches not only provide locality, but also replication. Is that a problem? Basic SMP structure 1st semester, 2024 Loubach CSC-25 High Performance Architectures ITA 14/44

Use Quizgecko on...
Browser
Browser