Parallel Computer Architectures Overview
48 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What are the main components of a massively parallel calculator?

  • Processor, Memory, Interconnection Network (correct)
  • Storage, Control Unit, Input Devices
  • Database, User Interface, Processing Units
  • Memory, Graphical Unit, Software

In Flynn's classification, which mode corresponds to Single Instruction stream Single Data stream?

  • DIMS
  • SIMD
  • SISD (correct)
  • MIMD

What type of memory organization allows every processor to access a unique address space?

  • Distributed Memory (correct)
  • Virtual Memory
  • Buffer Memory
  • Shared Memory

Which statement best describes the SIMD architecture?

<p>Single Instruction stream, Multiple Data streams (B)</p> Signup and view all the answers

What is the primary purpose of a pipeline in processing units?

<p>To subdivide operations into elementary stages (C)</p> Signup and view all the answers

Which process is NOT a stage in the pipeline of a floating point adder?

<p>Data transfer (D)</p> Signup and view all the answers

What characterizes MIMD architectures in Flynn's classification?

<p>Multiple Instruction stream Multiple Data stream (C)</p> Signup and view all the answers

How does shared memory differ from distributed memory?

<p>Access time is uniform in shared memory. (B)</p> Signup and view all the answers

What is a significant disadvantage of MIMD with shared memory systems?

<p>Lack of modularity (B)</p> Signup and view all the answers

How does MIMD with distributed memory primarily achieve high memory performance?

<p>Through local accesses (C)</p> Signup and view all the answers

In a hypercube topology, where are processor/memory couples placed?

<p>At the vertices of an n-dimensional hypercube (A)</p> Signup and view all the answers

What is a primary characteristic of transputers in multiprocessing?

<p>They integrate several components on a single chip (C)</p> Signup and view all the answers

Which of the following is a disadvantage of MIMD with distributed memory?

<p>Challenges with non-local access memory performance (D)</p> Signup and view all the answers

Which attribute facilitates the integration of many processors on a chip in SIMD machines?

<p>Specialized processors with simplified control logic (D)</p> Signup and view all the answers

What does virtual shared memory provide to programmers?

<p>A single address space managed by the system (C)</p> Signup and view all the answers

What challenge is typically faced with communication management in MIMD machines?

<p>Synchronization between multiple processors (D)</p> Signup and view all the answers

What does the variable $T(n)$ represent in a pipelined system?

<p>Total time to execute a sequence of operations (C)</p> Signup and view all the answers

In the context of pipeline execution speed, what does the variable $V(n)$ represent?

<p>The execution speed (A)</p> Signup and view all the answers

Which of the following best describes the purpose of the inhibition bit in SIMD machines?

<p>To prevent unnecessary operations during execution (D)</p> Signup and view all the answers

What is the term that refers to $r_{∞}$ in a pipelined system?

<p>Asymptotic speed (D)</p> Signup and view all the answers

What does the relationship $n_{1/2} = α/τ$ indicate?

<p>Length of the sequence needed for peak performance (A)</p> Signup and view all the answers

In a SIMD machine's operation on data, what is the significance of the independent banks of memory?

<p>They allow for providing multiple words in parallel. (D)</p> Signup and view all the answers

Which example best illustrates low-level parallelism in computing?

<p>Performing multiple arithmetic operations at the same time (C)</p> Signup and view all the answers

What is the function of the network of interconnections in SIMD machines?

<p>To gather and realign data for correct processing (B)</p> Signup and view all the answers

What is the general structure of a vector as defined in the content?

<p>{<em>A<sub>i</sub></em> + (<em>k</em> - 1) <em>R</em>} (C)</p> Signup and view all the answers

What does 'SIMD' stand for in the context of vector machines?

<p>Single Instruction Multiple Data (D)</p> Signup and view all the answers

In the definition provided, what is required for combining vectors?

<p>Vectors must have the same length. (A)</p> Signup and view all the answers

Which of the following best describes a bidimensional vector?

<p>A sequence of addresses with two different increments. (D)</p> Signup and view all the answers

Which of the following operates only on scalars as defined in the document?

<p>Scalar unit operations (D)</p> Signup and view all the answers

What characterizes the last two operations in component-by-component operations?

<p>They are called reduction operations. (C)</p> Signup and view all the answers

What is the significance of performing operations with at least one vector operand?

<p>They extend classic arithmetic expressions. (B)</p> Signup and view all the answers

What type of operations can vector machines execute?

<p>Both arithmetic and logical operations on scalars and vectors. (B)</p> Signup and view all the answers

What is the maximum amount of external memory that the Inmos T800 can support?

<p>8 MB (C)</p> Signup and view all the answers

Which metric does NOT contribute to the complexity of parallel algorithms?

<p>Time complexity (C)</p> Signup and view all the answers

What is the theoretical upper limit for speedup in a parallel algorithm using p processors?

<p>S<sub>p</sub>(A) ≤ p (D)</p> Signup and view all the answers

In terms of efficiency, what is the maximum value for Ep(A)?

<p>1 (A)</p> Signup and view all the answers

Which of the following features allows for fast task execution in Inmos T800?

<p>Concurrency (D)</p> Signup and view all the answers

Based on the general result for a perfect parallel machine, which statement is accurate?

<p>N<sub>p</sub>/q ≤ T<sub>q</sub> ≤ N<sub>p</sub> (A)</p> Signup and view all the answers

What does speedup Sp(A) represent in the context of parallel algorithms?

<p>The time needed by the best sequential algorithm divided by parallel execution time (D)</p> Signup and view all the answers

Which limitation is associated with the Transputer's communication?

<p>Point-to-point connections only (C)</p> Signup and view all the answers

What does the operation A[1; N; 2] = B[3; N; 1] + d C[5; N; 3] become in loop form?

<p><em>A</em>(2 * <em>i</em> - 1) = <em>B</em>(2 + <em>i</em>) + <em>d</em> <em>C</em>(3 * <em>i</em> + 2) (A)</p> Signup and view all the answers

In the context of the software view of vector operations, which statement is true?

<p>All right operands are stored after operations are executed. (A)</p> Signup and view all the answers

Regarding the use of a mask in vector operations, what happens when VM(i) = 0?

<p>The <em>i</em>th component of the result vector remains unchanged. (B)</p> Signup and view all the answers

How does the implementation of a mask vector influence the execution of vector operations?

<p>Only operations corresponding to 1s in the mask are executed. (D)</p> Signup and view all the answers

What effect does the use of the mask have on the computation cost in vector operations?

<p>It does not affect the overall cost regardless of the mask. (D)</p> Signup and view all the answers

Which statement accurately reflects the equivalence of vector instructions to conditional branches in loops?

<p>Vector instructions can replicate a loop's conditional actions without loss of functionality. (B)</p> Signup and view all the answers

What is the primary function of the variable c in the equation c = SUM(A[1; N; 2])?

<p>It stores the total sum of all elements in <em>A</em>. (D)</p> Signup and view all the answers

When executing vector operations, what is typically true regarding the index variable during iterations?

<p>The index variable iterates through all values in the range determined by <em>N</em>. (A)</p> Signup and view all the answers

Flashcards

SIMD Architecture

A parallel processing architecture where a single instruction is executed on multiple data streams simultaneously.

MIMD Architecture

A parallel processing architecture where multiple instructions are executed on multiple data streams simultaneously.

Shared Memory

A memory architecture where all processors share a single address space.

Distributed Memory

A memory architecture where each processor has its own local memory.

Signup and view all the flashcards

Processing Element (PE)

A fundamental processing unit in a parallel system.

Signup and view all the flashcards

Interconnection Network

The communication pathway between processors and memory in a parallel system.

Signup and view all the flashcards

Flynn's Taxonomy

A classification of computer architectures based on instruction streams and data streams.

Signup and view all the flashcards

Pipeline

A technique that breaks down a task into smaller stages, allowing multiple tasks to be processed concurrently.

Signup and view all the flashcards

Pipeline Technique

A method to speed up execution by breaking down a task into smaller stages that can be processed concurrently. It is applicable to floating point operations, memory access, and control unit operations.

Signup and view all the flashcards

Startup Time (α)

The time required to initialize the pipeline before the first operation can be processed.

Signup and view all the flashcards

Asymptotic Speed (r∞)

The maximum speed a pipeline can achieve, given an infinitely long sequence of operations.

Signup and view all the flashcards

Sequence Length for Half Performance (n1/2)

The length of a sequence of operations required to achieve half of the asymptotic speed of the pipeline.

Signup and view all the flashcards

Parallelism between Functional Units

Multiple units (e.g., adder, multiplier, memory unit) working simultaneously to process different parts of a task.

Signup and view all the flashcards

SIMD (Single Instruction Multiple Data)

A parallel architecture where a single instruction is executed on multiple data streams simultaneously.

Signup and view all the flashcards

Interconnection Network in SIMD

The communication pathway that rearranges data for processing in a SIMD machine. It synchronizes data streams for efficient operation.

Signup and view all the flashcards

Inhibition Bit

A special bit that controls the execution of operations in a loop. It prevents unnecessary calculations when conditions aren't met.

Signup and view all the flashcards

Hypercube

A multi-dimensional structure where processors and memory are placed at its vertices, connected by communication links that form the edges. Each dimension represents a connection to another processor.

Signup and view all the flashcards

MIMD with Shared Memory

A parallel architecture where multiple processors execute different instructions on different data streams, all sharing a single address space for memory access.

Signup and view all the flashcards

MIMD with Distributed Memory

A parallel architecture where each processor has its own memory, allowing for independent data access but requiring communication for data sharing.

Signup and view all the flashcards

Virtual Shared Memory

A system that presents a single address space illusion to the programmer, handling data distribution and message management behind the scenes, making it seem like all processors share one memory.

Signup and view all the flashcards

SIMD Processor

A specialized processor designed for parallel processing, where a single instruction is executed on multiple data streams simultaneously. These processors are optimized for specific tasks, like performing the same operation on many numbers at once.

Signup and view all the flashcards

Transputer

A specialized integrated circuit that acts as a building block for parallel systems. It combines a processor, memory, communication links, and a floating-point unit on a single chip.

Signup and view all the flashcards

Communication Management

The process of handling data transfer and communication between processors in a parallel system. This can involve routing messages, synchronizing operations, and managing data access.

Signup and view all the flashcards

Vector Coprocessor

An additional processor unit designed to accelerate vector operations, such as performing the same operation on a series of numbers.

Signup and view all the flashcards

Vector Definition

A vector is a sequence of memory addresses defined by a starting address, a step (increment or stride), and a length.

Signup and view all the flashcards

Vector Notation

A vector is denoted as A[i; N; R], where A is the starting address, i is the initial index, N is the length, and R is the step size.

Signup and view all the flashcards

Vector Machine

A machine equipped with instructions that can operate on both scalars and vectors.

Signup and view all the flashcards

Vector Operations

Operations involving at least one vector operand, extending basic arithmetic and logic.

Signup and view all the flashcards

Component-by-component Vector Operation

An operation applied to each individual element of two vectors, producing a resulting vector.

Signup and view all the flashcards

Reduction Operation

A vector operation that combines all the values in a vector into a single scalar value.

Signup and view all the flashcards

Bidimensional Vector

A vector representing a sequence of addresses in a 2D space, defined by starting address, row and column steps, and dimensions.

Signup and view all the flashcards

Fortran 8X Vector Types

Fortran 8X includes built-in vector types similar to those defined in this context.

Signup and view all the flashcards

Transputer: Communication

Transputers allow communication to happen in parallel with CPU activities, enabling efficient data exchange between processors.

Signup and view all the flashcards

Transputer: Connection Type

Transputers have a point-to-point connection, meaning data travels directly between two processors, without needing a central hub.

Signup and view all the flashcards

Parallel Algorithm Speedup

Speedup (Sp(A)) measures how much faster a parallel algorithm (A) runs with p processors compared to the best sequential algorithm.

Signup and view all the flashcards

Parallel Algorithm Efficiency

Efficiency (Ep(A)) indicates how well the processors are utilized in a parallel algorithm. It ranges from 0 to 1, with 1 being perfect utilization.

Signup and view all the flashcards

Perfect Parallel Machine Result

This proposition shows how a computation can be performed on different numbers of processors within a specific time range, assuming a 'perfect' machine with seamless parallelization.

Signup and view all the flashcards

Parallel Algorithm: Lower Bound Time

The minimum time to execute a computation on q processors is at least the number of operations divided by q.

Signup and view all the flashcards

Parallel Algorithm: Upper Bound Time

The maximum time to execute a computation on q processors is the sum of the time to execute the operations on each processor, plus the sequential time.

Signup and view all the flashcards

Proof for q=1

When there's only one processor (q=1), the time to execute the computation is equal to the number of operations.

Signup and view all the flashcards

Vector Operation in Loops

A loop where each iteration performs an operation on corresponding elements of vectors. Think of it as applying a single instruction to multiple data points.

Signup and view all the flashcards

Right Operand Assumption

In vector operations, the entire right operand is read before any computations are performed. It's as if the right operand is copied before the loop starts.

Signup and view all the flashcards

Mask Vector (VM)

A vector of bits that controls which elements of a result vector are modified. Imagine it as a switch that turns on or off operations.

Signup and view all the flashcards

Mask Vector Effect

If VM(i) is 1, the ith result element is modified based on the vector operation. If VM(i) is 0, the ith result element is unchanged

Signup and view all the flashcards

Mask for Conditional Branches

Use a mask to vectorize conditional branches, replacing the loop with vector operations. It eliminates the branching and performs the operations in parallel

Signup and view all the flashcards

Mask Vector Optimization

While a mask controls output, all vector operations might still be performed, so the optimization is limited. Cost scales with vector length, not only the number of non-zero mask elements.

Signup and view all the flashcards

Example Vector Operation

A simple vector operation like C = A + B is equivalent to individually adding elements of A and B and storing the result in C. This operation is done in parallel for all corresponding elements.

Signup and view all the flashcards

Loop to Vector Equation

Transforming loop code into vector operations is crucial. This allows for efficient parallel execution and takes advantage of vectorized hardware.

Signup and view all the flashcards

Study Notes

Architectures

  • Several types of architectures are mentioned, including SIMD and MIMD.
  • Massive parallel systems are discussed.
  • Basic components of a parallel computer include processors, an interconnection network, and memory.

General Structure of a Parallel Computer

  • Memory stores data and instructions.
  • An interconnection network connects processors and memory.
  • Processing elements (PEs) are the processors. Multiple PEs are represented in diagrams.

Plan

  • A plan for the study of parallel computer architectures is outlined.
  • Topics include introductory concepts, SIMD/MIMD architectures, fundamental processors, interconnection networks, memory organization, and examples of parallel computer architectures.

Bibliography

  • A list of books and articles relevant to parallel processing is provided.
  • Authors and titles of works are listed, including several references on specific architectures.

Classification (1)

  • Flynn's taxonomy categorizes computers based on instruction and data streams.
  • SISD (Single Instruction stream, Single Data stream)
  • SIMD (Single Instruction stream, Multiple Data stream)
  • MIMD (Multiple Instruction stream, Multiple Data stream)
  • Kuck further classifies systems.

Memory Organization

  • Shared memory: A single address space shared by all processors. Access time is less dependent on processor and memory location.
  • Distributed memory: Each processor has its own separate address space. Access time depends on both processor and memory location.

Pipeline

  • A pipeline breaks down operations into stages for efficient execution.
  • Example is a floating-point adder with four stages (exponent subtraction, mantissa alignment, mantissa addition, normalization).
  • It is applicable to both floating-point and memory operations.

Pipeline (suite)

  • Example code demonstrates parallel calculations.
  • Data dependencies and stage lengths for calculations are represented.

Pipeline (suite)

  • Mathematical formulas describe execution time.
  • Quantities such as T (max time for the stages), n (total operations), and α (fixed overheads) are present.
  • There are formulas to help describe theoretical peak speed.

Parallelism Between Functional Units

  • Multiple independent functional units can execute instructions concurrently.
  • Examples include adders, multipliers, and I/O units.
  • Pipeline parallelism can be used as well.

SIMD Machines

  • Fundamental principle: All processors execute the same instruction simultaneously on different data.
  • Instruction broadcast to all processors.
  • Memory is structured in banks.
  • Interconnection network used for re-sequencing.
  • Operating in blocks of P elements.

SIMD Machines (suite)

  • Data layout in different memory banks and parallel execution (SIMD) is described using diagrams.
  • Potential issues, such as memory bank conflicts, are recognized and discussed.

SIMD Machines (suite)

  • Further explanation on data layout in memory banks.
  • Demonstrates potential conflicts if data is not properly allocated.

SIMD Machines (suite)

  • Explanation of how data should be structured for SIMD operations to work optimally.

SIMD Machines (suite)

  • In the case when an instruction involves a conditional check, the conditional and result operations will be done in parallel for the processors.

SIMD Characteristics Summary

  • SIMD's strengths include simplicity, modularity, and high throughput for handling large amounts of data.
  • SIMD's weaknesses include limitations in handling various computation tasks, limited flexibility in program design, and specialization of architecture.

MIMD with Shared Memory

  • Completely independent processors, with a shared address space.
  • Uniform access time to the shared memory.
  • Communication relies on the interconnection network, similar to a telephone switchboard.
  • Data transfer between processors is not explicit.

MIMD with Distributed Memory

  • Fully independent and autonomous processors each with own address space.
  • Memory allocated to processors.
  • Processors communicate with each other using messages through interconnection network.
  • Explicit movement of data between processes.

Topologies for Distributed Memory

  • Different network topologies (2D mesh, 3D mesh, torus, and hypercube).
  • Diagrams illustrate processor and memory connections.

Topology and Hypercubes

  • Hypercubes are a common topology for distributed memory machines.
  • Processor and memory connections and communication links are elaborated.
  • Diagrams demonstrate details on connections and communication links.

Overview of MIMD Machines with Shared Memory

  • Advantages in simplicity and programming and lack of required data movement.
  • Disadvantages include potential performance bottlenecks due to shared memory contention or network congestion in certain situations

Overview of MIMD Machines with Distributed Memory

  • Advantages include scalability, potential for faster data access, and flexibility in design.
  • Disadvantages include the complexity of programming that may be required as well as message communication overhead.

Element Processors (SIMD Case)

  • Dedicated processors (1 bit, 4 bits, or those with specialized floating-point units) can reduce control logic complexity.
  • Integration of multiple processors onto a single chip is possible.

Element Processors (MIMD Case)

  • General-purpose processors are common.
  • Modern processors have tools for parallel execution.
  • Communication mechanisms (e.g., Direct Connect Routing Module) may be needed.
  • Coprocessor support for certain operations (vector operations) can be added or integrated.

Parallel Algorithm

  • Using parallelism significantly affects system performance.
  • Measuring algorithm efficiency requires considering data size, parallelism degree in an algorithm, and data transfer.

Parallel Algorithm (Definitions)

  • Key definitions for assessing parallel execution: Sequential execution time, time using a parallel approach, speedup, and efficiency.

Parallel Algorithm (Properties)

  • Important characteristics for parallel algorithms include upper bounds on speedup and efficiency.

Vector Operations

  • Operations on multiple values simultaneously
  • Examples demonstrate how single-value mathematical operations can be performed on vectors
  • Concepts such as vector lengths and incrementing steps are important
  • A variety of operations are shown

Vector Operations (Examples)

  • Examples of vector operations, including component-wise arithmetic, scalar operations with vectors and reduction operations.

Software View of Vector Operations

  • Software instructions sequence to perform vector operations.
  • Instructions for storing and loading vector operations steps.

Mask

  • Mask (VM) is a vector of bits used to control vector operations.
  • When VM is 1, specific element is computed.
  • Useful for controlling which elements are modified.
  • Useful to filter data in vector operations.

Examples

  • Examples that use masks to describe steps in conditional statements operating on vectors.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Parallel Architectures PDF

Description

This quiz covers key concepts in parallel computer architectures, including SIMD and MIMD systems. It explores the general structure of parallel computers, the role of processing elements, and interconnection networks. Additionally, it provides a plan for studying various architectures as well as a bibliography of relevant literature.

More Like This

Use Quizgecko on...
Browser
Browser