🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

20240205-DS642-Lecture-03.pdf

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Full Transcript

DS 642: Applications of Parallel Computing Lecture 3 02/05/2024 http://www.cs.njit.edu/~bader DS642 1 Outline Processors and registers Memory hierarchies Parallelism within single processors – Instruction Level Parallelism (ILP) and Pipelining – SIMD units – Special Instructions (FMA) Case study: Ma...

DS 642: Applications of Parallel Computing Lecture 3 02/05/2024 http://www.cs.njit.edu/~bader DS642 1 Outline Processors and registers Memory hierarchies Parallelism within single processors – Instruction Level Parallelism (ILP) and Pipelining – SIMD units – Special Instructions (FMA) Case study: Matrix Multiplication Optimization in practice 02/05/2024 DS642 2 What is Pipelining? Dave Patterson’s Laundry example Latency: wash (30 min) + dry (40 min) + fold (20 min) = 90 min 6 PM 7 8 9 In this example (4 loads): Time 4 * 90min = 6 hours 30 40 40 40 40 20 T a s k O r d e r - Sequential execution takes A B C D 02/05/2024 DS642 3 What is Pipelining? Dave Patterson’s Laundry example Latency: wash (30 min) + dry (40 min) + fold (20 min) = 90 min 6 PM 7 8 9 In this example (4 loads of laundry): Time 30 40 40 40 40 20 T a s k O r d e r - Sequential execution takes 4 * 90min = 6 hours - Pipelined execution takes 30+4*40+20 = 3.5 hours A B Bandwidth = loads/hour = 4/6 l/h w/o pipelining = 4/3.5 l/h w pipelining

Use Quizgecko on...
Browser
Browser