CSE220 - Computer Architecture Course Outline PDF
Document Details
Uploaded by Deleted User
University of California, Santa Cruz
Prof. Heiner Litz
Tags
Summary
This document appears to be lecture notes for a computer architecture course. The course covers topics such as technology trends, performance, ISAs, pipelining, caches, and virtual machines, and provides a course outline, including details on required materials and assessments.
Full Transcript
CSE220 - Computer Architecture Introduction + Performance Metrics Prof. Heiner Litz [email protected] Plan for Today Course Overview Performance Metrics Computer Architecture Laws Moore’s Law Amdahl’s Law...
CSE220 - Computer Architecture Introduction + Performance Metrics Prof. Heiner Litz [email protected] Plan for Today Course Overview Performance Metrics Computer Architecture Laws Moore’s Law Amdahl’s Law Little’s Law Benchmarking CMPE 110 Prof. Renau 2 New Golden Age for Computer Architecture What is the fastest CPU out there? What should you use for ML in the datacenter? This course will cover the fundamental principles of architecture With it, can understand/evaluate/optimize the diversity of architectures that are currently emerging CMPE 110 Prof. Renau 3 When Will You Use Computer Architecture Hardware Design Systems Design High performance software development System Selection / Purchasing Performance Tuning Throughout Computer Science CMPE 110 Prof. Renau 4 What will you know at the end? Understand how current processors work Intel Skylake, AMD Zen, Apple Firestorm (in A14)... Become a better programmer Understand the performance implications Get basic knowledge to do research in computer architecture Fundamental computational concepts applicable beyond architecture, such as: parallelism, locality, speculation, performance evaluation CMPE 110 Prof. Renau 5 What will you know at the end? How to turn transistors into performance CMPE 110 Prof. Renau 6 You will not learn I/O Advanced multiprocessor stuff Consistency protocol optimizations (CSE226) GPUs & accelerators (watch for special topics course) Compiler/microprocessor interaction but you could guess it Many research ideas not available in current processors Thread Level Speculation HW support for debugging Software techniques to improve ILP Some covered in CSE 221 (Advanced Microprocessor Design) CMPE 110 Prof. Renau 7 Staff Instructor - Prof. Heiner Litz Lectures: T/Th 1:30-3pm Office Hours: TBD https://people.ucsc.edu/~hlitz/ Research interests: computer architecture, data centers, compilers, storage systems Teaching Assistant Surim Oh Yuanpeng Liao Office Hours: TBD CMPE 110 Prof. Renau 8 Lectures Logistics When: TueThu 11:40pm-1:15pm (zoom) Recordings available afterwards on Yuja (webcast) Strongly recommend attending live and participating Office Hours Thursday: TBD Final Exam Wednesday, Dec. 119:00–11:00 a.m. CMPE 110 Prof. Renau 10 Course Outline Unit 1 – Basics (Weeks 1 – 3) Topics – technology trends, performance, ISAs, pipelining, caches, VM Five week review of CSE120 Unit 2 – Advanced Processors(Weeks 4 – 8) Topics – out-of-order, speculation, branch prediction, prefetching Unit 3 – Parallelism (Weeks 9 – 10) Topics – multithreading, vectors, VLIW coherence, consistency Final Exam (Wed, June 14, 4-7 pm) CMPE 110 Prof. Renau 11 Recommend Textbooks No books required, but if slides are insufficient, consider consulting: Computer Architecture: A Quantitative Approach, 6th Ed. by John Hennessy and David Patterson (Modern Processor Design by John Shen and Mikko Lipasti) (The RISC-V Reader by David Patterson & Andrew Waterman) CMPE 110 Prof. Renau 12 Materials Everything will be at: Canvas (slides, recordings on Yuja) https://canvas.ucsc.edu/courses/75714 Discussion (on Ed) You can run your own Discord but we will check only Ed Github for Simulator (Scarab) used in this class For final exam, will need: Canvas access One double-sided page of notes, closed book otherwise Calculator No midterm CMPE 110 Prof. Renau 13 Notice of Accommodation If you have an accommodation from the DRC, send me the form by email with the requirements Please send the form promptly CMPE 110 Prof. Renau 14 Course Requirements Prerequisites Taken an undergraduate computer architecture course Covered basic computer organization (e.g. ISA, pipelines, caches) Up to student to assess Capacity to “install/play” with Linux setups Will need to know C and be able to compile Capacity to program/script You should not take this course if any of these apply You do not meet the prerequisites Your schedule and/or lifestyle won’t fit a high-workload course You are a current UCSC undergrad and have not taken CSE 120 Questions? Bonus Instructor office hours today following lecture on this call CMPE 110 Prof. Renau 15 Prerequisite Diagnostic (this week) A Prerequisite Diagnostic Exam will be given via Canvas (you can take it now, closes on Friday) Goal: Allow students to see how much they know coming into course relative to instructor expectations Graded to incentivize students to take it, but only 2% of grade No grade required to stay in course, however, a low score should be a warning to student they may not be adequately prepared Consider auditing/taking CSE 120, I can provide access to Yuja Closed book, on Canvas, 30 min, opens in the morning CMPE 110 Prof. Renau 16 Tentative Grading Homework: 10% Exams: 35% Prerequisite Diagnostic: 2% Final: 33% Projects: 40%+15% Scarab Simulator, paper review End of year presentation Projects are in groups of 2 Homework + Exam individual Exams are on Canvas, time constrained (know your stuff!) Late policy: 1min late = not submitted = zero (I’m not kidding) Recommend submitting partially completed homework early CMPE 110 Prof. Renau 17 Homework (10%) Homework problems are a great way to interact with material Learning value is the way you solved it more so than actual solution Great way to prepare for final, but exam questions may be formulated differently, so focus on process Typically have a week from posting to due date Can start as soon as posted as problems usually follow lecture order Use Ed to see assignment & to submit your work Use Ed discussion for the homework questions Try to be specific instead of vague (e.g. “I’m stuck on #2”) More specific questions are not only easier to answer, but process of narrowing down point of confusion and concisely stating it may help you solve it Other students can help answer Be careful to not divulge partial solutions CMPE 110 Prof. Renau 18 Project Overview (45%) Environment: Linux System required Goal: Get hands on experience with processor design and a better understanding of hardware-software interactions Will use Scarab architecture simulator There will be several labs and a final project We will provide some assignments to ramp up, larger project towards the end. CMPE 110 Prof. Renau 20 Project Timeline Part 1 – Setup Scarab Lab Part 2 – Run and Evaluate Spec2017 Part 3 – Cache performance study, determine 3Cs Part 4 – Processor Improvement Implement a new Scarab feature (e.g. new prefetcher, branch predictor) We will provide some project ideas Students can propose own ideas to the instructor CMPE 110 Prof. Renau 21 Advice Focus on concepts and reasoning over mechanical details What problem is this architecture technique trying to solve? What is the key insight? What makes it provide benefit? Will allow you to solve problems even if formulated very differently When in trouble with the material Consult slides, webcasts, recommended textbooks Check for question on Canvas disussion, if unasked, ask it Attend staff office hours Email staff – last resort, nearly all course material questions should be handled on Canvas I have a keen eye and no tolerance for cheating Exams are to be done INDIVIDUALLY disciplinary hearings are no fun check UCSC’s Code of Academic Integrity https://www.ucsc.edu/academics/academic-integrity/ CMPE 110 Prof. Renau 22 Break CMPE 110 Prof. Renau 23 Plan for Today Performance Metrics Amdahl’s Law CMPE 110 Prof. Renau 24 Acknowledgements These slides contain course material initially created by Jose Renau for CMPE 202 CMPE 110 Prof. Renau 25 Computer Architect Role of the computer architect To design the various levels of a computer system to maximize performance and programmability within technology and cost limits Architect must be aware of Measures of cost and performance Application characteristics and benchmarks Major performance factors: memory, technology How to design most of the processor and system blocks Security Caveat Processor design requires 2-5 years, so need to look at future technology CMPE 110 Prof. Renau 26 Performance Basics Two major metrics Latency: wall-clock time, elapsed time,… How long it takes to do X? Throughput: bandwidth How much X work is done in a given time? CMPE 110 Prof. Renau 27 Important Performance Metrics Time/Latency (seconds) IPC (instructions per cycle) CPI (cycles per instruction) Bandwidth (X/second, where x=requests,MB,..) Cache miss rate (misses/(misses+hits)) Cache hit rate (hits/(misses+hits)) Part 2 – Run and Evaluate Spec2017 Misses per kilo instructions (MPKI) MIPS Million Instructions Per Second FLOPS Million floating point instructions Per Second Clock Frequency (Herz or 1/s) CMPE 110 Prof. Renau 28 Processor Execution Time IC: Instruction Count IPC: Cycle per Instruction Freq: Processor Frequency Time = IC * CPI * 1/Freq Part 2 – Run and Evaluate Spec2017 Want to minimize increase clock freq but at a cost of power Reduce by optimizing algorithm or compiler Can be reduced by using simple instructions, uarch improvments CMPE 110 Prof. Renau 29 Relative Performance, Speedup Performance overall improved execution for a whole problem 𝑇𝑖𝑚𝑒(𝑜𝑙𝑑) 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 𝑇𝑖𝑚𝑒(𝑛𝑒𝑤) Prefer using speedup (times or x) instead of percent % 2.1x speedup is more clear than 110% speedup Sometimes % still used if gain is small (e.g. 5% speedup is 1.05x) CMPE 110 Prof. Renau 31 Cycles per Instruction (CPI) Two common usages: Instruction CPI Each instruction has a fixed CPI E.g.: load takes 7 cycles, adds 5 cycles… CPI_load = 7 CPI_add = 5 Average CPI (most common meaning) Depends on the program or instruction mix Lower CPI means better performance $ 𝐼𝐶_𝑖 𝐶𝑃𝐼 = 4 ∗ 𝐶𝑃𝐼_𝑖 𝐼𝐶 !"# IPC (Instructions Per Cycle) is the inverse of CPI CMPE 110 Prof. Renau 32 Example Given two machines, M1 and M2 Instruction Percentage CPI_M1 CPI_M2 ALU 50% 1 2 Branches 15% 2 1 Loads 20% 2 1 Stores 15% 1 1 How much faster is machine M1 than M2? CMPE 110 Prof. Renau 34 Example Given two machines, M1 and M2 Instruction Percentage CPI_M1 CPI_M2 ALU 50% 1 2 Branches 15% 2 1 Loads 20% 2 1 Stores 15% 1 1 What about assuming M1 has 500MHz and M2 has 600MHz? CMPE 110 Prof. Renau 35 Amdahl’s Law Performance gain limited by the fraction of improved execution E.g.: If you only speed up 50% of the code, your maximum speedup is 2 𝑇𝑖𝑚𝑒(𝑜𝑙𝑑) 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 𝑇𝑖𝑚𝑒(𝑛𝑒𝑤) 1 𝑀𝑎𝑥 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 1 − 𝑓(𝑒) 1 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 𝑓(𝑒) 1 − 𝑓(𝑒) + 𝑆𝑝𝑒𝑒𝑑𝑢𝑝(𝑒) CMPE 110 Prof. Renau 36 Amdahl’s Law CMPE 110 Prof. Renau 37 Analyzing Performance Run representative applications (Spec?) Measure execution time Understanding performance Is it because of high IC or low IPC ? What causes low IPC Measure cache MPKI, branch mispredictions, memory accesses,.. Part 2 – Run and Evaluate Spec2017 As a computer architect we mainly control IPC, IC not so much How to measure performance properties? PMU counters: perf stat (cache misses, branch mispredictions,..) Architectural simulator (also allows prototyping of new ideas) CMPE 110 Prof. Renau 38