lec5-4471029-computer-abstraction-and-technologies(performance) III and Logic Review.pdf
Document Details
Uploaded by CuteWatermelonTourmaline
Kangwon National University
Full Transcript
Computer Abstractions and Technology : Performance III 471029: Introduction to Computer Architecture 5th Lecture Disclaimer: Slides are mainly based on COD 5th textbook and also developed in part by Profs. Dohyung Kim @ KNU and Computer architecture course @ KAIST and SKKU...
Computer Abstractions and Technology : Performance III 471029: Introduction to Computer Architecture 5th Lecture Disclaimer: Slides are mainly based on COD 5th textbook and also developed in part by Profs. Dohyung Kim @ KNU and Computer architecture course @ KAIST and SKKU 1 Example of Performance profiling Naïve (E-ruri) Makefile naïve.c test_common.c test_common.h Simple matrices multiplication workload Three large matrices: matrix_r = matrix_a x matrix_b 2 Run-time Variations Evaluation on single-thread workload 3 Run-time Variations (cont’d) Evaluation on more real environment multiple workloads + multi-threaded workload “History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers”, Wonjun Song et al., ASPLOS 4 Run-time Variations (cont’d) Evaluation on more real environment (cont’d) multiple workloads + multi-threaded workload “History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers”, Wonjun Song et al., ASPLOS 5 Run-time Variations (cont’d) Minimize the sources of noise. Disable DVFS (or CPU frequency scailing) both in Linux and BIOS e.g., performance governor in Linux (performance, powersave, ondemand) /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor Turn off SMT (aka., HT) Change the Linux scheduling policy to “SCHED_FIFO” (hmm..) GRUB_CMDLINE_LINUX_DEFAULT=“isolcpus=0,1” adding it as grub parameter Spawn a workload and pin it to a specific core (disabling migration) Evaluate a workload several time and use a mean 6 Run-time Variations (cont’d) “perf” supports to run the same test workload multiple times and get for each count, the standard deviation from the mean. e.g., perf stat –r 10 ….. 7 Profiling The perf tool offers a rich set of commands to collect and analyze performance and trace data stat, record, report commands perf record Run a command and record its profile into perf.data file $ perf record –h perf report Read perf.data (crated by perf record) and display the profile $ perf report –h 8 Profiling - Example 9 Profiling – Example (cont’d) 10 [ Aside ] Visualization of Perf results Flame, Pprof, Hotspot Example on Flame 11 Source: ‘Using Linux perf at Netflix”, Brendan Gregg, 2017 12 Profiling – Example (cont’d) 13 Profiling – Example (cont’d) 14 Profiling – Example (cont’d) [ prior version ] 15 Profiling – Example (cont’d) [ prior version ] 16 Digital Systems and Logic : Review 471029: Introduction to Computer Architecture 5th Lecture Disclaimer: Slides are mainly based on COD 5th textbook and also developed in part by Profs. Dohyung Kim @ KNU and Computer architecture course @ KAIST and SKKU 17 Hardware Design Hierarchy system datapath control code state combinational multiplexer comparator registers registers logic register logic switching networks 18 Switches The basic element of physical implementations Convention: if input is a “1”, the switch is asserted A Z Open switch if A is “0” (unasserted) and turn OFF light bulb (Z) A Z Close switch if A is “1” (asserted) and turn ON light bulb (Z) In this example, Z ≡ A. 19 Switches (cont’d) Can compose switches into more complex ones (Boolean functions) Arrwas show action upon assertion (1 = close) A B AND: “1” Z ≡ A and B A OR: “1” Z ≡ A or B B 20 Block Diagrams In reality, chips composed of just transistors and wires Small groups of transistors from useful building blocks X Y VDD(1) Z ≡ X Y NAND Z GND(0) Can combine to build higher-level blocks You can build AND, OR, and NOT out of NAND! 21 Today Combinational logic Sequential logic 22 Type of circuits Digital Systems consist of two basic types of circuits Combinational Logic Output depeonds only on the current inputs E.g.,) circuits to add A, B 23 Type of circuits (cont’d) Digital Systems consist of two basic types of circuits Combinational Logic Sequential Logic Output depends on the current inputs + current states(stored values) Include memory elements 24 ALU (Arithmetic Logic Unit) ALU is a combinational digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers An ALU is a fundamental building block of many types of computing circuits, including the CPU of computers, FPUs, and GPUs A single CPU, FPU or GPU may contain multiple ALUs 25 Representations of Combinational Logic Text Description Circuit Diagram Transistors and wires Logic gates True Table Boolean Expression All are equivalent 26 Truth Tables Table that relates the inputs to a CL circuit to its output Output only depends on current inputs Use abstraction of 0/1 instead of high/low voltage Shows output for every possible combination of inputs How big? 0 or 1 for each of N inputs, so 2N rows 27 Logic Gates Special names and symbols Circle means NOT! a c NOT 0 1 1 0 a b c 0 0 0 AND 0 1 0 1 0 0 1 1 1 a b c 0 0 0 OR 0 1 1 1 0 1 1 1 1 28 Logic Gates (cont’d) Special names and symbols a b c 0 0 1 NAND 0 1 1 0 1 1 1 1 0 a b c 0 0 1 0 1 0 NOR 1 0 0 1 1 0 a b c 0 0 0 0 1 1 XOR 1 0 1 1 1 0 29 Truth Table to Logic Equation 30 Laws of Boolean Algebra 31 Boolean Algebraic Simplicafication Example 32 Boolean Algebraic Simplicafication Example 33 Boolean Algebraic Simplicafication Example 34 Boolean Algebraic Simplicafication Example 35 Circuit Simplification Example Simplify the following circuit A B D C Options 1) Test all combinations of the inputs and build the Truth Table 2) Write out expressions for signals based on gates - Will show this method here 36 Circuit Simplification Example (cont’d) Simplify the following circuit A A AB (AB)’ B B D A A+B’C B’ B’C C C Start from left, propagate signals to the right Arrive at D = (AB)’(A + B’C) 37 Circuit Simplification Example (cont’d) Simplify Expression D = (AB)’(A + B’C) = (A’ + B’)(A + B’C) DeMorgan’s = A’A + A’B’C + B’A + B’B’C Distribution = 0 + A’B’C + B’A + B’B’C Complementarity = A’B’C + B’A + B’C Idempotent Law = (A’ + 1)B’C + AB’ Distribution = B’C + AB’ Law of 1’s = B’(A + C) Distribution 38 Circuit Simplification Example (cont’d) Draw out final circuit D = B’C + AB’ = B’(A + C) 3 How many gates do we need for each? Simplified circuit A B D C Reduction from 6 gates to 3! 39