Digital Design and Computer Architecture RISC-V Edition PDF

Table B.4 Register names and numbers 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Register funct...

Table B.4 Register names and numbers 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Register funct4 rd/rs1 rs2 op CR-Type Name Number Use funct3 imm rd/rs1 imm op CI-Type zero x0 Constant value 0 ra x1 Return address funct3 imm rs1' imm rs2' op CS-Type sp x2 Stack pointer funct6 rd'/rs1' funct2 rs2' op CS'-Type gp x3 Global pointer funct3 imm rs1' imm op CB-Type tp x4 Thread pointer funct3 imm funct rd'/rs1' imm op CB'-Type t0 –2 x5 –7 Temporary registers funct3 imm op CJ-Type s0/fp x8 Saved register / Frame pointer s1 x9 Saved register funct3 imm rs2 op CSS-Type a0 –1 x10 –11 Function arguments / Return values funct3 imm rd' op CIW-Type a2 –7 x12 –17 Function arguments funct3 imm rs1' imm rd' op CL-Type s2 –11 x18–27 Saved registers 3 bits 3 bits 3 bits 2 bits 3 bits 2 bits t3-6 x28–31 Temporary registers Figure B.2 RISC-V compressed (16-bit) instruction formats Table B.5 RVM: RISC-V multiply and divide instructions op funct3 funct7 Type Instruction Description Operation 0110011 (51) 000 0000001 R mul rd, rs1, rs2 multiply rd = (rs1 * rs2)31:0 0110011 (51) 001 0000001 R mulh rd, rs1, rs2 multiply high signed signed rd = (rs1 * rs2)63:32 0110011 (51) 010 0000001 R mulhsu rd, rs1, rs2 multiply high signed unsigned rd = (rs1 * rs2)63:32 0110011 (51) 011 0000001 R mulhu rd, rs1, rs2 multiply high unsigned unsigned rd = (rs1 * rs2)63:32 0110011 (51) 100 0000001 R div rd, rs1, rs2 divide (signed) rd = rs1 / rs2 0110011 (51) 101 0000001 R divu rd, rs1, rs2 divide unsigned rd = rs1 / rs2 0110011 (51) 110 0000001 R rem rd, rs1, rs2 remainder (signed) rd = rs1 % rs2 0110011 (51) 111 0000001 R remu rd, rs1, rs2 remainder unsigned rd = rs1 % rs2 Table B.6 RVC: RISC-V compressed (16-bit) instructions op instr15:10 funct2 Type RVC Instruction 32-Bit Equivalent 00 (0) 000– – – – CIW c.addi4spn rd', sp, imm addi rd', sp, ZeroExt(imm)*4 00 (0) 001– – – – CL c.fld fd', imm(rs1') fld fd', (ZeroExt(imm)*8)(rs1') 00 (0) 010– – – – CL c.lw rd', imm(rs1') lw rd', (ZeroExt(imm)*4)(rs1') 00 (0) 011– – – – CL c.flw fd', imm(rs1') flw fd', (ZeroExt(imm)*4)(rs1') 00 (0) 101– – – – CS c.fsd fs2', imm(rs1') fsd fs2', (ZeroExt(imm)*8)(rs1') 00 (0) 110– – – – CS c.sw rs2', imm(rs1') sw rs2', (ZeroExt(imm)*4)(rs1') 00 (0) 111– – – – CS c.fsw fs2', imm(rs1') fsw fs2', (ZeroExt(imm)*4)(rs1') 01 (1) 000000 – CI c.nop (rs1=0,imm=0) nop 01 (1) 000– – – – CI c.addi rd, imm addi rd, rd, SignExt(imm) 01 (1) 001– – – – CJ c.jal label jal ra, label 01 (1) 010– – – – CI c.li rd, imm addi rd, x0, SignExt(imm) 01 (1) 011– – – – CI c.lui rd, imm lui rd, {14{imm5}, imm} 01 (1) 011– – – – CI c.addi16sp sp, imm addi sp, sp, SignExt(imm)*16 01 (1) 100 – 00 – CB' c.srli rd', imm srli rd', rd', imm 01 (1) 100 – 01 – CB' c.srai rd', imm srai rd', rd', imm 01 (1) 100 – 10 – CB' c.andi rd', imm andi rd', rd', SignExt(imm) 01 (1) 100011 00 CS' c.sub rd', rs2' sub rd', rd', rs2' 01 (1) 100011 01 CS' c.xor rd', rs2' xor rd', rd', rs2' 01 (1) 100011 10 CS' c.or rd', rs2' or rd', rd', rs2' 01 (1) 100011 11 CS' c.and rd', rs2' and rd', rd', rs2' 01 (1) 101– – – – CJ c.j label jal x0, label 01 (1) 110– – – – CB c.beqz rs1', label beq rs1', x0, label 01 (1) 111– – – – CB c.bnez rs1', label bne rs1', x0, label 10 (2) 000– – – – CI c.slli rd, imm slli rd, rd, imm 10 (2) 001– – – – CI c.fldsp fd, imm fld fd, (ZeroExt(imm)*8)(sp) 10 (2) 010– – – – CI c.lwsp rd, imm lw rd, (ZeroExt(imm)*4)(sp) 10 (2) 011– – – – CI c.flwsp fd, imm flw fd, (ZeroExt(imm)*4)(sp) 10 (2) 1000– – – CR c.jr rs1 (rs1≠0,rs2=0) jalr x0, rs1, 0 10 (2) 1000– – – CR c.mv rd, rs2 (rd ≠0,rs2≠0) add rd, x0, rs2 10 (2) 1001– – – CR c.ebreak (rs1=0,rs2=0) ebreak 10 (2) 1001– – – CR c.jalr rs1 (rs1≠0,rs2=0) jalr ra, rs1, 0 10 (2) 1001– – – CR c.add rd, rs2 (rs1≠0,rs2≠0) add rd, rd, rs2 10 (2) 101– – – – CSS c.fsdsp fs2, imm fsd fs2, (ZeroExt(imm)*8)(sp) 10 (2) 110– – – – CSS c.swsp rs2, imm sw rs2, (ZeroExt(imm)*4)(sp) 10 (2) 111– – – – CSS c.fswsp fs2, imm fsw fs2, (ZeroExt(imm)*4)(sp) rs1', rs2', rd': 3-bit register designator for registers 8 –15: 0002 = x8 or f8, 0012 = x9 or f9, etc. Table B.7 RISC-V pseudoinstructions Pseudoinstruction RISC-V Instructions Description Operation nop addi x0, x0, 0 no operation li rd, imm11:0 addi rd, x0, imm11:0 load 12-bit immediate rd = SignExtend(imm11:0) * li rd, imm31:0 lui rd, imm31:12 load 32-bit immediate rd = imm31:0 addi rd, rd, imm11:0 mv rd, rs1 addi rd, rs1, 0 move (also called “register copy”) rd = rs1 not rd, rs1 xori rd, rs1, —1 one’s complement rd = ~rs1 neg rd, rs1 sub rd, x0, rs1 two’s complement rd = —rs1 seqz rd, rs1 sltiu rd, rs1, 1 set if = 0 rd = (rs1 == 0) snez rd, rs1 sltu rd, x0, rs1 set if ≠ 0 rd = (rs1 ≠ 0) sltz rd, rs1 slt rd, rs1, x0 set if < 0 rd = (rs1 < 0) sgtz rd, rs1 slt rd, x0, rs1 set if > 0 rd = (rs1 > 0) beqz rs1, label beq rs1, x0, label branch if = 0 if (rs1 == 0) PC = label bnez rs1, label bne rs1, x0, label branch if ≠ 0 if (rs1 ≠ 0) PC = label blez rs1, label bge x0, rs1, label branch if ≤ 0 if (rs1 ≤ 0) PC = label bgez rs1, label bge rs1, x0, label branch if ≥ 0 if (rs1 ≥ 0) PC = label bltz rs1, label blt rs1, x0, label branch if < 0 if (rs1 < 0) PC = label bgtz rs1, label blt x0, rs1, label branch if > 0 if (rs1 > 0) PC = label ble rs1, rs2, label bge rs2, rs1, label branch if ≤ if (rs1 ≤ rs2) PC = label bgt rs1, rs2, label blt rs2, rs1, label branch if > if (rs1 > rs2) PC = label bleu rs1, rs2, label bgeu rs2, rs1, label branch if ≤ (unsigned) if (rs1 ≤ rs2) PC = label bgtu rs1, rs2, label bltu rs2, rs1, offset branch if > (unsigned) if (rs1 > rs2) PC = label j label jal x0, label jump PC = label jal label jal ra, label jump and link PC = label, ra = PC + 4 jr rs1 jalr x0, rs1, 0 jump register PC = rs1 jalr rs1 jalr ra, rs1, 0 jump and link register PC = rs1, ra = PC + 4 ret jalr x0, ra, 0 return from function PC = ra call label jal ra, label call nearby function PC = label, ra = PC + 4 * call label auipc ra, offset31:12 call far away function PC = PC + offset, ra = PC + 4 jalr ra, ra, offset11:0 * la rd, symbol auipc rd, symbol31:12 load address of global variable rd = PC + symbol addi rd, rd, symbol11:0 * l{b|h|w} rd, symbol auipc rd, symbol31:12 load global variable rd = [PC + symbol] l{b|h|w} rd, symbol11:0(rd) * s{b|h|w} rs2, symbol, rs1 auipc rs1, symbol31:12 store global variable [PC + symbol] = rs2 s{b|h|w} rs2, symbol11:0(rs1) csrr rd, csr csrrs rd, csr, x0 read CSR rd = csr csrw csr, rs1 csrrw x0, csr, rs1 write CSR csr = rs1 * If bit 11 of the immediate / offset / symbol is 1, the upper immediate is incremented by 1. symbol and offset are the 32-bit PC-relative addresses of a label and a global variable, respectively. Table B.8 Privileged / CSR instructions op funct3 Type Instruction Description Operation 1110011 (115) 000 I ecall transfer control to OS (imm=0) 1110011 (115) 000 I ebreak transfer control to debugger (imm=1) 1110011 (115) 000 I uret return from user exception (rs1=0,rd=0,imm=2) PC = uepc 1110011 (115) 000 I sret return from supervisor exception (rs1=0,rd=0,imm=258) PC = sepc 1110011 (115) 000 I mret return from machine exception (rs1=0,rd=0,imm=770) PC = mepc 1110011 (115) 001 I csrrw rd,csr,rs1 CSR read/write (imm=CSR number) rd = csr,csr = rs1 1110011 (115) 010 I csrrs rd,csr,rs1 CSR read/set (imm=CSR number) rd = csr,csr = csr | rs1 1110011 (115) 011 I csrrc rd,csr,rs1 CSR read/clear (imm=CSR number) rd = csr,csr = csr & ~rs1 1110011 (115) 101 I csrrwi rd,csr,uimm CSR read/write immediate (imm=CSR number) rd = csr,csr = ZeroExt(uimm) 1110011 (115) 110 I csrrsi rd,csr,uimm CSR read/set immediate (imm=CSR number) rd = csr, csr = csr | ZeroExt(uimm) 1110011 (115) 111 I csrrci rd,csr,uimm CSR read/clear immediate (imm=CSR number) rd = csr, csr = csr & ~ZeroExt(uimm) For privileged / CSR instructions, the 5-bit unsigned immediate, uimm, is encoded in the rs1 field. In Praise of Digital Design and Computer Architecture RISC-V Edition Harris and Harris have shown how to design a RISC-V processor from the gates all the way up through the microarchitecture. Their clear explanations combined with their comprehensive approach give a full picture of both digital design and the RISC-V architecture. With the exciting opportunity that students have to run large digital designs on modern FPGAs, this approach is both informative and enlightening. David A. Patterson University of California, Berkeley What broad fields Harris and Harris have distilled into one book! As semiconductor manufacturing matures, the importance of digital design and computer architecture will only increase. Readers will find an approachable and comprehensive treatment of both topics and will walk away with a clear understanding of the RISC-V instruction set architecture. Andrew Waterman SiFive There are excellent texts for teaching digital design and there are excellent texts for teaching computer hardware organization - and this textbook does both! It is also unique in its ability to connect the dots. The writing makes the RISC-V architecture understandable by building on the basics, and I have found the exercises to be a great resource across multiple courses. Roy Kravitz Portland State University When I first read Harris and Harris’s MIPS textbook back in 2008, I thought that it was one of the best books I had ever read for teaching computer architecture. I started using it in my courses immediately. Thirteen years later, I have had the honor of reviewing this new RISC-V edition, and my opinion of their books has not changed: Digital Design and Computer Architecture: RISC-V Edition is an excellent book, very clear, thorough, with a high educational value, and in line with the courses that we teach in the areas of digital design and computer archi- tecture. I look forward to using this RISC-V textbook in my courses. Daniel Chaver Martinez University Complutense of Madrid Digital Design and Computer Architecture RISC-V Edition Digital Design and Computer Architecture RISC-V Edition Sarah L. Harris David Harris AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier Morgan Kaufmann is an imprint of Elsevier 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright © 2022 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress. ISBN: 978-0-12-820064-3 For Information on all Morgan Kaufmann publications visit our website at https://www.elsevier.com/books-and-journals Publisher: Katey Birtcher Senior Acquisitions Editor: Steve Merken Editorial Project Manager: Ruby Gammell/Andrae Akeh Senior Project Manager: Manikandan Chandrasekaran Cover Designer: Victoria Pearson Esser Typeset by MPS Limited, Chennai, India Printed in the United States of America Last digit is the print number: 9 8 7 6 5 4 3 2 1 Contents Preface................................................. xix Features............................................... xx Online Supplements.................................... xxi How to Use the Software Tools in a Course................. xxii Labs................................................ xxiii RVfpga.............................................. xxiii Bugs................................................ xxiv Acknowledgments..................................... xxiv About the Authors........................................ xxv Chapter 1 From Zero to One..................................... 1 1.1 The Game Plan.............................................. 1 1.2 The Art of Managing Complexity............................. 2 1.2.1 Abstraction 2 1.2.2 Discipline 3 1.2.3 The Three -Y’s 4 1.3 The Digital Abstraction 5 1.4 Number Systems 7 1.4.1 Decimal Numbers 7 1.4.2 Binary Numbers 7 1.4.3 Hexadecimal Numbers 9 1.4.4 Bytes, Nibbles, and All That Jazz 11 1.4.5 Binary Addition 12 1.4.6 Signed Binary Numbers 13 1.5 Logic Gates 17 1.5.1 NOT Gate 18 1.5.2 Buffer 18 1.5.3 AND Gate 18 1.5.4 OR Gate 19 1.5.5 Other Two-Input Gates 19 1.5.6 Multiple-Input Gates 19 1.6 Beneath the Digital Abstraction 20 1.6.1 Supply Voltage 20 1.6.2 Logic Levels 20 1.6.3 Noise Margins 21 ix x Contents 1.6.4 DC Transfer Characteristics 22 1.6.5 The Static Discipline 22 1.7 CMOS Transistors 24 1.7.1 Semiconductors 25 1.7.2 Diodes 25 1.7.3 Capacitors 26 1.7.4 nMOS and pMOS Transistors 26 1.7.5 CMOS NOT Gate 29 1.7.6 Other CMOS Logic Gates 29 1.7.7 Transmission Gates 31 1.7.8 Pseudo-nMOS Logic 31 1.8 Power Consumption 32 1.9 Summary and a Look Ahead 34 Exercises 36 Interview Questions 50 Chapter 2 Combinational Logic Design 53 2.1 Introduction 53 2.2 Boolean Equations 56 2.2.1 Terminology 56 2.2.2 Sum-of-Products Form 56 2.2.3 Product-of-Sums Form 58 2.3 Boolean Algebra 58 2.3.1 Axioms 59 2.3.2 Theorems of One Variable 59 2.3.3 Theorems of Several Variables 60 2.3.4 The Truth Behind It All 62 2.3.5 Simplifying Equations 63 2.4 From Logic to Gates 64 2.5 Multilevel Combinational Logic 67 2.5.1 Hardware Reduction 68 2.5.2 Bubble Pushing 69 2.6 X’s and Z’s, Oh My 71 2.6.1 Illegal Value: X 71 2.6.2 Floating Value: Z 72 2.7 Karnaugh Maps 73 2.7.1 Circular Thinking 74 2.7.2 Logic Minimization with K-Maps 75 2.7.3 Don’t Cares 79 2.7.4 The Big Picture 80 2.8 Combinational Building Blocks 81 2.8.1 Multiplexers 81 2.8.2 Decoders 84 Contents xi 2.9 Timing 86 2.9.1 Propagation and Contamination Delay 86 2.9.2 Glitches 90 2.10 Summary 93 Exercises 95 Interview Questions 104 Chapter 3 Sequential Logic Design.......................... 107 3.1 Introduction 107 3.2 Latches and Flip-Flops 107 3.2.1 SR Latch 109 3.2.2 D Latch 111 3.2.3 D FIip-Flop 112 3.2.4 Register 112 3.2.5 Enabled Flip-Flop 113 3.2.6 Resettable Flip-Flop 114 3.2.7 Transistor-Level Latch and Flip-Flop Designs 114 3.2.8 Putting It All Together 116 3.3 Synchronous Logic Design 117 3.3.1 Some Problematic Circuits 117 3.3.2 Synchronous Sequential Circuits 118 3.3.3 Synchronous and Asynchronous Circuits 120 3.4 Finite State Machines 121 3.4.1 FSM Design Example 121 3.4.2 State Encodings 127 3.4.3 Moore and Mealy Machines 130 3.4.4 Factoring State Machines 132 3.4.5 Deriving an FSM from a Schematic 135 3.4.6 FSM Review 138 3.5 Timing of Sequential Logic 139 3.5.1 The Dynamic Discipline 140 3.5.2 System Timing 140 3.5.3 Clock Skew 146 3.5.4 Metastability 149 3.5.5 Synchronizers 150 3.5.6 Derivation of Resolution Time 152 3.6 Parallelism 155 3.7 Summary 159 Exercises 160 Interview Questions 169 xii Contents Chapter 4 Hardware Description Languages.................. 171 4.1 Introduction 171 4.1.1 Modules 171 4.1.2 Language Origins 172 4.1.3 Simulation and Synthesis 173 4.2 Combinational Logic 175 4.2.1 Bitwise Operators 175 4.2.2 Comments and White Space 178 4.2.3 Reduction Operators 178 4.2.4 Conditional Assignment 179 4.2.5 Internal Variables 180 4.2.6 Precedence 182 4.2.7 Numbers 183 4.2.8 Z’s and X’s 184 4.2.9 Bit Swizzling 186 4.2.10 Delays 186 4.3 Structural Modeling 188 4.4 Sequential Logic 191 4.4.1 Registers 191 4.4.2 Resettable Registers 192 4.4.3 Enabled Registers 194 4.4.4 Multiple Registers 195 4.4.5 Latches 196 4.5 More Combinational Logic 196 4.5.1 Case Statements 199 4.5.2 If Statements 200 4.5.3 Truth Tables with Don’t Cares 203 4.5.4 Blocking and Nonblocking Assignments 203 4.6 Finite State Machines 207 4.7 Data Types 211 4.7.1 SystemVerilog 212 4.7.2 VHDL 213 4.8 Parameterized Modules 215 4.9 Testbenches 218 4.10 Summary 222 Exercises 224 Interview Questions 235 Chapter 5 Digital Building Blocks 237 5.1 Introduction 237 5.2 Arithmetic Circuits 237 5.2.1 Addition 237 5.2.2 Subtraction 244 Contents xiii 5.2.3 Comparators 245 5.2.4 ALU 247 5.2.5 Shifters and Rotators 251 5.2.6 Multiplication 253 5.2.7 Division 254 5.2.8 Further Reading 255 5.3 Number Systems 256 5.3.1 Fixed-Point Number Systems 256 5.3.2 Floating-Point Number Systems 257 5.4 Sequential Building Blocks 261 5.4.1 Counters 261 5.4.2 Shift Registers 262 5.5 Memory Arrays 265 5.5.1 Overview 265 5.5.2 Dynamic Random Access Memory (DRAM) 267 5.5.3 Static Random Access Memory (SRAM) 268 5.5.4 Area and Delay 268 5.5.5 Register Files 269 5.5.6 Read Only Memory (ROM) 269 5.5.7 Logic Using Memory Arrays 271 5.5.8 Memory HDL 272 5.6 Logic Arrays 272 5.6.1 Programmable Logic Array (PLA). 275 5.6.2 Field Programmable Gate Array (FPGA) 276 5.6.3 Array Implementations 282 5.7 Summary 283 Exercises 285 Interview Questions 297 Chapter 6 Architecture 299 6.1 Introduction 299 6.2 Assembly Language 300 6.2.1 Instructions 301 6.2.2 Operands: Registers, Memory, and Constants 302 6.3 Programming 308 6.3.1 Program Flow 308 6.3.2 Logical, Shift, and Multiply Instructions 308 6.3.3 Branching 311 6.3.4 Conditional Statements 313 6.3.5 Getting Loopy 315 6.3.6 Arrays 317 6.3.7 Function Calls 320 6.3.8 Pseudoinstructions 330 xiv Contents 6.4 Machine Language 332 6.4.1 R-Type Instructions 332 6.4.2 l-Type Instructions 334 6.4.3 S/B-Type Instructions 336 6.4.4 U/J-Type Instructions 338 6.4.5 Immediate Encodings 340 6.4.6 Addressing Modes 341 6.4.7 Interpreting Machine Language Code 342 6.4.8 The Power of the Stored Program 343 6.5 Lights, Camera, Action: Compiling, Assembling, and Loading 344 6.5.1 The Memory Map 344 6.5.2 Assembler Directives 346 6.5.3 Compiling 348 6.5.4 Assembling 350 6.5.5 Linking 353 6.5.6 Loading 355 6.6 Odds and Ends 355 6.6.1 Endianness 355 6.6.2 Exceptions 356 6.6.3 Signed and Unsigned Instructions 360 6.6.4 Floating-Point Instructions 361 6.6.5 Compressed Instructions 362 6.7 Evolution of the RISC-V Architecture 363 6.7.1 RISC-V Base Instruction Sets and Extensions 364 6.7.2 Comparison of RISC-V and MIPS Architectures 365 6.7.3 Comparison of RISC-V and ARM Architectures 365 6.8 Another Perspective: x86 Architecture 366 6.8.1 x86 Registers 366 6.8.2 x86 Operands 367 6.8.3 Status Flags 369 6.8.4 x86 Instructions 369 6.8.5 x86 Instruction Encoding 371 6.8.6 Other x86 Peculiarities 372 6.8.7 The Big Picture 373 6.9 Summary 374 Exercises 375 Interview Questions 390 Chapter 7 Microarchitecture 393 7.1 Introduction 393 7.1.1 Architectural State and Instruction Set 393 7.1.2 Design Process 394 7.1.3 Microarchitectures 396 Contents xv 7.2 Performance Analysis 397 7.3 Single-Cycle Processor 398 7.3.1 Sample Program 399 7.3.2 Single-Cycle Datapath 399 7.3.3 Single-Cycle Control 407 7.3.4 More Instructions 410 7.3.5 Performance Analysis 412 7.4 Multicycle Processor 415 7.4.1 Multicycle Datapath 416 7.4.2 Multicycle Control 422 7.4.3 More Instructions 432 7.4.4 Performance Analysis 435 7.5 Pipelined Processor 439 7.5.1 Pipelined Datapath 441 7.5.2 Pipelined Control 443 7.5.3 Hazards 443 7.5.4 Performance Analysis 454 7.6 HDL Representation 456 7.6.1 Single-Cycle Processor 457 7.6.2 Generic Building Blocks 461 7.6.3 Testbench 464 7.7 Advanced Microarchitecture 468 7.7.1 Deep Pipelines 468 7.7.2 Micro-Operations 469 7.7.3 Branch Prediction 470 7.7.4 Superscalar Processors 472 7.7.5 Out-of-Order Processor 473 7.7.6 Register Renaming 476 7.7.7 Multithreading 478 7.7.8 Multiprocessors 479 7.8 Real-World Perspective: Evolution of RISC-V Microarchitecture 482 7.9 Summary 486 Exercises 488 Interview Questions 497 Chapter 8 Memory Systems 499 8.1 Introduction 499 8.2 Memory System Performance Analysis 503 8.3 Caches 505 8.3.1 What Data is Held in the Cache? 505 8.3.2 How is Data Found? 506 8.3.3 What Data is Replaced? 514 8.3.4 Advanced Cache Design 515 xvi Contents 8.4 Virtual Memory 519 8.4.1 Address Translation 522 8.4.2 The Page Table 523 8.4.3 The Translation Lookaside Buffer 525 8.4.4 Memory Protection 526 8.4.5 Replacement Policies 527 8.4.6 Multilevel Page Tables 527 8.5 Summary 530 Epilogue 530 Exercises 532 Interview Questions 541 Chapter 9 Embedded I/O Systems 542 9.1 Introduction 542 Chapter 9 is available as an online supplement 542.e1 9.1 Introduction 542.e1 9.2 Memory-Mapped I/O 542.e1 9.3 Embedded I/O Systems 542.e3 9.3.1 RED-V Board 542.e3 9.3.2 FE310-G002 System-on-Chip 542.e5 9.3.3 General-Purpose Digital I/O 542.e5 9.3.4 Device Drivers 542.e10 9.3.5 Serial I/O 542.e14 9.3.6 Timers 542.e29 9.3.7 Analog I/O 542.e30 9.3.8 Interrupts 542.e39 9.4 Other Microcontroller Peripherals 542.e43 9.4.1 Character LCDs 542.e43 9.4.2 VGA Monitor 542.e45 9.4.3 Bluetooth Wireless Communication 542.e53 9.4.4 Motor Control 542.e54 9.5 Summary 542.e64 Appendix A Digital System Implementation 543 A.1 Introduction 543 Appendix A is available as an online supplement 543.e1 A.1 Introduction 543.e1 A.2 74xx Logic 543.e1 A.2.1 Logic Gates 543.e2 A.2.2 Other Functions 543.e2 Contents xvii A.3 Programmable Logic 543.e2 A.3.1 PROMs 543.e2 A.3.2 PLAs 543.e6 A.3.3 FPGAs 543.e7 A.4 Application-Specific Integrated Circuits 543.e9 A.5 Datasheets 543.e9 A.6 Logic Families 543.e15 A.7 Switches and Light-Emitting Diodes 543.e17 A.7.1 Switches 543.e17 A.7.2 LEDs 543.e18 A.8 Packaging and Assembly 543.e19 A.8.1 Packages 543.e19 A.8.2 Breadboards 543.e20 A.8.3 Printed Circuit Boards 543.e20 A.8.4 Putting It All Together 543.e23 A.9 Transmission Lines 543.e23 A.9.1 Matched Termination 543.e24 A.9.2 Open Termination 543.e26 A.9.3 Short Termination 543.e27 A.9.4 Mismatched Termination 543.e27 A.9.5 When to Use Transmission Line Models 543.e30 A.9.6 Proper Transmission Line Terminations 543.e30 A.9.7 Derivation of Z0 543.e32 A.9.8 Derivation of the Reflection Coefficient 543.e33 A.9.9 Putting It All Together 543.e34 A.10 Economics 543.e35 Appendix B RISC-V Instruction Set Summary 544 Appendix C C Programming 545 C.1 Introduction 545 Appendix C is available as an online supplement 545.e1 C.1 Introduction 545.e1 C.2 Welcome to C 545.e3 C.2.1 C Program Dissection 545.e3 C.2.2 Running a C Program 545.e4 C.3 Compilation 545.e5 C.3.1 Comments 545.e5 C.3.2 #define 545.e5 C.3.3 #include 545.e6 xviii Contents C.4 Variables 545.e7 C.4.1 Primitive Data Types 545.e8 C.4.2 Global and Local Variables 545.e9 C.4.3 Initializing Variables 545.e11 C.5 Operators 545.e11 C.6 Function Calls 545.e15 C.7 Control-Flow Statements 545.e16 C.7.1 Conditional Statements 545.e17 C.7.2 Loops 545.e19 C.8 More Data Types 545.e21 C.8.1 Pointers 545.e21 C.8.2 Arrays 545.e23 C.8.3 Characters 545.e27 C.8.4 Strings 545.e27 C.8.5 Structures 545.e29 C.8.6 typedef 545.e31 C.8.7 Dynamic Memory Allocation 545.e32 C.8.8 Linked Lists 545.e33 C.9 Standard Libraries 545.e35 C.9.1 stdio 545.e35 C.9.2 stdlib 545.e40 C.9.3 math 545.e42 C.9.4 string 545.e42 C.10 Compiler and Command Line Options 545.e43 C.10.1 Compiling Multiple C Source Files 545.e43 C.10.2 Compiler Options 545.e43 C.10.3 Command Line Arguments 545.e44 C.11 Common Mistakes 545.e44 Further Reading.......................................... 547 Index................................................... 549 Preface This book is unique in its treatment in that it presents digital logic design from the perspective of computer architecture, starting at the beginning with 1’s and 0’s and leading through to the design of a microprocessor. We believe that building a microprocessor is a special rite of passage for engineering and computer science students. The inner workings of a processor seem almost magical to the uninitiated yet prove to be straightforward when carefully explained. Digital design in and of itself is a powerful and exciting subject. Assembly language programming unveils the inner language spoken by the processor. Microarchitecture is the link that brings it all together. The first two versions of this increasingly popular text cover the MIPS and ARM architectures. As one of the original Reduced Instruction Set Computing architectures, MIPS is clean and excep- tionally easy to understand and build. MIPS remains an important architecture, as it has inspired many of the subsequent architectures, including RISC-V. The ARM architecture has exploded in popu- larity over the past several decades because of its efficiency and rich ecosystem. More than 50 billion ARM processors have been shipped, and more than 75% of humans on the planet use products with ARM processors. Over the past decade, RISC-V has emerged as an increasingly important architecture, both pedagogically and commercially. As the first widely used open-source computer architecture, RISC-V offers the simplicity of MIPS with the flexibility and features of modern processors. Pedagogically, the learning objectives of the MIPS, ARM, and RISC-V editions are identical. The RISC-V architecture has a number of features, including extendibility and compressed instructions, that con- tribute to its efficiency but add a small amount of complexity. The three microarchitectures are also similar, with MIPS and RISC-V architectures sharing many similarities. We expect to offer MIPS, ARM, and RISC-V editions as long as the market demands. xix xx Preface FEATURES Side-by-Side Coverage of SystemVerilog and VHDL Hardware description languages (HDLs) are at the center of modern digital design practices. Unfortunately, designers are evenly split between the two dominant languages, SystemVerilog and VHDL. This book introduces HDLs in Chapter 4 as soon as combinational and sequential logic design has been covered. HDLs are then used in Chapters 5 and 7 to design larger building blocks and entire processors. Nevertheless, Chapter 4 can be skipped and the later chapters are still accessible for courses that choose not to cover HDLs. This book is unique in its side-by-side presentation of SystemVerilog and VHDL, enabling the reader to learn the two languages. Chapter 4 describes principles that apply to both HDLs, and then provides language- specific syntax and examples in adjacent columns. This side-by-side treatment makes it easy for an instructor to choose either HDL and for the reader to transition from one to the other, either in a class or in professional practice. RISC-V Architecture and Microarchitecture Chapters 6 and 7 offer in-depth coverage of the RISC-V architecture and microarchitecture. RISC-V is an ideal architecture because it is a real architecture shipped in an increasing number of commercial products, yet it is streamlined and easy to learn. Moreover, because of its popularity in the commercial and hobbyist worlds, simulation and development tools exist for the RISC-V architecture. Real-World Perspectives In addition to the real-world perspective in discussing the RISC-V architecture, Chapter 6 illustrates the architecture of Intel x86 pro- cessors to offer another perspective. Chapter 9 (available as an online supplement) also describes peripherals in the context of SparkFun’s RED-V RedBoard, a popular development board that centers on SiFive’s Freedom E310 RISC-V processor. These real-world perspective chapters show how the concepts in the chapters relate to the chips found in many PCs and consumer electronics. Accessible Overview of Advanced Microarchitecture Chapter 7 includes an overview of modern high-performance micro architectural features, including branch prediction, superscalar, and out-of-order operation, multithreading, and multicore processors. The Preface xxi treatment is accessible to a student in a first course and shows how the microarchitectures in the book can be extended to modern processors. End-of-Chapter Exercises and Interview Questions The best way to learn digital design is to do it. Each chapter ends with numerous exercises to practice the material. The exercises are followed by a set of interview questions that our industrial colleagues have asked students who are applying for work in the field. These questions pro- vide a helpful glimpse into the types of problems that job applicants will typically encounter during the interview process. Exercise solutions are available via the book’s companion and instructor websites. ONLINE SUPPLEMENTS Supplementary materials are available online at ddcabook.com or the publisher’s website: https://www.elsevier.com/books-and-journals/ book-companion/9780128200643. These companion sites (accessible to all readers) include the following: ▸ Links to video lectures ▸ Solutions to odd-numbered exercises ▸ Figures from the text in PDF and PPTX formats ▸ Links to professional-strength computer-aided design (CAD) tools from Intel® ▸ Instructions on how to use PlatformIO (an extension of Visual Studio Code) to compile, assemble, and simulate C and assembly code for RISC-V processors ▸ Hardware description language (HDL) code for the RISC-V processor ▸ Intel’s Quartus helpful hints ▸ Lecture slides in PowerPoint (PPTX) format ▸ Sample course and laboratory materials ▸ List of errata The instructor site (accessible to instructors who register at https:// inspectioncopy.elsevier.com) includes the following: ▸ Solutions to all exercises ▸ Laboratory solutions EdX MOOC This book also has a companion Massive Open Online Course (MOOC) through EdX. The course includes video lectures, interactive practice xxii Preface problems, and interactive problem sets and labs. The MOOC is divided into two parts: Digital Design (ENGR 85A) and Computer Architecture (ENGR85B) offered by HarveyMuddX (on EdX, search for “Digital Design HarveyMuddX” and “Computer Architecture HarveyMuddX”). EdX does not charge for access to the videos but does charge for the inter- active exercises and certificate. EdX offers discounts for students with financial need. HOW TO USE THE SOFTWARE TOOLS IN A COURSE Intel’s Quartus Software The Quartus software, either Web or Lite Edition, is a free version of Intel’s professional-strength Quartus™ FPGA design tools. It allows students to enter their digital designs in schematic or using either the SystemVerilog or the VHDL hardware description language (HDL). After entering the design, students can simulate their circuits using the ModelSim™-Intel FPGA Edition or Starter Edition, which is available with Intel’s Quartus software. Quartus also includes a built-in logic synthesis tool that supports both SystemVerilog and VHDL. The difference between the Web or Lite Edition and the Pro Edition is that the Web or Lite Edition supports a subset of the most common Altera FPGAs. The free versions of ModelSim degrade performance for simulations with more than 10,000 lines of HDL, whereas the professional version of ModelSim does not. PlatformIO PlatformIO, which is an extension of Visual Studio Code, serves as a soft- ware development kit (SDK) for RISC-V. With the explosion of SDKs for each new platform, PlatformIO has streamlined the process of program- ming and using various processors by providing a unified interface for a large number of platforms and devices. It is available as a free download and can be used with SparkFun’s RED-V RedBoard, as described in the labs provided on the companion website. PlatformIO provides access to a commercial RISC-V compiler and allows students to write both C and assembly programs, compile them, and then run and debug them on SparkFun’s RED-V RedBoard (see Chapter 9 and the accompanying labs). Venus RISC-V Assembly Simulator The Venus Simulator (available at: https://www.kvakil.me/venus/) is a web-based RISC-V assembly simulator. Programs are written (or copy/ pasted) in the Editor tab and then simulated and run in the Simulator tab. Registers and memory contents can be viewed as the program runs. Preface xxiii LABS The companion site includes links to a series of labs that cover topics from digital design through computer architecture. The labs teach stu- dents how to use the Quartus tools to enter, simulate, synthesize, and implement their designs. The labs also include topics on C and assem- bly language programming using PlatformIO and SparkFun’s RED-V RedBoard. After synthesis, students can implement their designs using the Altera DE2, DE2-115, DE0, or other FPGA board. The labs are written to target the DE2 or DE-115 boards. These powerful and competitively priced boards are available from de2-115.terasic.com. The board con- tains an FPGA that can be programmed to implement student designs. We provide labs that describe how to implement a selection of designs on the DE2-115 board using the Quartus software. To run the labs, students will need to download and install Intel’s Quartus Web or Lite Edition and Visual Studio Code with the PlatformIO extension. Instructors may also choose to install the tools on lab machines. The labs include instructions on how to implement the projects on the DE2/DE2-115 board. The implementation step may be skipped, but we have found it of great value. We have tested the labs on Windows, but the tools are also available for Linux. RVfpga RISC-V FPGA, also referred to as RVfpga, is a free two-course sequence that can be completed after learning the material in this book. The first course shows how to target a commercial RISC-V core to an FPGA, pro- gram it using RISC-V assembly or C, add peripherals to it, and analyze and modify the core and memory system, including adding instructions to the core. This course uses the open-source SweRVolf system-on-chip (SoC) (https://github.com/chipsalliance/Cores-SweRVolf), which is based on Western Digital’s open-source commercial SweRV EH1 core (https:// www.westerndigital.com/company/innovations/risc-v). The course also shows how to use Verilator, an open-source HDL simulator, and Western Digital’s Whisper, an open-source RISC-V instruction set simulator (ISS). RVfpga-SoC, the second course, shows how to build an SoC based on SweRVolf using building blocks such as the SweRV EH1 core, inter- connect, and memories. The course then guides the user in loading and running the Zephyr operating system on the RISC-V SoC. All neces- sary software and system source code (Verilog/SystemVerilog files) are free, and the courses may be completed in simulation, so no hardware is required. RVfpga materials are freely available with registration from the Imagination Technologies University Programme: https://university. imgtec.com/rvfpga/. xxiv Preface BUGS As all experienced programmers know, any program of significant com- plexity undoubtedly contains bugs. So, too, do books. We have taken great care to find and squash the bugs in this book. However, some errors undoubtedly do remain. We will maintain a list of errata on the book’s webpage. Please send your bug reports to [email protected]. The first person to report a substantive bug with a fix that we use in a future printing will be rewarded with a $1 bounty! ACKNOWLEDGMENTS We appreciate the hard work of Steve Merken, Nate McFadden, Ruby Gammell, Andrae Akeh, Manikandan Chandrasekaran, and the rest of the team at Morgan Kaufmann who made this book happen. We love the art of Duane Bibby, whose cartoons enliven the chapters. We thank Matthew Watkins, who contributed the section on Heterogeneous Multiprocessors in Chapter 7, and Josh Brake, who contributed to Chapter 9 on Embedded I/O Systems. We greatly appre- ciate the work of Mateo Markovic and Geordie Ryder, who reviewed the book and contributed to the exercise solutions. Numerous other reviewers also substantially improved the book. They include Daniel Chaver Martinez, Roy Kravitz, Zvonimir Bandic, Giuseppe Di Luna, Steffen Paul, Ravi Mittal, Jennifer Winikus, Hesham Omran, Angel Solis, Reiner Dizon, and Olof Kindgren. We also appreciate the students in our courses at Harvey Mudd College and UNLV who have given us help- ful feedback on drafts of this textbook. And last, but not least, we both thank our families for their love and support. About the Authors Sarah L. Harris is an Associate Professor of Electrical and Computer Engineering at the University of Nevada, Las Vegas. She received her Ph.D. and M.S. in Electrical Engineering from Stanford University. Sarah has also worked with Hewlett-Packard, the San Diego Supercomputer Center, and NVIDIA. Sarah loves teaching, exploring and developing new technologies, traveling, playing the guitar and various other instru- ments, and spending time with her family. Her recent research includes designing bio-inspired prosthetics and implementing machine-learning algorithms in hardware. David Harris is the Harvey S. Mudd Professor of Engineering Design and Associate Department Chair at Harvey Mudd College. He received his Ph.D. in electrical engineering from Stanford University and his M.Eng. in electrical engineering and computer science from MIT. Before attending Stanford, he worked at Intel as a logic and circuit designer on the Itanium and Pentium II processors. Since then, he has consulted at Broadcom, Sun Microsystems, Hewlett-Packard, Evans & Sutherland, and other design companies. David’s passions include teaching, build- ing chips and airplanes, and exploring the outdoors. When he is not at work, he can usually be found hiking, flying, and spending time with his family. David holds about a dozen patents and is the author of three other textbooks on chip design, as well as seven guidebooks to the Southern California mountains. xxv From Zero to One 1 1. 1 The Game Plan 1. 2 The Art of Managing Complexity 1. 3 The Digital Abstraction 1. 4 Number Systems 1. 5 Logic Gates 1. 6 Beneath the Digital 1.1 THE GAME PLAN Abstraction 1. 7 CMOS Transistors* Microprocessors have revolutionized our world during the past three decades. A laptop computer today has far more capability than a 1. 8 Power Consumption* room-sized mainframe of yesteryear. A luxury automobile contains 1. 9 Summary and a Look Ahead about 100 microprocessors. Advances in microprocessors have made Exercises cell phones and the Internet possible, have vastly improved medicine, Interview Questions and have transformed how war is waged. Worldwide semiconductor industry sales have grown from US $21 billion in 1985 to $400 billion in 2020, and microprocessors are a major segment of these sales. We believe that microprocessors are not only technically, economi- Application >”hello cally, and socially important, but are also an intrinsically fascinating Software world!” human invention. By the time you finish reading this book, you will Operating know how to design and build your own microprocessor. The skills Systems you learn along the way will prepare you to design many other digital systems. Architecture We assume that you have a basic familiarity with electricity, some prior programming experience, and a genuine interest in understanding Micro- what goes on under the hood of a computer. This book focuses on the architecture design of digital systems, which operate on 1’s and 0’s. We begin with Logic + digital logic gates that accept 1’s and 0’s as inputs and produce 1’s and 0’s as outputs. We then explore how to combine logic gates into more Digital complicated modules, such as adders and memories. Then, we shift gears Circuits to programming in assembly language, the native tongue of the micro- processor. Finally, we put gates together to build a microprocessor that Analog + Circuits − runs these assembly language programs. A great advantage of digital systems is that the building blocks are quite simple: just 1’s and 0’s. They do not require grungy mathematics or Devices a profound knowledge of physics. Instead, the designer’s challenge is to combine these simple blocks into complicated systems. A microprocessor Physics may be the first system that you build that is too complex to fit in your Digital Design and Computer Architecture, RISC-V Edition. DOI: 10.1016/B978-0-12-820064-3.00001-5 1 Copyright © 2022 Elsevier Inc. All rights reserved. 2 CHAPTER ONE From Zero to One head all at once. One of the major themes woven through this book is how to manage complexity. 1.2 THE ART OF MANAGING COMPLEXITY One of the characteristics that separates an engineer or computer scien- tist from a layperson is a systematic approach to managing complexity. Modern digital systems are built from millions or billions of transistors. No human being could understand these systems by writing equations describing the movement of electrons in each transistor and solving all of the equations simultaneously. You will need to learn to manage com- plexity to understand how to build a microprocessor without getting mired in a morass of detail. 1. 2. 1 Abstraction The critical technique for managing complexity is abstraction: hid- ing details when they are not important. A system can be viewed from many different levels of abstraction. For example, American politicians abstract the world into cities, counties, states, and countries. A county contains multiple cities and a state contains many counties. When a politician is running for president, the politician is mostly interested in Application >”hello how the state as a whole will vote rather than how each county votes, so world!” Programs Software the state is the most useful level of abstraction. On the other hand, the Operating Device Census Bureau measures the population of every city, so the agency must Systems Drivers consider the details of a lower level of abstraction. Instructions Figure 1.1 illustrates levels of abstraction for an electronic computer Architecture Registers system, along with typical building blocks at each level. At the lowest Micro- Datapaths level of abstraction is the physics, the motion of electrons. The behav- architecture Controllers ior of electrons is described by quantum mechanics and Maxwell’s equations. Our system is constructed from electronic devices such as + Adders Logic Memories transistors (or vacuum tubes, once upon a time). These devices have well-defined connection points called terminals and can be modeled by Digital AND Gates Circuits NOT Gates the relationship between voltage and current as measured at each termi- nal. By abstracting to this device level, we can ignore the individual elec- Analog + Amplifiers Circuits − Filters trons. The next level of abstraction is analog circuits, in which devices are assembled to create components such as amplifiers. Analog circuits Transistors Devices Diodes input and output a continuous range of voltages. Digital circuits, such as logic gates, restrict the voltages to discrete ranges, which we will use to Physics Electrons indicate 0 and 1. In logic design, we build more complex structures, such as adders or memories, from digital circuits. Figure 1.1 Levels of abstraction Microarchitecture links the logic and architecture levels of abstrac- for an electronic computing tion. The architecture level of abstraction describes a computer from system the programmer’s perspective. For example, the Intel x86 architecture 1.2 The Art of Managing Complexity 3 used by microprocessors in most personal computers (PCs) is defined by a set of instructions and registers (memory for temporarily storing variables) that the programmer is allowed to use. Microarchitecture involves combining logic elements to execute the instructions defined by the architecture. A particular architecture can be implemented by one of many different microarchitectures with different price/perfor- mance/power trade-offs. For example, the Intel Core i7, the Intel 80486, and the AMD Athlon all implement the x86 architecture with different microarchitectures. Moving into the software realm, the operating system handles low-level details, such as accessing a hard drive or managing memory. Finally, the application software uses these facilities provided by the operating system to solve a problem for the user. Thanks to the power of abstraction, your grandmother can surf the Web without any regard for the quantum vibrations of electrons or the organization of the memory in her computer. This book focuses on the levels of abstraction from digital circuits Each chapter in this book through computer architecture. When you are working at one level of begins with an abstraction abstraction, it is good to know something about the levels of abstraction icon indicating the focus of immediately above and below where you are working. For example, a the chapter in deep blue, with computer scientist cannot fully optimize code without understanding the secondary topics shown in lighter shades of blue. architecture for which the program is being written. A device engineer cannot make wise trade-offs in transistor design without understanding the circuits in which the transistors will be used. We hope that by the time you finish reading this book, you can pick the level of abstraction appropriate to solving your problem and evaluate the impact of your Ford launched the Model T design choices on other levels of abstraction. with a bold manifesto: I will build a motor car for the great multitude. It will be 1. 2. 2 Discipline large enough for the family, Discipline is the act of intentionally restricting your design choices so but small enough for the individual to run and care that you can work more productively at a higher level of abstraction. for. It will be constructed of Using interchangeable parts is a familiar application of discipline. One the best materials, by the best of the famous early examples of interchangeable parts was in automo- men to be hired, after the bile manufacturing. Although the modern gas-powered car dates back to simplest designs that modern the German Benz Patent-Motorwagen of 1886, early cars were hand- engineering can devise. But it will be so low in price that crafted by skilled tradesmen, a time-consuming and expensive process. no man making a good salary Henry Ford made a key advance in 1908 by focusing on mass produc- will be unable to own one— tion with interchangeable parts and moving assembly lines. and enjoy with his family the The discipline of interchangeable parts revolutionized the indus- blessing of hours of pleasure try. By limiting the components to a standardized set with well-defined in God’s great open spaces. [My Life and Work, by tolerances, cars could be assembled and repaired much faster and with Samuel Crowther and Henry less skill. The car builder no longer concerned himself with lower lev- Ford, 1922] els of abstraction, such as fitting a door to a nonstandardized opening. 4 CHAPTER ONE From Zero to One The Model T Ford became the most-produced car of its era, selling over 15 million units. Another example of discipline was Ford’s famous say- ing: “Any customer can have a car painted any color that he wants so long as it is black.” In the context of this book, the digital discipline will be very import- ant. Digital circuits use discrete voltages, whereas analog circuits use continuous voltages. Therefore, digital circuits are a subset of analog cir- Model T Ford Photo from cuits and in some sense must be capable of less than the broader class https://commons.wikimedia. of analog circuits. However, digital circuits are much simpler to design. org/wiki/File: TModel_ launch_Geelong.jpg. By limiting ourselves to digital circuits, we can easily combine compo- nents into sophisticated systems that ultimately outperform those built from analog components in many applications. For example, digital tele- visions, compact disks (CDs), and cell phones are replacing their analog predecessors. 1. 2. 3 The Three -Y’s In addition to abstraction and discipline, designers use the three “-y’s” to manage complexity: hierarchy, modularity, and regularity. These princi- ples apply to both software and hardware systems. ▸ Hierarchy involves dividing a system into modules, then fur- ther subdividing each of these modules until the pieces are easy to understand. ▸ Modularity states that modules have well-defined functions and interfaces so that they connect easily without unanticipated side effects. ▸ Regularity seeks uniformity among modules. Common modules are reused many times, reducing the number of distinct modules that must be designed. To illustrate these “-y’s,” we return to the example of automobile manufacturing. A car was one of the most intricate objects in common use in the early 20th century. Using the principle of hierarchy, we can break the Model T Ford into components, such as the chassis, engine, and seats. The engine, in turn, contained four cylinders, a carburetor, a magneto, and a cooling system. The carburetor contained fuel and air intakes, a choke and throttle, a needle valve, and so forth, as shown in Figure 1.2. The fuel intake was made from a threaded elbow and a cou- pling nut. Thus, the complex system is recursively broken down into simple interchangeable components that can be mass produced. Modularity teaches that each component should have a well-defined function and interface. For example, the coupling nut has a function of holding the fuel feed line to the intake elbow in a way that does not 1.3 The Digital Abstraction 5 Figure 1.2 Cutaway view of Model T fuel system, showing fuel supply on left and carburetor on right. (https://en.wikipedia.org/ wiki/Ford_Model_T_engine#/ media/File:Pagé_1917_Model_T_ Ford_Car_Figure_14.png) leak yet can be easily removed when the feed line needs replacement. It is of a standardized diameter and thread pitch, tightened to a stan- dardized torque by a standardized wrench. A car maker can buy the nut from many different suppliers, as long as the correct size is specified. Modularity dictates that there should be no side effects: the coupling nut should not deform the elbow and should preferably be located where it can be tightened or removed without taking off other equipment in the engine. Regularity teaches that interchangeable parts are a good idea. With regularity, a leaking carburetor can be replaced by an identical part. The carburetors can be efficiently built on an assembly line instead of being painstakingly handcrafted. We will return to these principles of hierarchy, modularity, and regu- larity throughout the book. 1.3 THE DIGITAL ABSTRACTION Most physical variables are continuous. For example, the voltage on a wire, the frequency of an oscillation, or the position of a mass are all continuous quantities. Digital systems, on the other hand, represent Charles Babbage, 1791–1871 information with discrete-valued variables—that is, variables with a Attended Cambridge University finite number of distinct values. and married Georgiana An early digital system using variables with ten discrete values was Whitmore in 1814. Invented the Analytical Engine, the Charles Babbage’s Analytical Engine. Babbage labored from 1834 to world’s first mechanical 1871, designing and attempting to build this mechanical computer. The computer. Also invented the Analytical Engine used gears with ten positions labeled 0 through 9, cowcatcher and the universal much like a mechanical odometer in a car. Figure 1.3 shows a prototype postage rate. Interested in lock of the Analytical Engine, in which each row processes one digit. Babbage picking, but abhorred street musicians (image courtesy of chose 25 rows of gears, so the machine has 25-digit precision. Fourmilab Switzerland, www. Unlike Babbage’s machine, most electronic computers use a binary fourmilab.ch). (two-valued) representation in which a high voltage indicates a “1” and 6 CHAPTER ONE From Zero to One Figure 1.3 Babbage’s Analytical Engine, under construction at the time of his death in 1871 (image courtesy of Science Museum/ Science and Society Picture Library) a low voltage indicates a “0,” because it is easier to distinguish between two voltages than ten. The amount of information D in a discrete valued variable with N distinct states is measured in units of bits as D = log 2 N bits (1.1) A binary variable conveys log22 = 1 bit of information. Indeed, the word bit is short for binary digit. Each of Babbage’s gears carried log210 = 3.322 bits of information because it could be in one of 23.322 = 10 unique positions. A continuous signal theoretically contains an infinite amount of information because it can take on an infinite number of values. In practice, noise and measurement error limit the information to only 10 to 16 bits for most continuous signals. If the measurement must be made rapidly, the information content is lower (e.g., 8 bits). This book focuses on digital circuits using binary variables: 1’s and 0’s. George Boole developed a system of logic operating on binary vari- George Boole, 1815–1864 Born to working-class parents ables that is now known as Boolean logic. Each of Boole’s variables and unable to afford a formal could be TRUE or FALSE. Electronic computers commonly use a pos- education, Boole taught itive voltage to represent “1” and zero volts to represent “0.” In this himself mathematics and book, we will use the terms “1,” “TRUE,” and “HIGH” synonymously. joined the faculty of Queen’s Similarly, we will use “0”, “FALSE,” and “LOW” interchangeably. College in Ireland. He wrote An Investigation of the Laws The beauty of the digital abstraction is that digital designers can of Thought (1854), which focus on 1’s and 0’s, ignoring whether the Boolean variables are physically introduced binary variables represented with specific voltages, rotating gears, or even hydraulic fluid and the three fundamental logic levels. A computer programmer can work without needing to know the operations: AND, OR, and intimate details of the computer hardware. On the other hand, under- NOT (image courtesy of the American Institute of Physics). standing the details of the hardware allows the programmer to optimize the software better for that specific computer. 1.4 Number Systems 7 An individual bit doesn’t carry much information. In the next sec- tion, we examine how groups of bits can be used to represent numbers. In later chapters, we will also use groups of bits to represent letters and programs. 1.4 NUMBER SYSTEMS You are accustomed to working with decimal numbers. In digital sys- tems consisting of 1’s and 0’s, binary or hexadecimal numbers are often more convenient. This section introduces the various number systems that will be used throughout the rest of the book. 1. 4. 1 Decimal Numbers In elementary school, you learned to count and do arithmetic in decimal. Just as you (probably) have ten fingers, there are ten decimal digits: 0, 1, 2, …, 9. Decimal digits are joined together to form longer decimal numbers. Each column of a decimal number has ten times the weight of the previous column. From right to left, the column weights are 1, 10, 100, 1000, and so on. Decimal numbers are referred to as base 10. The base is indicated by a subscript after the number to pre- vent confusion when working in more than one base. For example, Figure 1.4 shows how the decimal number 974210 is written as the sum of each of its digits multiplied by the weight of the correspond- ing column. An N-digit decimal number represents one of 10N possibilities: 0, 1, 2, 3, …, 10N − 1. This is called the range of the number. For example, a three-digit decimal number represents one of 1000 possibilities in the range of 0 to 999. 1. 4. 2 Binary Numbers Bits represent one of two values, 0 or 1, and are joined together to form binary numbers. Each column of a binary number has twice the weight of the previous column, so binary numbers are base 2. In binary, the col- umn weights (again, from right to left) are 1, 2, 4, 8, 16, 32, 64, 128, 1000's column 100's column 10's column 1's column Figure 1.4 Representation of a decimal number 974210 = 9 × 10 + 7 × 10 + 4 × 10 + 2 × 10 3 2 1 0 nine seven four two thousands hundreds tens ones 8 CHAPTER ONE From Zero to One 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, and so on. If you work with binary numbers often, you’ll save time if you remember these powers of two up to 216. An N-bit binary number represents one of 2N possibilities: 0, 1, 2, 3, …, 2N − 1. Table 1.1 shows 1-, 2-, 3-, and 4-bit binary numbers and their decimal equivalents. Example 1.1 BINARY TO DECIMAL CONVERSION Convert the binary number 101102 to decimal. Solution Figure 1.5 shows the conversion. Table 1.1 Binary numbers and their decimal equivalent 1-Bit 2-Bit 3-Bit 4-Bit Binary Binary Binary Binary Decimal Numbers Numbers Numbers Numbers Equivalents 0 00 000 0000 0 1 01 001 0001 1 10 010 0010 2 11 011 0011 3 100 0100 4 101 0101 5 110 0110 6 111 0111 7 1000 8 1001 9 1010 10 1011 11 1100 12 1101 13 1110 14 1111 15 1.4 Number Systems 9 16's column 8's column 4's column 2's column 1's column Figure 1.5 Conversion of a binary number to decimal 10110 2 = 1 × 24 + 0 × 23 + 1 × 22 + 1 × 21+ 0 × 2 0 = 2210 one no one one no sixteen eight four two one Example 1.2 DECIMAL TO BINARY CONVERSION Convert the decimal number 8410 to binary. Solution Determine whether each column of the binary result has a 1 or a 0. We can do this starting at either the left or the right column. Working from the left, start with the largest power of 2 less than or equal to the number (in this case, 64). 84 ≥ 64, so there is a 1 in the 64’s column, leaving 84 − 64 = 20. 20 < 32, so there is a 0 in the 32’s column. 20 ≥ 16, so there is a 1 in the 16’s column, leaving 20 − 16 = 4. 4 < 8, so there is a 0 in the 8’s column. 4 ≥ 4, so there is a 1 in the 4’s column, leaving 4 − 4 = 0. Thus, there must be 0 s in the 2’s and 1’s column. Putting this all together, 8410 = 10101002. Working from the right, repeatedly divide the number by 2. The remainder goes in each column. 84/2 = 42, so 0 goes in the 1’s column. 42/2 = 21, so 0 goes in the 2’s column. 21/2 = 10 with the remainder of 1 going in the 4’s column. 10/2 = 5, so 0 goes in the 8’s column. 5/2 = 2 with the remainder of 1 going in the 16’s column. 2/2 = 1, so 0 goes in the 32’s column. Finally, 1/2 = 0 with the remain- der of 1 going in the 64’s column. Again, 8410 = 10101002. 1. 4. 3 Hexadecimal Numbers Hexadecimal, a term coined Writing long binary numbers becomes tedious and prone to error. A group by IBM in 1963, derives from of four bits represents one of 24 = 16 possibilities. Hence, it is sometimes the Greek hexi (six) and Latin decem (ten). A more proper more convenient to work in base 16, called hexadecimal. Hexadecimal term would use the Latin numbers use the digits 0 to 9 along with the letters A to F, as shown in sexa (six), but sexadecimal Table 1.2. Columns in base 16 have weights of 1, 16, 162 (or 256), 163 sounded too risqué. (or 4096), and so on. Example 1.3 HEXADECIMAL TO BINARY AND DECIMAL CONVERSION Convert the hexadecimal number 2ED16 to binary and to decimal. Solution Conversion between hexadecimal and binary is easy because each hexa- decimal digit directly corresponds to four binary digits. 216 = 00102, E16 = 11102 and D16 = 11012, so 2ED16 = 0010111011012. Conversion to decimal requires the arithmetic shown in Figure 1.6. 10 CHAPTER ONE From Zero to One Table 1.2 Hexadecimal number system Hexadecimal Digit Decimal Equivalent Binary Equivalent 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 256's column 16's column 1's column Figure 1.6 Conversion of a hexadecimal number to decimal 2ED 16 = 2 × 16 2 + E × 16 1 + D × 16 0 = 74910 two fourteen thirteen two hundred sixteens ones fifty six's Example 1.4 BINARY TO HEXADECIMAL CONVERSION Convert the binary number 11110102 to hexadecimal. Solution Again, conversion is easy. Start reading from the right. The four least signifi cant bits are 10102 = A16. The next bits are 1112 = 716. Hence, 11110102 = 7A16. 1.4 Number Systems 11 Example 1.5 DECIMAL TO HEXADECIMAL AND BINARY CONVERSION Convert the decimal number 33310 to hexadecimal and binary. Solution Like decimal to binary conversion, decimal to hexadecimal conversion can be done from the left or the right. Working from the left, start with the largest power of 16 less than or equal to the number (in this case, 256). 256 goes into 333 once, so there is a 1 in the 256’s column, leaving 333 − 256 = 77. 16 goes into 77 four times, so there is a 4 in the 16’s column, leaving 77 − 16 × 4 = 13. 1310 = D16, so there is a D in the 1’s column. In summary, 33310 = 14D16. Now it is easy to convert from hexadec- imal to binary, as in Example 1.3. 14D16 = 1010011012. Working from the right, repeatedly divide the number by 16. The remainder goes in each column. 333/16 = 20, with the remainder of 1310 = D16 going in the 1’s column. 20/16 = 1, with the remainder of 4 going in the 16’s column. 1/16 = 0, with the remainder of 1 going in the 256’s column. Again, the result is 14D16. 1. 4. 4 Bytes, Nibbles, and All That Jazz A group of eight bits is called a byte. It represents one of 28 = 256 pos- sibilities. The size of objects stored in computer memories is customarily measured in bytes rather than bits. A group of four bits, or half a byte, is called a nibble. It represents one of 24 = 16 possibilities. One hexadecimal digit stores one nibble and two hexadecimal digits store one full byte. Nibbles are no longer a com- monly used unit, but the term is cute. Microprocessors handle data in chunks called words. The size A microprocessor is a of a word depends on the architecture of the microprocessor. When processor built on a single this chapter was written in 2021, most computers had 64-bit proces- chip. Until the 1970’s, processors were too sors, indicating that they operate on 64-bit words. At the time, older complicated to fit on one computers handling 32-bit words were also widely available. Simpler chip, so mainframe processors microprocessors, especially those used in gadgets such as toasters, use were built from boards 8- or 16-bit words. containing many chips. Within a group of bits, the bit in the 1’s column is called the Intel introduced the first 4-bit microprocessor, called least significant bit (lsb), and the bit at the other end is called the the 4004, in 1971. Now, most significant bit (msb), as shown in Figure 1.7(a) for a 6-bit even the most sophisticated binary number. Similarly, within a word, the bytes are identified as supercomputers are built least significant byte (LSB) through most significant byte (MSB), as using microprocessors. shown in Figure 1.7(b) for a 4-byte number written with eight hexa- We will use the terms microprocessor and processor decimal digits. interchangeably throughout By handy coincidence, 210 = 1024 ≈ 103. Hence, the term kilo this book. (Greek for thousand) indicates 210. For example, 210 bytes is one 12

Digital Design and Computer Architecture RISC-V Edition PDF

Document Details

Tags

Related

Summary

Full Transcript