RISC-V Processor Design PDF

258 Chapter 4 The Processor determines whether PC + 4 or the branch destination address is written into the PC, is set based on the Zero output of the ALU, which is used to perform the...

258 Chapter 4 The Processor determines whether PC + 4 or the branch destination address is written into the PC, is set based on the Zero output of the ALU, which is used to perform the comparison of a beq instruction. The regularity and simplicity of the RISC-V instruction set mean that a simple decoding process can be used to determine how to set the control lines. In the remainder of the chapter, we refine this view to fill in the details, which requires that we add further functional units, increase the number of connections between units, and, of course, enhance a control unit to control what actions are taken for different instruction classes. Sections 4.3 and 4.4 describe a simple implementation that uses a single long clock cycle for every instruction and follows the general form of Figures 4.1 and 4.2. In this first design, every instruction begins execution on one clock edge and completes execution on the next clock edge. While easier to understand, this approach is not practical, since the clock cycle must be severely stretched to accommodate the longest instruction. After designing the control for this simple computer, we will look at faster implementations with all their complexities, including exceptions. Check How many of the five classic components of a computer—shown on page 253—do Yourself Figures 4.1 and 4.2 include? 4.2 Logic Design Conventions To discuss the design of a computer, we must decide how the hardware logic implementing the computer will operate and how the computer is clocked. This section reviews a few key ideas in digital logic that we will use extensively in this chapter. If you have little or no background in digital logic, you will find it helpful to read Appendix A before continuing. The datapath elements in the RISC-V implementation consist of two different types of logic elements: elements that operate on data values and elements that combinational contain state. The elements that operate on data values are all combinational, which element An operational means that their outputs depend only on the current inputs. Given the same input, a element, such as an AND combinational element always produces the same output. The ALU in Figure 4.1 and gate or an ALU discussed in Appendix A is an example of a combinational element. Given a set of inputs, it always produces the same output because it has no internal storage. Other elements in the design are not combinational, but instead contain state. An state element A memory element contains state if it has some internal storage. We call these elements state element, such as a register elements because, if we pulled the power plug on the computer, we could restart it or a memory. accurately by loading the state elements with the values they contained before we pulled the plug. Furthermore, if we saved and restored the state elements, it would be as if the computer had never lost power. Thus, these state elements completely characterize the computer. In Figure 4.1, the instruction and data memories, as well as the registers, are all examples of state elements. 4.2 Logic Design Conventions 259 A state element has at least two inputs and one output. The required inputs are the data value to be written into the element and the clock, which determines when the data value is written. The output from a state element provides the value that was written in an earlier clock cycle. For example, one of the logically simplest state elements is a D-type flip-flop (see Appendix A), which has exactly these two inputs (a value and a clock) and one output. In addition to flip-flops, our RISC-V implementation uses two other types of state elements: memories and registers, both of which appear in Figure 4.1. The clock is used to determine when the state element should be written; a state element can be read at any time. Logic components that contain state are also called sequential, because their outputs depend on both their inputs and the contents of the internal state. For example, the output from the functional unit representing the registers depends both on the register numbers supplied and on what was written into the registers previously. Appendix A discusses the operation of both the combinational and sequential elements and their construction in more detail. Clocking Methodology A clocking methodology defines when signals can be read and when they can be clocking written. It is important to specify the timing of reads and writes, because if a signal methodology The is written at the same time that it is read, the value of the read could correspond approach used to determine when data are to the old value, the newly written value, or even some mix of the two! Computer valid and stable relative to designs cannot tolerate such unpredictability. A clocking methodology is designed the clock. to make hardware predictable. For simplicity, we will assume an edge-triggered clocking methodology. An edge-triggered edge-triggered clocking methodology means that any values stored in a sequential clocking A clocking logic element are updated only on a clock edge, which is a quick transition from scheme in which all state changes occur on a clock low to high or vice versa (see Figure 4.3). Because only state elements can store a edge. data value, any collection of combinational logic must have its inputs come from a set of state elements and its outputs written into a set of state elements. The inputs are values that were written in a previous clock cycle, while the outputs are values that can be used in a following clock cycle. State State element Combinational logic element 1 2 Clock cycle FIGURE 4.3 Combinational logic, state elements, and the clock are closely related. In a synchronous digital system, the clock determines when elements with state will write values into internal storage. Any inputs to a state element must reach a stable value (that is, have reached a value from which they will not change until after the clock edge) before the active clock edge causes the state to be updated. All state elements in this chapter, including memory, are assumed positive edge-triggered; that is, they change on the rising clock edge. 260 Chapter 4 The Processor Figure 4.3 shows the two state elements surrounding a block of combinational logic, which operates in a single clock cycle: all signals must propagate from state element 1, through the combinational logic, and to state element 2 in the time of one clock cycle. The time necessary for the signals to reach state element 2 defines the length of the clock cycle. control signal A signal For simplicity, we do not show a write control signal when a state element is used for multiplexor written on every active clock edge. In contrast, if a state element is not updated on selection or for directing every clock, then an explicit write control signal is required. Both the clock signal the operation of a and the write control signal are inputs, and the state element is changed only when functional unit; contrasts with a data signal, which the write control signal is asserted and a clock edge occurs. contains information We will use the word asserted to indicate a signal that is logically high and assert that is operated on by a to specify that a signal should be driven logically high, and deassert or deasserted functional unit. to represent logically low. We use the terms assert and deassert because when we implement hardware, at times 1 represents logically high and at times it can asserted The signal is represent logically low. logically high or true. An edge-triggered methodology allows us to read the contents of a register, deasserted The signal is send the value through some combinational logic, and write that register in the logically low or false. same clock cycle. Figure 4.4 gives a generic example. It doesn’t matter whether we assume that all writes take place on the rising clock edge (from low to high) or on the falling clock edge (from high to low), since the inputs to the combinational logic block cannot change except on the chosen clock edge. In this book, we use the rising clock edge. With an edge-triggered timing methodology, there is no feedback within a single clock cycle, and the logic in Figure 4.4 works correctly. In Appendix A, we briefly discuss additional timing constraints (such as setup and hold times) as well as other timing methodologies. For the 32-bit RISC-V architecture, nearly all of these state and logic elements will have inputs and outputs that are 32 bits wide, since that is the width of most of the data handled by the processor. We will make it clear whenever a unit has an input or output that is other than 32 bits in width. The figures will indicate buses, which are signals wider than 1 bit, with thicker lines. At times, we will want to combine several buses to form a wider bus; for example, we may want to obtain a 32-bit bus by combining two 16-bit buses. In such cases, labels on the bus lines State Combinational logic element FIGURE 4.4 An edge-triggered methodology allows a state element to be read and written in the same clock cycle without creating a race that could lead to indeterminate data values. Of course, the clock cycle still must be long enough so that the input values are stable when the active clock edge occurs. Feedback cannot occur within one clock cycle because of the edge-triggered update of the state element. If feedback were possible, this design could not work properly. Our designs in this chapter and the next rely on the edge-triggered timing methodology and on structures like the one shown in this figure. 4.3 Building a Datapath 261 will make it clear that we are concatenating buses to form a wider bus. Arrows are also added to help clarify the direction of the flow of data between elements. Finally, color indicates a control signal contrary to a signal that carries data; this distinction will become clearer as we proceed through this chapter. True or false: Because the register file is both read and written on the same clock Check cycle, any RISC-V datapath using edge-triggered writes must have more than one Yourself copy of the register file. Elaboration: There is also a 64-bit version of the RISC-V architecture, and, naturally enough, most paths in its implementation would be 64 bits wide. 4.3 Building a Datapath A reasonable way to start a datapath design is to examine the major components required to execute each class of RISC-V instructions. Let’s start at the top by looking at which datapath elements each instruction needs, and then work our way down through the levels of abstraction. When we show the datapath elements, we will also show their control signals. We use abstraction in this explanation, starting from the bottom up. Figure 4.5a shows the first element we need: a memory unit to store the instructions of a program and supply instructions given an address. Figure 4.5b also shows the program counter (PC), which as we saw in Chapter 2 is a register that holds the address of the current instruction. Lastly, we will need an adder datapath element A to increment the PC to the address of the next instruction. This adder, which is unit used to operate on combinational, can be built from the ALU described in detail in Appendix A simply or hold data within a by wiring the control lines so that the control always specifies an add operation. We processor. In the RISC-V implementation, the will draw such an ALU with the label Add, as in Figure 4.5c, to indicate that it has datapath elements include been permanently made an adder and cannot perform the other ALU functions. the instruction and data To execute any instruction, we must start by fetching the instruction from memories, the register memory. To prepare for executing the next instruction, we must also increment the file, the ALU, and adders. program counter so that it points at the next instruction, 4 bytes later. Figure 4.6 program counter shows how to combine the three elements from Figure 4.5 to form the portion of a (PC) The register datapath that fetches instructions and increments the PC to obtain the address of containing the address the next sequential instruction. of the instruction in the Now let’s consider the R-format instructions (see Figure 2.19 on page 127). program being executed. They all read two registers, perform an ALU operation on the contents of the registers, and write the result to a register. We call these instructions either R-type instructions or arithmetic-logical instructions (since they perform arithmetic or logical operations). This instruction class includes add, sub, and, and or, which

RISC-V Processor Design PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue