M. Morris R. Mano, Charles R. Kime, Tom Martin - Logic and computer design fundamentals-Prentice Hall (2015) (2).pdf
Document Details
Uploaded by ThinnerMedusa
2015
Tags
Full Transcript
Problems 107 A B F A B E...
Problems 107 A B F A B E C D FIGURE 2-39 Circuit for Problem 2-29 0 1 2 3 4 5 6 7 Time (ns) FIGURE 2-40 Waveform for Problem 2-30 2-30. The waveform in Figure 2-40 is applied to an inverter. Find the output of the inverter, assuming that (a) It has no delay. (b) It has a transport delay of 0.06 ns. (c) It has an inertial delay of 0.06 ns with a rejection time of 0.04 ns. 2-31. Assume that tpd is the average of tPHL and tPLH. Find the delay from each input to the output in Figure 2-41 by (a) Finding tPHL and tPLH for each path, assuming tPHL = 0.20 ns and tPLH = 0.36 ns for each gate. From these values, find tpd for each path. (b) Using tpd = 0.28 ns for each gate. (c) Compare your answers from parts (a) and (b) and discuss any differences. 2-32. The rejection time for inertial delays is required to be less than or equal to the propagation delay. In terms of the discussion of the example in Figure 2-25, why is this condition necessary to determine the delayed output? C D B A F B C FIGURE 2-41 Circuit for Problem 2-31 Problems 109 2-34. *Find a logic diagram that corresponds to the VHDL structural description in Figure 2-42. Note that complemented inputs are not available. 2-35. Using Figure 2-28 as a framework, write a structural VHDL description of the circuit in Figure 2-43. Replace X, Y, and Z with X (2:0). Consult package func_prims in library lcdf_vhdl for information on the various gate components. Compile func_prims and your VHDL model, and simulate your VHDL model for all eight possible input combinations to verify your description’s correctness. X F Y Z FIGURE 2-43 Circuit for Problem 2-35, 2-38, 2-41, and 2-43 2-36. Using Figure 2-28 as a framework, write a structural VHDL description of the circuit in Figure 2-44. Consult package func_prims in library lcdf_vhdl for information on the various gate components. Compile func_prims and your VHDL model, and simulate your VHDL model for all 16 possible input combinations to verify your description’s correctness. 2-37. Find a logic diagram representing minimum two-level logic needed to implement the VHDL dataflow description in Figure 2-45. Note that complemented inputs are available. 2-38. *Write a dataflow VHDL description for the circuit in Figure 2-43 by using the Boolean equation for the output F. 2-39. *Find a logic diagram that corresponds to the Verilog structural description in Figure 2-46. Note that complemented inputs are not available. D X C B Y A FIGURE 2-44 Circuit for Problems 2–36 and 2-40 Problems 111 2-42. Find a logic diagram representing minimum 2-level logic needed to implement the Verilog dataflow description in Figure 2-47. Note that complemented inputs are available. 2-43. *Write a dataflow Verilog description for the circuit in Figure 2-43 by using the Boolean equation for the output F and using Figure 2-34 as a model. // Combinational Circuit 2: Dataflow Verilog Description module comb_ckt_1 (a, b, c, d, a_n, b_n, c_n, d_n, f, g); // a_n, b_n,... are complements of a, b,... , respectively. input a, b, c, d, a_n, b_n, c_n, d_n; output f, g; assign f = b & (a |(a_n & c)) | (b_n & c & d_n); assign g = b & (c | (a_n & c_n) | (c_n & d_n)); endmodule FIGURE 2-47 Verilog for Problem 2–42 This page intentionally left blank 4-8 / HDL Representation for Sequential Circuits—Verilog 263 throughout the description. From the diagram in Figure 4-18(d), the states are A, B, C, and D. In addition, the parameter statements give the state codes assigned to each of these states. The notation used to define the state codes is 2'b followed by the binary code. The 2 denotes that there are two bits in the code and the 'b denotes that the base of the code given is binary. The if-else (without using the else if) makes a two-way decision based on whether a condition is TRUE or FALSE. In contrast, the case statement can make a multiway decision based on which one of a number of statements is TRUE. A simplified form for the generic case statement is case expression {case expression : statements} endcase in which the braces { } represent one or more such entries. The case expression must have values that can be taken on by a signal of the type used in expression. Typically, there are sequences of multiple statements. In the example in Figure 4-34, the case statement for the next-state function makes a mul- tiway decision based on the current state of the circuit, A, B, C, or D. For each of the case expressions, conditional statements of various types are used to make a binary decision based on whether input X is 1 or 0. Blocking assignment statements are then used to assign the next state based on the eight possible combinations of state value and input value. For example, consider the expression B. If X equals 1, then the next state will be C; if X equals 0, then the next state will be A. This corresponds to the two transitions out of state B in Figure 4-18(d). With this brief introduction to the case statement, the overall sequence recog- nizer can now be understood. Each of the three processes has a distinct function, but the processes interact to provide the overall sequence recognizer. The first process describes the state register for storing the sequence-recognizer state. Note that the description resembles that of the positive-edge-triggered flip-flop. There are two dif- ferences, however. First, there are two bits in the state register. Second, the state that results from applying RESET is state A rather than state 0. The first process is the only one of the three processes that has storage (sequential logic) associated with it. Following the coding guidelines provided earlier in this section, this always block uses nonblocking assignments. The second process describes the next-state function as discussed earlier. The event control statement contains signals X and state. In general, for describing combinational logic, all inputs must appear in the event control statement, since, whenever an input changes, the process must be executed. Since the next state logic is combinational, this process uses blocking assignments. The final process describes the output function and uses the same case state- ment framework as in the next-state function process, again using blocking assignments because the process describes combinational logic. Instead of assigning state names, values 0 and 1 are assigned to Z. If the value assigned is the same for both values 0 and 1 on X, no conditional statement is needed, so a conditional statement appears only for state D. If there are multiple input variables, more complex if-else combinations, as illustrated earlier, can be used to represent the conditioning of the outputs on the inputs. 4-8 / HDL Representation for Sequential Circuits—Verilog 265 module declares the device under test, the wire and regs to be connected to it, and then instantiates it. But in contrast to earlier testbenches, this testbench uses more than one process to provide stimulus to the inputs of the sequence recog- nizer. The first process applies the reset and X inputs, while second process pro- vides a periodic clock signal. The first process uses the test sequence that was described in Example 4-8, which is stored in the reg array test_sequence. At the beginning of simulation, the process activates reset to put the state machine in a known state. After deactivating reset, the process applies the X input values stored in the test_sequence array using a for loop statement. The input val- ues are applied shortly after the positive edge of the clock to ensure that there is sufficient time before the next positive edge that the timing conditions for storage elements are met, which will be described later in this chapter. This testbench provides a template for verifying Verilog models of simple finite state machines: using multiple processes to generate a clock signal and to apply reset and other inputs. For more complex circuits, testbenches may read inputs from a file and compare the outputs of the device under test to known good outputs, automati- cally flagging erroneous outputs. The language constructs for supporting the file read/write and user input/output necessary for such behavior are beyond the scope of this introductory text, but interested readers will easily find them in one of the many fine books dedicated to the Verilog language. A common pitfall is present whenever an if-else or case statement is employed. During synthesis, unexpected storage elements in the form of latches or flip-flops appear. For the very simple if-else used in Figure 4-33, this pitfall is employed to give a specification that synthesizes to a flip-flop. In addition to the two input signals, RESET and CLK, events posedge CLK and posedge RESET are pro- duced, which are TRUE if the value of the respective signal changes from 0 to 1. Selected combinations of values for RESET and the two events are shown in Table 4-13. Whenever RESET has no positive edge, or RESET is 0 and CLK is fixed at 0 or 1 or has a negative edge, no action is specified. In Verilog, the assumption is that, for any TABLE 4-13 Illustration of Generation of Storage in Verilog Inputs Action posedge RESET and RESET = 1 posedge CLK FALSE FALSE Unspecified FALSE TRUE Q 6= D TRUE FALSE Q 6= 0 TRUE TRUE Q 6= 0 4-10 / Sequential Circuit Timing 267 twH ! twH,min C twL ! twL,min ts th S/R tp-,min tp-,max Q (a) Pulse-triggered (positive pulse) twH ! twH,min C twL ! twL,min ts th D tp-,min tp-,max Q (b) Edge-triggered (negative edge) FIGURE 4-36 Flip-Flop Timing Parameters minimum and maximum values. Since the changes of the flip-flop outputs are to be separated from the control by the flip-flop inputs, the minimum propagation delay time should be longer than the hold time for correct operation. These and other parameters are specified in manufacturers’ data books for specific integrated circuit products. Similar timing parameters can be defined for latches and direct inputs, with additional propagation delays needed to model the transparent behavior of latches. 4-10 SEQUENTIAL CIRCUIT TIMING In addition to analyzing the function of a circuit, it is also important to analyze its performance in terms of the maximum input-to-output delay and the maximum clock frequency, fmax, at which it can operate. First of all, the clock frequency is just the in- verse of the clock period tp shown in Figure 4-37. So, the maximum allowable clock frequency corresponds to the minimum allowable clock period tp. To determine how small we can make the clock period, we need to determine the longest delay from the triggering edge of the clock to the next triggering edge of the clock. These delays are measured on all such paths in the circuit down which changing signals propagate. 4-10 / Sequential Circuit Timing 269 Combinational logic PIN,OUT tpd, COMB PIN,FF tpd, COMB ti PFF,OUT tpd, COMB to PFF,FF tpd, COMB Flip-flops Q D tpd, FF Clock C ts FIGURE 4-38 Sequential Circuit Timing Paths EXAMPLE 4-16 Clock Period and Frequency Calculations Suppose that all flip-flops used are the same and have tpd = 0.2 ns (nanosecond = 10-9seconds) and ts = 0.1 ns. Then the longest path beginning and ending with a flip-flop will be the path with the largest tpd,COMB. Further, suppose that the largest tpd,COMB is 1.3 ns and that tp has been set to 1.5 ns. From the previous equation for tp, we can write 1.5 ns = tslack + 0.2 + 1.3 + 0.1 = tslack + 1.6 ns Solving, we have tslack = -0.1 ns, so this value of tp is too small. In order for tslack to be greater than or equal to zero for the longest path, tp Ú tp,min = 1.6 ns. The maximum frequency fmax = 1/1.6 ns = 625 MHz (megahertz = 106 cycles per second). We note that, if tp is too large to meet the circuit specifications, we must either employ faster logic cells or change the circuit design to reduce the problem- atic path delays through the circuit while still performing the desired function. It is interesting to note that the hold time for a flip-flop, th, does not appear in the clock-period equation. It relates to another timing-constraint equation dealing with one or both of two specific situations. In one case, output changes arrive at the inputs of one or more flip-flops too soon. In the other case, the clock signals reaching one or more flip-flops are somehow delayed, a condition referred to as clock skew. Clock skew also can affect the maximum clock frequency. 4-12 / Synchronization and Metastability 271 Synchronous Asynchronous circuit circuit (a) Synchronous to asynchronous Asynchronous Asynchronous signals Synchronous circuit circuit (b) Asynchronous to synchronous Synchronous Asynchronous signals Synchronous circuit circuit Clock X Clock Y (c) Synchronous circuits with unrelated clocks FIGURE 4-39 Examples of Synchronous/Asynchronous Interfaces clock Y. Both of these cases can cause circuit malfunction, so we offer the synchro- nizing of such signals as a solution. In line with the perverse nature of asynchronous behavior, this solution isn’t perfect, but suffers from a troublesome phenomenon referred to as metastability, a topic treated briefly here. Our final topic that affects the synchronous circuit designer, but not in an inter- face problem, is “I thought this was a synchronous circuit; after all, it does have a clock controlling state changes.” Here we illustrate how a circuit designer can easily fall into the pitfall of unknowingly producing an asynchronous design, bringing into play timing-dependent factors controlling correct or incorrect operation. 4-12 SYNCHRONIZATION AND METASTABILITY We now turn our attention to asynchronous signals driving synchronous circuits, the case shown in Figures 4-39(b) and (c). Initially, we look at the problem that occurs if an asynchronous signal is applied directly to the synchronous circuit without special treatment. Then we offer a solution but find that there is an additional problem with the solution, which we also attempt to remedy. The circuit in Figure 4-40 can illustrate erroneous behavior due to an input sig- nal not synchronized with the clock. The circuit is initialized by using the Reset sig- nal which sets the state of the circuit to S0 (y0, y1, y2 = 1, 0, 0). As long as RDY = 1, the circuit cycles through the states S0 (1, 0, 0) and S1 (0, 1, 0) and S2(0, 0, 1). If 4-12 / Synchronization and Metastability 273 Clock RDY y0 y1 y2 (a) Correct circuit response to RDY Clock RDY y0 y0 resets y1 y1 fails to set y2 (b) Incorrect circuit response to RDY: invalid state (0, 0, 0) results. Clock RDY y0 y0 fails to reset y1 y1 sets y2 (c) Incorrect circuit response to RDY: invalid state (1, 1, 0), (0, 1, 1) and (1, 0, 1) results. FIGURE 4-41 Behavior of Example Circuit are all invalid states and give an incorrect output sequence. Thus, the circuit has again failed. Whether or not these failures occur depends upon circuit delays, the setup and hold times, and the detailed behavior of the flip-flops. Since none of these can be tightly controlled, we need a solution to prevent these failures that is independent of these parameters. Such a solution is the use of a synchronizing flip-flop. Synchronizing flip-flop In Figure 4-42(a), a D flip-flop has been added to the example circuit. The asynchronous signal RDY enters the D flip-flop and RDY_S, its output, is synchronous with signal Clock in the sense that RDY_Schanges one flip-flop delay after the positive edge. Since the asynchronous signal RDY enters the circuit through this single synchronizing flip-flop, the behavior exhibited when RDY reached two flip-flops is avoided. RDY_S cannot cause such behavior, since it does not change during the setup-time, hold-time interval for the normal circuit flip-flops. 4-12 / Synchronization and Metastability 275 Clock RDY RDY_S y0 y1 y2 (a) Circuit response to RDY with sensing at the Clock edge where RDY changes Clock RDY RDY_S y0 y1 y2 (b) Circuit response to RDY with sensing at the next Clock edge where RDY changes FIGURE 4-43 Behavior of Example Circuit with Synchronizing Flip-flop on RDY M 0 1 FIGURE 4-44 Mechanical Analogy for Latch States. state to the metastable point where both gates have equal output values with volt- ages between 1 and 0. Like the mechanical system, the latch and hence the flip-flop containing it will eventually go to either 0 or 1 due to a tiny electronic “noise” distur- bance. The length of time it remains in the metastable state is nondeterministic. The interval during which a change in the input will cause metastable behavior is very narrow, of the order of a few tens of picoseconds. Thus the behavior is unlikely, but it can happen. When it does, it is unknown how long the metastable state will persist. 4-13 / Synchronous Circuit Pitfalls 277 inside the flip-flop in this experiment is converted to a longer delay by the output buffer of the flip-flop and so is not visible at the output. So what can be done about this problem? Many solutions have been proposed, some of them ineffective. A simple one is to use a series of synchronizing flip-flops, i.e., a small shift register. The likelihood of the second flip-flop in the series going metastable because the first one applies a metastable or delayed input to it is less than that of the first flip-flop going metastable, and so on. Some commercial designs have used as many as six flip-flops in series to deal with this problem. More common is the use of three or so flip-flops in series. The more flip-flops, the more the circuit response to a change is delayed and the less likely the circuit is to fail due to metasta- bility. But the probability never goes to zero. Some degree of uncertainty for incor- rect operation always remains, however small. For a much more detailed discussion of metastability, see Wakerly’s Digital Design: Principles and Practices, 4th ed., 2006. 4-13 SYNCHRONOUS CIRCUIT PITFALLS Just because there is a clock does not mean that a circuit is synchronous. For example, in a ripple counter, such as in Figure 6-12, the clock drives at most one flip-flop clock input directly. All other clock inputs driving the flip-flops are actually state variables. So the changes in the state variables that are the outputs of these flip-flops are not synchronous with the clock. For a 16-bit ripple counter, in the worst case where all flip-flops change state, the most significant bit changes 16 flip-flop delays after the clock edge on the first flip-flop. Also, consider the synchronous counter in Figure 4-46. The 4-bit synchronous binary counter counts up by 1 whenever a positive edge occurs on Clock. When the count reaches 1111, the count up results in 0000. The binary counter also has an Asynchronous reset with drives the four asynchronous reset inputs to the internal flip-flops. When the reset shown becomes 0, it clears all four flip-flops to 0 with only the inherent time delays, i.e., independent of the positive clock edge. Due to the attached NAND gate and its connections, when the count become 0110 (6) in response to a positive edge, the NAND produces a 0, causing the four flip-flops to be cleared, giving 0000 (0). So the counter is supposed to count 0, 1, 2, 3, 4, 5, 0,.... But suppose that A2 goes to 0 a bit earlier than A1. Then the output of the NAND can go Clock 1 Count Binary counter Asynchronous reset A3 A2 A1 A0 FIGURE 4-46 Example of an Asynchronous Circuit References 279 complexity of descriptions, maximize the flexibility of representation, permit the use of default conditions, and provide a model that facilitates modeling of pragmatic designs. In addition, this model builds toward the use of hardware description lan- guages to model sequential circuits. As an alternative to the use of logic diagrams, state diagrams, and state tables, sequential circuits can be defined in VHDL or Verilog descriptions. These descrip- tions provide a powerful, flexible approach to sequential circuit specification for both simulation and automatic circuit synthesis. These representations involve pro- cesses that provide added descriptive power beyond the concurrent assignment statements of VHDL and the continuous assignment statement of Verilog. The pro- cesses, which permit program-like coding and use if-then-else and case conditional statements, can also be used to efficiently describe combinational logic. Finally, the timing parameters associated with flip-flops were presented, and the relationship between path delay in sequential circuits and clock frequency was established. Following this description, the important topics of synchronization of asynchronous signals, and metastability in synchronizing circuits were covered. REFERENCES 1. Bhasker, J. A Verilog HDL Primer, 2nd ed. Allentown, PA: Star Galaxy Press, 1999. 2. Ciletti, M. Advanced Digital Design with Verilog HDL. Upper Saddle River, NJ: Pearson Prentice Hall, 2003. 3. Ciletti, M. Starter’s Guide to Verilog 2001. Upper Saddle River, NJ: Pearson Prentice Hall, 2004. 4. Clare, C. R. Designing Logic Systems Using State Machines. New York: McGraw-Hill Book Company, 1973. 5. High-Speed CMOS Logic Data Book. Dallas: Texas Instruments, 1989. 6. IEEE Standard VHDL Language Reference Manual (ANSI/IEEE Std 1076- 1993; revision of IEEE Std 1076-1987). New York: The Institute of Electrical and Electronics Engineers, 1994. 7. IEEE Standard Description Language Based on the Verilog Hardware Description Language (IEEE Std 1364-1995). New York: The Institute of Electrical and Electronics Engineers, 1995. 8. Katz, R. H. and G. Borriello. Contemporary Logic Design, 2nd ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2005. 9. Mano, M. M. Digital Design, 3rd ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2002. 10. Palnitkar, S. Verilog HDL: A Guide to Digital Design and Synthesis, 2nd ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2003. 11. Pellerin, D. and D. Taylor. VHDL Made Easy! Upper Saddle River, NJ: Prentice Hall PTR, 1997. 12. Smith, D. J. HDL Chip Design. Madison, AL: Doone Publications, 1996. Problems 281 4-4. Clock and D waveforms, a D latch and an edge-triggered D flip-flop are shown in Figure 4-48. For both the latch and the flip-flop, carefully sketch the output waveform, Qi, obtained in response to the input waveforms. Assume that the propagation delay of the storage elements is negligible. Initially, all storage elements store 0. 4-5. A sequential circuit with two D flip-flops A and B, one input Y, and one output Z is specified by the following input equations: DA = BY + AY, DB = Y, Z = A B (a) Draw the logic diagram of the circuit. (b) Derive the state table. (c) Derive the state diagram. (d) Is this a Mealy or a Moore machine? 4-6. A sequential circuit with two D flip-flops A and B, two inputs X and Y, and one output Z is specified by the following input equations: DA = XA + X Y, DB = XB + XA, Z = XB (a) Draw the logic diagram of the circuit. (b) Derive the state table. (c) Derive the state diagram. (d) Is this a Mealy or a Moore machine? 4-7. *A sequential circuit has three D flip-flops A, B, and C, and one input X. The circuit is described by the following input equations: C D D Q1 C D with 1 Control D Q2 C Triggered D FIGURE 4-48 Waveforms and Storage Element for Problem 4-4 Problems 283 FIGURE 4-49 Circuit for Problem 4-11, Problem 4-40, Problem 4-41, Problem 4-49, Problem 4-50, and Problem 4-59 4-11. A sequential circuit has two D flip-flops, one input X, and one output Y. The logic diagram of the circuit is shown in Figure 4-49. Derive the state table and state diagram of the circuit. 4-12. A sequential circuit is given in Figure 4-13. (a) Add the necessary logic and/or connections to the circuit to provide an asynchronous reset to state A = 1, B = 0 for signal Reset = 0. (b) Add the necessary logic and/or connections to the circuit to provide a synchronous reset to state A = 0, B = 0 for signal Reset = 1. 4-13. *Design a sequential circuit with two D flip-flops A and B and one input X. When X = 0, the state of the circuit remains the same. When X = 1, the circuit goes through the state transitions from 00 to 10 to 11 to 01, back to 00, and then repeats. 4-14. The state diagram for a sequential circuit appears in Figure 4-50. X1X2/Z 00/0, 11/0 01/0, 10/1 A B 00/0, 01/0 Reset 00/1,01/0 01/1, 10/0 10/1, 11/1 00/1, 11/1 C D 10/1, 11/0 FIGURE 4-50 State Diagram for Problem 4-14 Problems 285 X Y3 D Y1 Y1 Y3 C Y1 R Y3 Y2 Y2 D Y2 Y3 C Y2 R D Z Clock C Y3 R Reset FIGURE 4-52 Circuit for Problem 4-16 4-17. A sequential circuit for a luggage lock has ten pushbuttons labeled 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Each pushbutton 0 through 9 produces a 1 on Xi, i = 0 through 9, respectively, with all other values on variable Xj, j ≠ i, equal to 0. Also, these ten pushbuttons produce a positive pulse on the clock C for clocking the flip-flops in the circuit. The circuitry that produces the Xi signals and the clock C has already been designed. The lock opens in response to a sequence of four Xi values, i = 0,... , 9, set by the user. The logic for connecting the four selected Xi values to variables Xa, Xb, Xc, and Xd has also been designed. The circuit is locked and reset to its initial state by pushing pushbutton Lock, which provides L, the asynchronous reset signal for the circuit. The lock is to unlock in response to the sequence Xa, Xb, Xc, Xd, regardless of all past inputs applied to it since it was reset. The circuit has a single Moore type output U which is 1 to unlock the lock, and 0 otherwise. Design the circuit with inputs Xa, Xb, Xc, and Xd, reset L, clock C, and output U. Use a one-hot code for the state assignment. Implement the circuit with D flip-flops and AND gates, OR gates, and inverters. 4-18. *A serial 2s complementer is to be designed. A binary integer of arbitrary length is presented to the serial 2s complementer, least significant bit first, on input X. When a given bit is presented on input X, the corresponding output bit is to appear during the same clock cycle on output Z. To indicate that a sequence is complete and that the circuit is to be initialized to receive another sequence, input Y becomes 1 for one clock cycle. Otherwise, Y is 0. Problems 287 Sequence on X without any stalls: 01111100111111100001011110101 Sequence on X with stalls: 0111111001111111100001011110101 Sequence on Z: 0111110001111101100001011110101 Sequence on S: 0000001000000010000000000000000 (a) Find the state diagram for the circuit. (b) Find the state table for the circuit and make a state assignment. (c) Find an implementation of the circuit using D flip-flops and logic gates. 4-23. In many communication and networking systems, the signal transmitted on the communication line uses a non-return-to-zero (NRZ) format. USB uses a specific version referred to as non-return-to-zero inverted (NRZI). A circuit that converts any message sequence of 0s and 1s to a sequence in the NRZI format is to be designed. The mapping for such a circuit is as follows: (a) If the message bit is a 0, then the NRZI format message contains an immediate change from 1 to 0 or from 0 to 1, depending on the current NRZI value. (b) If the message bit is a 1, then the NRZI format message remains fixed at 0 or 1, depending on the current NRZI value. This transformation is illustrated by the following example, which assumes that the initial value of the NRZI message is 1: Message: 10001110011010 NRZI Message: 10100001000110 (a) Find the Mealy model state diagram for the circuit. (b) Find the state table for the circuit and make a state assignment. (c) Find an implementation of the circuit using D flip-flops and logic gates. 4-24. +Repeat Problem 4-23, designing a sequential circuit that transforms an NRZI message into a normal message. The mapping for such a circuit is as follows: (a) If a change from 0 to 1 or from 1 to 0 occurs between adjacent bits in the NRZI message, then the message bit is a 0. (b) If no change occurs between adjacent bits in the NRZI message, then the message bit is a 1. 4-25. A pair of signals Request (R) and Acknowledge (A) is used to coordinate transactions between a CPU and its I/O system. The interaction of these signals is often referred to as a “handshake.” These signals are synchronous with the clock and, for a transaction, are to have their transitions always appear in the order shown in Figure 4-53. A handshake checker is to be designed that will verify the transition order. The checker has inputs, R and A, asynchronous reset signal, RESET, and output, Error (E). If the transitions in a handshake are in order, E = 0. If the transitions are out of order, then E becomes 1 and remains at 1 until the asynchronous reset signal (RESET = 1) is applied to the CPU. Problems 289 4-28. Repeat Problem 4-27 with D flip-flops using a Gray-code assignment. 4-29. +The state table for a 3-bit twisted ring counter is given in Table 4-15. This circuit has no inputs, and its outputs are the uncomplemented outputs of the flip-flops. Since it has no inputs, it simply goes from state to state whenever a clock pulse occurs. It has an asynchronous reset that initializes it to state 000. (a) Design the circuit using D flip-flops and assuming that the unspecified next states are don’t-care conditions. (b) Add the necessary logic to the circuit to initialize it to state 000 on power-up master reset. (c) In the subsection “Designing with Unused States” of Section 4-5, three techniques for dealing with situations in which a circuit accidentally enters an unused state are discussed. If the circuit you designed in parts (a) and (b) is used in a child’s toy, which of the three techniques given would you apply? Justify your decision. (d) Based on your decision in part (c), redesign the circuit if necessary. (e) Repeat part (c) for the case in which the circuit is used to control engines on a commercial airliner. Justify your decision. (f) Repeat part (d) based on your decision in part (e). 4-30. Do an automatic logic simulation-based verification of your design in Problem 4-14. The input sequence used in the simulation should include all transitions in Figure 4-50. The simulation output should include the input X and the state variables A, B, and output Z. 4-31. *Generate a verification sequence for the circuit described by the state table in Table 4-14. To reduce the length of the simulation sequence, assume that the simulator can handle X inputs and use X’s whenever possible. Assume that a Reset input is available to initialize the state to A = 0, B = 0 and that all transitions in the state diagram must be exercised. TABLE 4-15 State Table for Problem 4-29 Present State Next State ABC ABC 000 100 100 110 110 111 111 011 011 001 001 000 Problems 291 and find an implementation for the corresponding sequential circuit using D flop-flops, AND gates, OR gates, and inverters. 4-36. Design the sequential circuit for the state-machine diagram from Problem 4-35. Use a one-hot state assignment, D flip-flops and AND gates, OR gates, and inverters. 4-37. (a) Verify that the transitions in the state-machine diagram in Figure 4-27 obey the two transition conditions for state diagrams. (b) Repeat part (a) for the state-machine diagram in Figure 4-28. 4-38. *You are to find the state-machine diagram for the following electronic vending-machine specification. The vending machine sells jawbreaker candy, one jawbreaker for 25¢. The machine accepts N (nickels = 5¢), D (dimes = 10¢), and Q (quarters = 25¢). When the sum of the coins inserted in sequence is 25¢ or more, the machine dispenses one jawbreaker by making DJ equal to 1 and returns to its initial state. No change is returned DJ equals 0 for all other states. If anything less than 25¢ is inserted and the CR (Coin Return) pushbutton is pushed, then the coins deposited are returned through the coin return slot by making RC equal to 1, after which the machine returns to its initial state. RC equals 0 in all other states. Use Moore outputs for your design. 4-39. You are to find the state-machine diagram for the following electronic vending-machine specification. The vending machine sells soda for $1.50 per bottle. The machine accepts only D ($1 bills) and Q (quarters = 25¢). When the sum of money is greater than $1.50, i.e., two $1 bills, the machine returns change in the coin return (two quarters). When $1.50 has been paid, the machine lights an LED to indicate that a soda flavor may be selected. The choices by pushbutton are C (Cola), L (Lemon soda), O (Orange soda), and R (Root Beer). When one pushbutton is pushed, the selected soda is dispensed and the machine returns to its initial state. One other feature is that an LED comes on to warn the user that two quarters are not available for change, so if a second $1 bill is inserted, no change will be given. (a) Find the state-machine diagram for the soda vending machine as specified. (b) The specification as given is not very user friendly. Rewrite it to provide a remedy for every possible situation that the user might encounter in using the machine. All files referred to in the remaining problems are available in ASCII form for simulation and editing on the Companion Website for the text. A VHDL or Verilog compiler/simulator is necessary for the problems or portions of problems requesting simulation. Descriptions can still be written, however, for many of the problems without using compilation or simulation. 4-40. Write a gate-level structural VHDL description for the circuit from Problem 4-11. Use the VHDL model for a D flip-flop from Figure 4-29. Use the package func_prims in library lcdf_vhdl for the logic gate components. C H A P T E R 5 Digital Hardware Implementation o this point, we have studied the basics of design of combinational and sequential T circuits. This chapter covers selected topics that are very important to our understanding of contemporary design. It begins by characterizing logic gates and circuits with a particular focus on complementary metal oxide semiconductor (CMOS) technology. Then basic programmable logic device (PLD) technologies are covered. This coverage includes read-only memories (ROMs), programmable logic arrays (PLAs), programmable array logic (PAL®), and Field Programmable Gate Array (FPGA) devices. In the generic computer shown at the beginning of Chapter 1, CMOS technology forms the foundation for realization of most of the integrated circuits. Finally, programmable technologies are essential to eficient design in various parts of the computer and to the ability to upgrade systems in the ield. A good example of this latter feature is updating of the operating system (OS) stored in programmable ROM in smart phones and other embedded devices, and the BIOS (Basic Input Output System) in a laptop computer. 5-1 THE DESIGN SPACE For a given design, there is typically a target implementation technology that speci- fies the primitive elements available and their properties. The design space describes the target technologies and the parameters used to characterize them. Integrated Circuits Digital circuits are constructed with integrated circuits. An integrated circuit (abbre- viated IC) is a silicon semiconductor crystal, informally called a chip, containing the 295 296 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION electronic components for the digital gates and storage elements. The various com- ponents are interconnected on the chip. The chip is mounted in a ceramic or plastic container, and connections are welded from the chip to the external pins to form the integrated circuit. The number of pins may range from 14 on a small IC package to several hundred on a large package. Each IC has an alphanumeric designation printed on the surface of the package for identification. Each vendor publishes datasheets or a catalog containing the description and all the necessary information about the ICs that it manufactures. Typically, this information is available on vendor websites. levelS of integration As IC technology has improved, the number of gates present in a single silicon chip has increased considerably. Customary reference to a pack- age as being either a small-, medium-, large-, or very-large-scale integrated device is used to differentiate between chips with just a few internal gates and those with thousands to hundreds of millions of gates. Small-scale integrated (SSI) devices contain several independent primitive gates in a single package. The inputs and outputs of the gates are connected directly to the pins in the package. The number of gates is usually less than 10 and is limited by the number of pins available on the IC. Medium-scale integrated (MSI) devices have approximately 10 to 100 gates in a single package. They usually perform specific elementary digital functions, such as the addition of four bits. MSI digital functions are similar to the functional blocks described in Chapter 3. Large-scale integrated (LSI) devices contain between 100 and a few thousand gates in a single package. They include digital systems such as small processors, small memories, and programmable modules. Very-large-scale integrated (VLSI) devices contain several thousand to hun- dreds of millions of gates in a single package. Examples are complex microprocessor and digital signal-processing chips. Because of their small transistor dimensions, high density, and comparatively low cost, VLSI devices have revolutionized digital system and computer design. VLSI technology enables designers to create complex struc- tures that previously were not economical to manufacture. CMOS Circuit Technology Digital integrated circuits are classified not only by their function, but also by their specific implementation technology. Each technology has its own basic electronic device and circuit structures upon which more complex digital circuits and functions are developed. The specific electronic devices used in the construction of the basic circuits provide the name for the technology. Currently, silicon-based complementa- ry metal oxide semiconductor (CMOS) technology dominates due to its high circuit density, high performance, and low power consumption. Some manufacturers are now using SOI (silicon on insulator) technology, which is a variant of CMOS in which an insulating material (silicon dioxide) isolates the transistors from the base silicon wafer. Alternative technologies based on gallium arsenide (GaAs) and sili- con germanium (SiGe) are also used selectively for very high-speed circuits. 5-1 / The Design Space 297 So far we have dealt largely with implementing logic circuits in terms of gates. Here we diverge briefly into electronic circuits made of electronic devices called transistors that implement the gates. For very high-performance logic or logic with specialized properties, CMOS electronic-circuit-level design is important, since to achieve the very highest performance, it is sometimes necessary to design directly from the Boolean equations to the circuit level, bypassing the logic-gate level. Also, it is important to realize that there is a circuit design process that is critical to pro- duction of the logic gates used in design. cMoS tranSiStor The foundation for CMOS technology is the MOS (metal-oxide semiconductor) transistor. Transistors and the interconnections between them are fabricated as elements of an integrated circuit die, less formally referred to as a chip. Each rectangular die is cut from a very thin slice of crystalline silicon called a wafer. In the most modern fabrication facilities for making integrated circuits, wafers are typically 300 mm (about one foot) in diameter. A sketch of a transistor is shown in Figure 5-1(a). In this sketch, the transistor has been sliced on a vertical plane through the integrated circuit chip on which it lies. In addition, the fabrication steps that form the interconnections between transistors and the protective covering over the chip have not yet occurred, leaving the transis- tor exposed. The substrate is the basic wafer material. The fabrication process has modified the substrate to be highly conductive in the source and the drain regions of the transistor. The conductive polysilicon gate has been deposited on top of a very thin insulating layer of silicon dioxide. The resulting structure consists of two identi- cal conductive regions, the source and the drain, with a gap in between that lies under the gate. This gap is referred to as the channel. To give a sense of the size of the tran- sistor, the channel length in Intel’s most recent technology is 14 nanometers (14 * 10-9 meters), with 10 nanometer technology expected to be available in the near future. This ranges from approximately 1/1200 to 1/13000 of the diameter of a human hair, depending on the variability of the hair size. In the normal operation of an n-channel MOS transistor, the drain is by defini- tion at a higher voltage than the source. When the gate voltage is at least the thresh- old voltage of the transistor above the source voltage, and the drain voltage is sufficiently above the source voltage, a thin layer of the substrate just below the thin gate insulation becomes a conducting layer between the source and the drain. This permits a current to flow between the source and the drain. In this case, the transistor is said to be ON. If the gate-to-source voltage is less than the threshold voltage, the channel will be absent, blocking significant current flow. Under this condition, the transistor is said to be OFF. The use of ON and OFF refers to the present or absence of current flow between the source and the drain, respectively. Use of this terminol- ogy brings to mind the ON/OFF behavior of a switch. As a consequence, a switchlike behavior is a good first-order model for an MOS transistor. cMoS tranSiStor MoDelS The behavior of the MOS transistor model depends on the transistor type. CMOS technology employs two types of transistor: n-channel and p-channel. The behavior described in the preceding paragraph is that of an n-channel transistor. The two transistor types differ in the characteristics of the semiconductor 298 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION G (Gate) S (Source) D (Drain) Channel length Location of conducting layer Substrate (a) Transistor geometry D G X X: X: X S (b) (c) (d) Transistor symbols and models: n-channel S G X X: X: X D (e) (f) (g) Transistor symbols and models: p-channel FIGURE 5-1 MOS Transistor, Symbols, and Switch Models materials used in their implementation and in the mechanism governing the conduc- tion of a current through them. Most important to us, however, is their difference in behavior. We will model this behavior using switches controlled by voltages corre- sponding to logic 0 and logic 1. Such a model ignores the complexity of electronic devices and captures only logical behavior. The symbol for an n-channel transistor is shown in Figure 5-1(b). The transistor has three terminals: the gate (G), the source (S), and the drain (D), as shown. Here we make the usual assumption that a 1 represents the H voltage range and a 0 rep- resents the L voltage range. The notion of whether a path for current to flow exists is easily modeled by a switch, as shown in Figure 5-1(c). The switch consists of two fixed terminals corresponding to the S and D terminals of the transistor. In addition, there is a movable contact that, depending on its position, determines whether the switch is open or closed. The position of the contact is controlled by the voltage 5-1 / The Design Space 299 applied to the gate terminal G. Since we are looking at logic behavior, this control voltage is represented on the symbol by the input variable X on the gate terminal. For an n-channel transistor, the contact is open (no path exists) for the input variable X equal to 0 and closed (a path exists) for the input variable X equal to 1. Such a con- tact is traditionally referred to as being normally open, that is, open without a posi- tive voltage applied to activate or close it. Figure 5-1(d) shows a shorthand notation for the n-channel switch model with the variable X applied. This notation represents the fact that a path between S and D exists for X equal to 1 and does not exist for X equal to 0. The symbol for a p-channel transistor is shown in Figure 5-1(e). The positions of the source S and drain D are seen to be interchanged relative to their positions in the n-channel transistor. The voltage applied between the gate G and the source S determines whether a path exists between the drain and source. Note that the nega- tion indicator or bubble appears as a part of the symbol. This is because, in contrast to the behavior of an n-channel transistor, a path exists between S and D in the p-channel transistor for input variable X equal to 0 (at value L) and does not exist for input variable X equal to 1 (at value H). This behavior is represented by the model in Figure 5-1(f), which has a normally closed contact through which a path exists for X equal to 0. No path exists through the contact for X equal to 1. In addition, the short- hand notation of the p-channel switch model with variable X applied is given in Figure 5-1(g). Since a 0 on input X causes a path to exist through the switch and a 1 on X produces no path, the literal shown on the switch is X instead of X. circuitS of SwitcheS A circuit made up of switches that model transistors can be used to design CMOS logic. The circuit implements a function F if there is a path through the circuit for F equal to 1 and no path through the circuit for F equal to 0. A simple circuit of p-channel transistor switch models is shown in Figure 5-2(a). The function G1 implemented by this circuit can be determined by finding the input com- binations for which a path exists through the circuit. In order for the path to ex- ist through G1, both switches must be closed—that is, the path exists for X and Y both 1. This implies that X = 0 and Y = 0. Thus, the function G1 of the circuit is X # Y = X + Y —in other words, the NOR function. In Figure 5-2(b), for function G2, a path exists through the n-channel switch model circuit if either switch is closed— that is, for X = 1 or Y = 1. Thus, the function G2 is X + Y. G1 G2 X: X X: X Y: Y Y: Y (a) (b) FIGURE 5-2 Example of Switch Model Circuits 300 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION In general, switches in series give an AND function and switches in parallel an OR function. (The function for the preceding circuit that models p-channel transis- tors is a NOR function because of the complementation of the variables and the application of DeMorgan’s law.) By using these circuit functions to produce paths in a circuit that attach logic 1 (H) or logic 0 (L) to an output, we can implement a logic function on the output, as discussed next. fully coMpleMentary cMoS circuitS The subfamily of CMOS circuits that we will now consider has the general structure shown in Figure 5-3(a). Except during transi- tions, there is a path to the output of the circuit F either from the power supply +V (logic 1) or from ground (logic 0). Such a circuit is called static CMOS. In order to have a static circuit, the transistors must implement circuits of switches for both function F and function F. In other words, both the 0s and the 1s of the function F must be implemented with paths through circuits. The switch circuit implementing F is constructed using p-channel transistors and connects the circuit output to logic 1. We use p-channel transistors because they conduct logic-1 values better than logic- 0 values. The switch circuit implementing F is constructed using n-channel transis- tors and connects the circuit output to logic 0. Here n-channel transistors are used because they conduct logic-0 values better than logic-1 values. Note that the same input variables enter both the p-channel and n-channel switch circuits. To illustrate a fully complementary circuit, we use transistors corresponding to the circuits G1 and G2 from Figure 5-2(a) and (b) as the p-channel implementation of G and the n-channel implementation of G, respectively, in Figure 5-3(b). A path exists through G1 for X + Y = 1, which means that a path exists in Figure 5-3(b) from logic 1 to the circuit output, making G = 1 for X + Y = 1. This provides the 1s on the output for the function G. A path exists through G2 for X + Y = 1, which means that a path exists in Figure 5-3(b) from logic 0 to the output for X + Y = X + Y = 1. This path makes G = 0 for the complement of X + Y. Thus, the n-channel circuit implements G. This provides the 0s on the output for func- tion G. Since both the 1s and 0s are provided for G, we can say that the circuit output G = X + Y , which is a NOR gate. This is the standard static CMOS implementation for a NOR. Since the NAND is just the dual of the NOR, we can implement the CMOS NAND by simply replacing the + by · in the equations for G1 and G2. In terms of the switch circuit, the dual of switches in series is switches in parallel and vice versa. This duality applies to the transistors that are modeled as well, giving the NAND implementation in Figure 5-3(c). The final gate in Figure 5-3(d) is the implementa- tion of the NOT. Note that all of the circuits in Figure 5-3 implement inverting functions under DeMorgan’s laws. This inversion property is characteristic of CMOS gates. In fact, as we look at a general design procedure, we assume that functions are implemented using F = F. This avoids working directly with p-channel switches, which involve complementing variables. Thus, we will design the n-channel circuit for F and take the dual to get the p-channel circuit for F. For functions more complex than NAND, NOR, and NOT, the resulting circuits are called complex gates. logic 1 +V +V From G1 +V F using p-type +V transistors X !Y F X X G=X+Y X1 X F X2 using X n-type Y Xn transistors Y logic 0 From G2 (a) General structure (b) NOR (c) NAND (d) NOT FIGURE 5-3 Fully Complementary CMOS Gate Structure and Examples 5-1 / The Design Space 301 302 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION Design of complex gates, using a general design procedure, and transmission gates and their applications are covered in the supplement entitled More CMOS Circuit- Level Design appearing on the text Companion Website. Technology Parameters For each specific implementation technology, there are details that differ in their electronic circuit design and circuit parameters. The most important parameters used to characterize an implementation technology follow: Fan-in specifies the number of inputs available on a gate. Fan-out specifies the number of standard loads driven by a gate output. Maxi- mum fan-out for an output specifies the fan-out that the output can drive with- out impairing gate performance. Standard loads may be defined in a variety of ways depending upon the technology. Noise margin is the maximum external noise voltage superimposed on a nor- mal input value that will not cause an undesirable change in the circuit output. Cost for a gate specifies a measure of its contribution to the cost of the inte- grated circuit containing it. Propagation delay is the time required for a change in value of a signal to prop- agate from input to output. The operating speed of a circuit is inversely related to the longest propagation delays through the gates of the circuit. Power consumption (dissipation) is the power drawn from the power supply and consumed by the gate. The power consumed is dissipated as heat, so the power dissipation must be considered in relation to the operating temperature and cooling requirements of the chip. For battery-powered systems such as smart phones, the power consumption of the integrated circuits will determine the battery life of the system. Although all of these parameters are important to the designer, further details on only selected parameters are provided here. Because of their major importance to the design process, propagation delay and circuit timing have already been discussed in Chapters 2 and 4. fan-in For high-speed technologies, fan-in, the number of inputs to a gate, is of- ten restricted on gate primitives to no more than four or five. This is primarily due to electronic considerations related to gate speed. To build gates with large fan-in, interconnected gates with lower fan-in are used during technology mapping. A map- ping for a 7-input NAND gate illustrated in Figure 5-4 is made up of two 4-input NANDs and an inverter. FIGURE 5-4 Implementation of a 7-Input NAND Gate Using NAND Gates with Four or Fewer Inputs 5-1 / The Design Space 303 fan-out One approach to measuring fan-out is the use of a standard load. Each input on a driven gate provides a load on the output of the driving gate which is measured in standard load units. For example, the input to a specific inverter can have a load equal to 1.0 standard load. If a gate drives six such inverters, then the fan-out is equal to 6.0 standard loads. In addition, the output of a gate has a max- imum load that it can drive, called its maximum fan-out. The determination of the maximum fan-out is a function of the particular logic family. Our discussion will be restricted to CMOS, currently the most popular logic family. For CMOS gates, the loading of a gate output by the integrated circuit wiring and the inputs of other gates is modeled as a capacitance. This capacitive loading has no effect on the logic levels, as loading often does for other families. Instead, the load on the output of a gate determines the time required for the output of the gate to change from L to H and from H to L. If the load on the output is increased, then this time, called the transition time, increases. Thus, the maximum fan-out for a gate is the number of standard loads of capacitance that can be driven with the transition time no greater than its maximum allowable value. For example, a gate with a maximum fan-out of 8 standard loads could drive up to 8 inverters that present 1.0 standard load on each of their inputs. Both fan-in and fan-out must be dealt with in the technology-mapping step of the design process. Gates with fan-ins larger than those available for technology mapping can be implemented with multiple gates. Gates with fan-outs that either exceed their maximum allowable fan-out or produce too high a delay need to be replaced with multiple gates in parallel or have buffers added at their outputs. coSt For integrated circuits, the cost of a primitive gate is usually based on the area occupied by the layout cell for the circuit. The layout-cell area is proportional to the size of the transistors and the wiring in the gate layout. Ignoring the wiring area, the area of the gate is proportional to the number of transistors in the gate, which in turn is usually proportional to the gate-input cost. If the actual area of the layout is known, then a normalized value of this area provides a more accurate estimation of cost than gate-input cost. From a system standpoint, as important as the manufacturing cost per primi- tive logic gate is the overall cost to design, verify, and test the integrated circuit. Designing an integrated circuit with millions of transistors and bringing it to market requires a large team of engineers and considerable non-recurring engineering (NRE) costs, one-time costs that will be incurred no matter how many units of the product are manufactured. In contrast to NRE costs, production costs are those costs that are incurred for each unit of the product that is built, based upon the labor, materials, and energy required to manufacture the unit. The NRE costs are amortized over the product volume, which is the total number of units that are manufactured. For a low volume product, the NRE costs of designing the integrated circuit can be much larger than the per-unit production costs. As an alternative to a custom inte- grated circuit, low volume products are often based on programmable logic devices, described in Section 5-2. The NRE costs for the programmable devices are much lower than for custom integrated circuits, as is the time required to bring the device to market. The disadvantages of programmable devices relative to custom integrated circuits include larger propagation delays (lower performance) and higher per-unit 304 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION cost. Consequently, choosing between a fully custom integrated circuit and a pro- grammable device requires understanding the required performance and the esti- mated product volume in order to design a product that will be profitable. 5-2 PROGRAMMABLE IMPLEMENTATION TECHNOLOGIES Thus far, we have introduced implementation technologies that are fixed in the sense that they are fabricated as integrated circuits or by connecting together integrated circuits. In contrast, programmable logic devices (PLDs) are fabricated with struc- tures that implement logic functions and structures that are used to control connec- tions or to store information specifying the actual logic functions implemented. These latter structures require programming, a hardware procedure that determines which functions are implemented. The next four subsections deal with four types of basic programmable logic devices (PLDs): the read-only memory (ROM), the pro- grammable logic array (PLA), the programmable array logic (PAL®) device, and the field programmable gate array (FPGA). Before treating PLDs, we deal with the supporting programming technologies. The oldest of the programming technologies include fuses and anti-fuses. Fuses which are initially CLOSED are selectively “blown out” by a higher than normal voltage to established OPEN connections. The pattern of OPEN and CLOSED fuses establishes the connections defining the logic. Anti-fuses, the opposite of fuses, con- tain a material that is initially nonconducting (OPEN). Anti-fuses are selectively CLOSED by applying a higher-than-normal voltage to provide a pattern of OPEN and CLOSED anti-fuses to define the logic. A third programming technology for controlling connections is mask program- ming, which is done by the semiconductor manufacturer during the last steps of the chip fabrication process. Connections are made or not made in the metal layers serv- ing as conductors in the chip. Depending on the desired function for the chip, the structure of these layers is determined by the fabrication process. All three of the preceding connection technologies are permanent. The devices cannot be reprogrammed, because irreversible physical changes have occurred as a result of device programming. Thus, if the programming is incorrect or needs to be changed, the device must be discarded. The fourth programming technology which is very popular in large VLSI PLDs is a single-bit storage element driving the gate of an MOS n-channel transistor at the programming point. If the stored bit value is a 1, then the transistor is turned ON, and the connection between its source and drain forms a CLOSED path. For the stored bit value equal to 0, the transistor is OFF, and the connection between its source and drain is an OPEN path. Since storage element content can be changed electronically, the device can be easily reprogrammed. But in order to store values, the power supply must be available. Thus, the storage element technology is volatile— that is, the programmed logic is lost in the absence of the power-supply voltage. The fifth and final programming technology we are considering is control of transistor switching. This popular technology is based on storing charge on a floating gate. The latter is located below the regular gate within an MOS transistor and is com- pletely isolated by an insulating dielectric. Stored negative charge on the floating gate 5-2 / Programmable Implementation Technologies 305 makes the transistor impossible to turn ON. The absence of stored negative charge makes it possible for the transistor to turn ON if a HIGH is applied to its regular gate. Since it is possible to add or remove the stored charge, these technologies can permit erasure and reprogramming. Two approaches using control of transistor switching are called erasable and electrically erasable. Programming applies combinations of voltage higher-than- normal power-supply voltages to the transistor. Erasure uses exposure to a strong ultraviolet light source for a specified amount of time. Once this type of chip has been erased, it can be reprogrammed. An electrically erasable device can be erased by a process somewhat similar to the programming process, using voltages higher than the normal power-supply value. A third approach is flash technology, which is very widely used in flash memories. Flash technology is a form of electrically erasable technology that has a variety of erase options, including the erase of stored charge from individ- ual floating gates, all of the floating gates, or specific subsets of floating gates. Some, but not all, programmable-logic technologies have high fan-in gates. In order to show the internal logic diagram for such technologies in a concise form, we use a special gate symbology applicable to array logic. Figure 5-5 shows the conven- tional and array logic symbols for a multiple-input OR gate. Instead of having multiple input lines to the gate, we draw a single line to the gate. The input lines are drawn per- pendicular to this line and are selectively connected to the gate. If an x is present at the intersection of two lines, there is a connection (CLOSED). If an x is not present, then there is no connection (OPEN). In a similar fashion, we can draw the array logic for an AND gate. We next consider three distinct programmable device structures. We will describe each and indicate which of the technologies is typically used in its imple- mentation. These types of PLDs differ in the placement of programmable connec- tions in the AND-OR array. Figure 5-6 shows the locations of the connections for the three types. Programmable read-only memory (PROM) as well as flash memory has a fixed AND array constructed as a decoder and programmable connections for the output OR gates. This forms what appears to be a structure for implementing sum- of-minterm equations for the outputs. It also can be thought of as implementing a truth table (connections to OR gates for 1s and no connections to an OR gates for 0s). Also the ROM can be viewed as a memory in which the outputs provide words of binary data that are selected by the inputs applied to the decoder. The pro- grammable array logic (PAL®) device has a programmable connection AND array and a fixed OR array. The AND gates are programmed to provide the product terms for the Boolean functions, which are logically summed in each OR gate. The most flexible of the three types of PLD is the programmable logic array (PLA), which has programmable connections for both AND and OR arrays. The product terms in the (a) Conventional symbol (b) Array logic symbol FIGURE 5-5 Conventional and Array Logic Symbols for OR Gate 306 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION Fixed Programmable Inputs Programmable Outputs AND array Connections OR array (decoder) (a) Programmable read-only memory (PROM) Programmable Programmable Fixed Inputs Outputs Connections AND array OR array (b) Programmable array logic (PAL) device Programmable Programmable Programmable Programmable Inputs Outputs Connections AND array Connections OR array (c) Programmable logic array (PLA) device FIGURE 5-6 Basic Configuration of Three PLDs AND array may be shared by any OR gates to provide the required sum-of-products implementations. The names PLA and PAL® emerged for devices from different vendors during the development of PLDs. Read-Only Memory A read-only memory (ROM) is essentially a device in which “permanent” binary information is stored. The information must be specified by the designer and is then embedded into the ROM to form the required interconnection or electronic device pattern. Once the pattern is established, it stays within the ROM even when power is turned off and on again—that is, ROM is nonvolatile. A block diagram of a ROM device is shown in Figure 5-7(a). There are k inputs and n outputs. The inputs provide the address for the memory, and the outputs give the data bits of the stored word that is selected by the address. The number of words in a ROM device is determined from the fact that k address input lines can specify 2k words. Note that ROM does not have data inputs, because it does not have a write operation. Integrated circuit ROM chips have one or more enable inputs and come with three-state outputs to facilitate the construction of large arrays of ROM. Permanent and reprogrammable ROMs are also included in VLSI circuits including processors. Consider, for example, a 32 * 8 ROM. The unit consists of 32 words of 8 bits each. There are five input lines that form the binary numbers from 0 through 31 for the address. Figure 5-7(b) shows the internal logic construction of this ROM. The five inputs are decoded into 32 distinct outputs by means of a 5–to–32-line decoder. 5-2 / Programmable Implementation Technologies 307 k inputs (address) 2k x n ROM n outputs (data) (a) 0 1 x x x x I0 2 I1 3 5–to–32. I2. decoder. I3 28 I4 29 30 31 A7 A6 A5 A4 A3 A2 A1 A0 1 0 0 1 0 0 1 1 (b) FIGURE 5-7 Block Diagram and Internal Logic of a ROM Each output of the decoder represents a memory address. The 32 outputs are con- nected through programmable connections to each of the eight OR gates. The dia- gram uses the array logic convention used in complex circuits. (See Figure 5-5.) Each OR gate must be considered as having 32 inputs. Each output of the decoder is con- nected by a programming technology to one of the inputs of each OR gate. The ROM in Figure 5-7(b) is programmed with the word 10010011 in memory address 1. Since each OR gate has 32 internal programmable connections, and since there are eight OR gates, the ROM contains 32 * 8 = 256 programmable connections. In general, a 2k * n ROM will have an internal k–to–2k–line decoder and n OR gates. Each OR gate has 2k inputs, which are connected through programmable connec- tions to each of the outputs of the decoder. Depending on the programming technology and approaches, read-only memo- ries have different names: 1. ROM—mask programmed, 2. PROM—fuse or anti-fuse programmed, 3. EPROM—erasable floating gate programmed, 308 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION 4. EEPROM or E2 PROM—electrically erasable floating gate programmed, and 5. FLASH Memory—electrically erasable floating gate with multiple erasure and programming modes. The choice of programming technology depends on many factors, including the num- ber of identical ROMs to be produced, the desired permanence of the programming, the desire for reprogrammability, and the desired performance in terms of delay. ROM programming typically uses programming software that isolates the user from the details. A ROM stores computer programs, in which case the binary code produced by the usual programming tools such as compilers and assemblers is placed in the ROM. Otherwise, it can be programmed by tools that accept input, such as truth tables, Boolean equations, and hardware description languages. It can also, as in the case of FLASH memory, accept binary patterns representing, for example, photographs taken by a digital camera. In all of these cases, the input is transformed to a pattern of OPEN and CLOSED connections to the OR gates needed by the programming technology. Programmable Logic Array The programmable logic array (PLA) is similar in concept to the ROM, except that the PLA does not provide full decoding of the variables and does not generate all the minterms. The decoder is replaced by an array of AND gates that can be pro- grammed to generate product terms of the input variables. The product terms are then selectively connected to OR gates to provide the sum of products for the required Boolean functions. The internal logic of a PLA with three inputs and two outputs is shown in Figure 5-8. Such a circuit is too small to be cost effective but is presented here to demonstrate the typical logic configuration of a PLA. The diagram uses the array logic graphics symbols for complex circuits. Each input goes through a buffer and an inverter, represented in the diagram by a composite graphics symbol that has both the true and the complement outputs. Programmable connections run from each input and its complement to the inputs of each AND gate, as indicated by the inter- sections between the vertical and horizontal lines. The outputs of the AND gates have programmable connections to the inputs of each OR gate. The output of the OR gate goes to an XOR gate, where the other input can be programmed to receive a signal equal to either logic 1 or logic 0. The output is inverted when the XOR input is connected to 1 (since X ⊕ 1 = X ). The output does not change when the XOR input is connected to 0 (since X ⊕ 0 = X ). The particular Boolean functions imple- mented in the PLA of the figure are F1 = AB + AC + ABC F2 = AC + BC The product term is determined by the CLOSED connections from the input or their complements to the AND gates. The output of an OR gate gives the logic sum of the selected product terms as determined by the CLOSED connections 5-2 / Programmable Implementation Technologies 309 A B C X X 1 X AB X X 2 X X AC X Closed Open X X 3 X BC X X X 4 X ABC C C B B A A X 0 X 1 F1 F2 FIGURE 5-8 PLA with Three Inputs, Four Product Terms, and Two Outputs from the AND gate outputs to the OR gate inputs. The output may be comple- mented or left in its true form, depending on the programming of the connection associated with the XOR gate. Due to this structure, the PLA implements sum-of- products or complemented sum-of-products functions. Product terms can be shared between the functions, since the same AND gate can be connected to multiple OR gates. The size of a PLA is specified by the number of inputs, the number of product terms, and the number of outputs. For n inputs, k product terms, and m outputs, the internal logic of the PLA consists of n buffer-inverter gates, k AND gates, m OR gates, and m XOR gates. There are 2n * k programmable connections between the inputs and the AND array, k * m programmable connections between the AND and OR arrays, and m programmable connections associated with the XOR gates. The information needed to program a PLA are the CLOSED connections from true or complemented inputs, the CLOSED connections between AND gates and OR gates, and whether or not the sum of products form is inverted or not. As with the ROM, a variety of input forms may be acceptable to the tools that generate this information. Here we focus on implementing logic, so we consider only inverted or noninverted sum-of-products equations as the user input. 310 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION coMBinational circuit iMpleMentation uSing a pla A careful investigation must be undertaken in order to reduce the number of distinct product terms, so that the size of the PLA can be minimized. Fewer product terms can be achieved by simplifying the Boolean function to a minimum number of terms. The number of literals in a term is less important, since all the input variables are available to each term anyway. It is wise, however, to avoid extra literals, as these may cause problems in testing the circuit and may reduce the speed of the circuit. An important factor in obtaining a minimum number of product terms is the sharing of terms between the outputs. Both the true and complement forms of each function should be simplified to see which one can be expressed with fewer product terms and which one provides prod- uct terms that are common to other functions. So, in terms of preparing equations to be implemented in a PLA, multiple-output, two-level function optimization is the approach needed and often incorporated in PLA design software. While we have not covered this process formally, we can informally illustrate it by using K-maps. This process is illustrated in Example 5-1. EXAMPLE 5-1 Implementing a Combinational Circuit Using a PLA Implement the following two Boolean functions with a PLA: F1(A, B, C) = Σm(3, 5, 6, 7) F2(A, B, C) = Σm(1, 2, 3, 7) In Figure 5-9, using K-maps, two-level single-output optimization is applied to functions F1 and F2 with the resulting prime implicants used appearing in black. The resulting equations appear directly below the two maps. Product term BC can be shared between the two functions, so a total of five product terms are required. By considering the complement of F1 and the true form of F2, one discovers that there are two nonprime implicants, shown as blue squares, that can be used in both func- tions. The solutions that share these terms are given on the next line below the maps. Using the implicants in blue, the solution obtained is: F1 = A BC + ABC + B C F2 = A BC + ABC + BC F1 B F2 BC B BC A 00 01 11 10 A 00 01 11 10 0 0 0 1 0 0 0 1 1 1 A 1 0 1 1 1 A 1 0 0 1 0 C C F1 ! AB " AC " BC F2 ! AB " AC " BC F1 ! ABC " ABC " BC F2! ABC " ABC " BC FIGURE 5-9 K-Maps and Expressions for PLA Example 5-1 5-2 / Programmable Implementation Technologies 311 Because of the bar over all of F1, a 1 must be applied to the control input of the output XOR gate. This solution requires only four AND gates. It requires both the use of an output inversion and multiple-output, two-level optimization to achieve this minimum-cost solution. The two implicants that are shared would nor- mally result from the process of generating prime implicants for multiple-output optimization. Programmable Array Logic Devices The programmable array logic (PAL®) device is a PLD with a fixed OR array and a programmable AND array. Because only the AND gates are programmable and cannot be shared by multiple functions, design for the PAL device is easier, but is not as flexible as that for the PLA. Figure 5-10 presents the logic configuration of a typi- cal programmable array logic device. The particular device shown has four inputs and four outputs. Each input has a buffer-inverter gate, and each output is generated by a fixed OR gate. The device has four sections, each composed of a three-wide AND-OR array, meaning that there are three programmable AND gates in each section. Each AND gate has ten programmable input connections, indicated in the diagram by ten vertical lines intersecting each horizontal line. The horizontal line symbolizes the multiple-input configuration of an AND gate. One of the outputs shown is connected to a buffer-inverter gate and then fed back into the inputs of the AND gates through programmed connections. This is often done with all device out- puts. Since the number of AND terms is not large, these paths permit the output of a PAL AND-OR circuit to be used as inputs to other PAL AND-OR circuits. This pro- vides the capability to implement a limited variety of multiple-level circuits, which among other advantages increases the number of AND gates available for a given function. Commercial PAL devices contain more gates than the one shown in Figure 5-10. A small PAL integrated circuit may have up to eight inputs, eight outputs, and eight sections, each consisting of an eight-wide AND-OR array. Each PAL device output is driven by a three-state buffer and also serves as an input. These input/outputs can be programmed to be an input only, an output only, or bidirectional with a variable signal driving the three-state buffer enable signal. Flip-flops are often included in a PAL device between the array and the three- state buffer at the outputs. Since each output is fed back as an input through a buffer-inverter gate into the AND programmed array, a sequential circuit can be easily implemented. coMBinational circuit iMpleMentation with a pal Device In designing with a PAL device, because of the inability to share AND gates within a basic circuit, single- output, two-level optimization applies. But because of the connections from out- puts to inputs, multilevel functions are easy to implement, so limited multilevel optimization and the sharing of sum-of-products forms and the complement of sum- of-products forms applies also. Unlike the arrangement in the PLA, a product term cannot be shared among two or more OR gates. Manual execution of optimization for a PAL device is illustrated in Example 5-2. 312 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION AND gates inputs Product term A A B B C C D D WW 1 X X X 2 X XX X W 3 X A 4 X 5 X X X X 6 X All Closed B (always ! 0) 7 XX 8 X X Y 9 X X C 10 X 11 X X X Z 12 X X XX X Closed D " Open A A B B C C D D WW FIGURE 5-10 PAL Device Structure with Connection Map for PAL® Device for Example 5-2 EXAMPLE 5-2 Implementing a Combinational Circuit Using a PAL As an example of a PAL device incorporated into the design of a combinational cir- cuit, consider the following Boolean functions, given in sum-of-minterms form: W(A, B, C, D) = g m(2, 12, 13) X(A, B, C, D) = g m(7, 8, 9, 10, 11, 12, 13, 14, 15) Y(A, B, C, D) = g m(0, 2, 3, 4, 5, 6, 7, 8, 10, 11, 15) Z(A, B, C, D) = g m(1, 2, 8, 12, 13) Simplifying the four functions to a minimum number of terms results in the follow- ing Boolean functions: W = ABC + A BCD X = A + BCD 5-2 / Programmable Implementation Technologies 313 Y = AB + CD + B D Z = ABC + A BCD + A C D + A B CD = W + ACD + ABCD Note that the all four equations are the result of two-level optimization. But the function for Z has four product terms. The logical sum of two of these terms is equal to W. Thus, by using W, it is possible to reduce the number of terms for Z from four to three, so that the equations above can fit into the PAL device in Figure 5-10. Even if W were not present as an output, the PAL device structure would permit the factor W to be designed and used to implement Z. In this case, however, the output at W would not be useful for implementing any other function but W. Field Programmable Gate Array The most common form of programmable logic device currently available is the field programmable gate array (FPGA). While FPGA devices from different manufactur- ers have a wide variety of features, most FPGA devices have three programmable elements in common: programmable logic blocks, programmable interconnect, and programmable input/output pins, as illustrated in Figure 5-11. In addition to these three common elements, many FPGAs have specialized blocks of dedicated logic such as memories, arithmetic components, and even microprocessors. This section focuses on the basic features of FPGAs, which should provide sufficient background FIGURE 5-11 The Three Programmable Features of Most FPGA Devices: Logic Blocks, Interconnect, and Input/Output 314 CHAPTER 5 / DIGITAL HARDWARE IMPLEMENTATION for readers who are interested in a particular FPGA to understand the data manuals from the manufacturer. The advantage of the FPGA relative to other programmable logic families is the availability of configurable combinational logic and flip-flops, and ease of re-configuration. Most FPGA families are configured using static random access memory (SRAM), which will be more fully explained in Chapter 7. Other technolo- gies for configuring FPGAs include Flash memory and anti-fuses (similar to the PROM described earlier in this section). FPGAs that use SRAM for their configura- tion are volatile, meaning that they lose their configuration when power is removed and the configuration must be loaded whenever power is re-applied. Regardless of the configuration technology, each configuration bit in the FPGA controls the behav- ior of a programmable element. Configuring the FPGA requires setting all of the configuration bits for the programmable logic blocks, routing, and I/O. The first programmable feature common to many FPGAs that we will describe is the programmable logic block. A programmable logic block contains combina- tional and sequential logic that can be configured to implement many different func- tions. Many FPGA families have programmable logic blocks based upon a look-up table (LUT) to implement combinational functions. A look-up table is a 2k * 1 memory that implements the truth table for a function of k variables, referred to as a k-LUT. Figure 5-12(a) illustrates a 2-input LUT. Any of the sixteen possible Boolean functions of two variables can be implemented by setting the SRAM configuration bits in the figure to the truth table for the desired function of a and b, as described in Section 3-7. To implement functions of more than k variables, several k-LUTs can be connected together with multiplexers, as shown in Figure 5-12(b). Combining the smaller LUTs with a multiplexer uses Shannon’s expansion theorem, which states that any Boolean function f (x1, x2, x3, c , xk) can be expressed as f (x1, x2, x3, c , xk) = xk # f (x1, x2, x3, c , 1) + xk # f (x1, x2, x3, c , 0) SRAM 2-LUT bit f(a, b, 0) SRAM a f(a, b, c) bit f(a, b) b SRAM bit 2-LUT SRAM f(a, b, 1) bit a b c (a) (b) FIGURE 5-12 (a) A 2-Input Look-Up Table, (b) Implementing a 3-Input Function with Two 2-LUTs and a Multiplexer 5-2 / Programmable Implementation Technologies 315 In Figure 5-12(b), the Boolean functi