(The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-24-101-39-41.pdf

Full Transcript

40 Chapter 1 Computer Abstractions and Technology Elaboration: Although clock cycle time has traditionally been fixed, to save energy or temporarily boost performance, today’s processors can vary their clock rates, so we would...

40 Chapter 1 Computer Abstractions and Technology Elaboration: Although clock cycle time has traditionally been fixed, to save energy or temporarily boost performance, today’s processors can vary their clock rates, so we would need to use the average clock rate for a program. For example, the Intel Core i7 will temporarily increase clock rate by about 10% until the chip gets too warm. Intel calls this Turbo mode. Check A given application written in Java runs 15 seconds on a desktop processor. A new Yourself Java compiler is released that requires only 0.6 as many instructions as the old compiler. Unfortunately, it increases the CPI by 1.1. How fast can we expect the application to run using this new compiler? Pick the right answer from the three choices below: 15 0. 6 a. 8.2 sec 1. 1 b. 15 0. 6 1. 1 9.9 sec 1. 5 1. 1 c. 27.5 sec 0. 6 1.7 The Power Wall Figure 1.16 shows the increase in clock rate and power of nine generations of Intel microprocessors over 36 years. Both clock rate and power increased rapidly for decades and then flattened or dropped off recently. The reason they grew together is that they are correlated, and the reason for their recent slowing is that we have run into the practical power limit for cooling commodity microprocessors. 10,000 120 3600 2667 3300 3500 3500 3600 2000 100 1000 103 Clock Rate (MHz) 95 91 95 Clock Rate 200 Power (watts) 87 80 75.3 84 100 66 60 25 12.5 16 Power 40 10 10.1 29.1 20 3.3 4.1 4.9 1 0 Pentium Pro (1997) Pentium (1993) 80286 (1982) 80386 (1985) 80486 (1989) Pentium 4 Willamette (2001) Pentium 4 Prescott (2004) Core 2 Kentsfield (2007) Core i5 Coffee Lake (2018) Core i5 Clarkdale (2010) Core i5 Haswell (2013) Core i5 Skylake (2016) FIGURE 1.16 Clock rate and power for Intel x86 microprocessors over nine generations and 36 years. The Pentium 4 made a dramatic jump in clock rate and power but less so in performance. The Prescott thermal problems led to the abandonment of the Pentium 4 line. The Core 2 line reverts to a simpler pipeline with lower clock rates and multiple processors per chip. The Core i5 pipelines follow in its footsteps. 1.7 The Power Wall 41 Although power provides a limit to what we can cool, in the post-PC era the really valuable resource is energy. Battery life can trump performance in the personal mobile device, and the architects of warehouse scale computers try to reduce the costs of powering and cooling 50,000 servers as the costs are high at this scale. Just as measuring time in seconds is a safer evaluation of program performance than a rate like MIPS (see Section 1.10), the energy metric joules is a better measure than a power rate like watts, which is just joules/second. The dominant technology for integrated circuits is called CMOS (complementary metal oxide semiconductor). For CMOS, the primary source of energy consumption is so-called dynamic energy—that is, energy that is consumed when transistors switch states from 0 to 1 and vice versa. The dynamic energy depends on the capacitive loading of each transistor and the voltage applied: Energy ∝ Capacitive load Voltage2 This equation is the energy of a pulse during the logic transition of 0 → 1 → 0 or 1 → 0 → 1. The energy of a single transition is then Energy ∝ 1 / 2 Capacitive load Voltage 2 The power required per transistor is just the product of energy of a transition and the frequency of transitions: Power ∝ 1 / 2 Capacitive load Voltage2 Frequency switched Frequency switched is a function of the clock rate. The capacitive load per transistor is a function of both the number of transistors connected to an output (called the fanout) and the technology, which determines the capacitance of both wires and transistors. With regard to Figure 1.16, how could clock rates grow by a factor of 1000 while power increased by only a factor of 30? Energy and thus power can be reduced by lowering the voltage, which occurred with each new generation of technology, and power is a function of the voltage squared. Typically, the voltage was reduced about 15% per generation. In 20 years, voltages have gone from 5 V to 1 V, which is why the increase in power is only 30 times. Relative Power Suppose we developed a new, simpler processor that has 85% of the capacitive load of the more complex older processor. Further, assume that it can adjust EXAMPLE voltage so that it can reduce voltage 15% compared to processor B, which results in a 15% shrink in frequency. What is the impact on dynamic power? 42 Chapter 1 Computer Abstractions and Technology Powernew Capacitive load 0.85 Voltage 0.85 2 Frequency switched 0.85 ANSWER Powerold Capacitive load Voltage2 Frequency switched Thus the power ratio is 0.854  0.52 Hence, the new processor uses about half the power of the old processor. The modern problem is that further lowering of the voltage appears to make the transistors too leaky, like water faucets that cannot be completely shut off. Even today about 40% of the power consumption in server chips is due to leakage. If transistors started leaking more, the whole process could become unwieldy. To try to address the power problem, designers have already attached large devices to increase cooling, and they turn off parts of the chip that are not used in a given clock cycle. Although there are many more expensive ways to cool chips and thereby raise their power to, say, 300 watts, these techniques are generally too costly for personal computers and even servers, not to mention personal mobile devices. Since computer designers slammed into a power wall, they needed a new way forward. They chose a different path from the way they designed microprocessors for their first 30 years. Elaboration: Although dynamic energy is the primary source of energy consumption in CMOS, static energy consumption occurs because of leakage current that flows even when a transistor is off. In servers, leakage is typically responsible for 40% of the energy consumption. Thus, increasing the number of transistors increases power dissipation, even if the transistors are always off. A variety of design techniques and technology innovations are being deployed to control leakage, but it’s hard to lower voltage further. Elaboration: Power is a challenge for integrated circuits for two reasons. First, power must be brought in and distributed around the chip; modern microprocessors use hundreds of pins just for power and ground! Similarly, multiple levels of chip interconnection are used solely for power and ground distribution to portions of the chip. Second, power is dissipated as heat and must be removed. Server chips can burn more than 100 watts, and cooling the chip and the surrounding system is a major expense in warehouse scale computers (see Chapter 6).

Use Quizgecko on...
Browser
Browser