Summary

This document presents a lecture on computer architecture, focusing on processor technology, historical development, and key components. The document discusses topics like CPU architecture, technologies, and innovations in microprocessors.

Full Transcript

Computer Architecture Part II-D: Survey of Processor Architecture CPU processors What’s the difference? Areas of Development Below are technologies which can be improved in CPU design:  Internal and external clock frequency  Clock doubling  System bus...

Computer Architecture Part II-D: Survey of Processor Architecture CPU processors What’s the difference? Areas of Development Below are technologies which can be improved in CPU design:  Internal and external clock frequency  Clock doubling  System bus  Internal and external data width  Internal cache  Instruction Set  Material used for the die  Voltage End result: Enhance speed of the CPU and the system in general Areas of Development: Clock Frequency Internal clock frequency  Speed of data processing inside the CPU External clock frequency  Speed of data transfer to and from the CPU via the system bus System Bus Is the conduit for moving data between the processor and the other components like main memory Currently, Intel Pentium 4 processors are already in the 3.6 GHz and the bus can either be 400/533/800 MHz AMD: Athlon XP processor supports 400 MHz bus for model 3200+ The GHz Race June 1999 : API (Alpha Processor Inc.) and Samsung demonstrated a 1 GHz chip (target: mid 2000) -- main consumer is Compaq. March 2000: AMD released Athlon 1 GHz; within days Intel released 1 GHz PIII 2001:  AMD introduces 1.4GHz Athlon in June 2001.  Intel introduces 1.8GHz P4 processor in July 2001 2002: Both AMD and Intel are using 0.13micron process technology  AMD introduces Athlon XP 2200+ in June 2002  Intel introduces Pentium 4 2.53GHz in May 2002 and mobile Pentium 4 2GHz in June 2002 2003  Pentium 4: 3.2 GHz, 800 MHz  AMD 3200+: 2.2 GHz, 400 MHz 2004  Pentium 4: 3.6 GHz, 800 MHz  AMD: Same  now concentrating on 64-bit CPUs Number of transistors Pertium 4 42,000,000 45000 40000 486SX/486DX AMD K6 35000 486DX2/486DX4 8,800,000 Transistors (,000) 30000 1,200,000 386DX/386SX 25000 250,000 Pentium, Cyrix 8086/8088 286 20000 22,000 128,000 AMD K5, MMX 3,100,000 15000 Athlon 1.4 GHz 37,000,000 10000 5000 0 1984 1987 1990 1993 1997 1999 2001 Year The Microprocessor War: Intel vs. AMD Intel is constantly being challenged by its main competitor – AMD Areas of competition  Processor speed  Innovation  Price Micron technology A micron is 1 millionth of a meter The tiny elements that make up a transistor on a chip are measured in microns Smaller microns => smaller chips -- or same chip size, more transistors -- more power Currently, processors are using 0.09 micron (90 nm) technology for both P4 and Athlon By comparison, human hair is about 100 micron Micron Technology: Examples 8080 1974 6 8086 1978 3 486 Intel 1989 1 486 AMD 1990 0.8 Pentium classic 1993 0.8 IDT Winchip 1997 0.35 Pentium MMX 1997 0.25 AMD K6-11 1997 0.25 PIII/Athlon/Itanium 2001 0.18 P4/Athlon XP 2002 0.13 2003 0.13/0.09 2004 0.09 Copper-based Microprocessors What limits making the chips smaller and decreasing the microns is the use of aluminum Copper is a good choice since it is a better conductor of electricity, it consumes less energy and takes up less space than aluminum Using copper allowed processors to boost speeds to the GHz range Examples of Copper Processors IBM has pioneered the use of copper instead of aluminum for microprocessors -- September 1, 1998. IBM Power PC 740/750 Apple’s iBook which was launched July 22, 1999 uses an optimized version of the Power PC 740/750 P4 and Athlon both use copper technology PC on a Chip Another direction that chip developers are going is developing a PC-on-a-chip. The chip integrates a number of core electronic features into one chip including the main processor, graphics, and audio. Result: The chip will replace the dozen or so separate chips (memory, FPU, chips for graphics, video, etc) currently found on the PC today Industry Updates on PC-on-a-chip National Semiconductor launched the Geode family of chips that integrates most electronic functions for an information appliance such as a TV set-top box The first design, the Geode SC1400, is geared for an Internet- centric TV set-top box equipped for digital video Intel StrongARM is designed for fast time-to-market development of handheld, palm-size devices, wireless and Internet appliances AMD also produces chip systems Impact of PC-on-a-Chip Desktops will be smaller and quieter Battery of notebooks will last longer because of the low power drain of the chip Proliferation of information appliances Areas of Development: Voltage One of newest CPU technologies is continually thinner wires inside the chip. Thinner wires -> CPU operates at lower voltage Result: Small CPU generating less heat and capable of operating at higher speeds Casing Socket 7  The receptacle on the motherboard that holds a Pentium CPU chip. It is also used to hold Pentium chip clones such as the 5x86, 6x86, K5 and K6. ZIF  Zero Insertion Force socket. A type of socket designed for easy insertion of chips that have a high density of pins. The chip is dropped into the socket's holes and a lever is pulled down to lock it in. After insertion of the chip, the lever is pulled down and the pins are locked in Casing Slot 1 cartridge  A receptacle on the motherboard that holds an Intel Single Edge Contact (SEC) cartridge. The cartridge contains up to two CPUs and an L2 cache, which runs at half the speed of the CPU. It plugs into the slot with a 242-pin edge connector. SEC: Single Edge Contact  Starting with the Pentium II, an Intel hardware module that contains the CPU and an external L2 cache. It plugs into a socket (Slot 1, Slot 2, etc.) on the motherboard which is more similar to a bus slot than an individual chip socket. A Pentium II Mounted on a Slot 1 Casing: Slot 2 Cartridge An enhanced Slot 1, which uses a 330-pin Single Edge Contact (SEC) cartridge that holds up to four processors. The L2 cache runs at full processor speed. Intel's Pentium II Xeon chips were the first to use Slot 2 AMD on a Slot A Slot A A receptacle on the motherboard for a K7 CPU chip from AMD. It is physically similar to Slot 1, but has different electrical requirements FC-PPGA (Flip-Chip) Traditional Wiring Flip-Chip (IBM) Advantages of FC-PPGA (Flip-Chip) Greater # of I/O pins available Smaller die = more dies / wafer Shorten electrical connections Better manufacturing efficiency LGA/BGA Bottom view of LGA-based CPU LGA Socket 775 Advantages of LGA/BGA Lower voltage used (less distance traveled, reduced signal loss) Less heat dissipation Chip Sets Bunch of intelligent controller chips found on the motherboard, which controls the buses around the CPU. Chip sets enable higher speeds on one or more buses and the utilization of new facilities (RAM types, buses, improves EIDE, etc.) Suppliers include Intel, SIS, Opti, Via, ALi Areas of Development: Clock Doubling Problem with processors at high speeds (400 MHz and up): other electric components need to keep up with the pace Solution: split the clock frequency into two  high internal clock : governs the pace of the CPU and all internal processing  low external clock : governs the pace of the system bus (i.e. all data transfer to and from the CPU via the system bus) 486DX2 25/50 was the first to employ clock doubling Clock Doubling: What happens? If the motherboard crystal works at 25 MHz, the CPU will receive a signal every 40 ns. Internally, this frequency is doubled to 50 MHz. Now the clock ticks every 20 ns inside the CPU. This frequency governs all internal transactions Overclocking Going beyond the recommended clock frequency settings 3 method of overclocking  Change system bus frequency  Change CPU frequency multiplier  Change both system bus frequency and CPU frequency multiplier Some CPUs have locked frequencies (e.g. Intel’s 1 GHz CPU) Overclocking: How to... To overclock CPUs, jumpers located on the motherboard have to be set On some motherboards, you can see the instructions on how to set the jumpers, like the ASUS TX97 below Newer motherboards support jumper- less settings Overclocking Issues Heat. Can the CPU dissipate the heat? The L2 cache RAM of Pentium II cartridges - how fast can it work? RAM speed. Can it keep up with the system bus Will the software still work? Cooling The more overclocked a CPU is, the hotter it gets. Cooling fans and mechanisms were developed to keep the CPU from overheating. The bigger the fan and heat sink, the better it is. The CPU will operate more reliably. It will have a longer life span, and it can possibly be overclocked. Areas of Development: Data Width Internal Data Width  How many data bits can the CPU process simultaneously? External Data Width  How many data bits can the CPU receive simultaneously for processing Areas of Development: Cache Works as buffer between CPU and memory Two types:  Internal  External Areas of Development: Cache Examples Internal  Level 1  Level 2 External  Level 3 L2 Cache Out of Chip Intel used to separate the L2 cache from the CPU since combining them in one chip was too costly Why L2 Cache Was Expensive CPU and L2 cache are separate units inside Result: larger chip that requires a larger socket Areas of Development: Instruction Set Can the instruction set  Be simplified to speed up program processing?  Can it be improved?  Can instructions be added to increase the power of the CPU? About Multimedia Multimedia applications require geometric transformation  Re-computation of location and size of an image to determine new position  Deals with floating point The FPU (in the processor) handles all the floating point computations Drawing landscapes (e.g. Quake) involves a lot of computations, the CPU may not handle it as fast as the game player could stand About the FP registers Pentium-class processors have 8 FP registers, each of which has a length of 80 bits. So there is room for 8 numbers of 80 bit length, or 16 numbers each of 32 bit length How Multimedia is Handled Speed up the CPU, which results in faster FPU performance Improve the CPU’s FPU by adding more pipelines Add new instructions for more effective 3D performance 3D accelerated graphics cards Multimedia Innovations in CPUs MMX 3DNow! SSE MMX Introduced in 1995 in the Pentium processor Consists of 57 new instructions for 3D graphics Also introduced SIMD (Single Instruction Multiple Data) instructions: a technique where more than one integer could be processed simultaneously Problems:  Only works with integers  The processor can only work with either MMX or FPU, not both simultaneously, because they share the registers 3DNow! Introduced by AMD in the summer of 1998 in the K6-2 Characteristics  21 new instructions  SIMD instructions, which enabled handling of more data with just one instruction  Improved handling of numbers Successful  Integrated in Windows, games, and hardware drivers  It does not use the same registers  Registers are 80 bits wide and can hold 2 32-bit numbers simultaneously SSE Introduced in the Pentium III (Katmai) 500 MHz as Intel’s response to 3DNow! Characteristics  Has 8 new 128 bit registers, which can hold four 32 bit numbers  Has Streaming SIMD Extensions  50 new instructions which enable simultaneous, advanced calculations of more FP with a single instruction  New Media Instructions designed for coding and decoding MPEGs  Improved interaction between L2 and RAM Problems with SSE Pipelines can only handle two 32-bit numbers at a time To take advantage of the 128 bit registers, the FPU pipeline should have been doubled, but it was not done (would have pushed back release date of Katmai) Potentially though it can speed up 3D graphics since the registers can now handle four 32 bit numbers at a time SSE Enhancements SSE2  Started in Pentium 4  Has 144 new instructions (since SSE)  Data width is now 64 bits SSE3  Includes 13 additional SIMD instructions over SSE2  The new instructions are primarily designed to improve thread synchronization and specific application areas such as media and gaming Future Trends Last Dec. 1997, the Semiconductor Industry Association (SIA) provided details about future requirements of microprocessors. Assures/reinforces Moore’s Law 1999 SIA Roadmap for Microprocessors 1999 2000 2001 2002 2005 2008 MPU (gate length) 0.14 0.12 0.10 0.09 0.065 0.045 microns Transistors/ 6.6 million 9.4 million 13 million 18 million 44 109 (sq. cm) million million Die size 340 340 340 340 408 468 (sq. mm) MHz 1250 1486 1767 2100 3500 6000 Packaging 740 821 912 1012 1384 1893 (pins/balls) Wafer size 200 200 300 300 300 300 (mm) International Technology Roadmap for Semiconductors Intel Corporation Had (and still has) the biggest impact in microprocessor technology Currently leads the way in microprocessors Main line of business is CPU but also has other hardware products (e.g. motherboards) Short History of Intel 1968: Birth of Intel  Started in the memory business  First product was a 64-bit memory 1970s: Increase in market share Early 1980s: Japanese started eating up a large chunk of the memory business by developing 16 - 256 KB memory chips 1984: Business slowing down  “Get us out of memory!” 1986: Exited from memory. Why? 386 was a success and there was no turning back Intel Processor Time Line 1982: 286 16-bit processor Optimized Instruction handling 1978: 8086 First 16-bit CPU from Intel 1988: 386SX Cheaper version of the 386DX 2 1979: 8088 Reengineered CPU to fit existing 8-bit hardware 1989: 486 1985: 386 Built in math co-processor First 32-bit CPU L1 cache on-chip 1971: 4004 Intel’s first microprocessor (32-bit system bus) (108 KHz, 4 bit bus width) Intel Processor Time Line May 7, 1997: Pentium II 1993: Pentium Classic (Klamath) Superscalar (5x 486DX-33 MHz) 512 KB L2 Width of system bus: 64 bit 486SX L1 cache of 32 KB Speed of system bus: 60 to 66 MHz Discount chip Initially produced a lot of heat Nov 1, 1995: Pentium Pro No math co-processor RISC Processor 32 bit processing L2 cache is built in 3 486DX4 Triple the clock speed From 25 MHz to 75 MHz Jan 8, 1997: Pentium MMX 33 MHz to 100 MHz New set of instructions for multimedia 32 KB L1 cache The Fundamental Problem to Solve r1

Use Quizgecko on...
Browser
Browser