CH02_Computer_Evolution_and_Performance.pptx

Full Transcript

+ William Stallings Computer Organization and Architecture 9th Edition + Chapter 2 Computer Evolution and Performance + History of Computers First Generation: Vacuum Tubes ENIAC Electronic Numerical Integrator And Computer Designed an...

+ William Stallings Computer Organization and Architecture 9th Edition + Chapter 2 Computer Evolution and Performance + History of Computers First Generation: Vacuum Tubes ENIAC Electronic Numerical Integrator And Computer Designed and constructed at the University of Pennsylvania Started in 1943 – completed in 1946 By John Mauchly and John Eckert World’s first general purpose electronic digital computer Army’s Ballistics Research Laboratory (BRL) needed a way to supply trajectory tables for new weapons accurately and within a reasonable time frame Was not finished in time to be used in the war effort Its first task was to perform a series of calculations that were used to help determine the feasibility of the hydrogen bomb Continued to operate under BRL management until 1955 when it was disassembled ENIAC Major Memory drawback drawback consisted Occupied was was the the need need of 20 Contained Capable 1500 Decimal accumulators, more more of of for manual Weighed square 140 kW rather each than 5000 programming 30 feet Power than capable 18,000 additions tons of consumption binary of by setting vacuum vacuum per per switches floor machine holding tubes second and space a 10 digit 10 digit plugging/ number unplugging cables + John von Neumann EDVAC (Electronic Discrete Variable Computer) First publication of the idea was in 1945 Stored program concept Attributed to ENIAC designers, most notably the mathematician John von Neumann Program represented in a form suitable for storing in memory alongside the data IAS computer Princeton Institute for Advanced Studies Prototype of all subsequent general-purpose computers Completed in 1952 Structure of von Neumann Machine + IAS Memory Formats Both data and instructions are The memory of the IAS stored there consists of 1000 storage locations (called words) Numbers are represented in binary form and each of 40 bits each instruction is a binary code + Structure of IAS Computer + Registers Contains a word to be stored in memory or sent to the I/O Memory buffer unit register (MBR) Or is used to receive a word from memory or from the I/O unit Memory address Specifies the address in memory of the word to be written register (MAR) from or read into the MBR Instruction register Contains the 8-bit opcode instruction being executed (IR) Instruction buffer Employed to temporarily hold the right-hand instruction register (IBR) from a word in memory Program counter Contains the address of the next instruction pair to be (PC) fetched from memory Accumulator (AC) Employed to temporarily hold operands and results of ALU and multiplier operations quotient (MQ) + IAS Operations + Table 2.1 The IAS Instruction Set Table 2.1 The IAS Instruction Set + History of Computers Second Generation: Transistors Smaller Cheaper Dissipates less heat than a vacuum tube Is a solid state device made from silicon Was invented at Bell Labs in 1947 It was not until the late 1950’s that fully transistorized computers were commercially available Table 2.2 Computer Generations + Computer Generations + Second Generation Computers Introduced: Appearance of the Digital More complex arithmetic Equipment Corporation and logic units and control (DEC) in 1957 units The use of high-level PDP-1 was DEC’s first programming languages computer Provision of system software which provided the ability This began the mini- to: computer phenomenon that load programs would become so move data to peripherals prominent in the third and libraries generation perform common computations Table 2.3 Example Members of the IBM 700/7000 Series Table 2.3 Example Members of the IBM 700/7000 Series History of Computers Third Generation: Integrated Circuits 1958 – the invention of the integrated circuit Discrete component Single, self-contained transistor Manufactured separately, packaged in their own containers, and soldered or wired together onto masonite-like circuit boards Manufacturing process was expensive and cumbersome The two most important members of the third generation were the IBM System/360 and the DEC PDP-8 + Microelectronics + A computer consists of Integrated gates, memory cells, and interconnections among Circuits these elements The gates and memory Data storage – provided by cells are constructed of memory cells simple digital electronic components Data processing – provided Exploits the fact that such by gates components as transistors, resistors, and conductors can Data movement – the paths be fabricated from a among components are used semiconductor such as silicon to move data from memory to memory and from memory Many transistors can be through gates to memory produced at the same time on a single wafer of silicon Control – the paths among components can carry Transistors can be connected with a processor metallization control signals to form circuits + Wafer, Chip, and Gate Relationshi p Intel : The Making of a Chip with 22nm/3D Transisto + Chip Growth Moore’s Law 1965; Gordon Moore – co-founder of Intel Observed number of transistors that could be put on a single chip was doubling every year The pace Consequences of Moore’s slowed to a doubling every law: The cost of 18 months in The Computer the 1970’s but computer becomes electrical logic and smaller and Reduction has sustained memory path length in power Fewer is more that rate ever is circuitry convenient and cooling interchip since shortened, to use in a has fallen requiremen connections increasing variety of at a ts operating environment dramatic speed s rate + DEC - PDP-8 Bus Structure + LSI Large Scale Later Integration Generation VLSI s Very Large Scale Integration ULSI Semiconductor Memory Ultra Large Microprocessors Scale Integration + Semiconductor Memory In 1970 Fairchild produced the first relatively capacious semiconductor memory Chip was about the Could hold 256 bits Much faster than Non-destructive size of a single core of memory core In 1974 the price per bit of semiconductor memory dropped below the price There has been a continuing and perrapid bit of core Developments memory in memory and processor decline in memory cost accompanied by a technologies changed the nature of corresponding increase in physical memory computers in less than a decade density Since 1970 semiconductor memory has been through 13 generations Each generation has provided four times the storage density of the previous generation, accompanied by declining cost per bit and declining access time + Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct a single computer processor 1971 Intel developed 4004 First chip to contain all of the components of a CPU on a single chip Birth of microprocessor 1972 Intel developed 8008 First 8-bit microprocessor 1974 Intel developed 8080 First general purpose microprocessor Faster, has a richer instruction set, has a large addressing capability Evolution of Intel Microprocessors a. 1970s Processors b. 1980s Processors Evolution of Intel Microprocessors c. 1990s Processors d. Recent Processors + Microprocessor Speed Techniques built into contemporary processors include: Pipelining Processor moves data or instructions into a conceptual pipe with all stages of the pipe processing simultaneously Branch Processor looks ahead in the instruction code fetched from memory and predicts prediction which branches, or groups of instructions, are likely to be processed next Data flow Processor analyzes which instructions are dependent on each other’s results, or data, analysis to create an optimized schedule of instructions Using branch prediction and data flow Speculative analysis, some processors speculatively execute instructions ahead of their actual appearance in the program execution, execution holding the results in temporary locations, keeping execution engines as busy as possible + Performance Balance Increase the Adjust the organization and number of bits that are retrieved architecture to compensate at one time by making DRAMs for the mismatch among the “wider” rather capabilities of the various than “deeper” and by using wide components bus data paths Reduce the frequency of Architectural examples memory access by incorporating include: increasingly complex and efficient cache structures between the processor and main memory Increase the interconnect Change the DRAM bandwidth interface to make between it more efficient processors and by including a memory by using cache or other higher speed buses buffering scheme and a hierarchy of on the DRAM chip buses to buffer and structure data flow Typical I/O Device Data Rates + Improvements in Chip Organization and Architecture Increase hardware speed of processor Fundamentally due to shrinking logic gate size More gates, packed more tightly, increasing clock rate Propagation time for signals reduced Increase size and speed of caches Dedicating part of processor chip Cache access times drop significantly Change processor organization and architecture Increase effective speed of instruction execution Parallelism + Problems with Clock Speed and Login Density Power Power density increases with density of logic and clock speed Dissipating heat RC delay Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them Delay increases as RC product increases Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance Memory latency Memory speeds lag processor speeds + Processor Trends The use of multiple Multicore processors on the same chip provides the potential to increase performance without increasing the clock rate Strategy is to use two simpler processors on the chip rather than one more complex processor With two processors larger caches are justified As caches became larger it made performance sense to create two and then three levels of cache on a chip + Many Integrated Core (MIC) Graphics Processing Unit (GPU) MIC GPU Leap in performance as well Core designed to perform as the challenges in parallel operations on developing software to graphics data exploit such a large number of cores Traditionally found on a plug-in graphics card, it is The multicore and MIC used to encode and render strategy involves a 2D and 3D graphics as well homogeneous collection of as process video general purpose processors on a single chip Used as vector processors for a variety of applications that require repetitive computations General definition: Embedded “A combination of computer hardware and software, and perhaps additional mechanical or other parts, designed to Systems perform a dedicated function. In many cases, embedded systems + are part of a larger system or product, as in the case of an antilock braking system in a car.” Table 2.7 Examples of Embedded Systems and Their Markets + System Clock + INSTRUCTION EXECUTION RATE + Gene Amdahl [AMDA67] Deals with the potential speedup of a program using multiple processors compared to a single Amdahl’s processor Law Illustrates the problems facing industry in the development of multi-core machines Software must be adapted to a highly parallel execution environment to exploit the power of parallel processing Can be generalized to evaluate and design technical improvement in a computer system + Amdahl’s Law + Little’s Law Fundamental and simple relation with broad applications Can be applied to almost any system that is statistically in steady state, and in which there is no leakage Queuing system If server is idle an item is served immediately, otherwise an arriving item joins a queue There can be a single queue for a single server or for multiple servers, or multiples queues with one being for each of multiple servers Average number of items in a queuing system equals the average rate at which items arrive multiplied by the time that an item spends in the system Relationship requires very few assumptions Because of its simplicity and generality it is extremely useful + Summary Computer Evolution and Performance Chapter 2 Multi-core First generation computers MICs Vacuum tubes GPGPUs Second generation computers Evolution of the Intel x86 Transistors Embedded systems Third generation computers Integrated circuits ARM evolution Performance assessment Performance designs Clock speed and instructions Microprocessor speed per second Performance balance Benchmarks Chip organization and Amdahl’s Law architecture Little’s Law + Key Terms Amdahl’s law stored-program concept benchmark upward compatible chip von Neumann machine clock cycle wafer control unit word cycle time accumulator (AC) embedded system arithmetic and logic unit execute cycle (ALU) fetch cycle graphics processing unit instruction cycle (GPU) instruction set input-output (I/O) main memory instruction buffer register MIPS rate (IBR) microprocessor instruction register (IR) multicore integrated circuit (IC) multiplexor many integrated core (MIC) opcode memory address register (MAR) memory buffer register (MBR) + Homework 2.2 2.5 2.9 2.10 2.13 2.15 2.16 2.17 2.18

Use Quizgecko on...
Browser
Browser