L1 - Introduction to Computer Architecture.ppt
Document Details
Uploaded by WieldyDatePalm
Tags
Full Transcript
TMF1214 Computer Architecture (Semester 2 2023/2024) Introduction to Computer Architecture Reference: Chapter 1 William Stallings Computer Organization and Architecture 8th Edition 1 Co...
TMF1214 Computer Architecture (Semester 2 2023/2024) Introduction to Computer Architecture Reference: Chapter 1 William Stallings Computer Organization and Architecture 8th Edition 1 Computer System: User’s View Image: http://www.coolnerds.com 2 Computer System Components: High Level View Input Computer Output Keyboard Monitor Mouse System unit Speaker Microphone 3 Architecture & Organization Architecture is those attributes visible to the programmer —Instruction set, number of bits used for data representation, I/O mechanisms, addressing techniques. —e.g. Is there a multiply instruction? Organization is how features are implemented —Control signals, interfaces, memory technology. —e.g. Is there a hardware multiply unit or is it done by repeated addition? Architecture & Organization (Cont...) All Intel x86 family share the same basic architecture The IBM System/370 family share the same basic architecture This gives code compatibility —At least backwards Organization differs between different versions Structure & Function Structure is the way in which components relate to each other Function is the operation of individual components as part of the structure 6 Function All computer functions are: —Data processing —Data storage ->even if the computer processing data on the fly (eg data come in and get processed, and result go out immediately)I/O vs data communication —Control-> to control all the functions (outside/within the pc) individual(s) & CU 7 Functional view Johari Abdullah, FCSIT 8 Operations (1) Data movement Transferring data from one peripheral or communication line to another 9 Operations (2) Storage Transferring data from external environment to computer storage (read) and vice versa (write) 10 Operation (3) Processing from/to storage Data processing involving data in storage 11 Operation (4) Processing from storage to I/O Data processing involving en route data between storage and external environment 12 Structure - Top Level (The Computer) Computer Central Main Processing Memory Unit (CPU) Computer Systems Interconnection/Bus Input Output 13 Computer Structure CPU: control the operation of the computer and performs its data processing functions, often simply referred to as processor. Main Memory: Stores data I/O: Move data between the computer and its external environment System Interconnection: mechanism that provides for communication among CPU, main memory and I/O. Example: System bus (consisting of a number of conducting wires to attach all the other components)14 Structure - The CPU CPU Computer Arithmetic Registers and Logic I/O Unit (ALU) System CPU Bus Internal CPU Memory Interconnection Control Unit 15 The CPU Structure Control Unit: controls the operation of the CPU and hence the computer ALU: Performs the computer’s data processing functions Registers: Provides internal storage to the CPU CPU interconnection/Internal bus: Some mechanism that provides for communication among the control unit, ALU and registers. 16 Structure - The Control Unit Control Unit CPU Sequencing ALU Logic Control Internal Unit Bus Control Unit Registers Registers and Decoders Control Memory 17 Recall Computer System: User’s View Image: http://www.coolnerds.com 18 Recall Computer System Components: High Level View Input Computer Output Keyboard Monitor Mouse System unit Speaker Microphone 19 CPU Motherboard 20 Computer Components: Interconnection I/O CPU MEMORY 21 CPU 22 CPU Organization Registers ALU CU 23 Memory I/O CPU MEMORY address content 0000000000 01010101010010101 0000000001 01110101010010101 1111111110 01010101011110101 1111111111 11010111010010101 24 Input/Output CPU I/O Module I/O Devices 25 Computer Systems Hierarchy A digital computer solves problems by carrying out instructions Results Instructions Computer A program: A sequence of instructions describing how to perform a certain task. 26 Computer Systems Hierarchy Human Language Difficult to implement Interpretation/Translation Machine Language Computer 27 Computer Systems Hierarchy Human Language Machine-like/Human-like language Interpretation/Translation Machine Language Computer 28 Computer Systems Hierarchy Programmers High-level language - C++, Java VB Assembly language OS - UNIX, Windows NT Instruction sets - Pentium, PowerPC Systems programmers Micro programs Hardware 29 TMC1214/TMC1213 Computer Architecture (Semester 2 2023/2024) Computer Evolution and Performance Reference: Chapter 2 William Stallings Computer Organization and Architecture 8th Edition 30 A (Very) Brief History of Computers The first Generation - Vacuum Tubes (1945 -1955) ENIAC (1943 - 1946) Intended for calculating range tables of aiming artillery Consisted more than 18000 vacuum tubes, 1500 square feet of floor space, weight 30 tons, consumed 140 KW Decimal machine Each digit represented by a ring of 10 vacuum tubes. Designed for artillery range table, but used to perform complex calculations to help determine the feasibility of hydrogen bomb - general purpose computer Programmed with multi-position switches and jumper cables. John von Neumann (1945 -1952) more later … Originally a member of the ENIAC development team. First to use binary arithmetic Architecture consists of : Memory, ALU, Program control, Input, Output Stored-program concept - main memory store both data and instructions 31 A (Very) Brief History of Computers (Cont…) Vacuum Tubes ENIAC 32 A (Very) Brief History of Computers (Cont…) The Second Generation - Transistors (1955 -1965) Transistors Transistor was invented in 1948 at Bell Labs by John Barden, Walter Brattain and William Shockley TX-0 (Transistorised eXperimental computer 0), first transistor computer, build at MIT Lincoln Labs DEC PDP-1, first affordable microcomputer ($120,000), performance half that of IBM 7090 (the fastest computer in the world at that time, which cost millions) PDP-8, cheap ($16,000), the first to use single bus 33 A (Very) Brief History of Computers (Cont…) The Third Generation - Integrated Circuits (1965 -1980) IBM System/360 Family of machines with same assembly language Designed for both scientific and commercial computing First to allowed microprogramming Very popular with universities 34 A (Very) Brief History of Computers (Cont…) The Fourth Generation – VLSI (1980- ?) Very Large Scale Integration (VLSI) is the process of creating integrated circuits by combining thousands of transistors into a single chip Led to PC revolution High performance, low cost 35 Generations of Computer (Technology) Vacuum tube - 1946-1957 Transistor - 1958-1964 Small scale integration - 1965 on —Up to 100 devices on a chip Medium scale integration - to 1971 —100-3,000 devices on a chip Large scale integration - 1971-1977 —3,000 - 100,000 devices on a chip Very large scale integration - 1978 -1991 —100,000 - 100,000,000 devices on a chip Ultra large scale integration – 1991 - —Over 100,000,000 devices on a chip Moore’s Law Moore’s Law Computers double in power roughly every two years, but cost only half as much 37 Moore’s Law Increased density of components on chip Gordon Moore - cofounder of Intel Number of transistors on a chip will double every year Since 1970’s development has slowed a little — Number of transistors doubles every 18 months Cost of a chip has remained almost unchanged Higher packing density means shorter electrical paths, giving higher performance Smaller size gives increased flexibility Reduced power and cooling requirements Fewer interconnections increases reliability 38 Growth in CPU Transistor Count 39 The IAS (von Neumann) Machine Stored Program concept Main memory storing programs and data ALU operating on binary data Control unit interpreting instructions from memory and executing Input and output equipment operated by control unit 1946 ~ 1952 John von Neumann Arithmetic and Princeton Logic Unit Institute for Advanced Studies Memory Input Output Almost all of today’s Main Equipment computers have the same Program general structure as the Control Unit IAS - referred to as von Neumann machines. The Structure of IAS Computer 40 The IAS Machine: Control Unit The control unit operates the machine by fetching instructions from memory and executing them ONE at a time. Central Processing Unit Arithmetic and Logic Unit Accumulator MQ Arithmetic & Logic Circuits Input MBR Output Instructions Equipment & Data Main Memory IBR PC MAR IR Control Circuits Program Control Unit Address 41 The IAS Machine: Instruction Cycle The IAS operates by repetitively performing an instruction cycle. Two sub-cycles: During the fetch cycle, the opcode of the NEXT instruction is loaded in to the IR and the address portion is loaded into the MAR Once the opcode is in the IR, the execute cycle is performed. Control circuitry interprets the opcode and executes the instruction by sending out appropriate control signals to cause data to be moved or an operation to be performed by the ALU. 42 IAS - details 1000 x 40 bit words —Binary number —2 x 20 bit instructions Set of registers (storage in CPU) —Memory Buffer Register (MBR) —Memory Address Register (MAR) —Instruction Register (IR) —Instruction Buffer Register (IBR) —Program Counter (PC) —Accumulator (AC) —Multiplier Quotient (MQ) 43 Structure of IAS – detail , FCSIT 44 Evolution of Intel Microprocessor Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm 1970s Processors 4004 8008 8080 8086 8088 Introduced 11/15/71 4/1/72 4/1/74 6/8/78 6/1/79 Clock 108KHz 200KHz 2MHz 5MHz, 8MHz, 5MHz, 8MHz Speeds 10MHz Bus Width 4 bits 8 bits 8 bits 16 bits 8 bits Number of 2,300 3,500 6,000 29,000 29,000 Transistor (10 microns) (10 microns) (6 microns) (3 microns) (3 microns) s Addressab 640 bytes 16 KBytes 64 KBytes 1 MB 1 MB le Memory Virtual -- -- -- -- -- Memory Brief First microcomputer Data/character 10X the 10X the Identical to 8086 except Descriptio chip, Arithmetic manipulation performance of performance of for its 8-bit external bus n manipulation the 8008 the 8080 45 Evolution of Intel Microprocessor Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm 1980s Processors Intel486TM DX CPU Intel386TM DX Intel386TM SX Microproce 80286 Microprocessor Microprocessor ssor Introduced 2/1/82 10/17/85 6/16/88 4/10/89 Clock 6MHz, 8MHz, 16MHz, 20MHz, 25MHz, 16MHz, 20MHz, 25MHz, 33MHz 25MHz, Speeds 10MHz, 12.5MHz 33MHz 33MHz, 50MHz Bus Width 16 bits 32 bits 16 bits 32 bits Number of 134,000 275,000 275,000 1.2 million Transistor (1.5 microns) (1 micron) (1 micron) (1 micron) s (.8 micron with 50MHz) Addressab 16 megabytes 4 gigabytes 16 megabytes 4 gigabytes le Memory Virtual 1 gigabyte 64 terabytes 64 terabytes 64 terabytes Memory Brief 3-6X the First X86 chip to handle 32- 16-bit address bus enabled low- Level 1 cache Descriptio performance of the bit data sets cost 32-bit processing on chip n 8086 46 Evolution of Intel Microprocessor Source: http://www.intel.com/intel/museum/25anniv/hof/tspecs.htm 1990s Processors Intel486TM SX Pentium® Pro Pentium® II Microprocessor Pentium® Processor Processor Processor Introduced 4/22/91 3/22/93 11/01/95 5/07/97 Clock 16MHz, 20MHz, 60MHz,66MHz 150MHz, 166MHz, 200MHz, 233MHz, Speeds 25MHz, 33MHz 180MHz, 200MHz 266MHz, 300MHz Bus Width 32 bits 64 bits 64 bits 64 bits Number of 1.185 million 3.1 million 5.5 million 7.5 million Transistors (1 micron) (.8 micron) (0.35 micron) (0.35 micron) Addressable 4 gigabytes 4 gigabytes 64 gigabytes 64 gigabytes Memory Virtual 64 terabytes 64 terabytes 64 terabytes 64 terabytes Memory Brief Identical in design to Superscalar architecture Dynamic execution Dual independent Description Intel486TM DX but brought 5X the performance of architecture drives bus, dynamic without math the 33-MHz Intel486TM DX high-performing execution, Intel coprocessor processor processor MMXTM technology 47 Pentium Evolution (1) 8080 — first general purpose microprocessor — 8 bit data path — Used in first personal computer – Altair 8086 — much more powerful — 16 bit — instruction cache, prefetch few instructions — 8088 (8 bit external bus) used in first IBM PC 80286 — 16 Mbyte memory addressable — up from 1Mb 80386 — 32 bit — Support for multitasking 48 Pentium Evolution (3) Pentium II —MMX technology —graphics, video & audio processing Pentium III —Additional floating point instructions for 3D graphics Pentium 4 —Note Arabic rather than Roman numerals —Further floating point and multimedia enhancements Itanium —64 bit —see chapter 15 See Intel web pages for detailed information on processors 49 Speeding it up Pipelining On board cache On board L1 & L2 cache Branch prediction Data flow analysis Speculative execution 50 Performance Mismatch Processor speed increased Memory capacity increased Memory speed lags behind processor speed 51 Logic and Memory Performance Gap Solutions Increase number of bits retrieved at one time —Make DRAM “wider” rather than “deeper” Change DRAM interface —Cache Reduce frequency of memory access —More complex cache and cache on chip Increase interconnection bandwidth —High speed buses —Hierarchy of buses I/O Devices Peripherals with intensive I/O demands Large data throughput demands Processors can handle this Problem moving data Solutions: —Caching —Buffering —Higher-speed interconnection buses —More elaborate bus structures —Multiple-processor configurations Typical I/O Device Data Rates Key is Balance Processor components Main memory I/O devices Interconnection structures Improvements in Chip Organization and Architecture Increase hardware speed of processor —Fundamentally due to shrinking logic gate size – More gates, packed more tightly, increasing clock rate – Propagation time for signals reduced Increase size and speed of caches —Dedicating part of processor chip – Cache access times drop significantly Change processor organization and architecture —Increase effective speed of execution —Parallelism Problems with Clock Speed and Login Density Power — Power density increases with density of logic and clock speed — Dissipating heat RC delay — Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them — Delay increases as RC product increases — Wire interconnects thinner, increasing resistance — Wires closer together, increasing capacitance Memory latency — Memory speeds lag processor speeds Solution: — More emphasis on organizational and architectural approaches Intel Microprocessor Performance Increased Cache Capacity Typically two or three levels of cache between processor and main memory Chip density increased —More cache memory on chip – Faster cache access Pentium chip devoted about 10% of chip area to cache Pentium 4 devotes about 50% More Complex Execution Logic Enable parallel execution of instructions Pipeline works like assembly line —Different stages of execution of different instructions at same time along pipeline Superscalar allows multiple pipelines within single processor —Instructions that do not depend on one another can be executed in parallel Diminishing Returns Internal organization of processors complex —Can get a great deal of parallelism —Further significant increases likely to be relatively modest Benefits from cache are reaching limit Increasing clock rate runs into power dissipation problem —Some fundamental physical limits are being reached New Approach – Multiple Cores Multiple processors on single chip — Large shared cache Within a processor, increase in performance proportional to square root of increase in complexity If software can use multiple processors, doubling number of processors almost doubles performance So, use two simpler processors on the chip rather than one more complex processor With two processors, larger caches are justified — Power consumption of memory logic less than processing logic Internet Resources http://www.intel.com/ —Search for the Intel Museum http://www.ibm.com http://www.dec.com Charles Babbage Institute PowerPC Intel Developer Home 64