Computer Evolution PDF

+ Chapter 1 Basic Concepts and Computer Evolution © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. MOTHERBOARD Main memory chips Processor I/O chips chip PROCESSOR CHIP Core Core Core Core L3 cache L3 cache Core Core Core Core CORE Arithmetic Instruction and logic Load/ logic unit (ALU) store logic L1 I-cache L1 data cache L2 instruction L2 data cache cache Figure 1.2 Simplified View of Major Elements of a Multicore Computer © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. COMPUTER I/O Main memory System Bus CPU CPU Registers ALU Structure Internal Bus Control Unit CONTROL UNIT Sequencing Logic Control Unit Registers and Decoders Control Memory Figure 1.1 A Top-Down View of a Computer © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. +  CPU – controls the operation of the computer and performs its data processing functions There are four main structural  Main Memory – stores data components  I/O – moves data between the of the computer: computer and its external environment  System Interconnection – some mechanism that provides for communication among CPU, main memory, and I/O © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Control Unit CPU Controls the operation of the CPU and hence the computer Major structural Arithmetic and Logic Unit (ALU) components: Performs the computer’s data processing function Registers Provide storage internal to the CPU CPU Interconnection Some mechanism that provides for communication among the control unit, ALU, and registers © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Boolean Binary Input logic Output Input storage Output function cell Read Activate Write signal (a) Gate (b) Memory cell Figure 1.10 Fundamental Computer Elements © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. A computer consists of gates, + Integrated memory cells, and Circuits interconnections among these elements Data storage – provided by memory cells The gates and memory cells are constructed of simple Data processing – provided digital electronic components by gates Exploits the fact that such Data movement – the paths components as transistors, resistors, among components are used and conductors can be fabricated to move data from memory from a semiconductor such as silicon to memory and from memory through gates to Many transistors can be produced at memory the same time on a single wafer of silicon Control – the paths among components can carry Transistors can be connected with a control signals process of metallization to form circuits © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Wafer Chip Gate Packaged chip Figure 1.11 Relationship Among Wafer, Chip, and Gate © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Figure depicts the key concepts in an integrated circuit. A thin wafer of silicon is divided into a matrix of small areas, each a few millimeters square. The identical circuit pattern is fabricated in each area, and the wafer is broken up into chips. Each chip consists of many gates and/or memory cells plus a number of input and output attachment points. A number of these packages can then be interconnected on a printed circuit board to produce larger and more complex circuits. Initially, only a few gates or memory cells could be reliably manufactured and packaged together. These early integrated circuits are referred to as small-scale integration (SSI). © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. t ui g ed of rc or in ga w d st rk ci ul l a at n te gr tio si o ’s an w om re te n tr irst in ve p r oo In M F 100 bn 10 bn 1 bn 100 m 10 m 100,000 10.000 1,000 100 10 1 1947 50 55 60 65 70 75 80 85 90 95 2000 05 11 Figure 1.12 Growth in Transistor Count on Integrated Circuits (DRAM memory) As time went on, it became possible to pack more and more components on the same chip. This growth in density is illustrated in Figure. This figure reflects the famous Moore’s law, which was propounded by Gordon Moore, cofounder of Intel © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Moore’s Law 1965; Gordon Moore – co-founder of Intel Observed number of transistors that could be put on a single chip was doubling every year Consequences of Moore’s law: The pace Computer slowed to a The cost The becomes doubling every of electrical smaller 18 months in computer Reduction path and is the 1970’s but logic and in power Fewer length is more has sustained memory and interchip shortened convenien that rate ever circuitry cooling connectio , t to use in since has fallen requireme ns increasing a variety at a nts operating of dramatic speed environm rate ents © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + LSI Large Scale Later Integration Generations VLSI Very Large Scale Integration ULSI Semiconductor Memory Ultra Large Microprocessors Scale Integration © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. With the introduction of large-scale integration (LSI), more than 1,000 components can be placed on a single integrated circuit chip. Very-large-scale integration (VLSI) achieved more than 10,000 components per chip, while current ultra- large- scale integration (ULSI) chips can contain more than one billion components. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Microprocessors The density of elements on processor chips continued to rise More and more elements were placed on each chip so that fewer and fewer chips were needed to construct a single computer processor 1971 Intel developed 4004 First chip to contain all of the components of a CPU on a single chip Birth of microprocessor 1972 Intel developed 8008 First 8-bit microprocessor 1974 Intel developed 8080 First general purpose microprocessor Faster, has a richer instruction set, has a large addressing capability © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + The Evolution of the Intel x86 Architecture Two processor families are the Intel x86 and the ARM architectures Current x86 offerings represent the results of decades of design effort on complex instruction set computers (CISCs) An alternative approach to processor design is the reduced instruction set computer (RISC) ARM architecture is used in a wide variety of embedded systems and is one of the most powerful and best-designed RISC-based systems on the market © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Highlights of the Evolution of the Intel Product Line: 8080 8086 80286 80386 80486 World’s first A more Extension of the Intel’s first 32- Introduced the general- powerful 16-bit 8086 enabling bit machine use of much purpose machine addressing a First Intel more microprocessor Has an 16-MB memory processor to sophisticated 8-bit machine, instruction instead of just support and powerful 8-bit data path cache, or 1MB multitasking cache to memory queue, that technology and Was used in the prefetches a sophisticated first personal few instructions instruction computer before they are pipelining (Altair) executed Also offered a The first built-in math appearance of coprocessor the x86 architecture The 8088 was a variant of this processor and used in IBM’s first personal computer (securing the success of Intel © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Basic Operational concepts The basic mechanism through which an instruction gets executed shall be illustrated. May be recalled: – ALU contains a set of registers, some general-purpose and some special purpose. – First we briefly know about the functions of the special-purpose registers before we look into some examples. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + For Interfacing with the Primary Memory Two special-purpose registers are used: Memory Address Register (MAR): Holds the address of the memory location to be accessed. Memory Data Register (MDR): Holds the data that is being written into memory, or will receive the data being read out from memory. Memory considered as a linear array of storage locations (bytes or words) each with unique address. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + To read data to or from Memory © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + To read data to or from Memory © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + For Keeping Track of Program / Instructions Two special-purpose registers are used: Program Counter (PC): Holds the memory address of the next instruction to be executed. Automatically incremented to point to the next instruction when an instruction is being executed. Instruction Register (IR): Temporarily holds an instruction that has been fetched from memory. Need to be decoded to find out the instruction type. Also contains information about the location of the data. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Architecture of the Example Processor © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. +How are the functional units connected? For a computer to achieve its operation, the functional units need to communicate with each other. In order to communicate, they need to be connected. Input Output Memory Processor Bus Functional units may be connected by a group of parallel wires. The group of parallel wires is called a bus. Each wire in a bus can transfer one bit of information. The number of parallel wires in a bus is equal to the word length of a Computer Components: Top-Level View + + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. A communication pathway Signals transmitted by any connecting two or more one device are available for devices reception by all other devices attached to the bus I Key characteristic is that it is a shared transmission medium If two devices transmit during the same time period their signals will overlap and become garbled n n e Typically consists of multiple Computer systems contain a t communication lines Each line is capable of number of different buses B c e transmitting signals representing that provide pathways binary 1 and binary 0 between components at various levels of the computer system hierarchy u t r s i System bus c A bus that connects major The most common computer o o computer components (processor, memory, I/O) interconnection structures are based on the use of one or more system buses n n © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Data Bus Data lines that provide a path for moving data among system modules May consist of 32, 64, 128, or more separate lines The number of lines is referred to as the width of the data bus The number of lines determines how many bits can be transferred at a time The width of the data bus is a key factor in determining overall system performance © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Address Bus Control Bus Used to designate the source or destination of the data on the data bus Used to control the access and the use of the data and address lines If the processor wishes to read a word of data from memory it puts the address of the desired word on the Because the data and address lines are address lines shared by all components there must be a means of controlling their use Width determines the maximum possible memory capacity of the Control signals transmit both system command and timing information among system modules Also used to address I/O ports Timing signals indicate the validity of The higher order bits are used to select a particular module on the bus data and address information and the lower order bits select a memory location or I/O port within Command signals specify operations the module to be performed © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. CPU Memory Memory I/O I/O Control lines Address lines Bus Data lines Figure 3.16 Bus Interconnection Scheme © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Single-Bus Architecture Inside the Processor There is a single bus inside the processor. ALU and the registers are all connected via the single bus. This bus is internal to the processor and should not be confused with the external bus that connects the processor to the memory and I/O devices. A typical single-bus processor architecture is shown on the next slide. – Two temporary registers Y and Z are also included. – Register Y temporarily holds one of the operands of the ALU. Register Z temporarily holds the result of the ALU operation. The muliplexer selects a constant operand 4 during execution of the micro opera on: PC -> PC + 4. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Multi-Bus Architectures Modern processors have mul ple buses that connect the registers and other func onal units. – Allows mul ple data transfer micro-opera ons to be executed in the same clock cycle. – Results in overall faster instruc on execu on. Also advantageous to have mul ple shorter buses rather than a single long bus. – Smaller parasi c capacitance, and hence smaller delay. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Memory Addressing © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Memory Location, Addresses, and Operation It is impractical to assign distinct addresses to individual bit locations in the memory. The most practical assignment is to have successive addresses refer to successive byte locations in the memory – byte-addressable memory. Byte locations have addresses 0, 1, 2, … If word length is 32 bits, they successive words are located at addresses 0, 4, 8,… Memory Location, Addresses, and Operation + n bits Memory consists of first word many millions of storage second word cells, each of which can store 1 bit. Data is usually accessed in n-bit groups. n is i th word called word length. last word Figure 2.5. Memory words. Memory + Location, Addresses, and Operation 32-bit word length example 32 bits b31 b30 b1 b0 Sign bit: b31= 0 for positive numbers b31= 1 for negative numbers (a) A signed integer 8 bits 8 bits 8 bits 8 bits ASCII ASCII ASCII ASCII character character character character (b) Four characters + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Big-Endian and Little-Endian Assignments-Example + What bit pattern that is contained in the byte at the big end of this 32-bit word :0xFF00AA11. + What bit pattern that is contained in the byte at the big end of this 32-bit word :0xFF00AA11. Find OUT?? What bit pattern that is contained in the byte at the big end of this 32-bit word :. 0x01234567 + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Memory Operation + Load (or Read or Fetch)  Copy the content. The memory content doesn’t change.  Address – Load  Registers can be used Store (or Write)  Overwrite the content in memory  Address and Data – Store  Registers can be used + Designing for Performance The cost of computer systems continues to drop dramatically, while the performance and capacity of those systems continue to rise equally dramatically Desktop applications that require the great power of today’s microprocessor-based systems include: Image processing Three-dimensional rendering Speech recognition Videoconferencing Multimedia authoring Voice and video annotation of files Simulation modeling Businesses are relying on increasingly powerful servers to handle transaction and database processing and to support massive client/server networks that have replaced the huge mainframe computer centers of yesteryear Cloud service providers use massive high-performance banks of servers to satisfy high-volume, high-transaction-rate applications for a broad spectrum of clients © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Performance + The speed with which a computer executes programs is affected by the design of its hardware and its machine language instructions Processor Speed Clock Speed: Measured in gigahertz (GHz), it indicates the number of cycles a processor can execute per second. Higher clock speeds generally mean faster processing. Instruction Set Architecture (ISA): The design of the processor's instruction set, including the complexity and efficiency of the instructions, affects performance. Pipelining: Technique where multiple instruction phases are overlapped. It allows the processor to execute more instructions per cycle. Memory Hierarchy Cache Memory: Small, fast memory located close to the CPU. It stores frequently accessed data and instructions to reduce the time needed to fetch them from the main memory. L1 Cache: Closest to the CPU, smallest and fastest. L2 Cache: Larger than L1, slightly slower. L3 Cache: Even larger and slower, shared among multiple cores. + Pipeline in Execuing Instructions Instruction execution is typically divided into 5 stages: – Instruction Fetch (IF) – Instruction Decode (ID) – ALU opera on (EX) – Memory Access (MEM) – Write Back result to register file (WB) These five stage can be executed in an overlapped fashion in a pipeline architecture. – Results in significant speedup by overlapping instruction execution. © 2016 Pearson Education, Inc., oboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. Performance + Main Memory (RAM): Larger and slower than cache memory. The speed and size of the RAM can significantly affect performance. Virtual Memory: Allows the use of disk storage to extend physical memory. While it provides more memory, it is much slower than RAM. Storage Speed Solid-State Drives (SSD): Faster than traditional Hard Disk Drives (HDD). They significantly improve data access times. Hard Disk Drives (HDD): Slower than SSDs but offer more storage capacity at a lower cost. Parallelism Multicore Processors: CPUs with multiple cores can execute multiple instructions simultaneously, improving performance for multi-threaded applications. Hyper-Threading: Technology that allows a single CPU core to execute multiple threads concurrently. Graphics Processing Unit (GPU): Specialized processor for handling graphics and parallel computing tasks. GPUs excel in tasks that require parallel processing, such as scientific simulations and machine learning. Performance Contd… + Processor circuits are controlled by a timing signal called a clock The clock defines regular time intervals, called clock cycles To execute a machine instruction, the processor divides the action to be performed into a sequence of basic steps, such that each step can be completed in one clock cycle Let the length P of one clock cycle, its inverse is the clock rate, R=1/P Basic performance equation T=(NxS)/R, where T is the processor time required to execute a program, N is the number of instruction executions, and S is the average number of basic steps needed to execute one machine instruction. And R is clock rate Basic + Performance Equation T – processor time required to execute a program that has been prepared in high- level language N – number of actual machine language instructions needed to complete the execution (note: loop) S – average number of basic steps needed to execute one machine instruction. Each step completes in one clock cycle R – clock rate N S Note: these are not independent to each others T R How to improve T? Basic Performance Equation Contd. + Increasing Clock Rate : N S Higher clock rates reduce the execution time since the T R denominator in the equation increases. Reducing CPI : How to improve T? Lower CPI values (through architectural improvements or better instruction set design) reduce the execution time. Optimizing Instruction Count : Reducing the total number of instructions through efficient coding and compiler optimizations also reduces the execution time. Performance Improvement + Pipelining and superscalar operation Pipelining: by overlapping the execution of successive instructions Superscalar: different instructions are concurrently executed with multiple instruction pipelines. This means that multiple functional units are needed Clock rate improvement Improving the integrated-circuit technology makes logic circuits faster, which reduces the time needed to complete a basic step Performance Improvement + Reducing amount of processing done in one basic step also makes it possible to reduce the clock period, P. However, if the actions that have to be performed by an instruction remain the same, the number of basic steps needed may increase Reduce the number of basic steps to execute Reduced instruction set computers (RISC) and complex instruction set computers (CISC) Performance Measurement + T is difficult to compute. Measure computer performance using benchmark programs. System Performance Evaluation Corporation (SPEC) selects and publishes representative application programs for different application domains, together with test results for many commercially available computers. Compile and run (no simulation) Reference computer Running time on the reference computer SPEC rating  Running time on the computer under test n 1 SPEC rating  ( SPECi ) n i 1 + Memory Locations, Addresses, and Operations + Instruction and Instruction Sequencing + Data Representation Data types: Digitalcomputers – binary data – memory or processor register These may include data or control information – bit or group of bits Data -> number or binary coded information Common types of data found Numbers Lettersof alphabet Other discrete symbols used for specific purpose All types of data except binary numbers are represented in computer registers in binary – coded form © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Registers are made up of flip – flops – 2 State devices that can store 1’s and 0’s Binary is most natural Also convenient to use other systems – decimal (Users) © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Number systems A number system of base or radix ‘r’ is a system that uses distinct symbol for ‘r ‘ digits. Multiply digit by an integer power of r and sum of all weighted digits Ex: decimal -> radix 10 system. 0---9 © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Binary Subtraction Using 1's Complement The 1's complement of a number is obtained by interchanging every 0 to 1 and every 1 to 0 in a binary number. For example, the 1's complement of the binary number 110 is 001 Step 1: Find the 1's complement of the subtrahend, which means the second number of subtraction. Step 2: Add it with the minuend or the first number. Step 3: If there is a carryover left then add it with the result obtained from step 2. Step 4: If there are no carryovers, then the result obtained in step 2 is the difference of the two numbers using 1's complement binary subtraction. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + Difference Between Multiprocessor and Multicomputer Which multiprocessor is one where multiple processors are used that share the same memory for operating and are connected to function together. While in multicomputer multiple computers that have different processor are connected and each has their own different memory. They share the data to work together. © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. + © 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Computer Evolution PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue