الشابتر_الثامن_والرابع_٢٠٢٣١٢٠٨١٢٣٣_٠٧٦٢١.pdf
Document Details
Uploaded by WellRoundedSkunk
Tags
Full Transcript
Computer Architecture Hardware Parallelism Computing: execute instructions that operate on data. Computer Instructions Data A machine can have one or many processors that operate on one or many data streams. Many attempts have been made to come up with a way to categorize computer archi...
Computer Architecture Hardware Parallelism Computing: execute instructions that operate on data. Computer Instructions Data A machine can have one or many processors that operate on one or many data streams. Many attempts have been made to come up with a way to categorize computer architectures. Computer Architecture 503323-3 5 Dr. Rania Mohammed Flynn’s Taxonomy Flynn’s taxonomy (Michael Flynn, 1967) classifies computer architectures based on the number of instructions that can be executed and how they operate on data. Flynn’s Taxonomy has been the most enduring classification method, despite having some limitations. Flynn’s Taxonomy takes into consideration the number of processors and the number of data paths incorporated into an architecture. Computer Architecture 503323-3 Dr. Rania Mohammed 6 Dr. Rania Mohammed 3 Computer Architecture Flynn’s Taxonomy The crux of parallel processing are CPUs. Based on the number of instruction and data streams that can be processed simultaneously, computing systems are classified into four major categories: One One Many SISD MISD Traditional von Neumann single CPU computer May be pipelined Computers SIMD MIMD Many Data Streams Instruction Streams Vector Processors fine gained data parallel computers Computer Architecture 503323-3 7 Multi computers Multiprocessors Dr. Rania Mohammed Flynn’s Taxonomy The four combinations of multiple processors and multiple data paths are described by Flynn as: SISD: Single Instruction, Single Data. These are classic uniprocessor systems. SIMD: Single Instruction, Multiple Data. Execute the same instruction on multiple data values, as in vector processors. MIMD: Multiple Instruction, Multiple data. These are today’s parallel architectures. MISD: Multiple Instruction, Single Data. Computer Architecture 503323-3 Dr. Rania Mohammed 8 Dr. Rania Mohammed 4 Computer Architecture Single Instruction, Single Data (SISD) A serial (non-parallel) computer Single instruction: only one instruction stream is being acted on by the CPU during any one clock cycle Single data: only one data stream is being used as input during any one clock cycle This is the oldest and until recently, the most prevalent form of computer Examples: most PCs, single CPU workstations and mainframes Computer Architecture 503323-3 9 Dr. Rania Mohammed Single Instruction, Multiple Data (SIMD) A type of parallel computer Single instruction: All processing units execute the same instruction at any given clock cycle Multiple data: Each processing unit can operate on a different data element Computer Architecture 503323-3 Dr. Rania Mohammed 10 Dr. Rania Mohammed 5 Computer Architecture Two varieties for SIMD: Processor Arrays: • E.g. Connection Machine CM-2, Maspar MP-1, MP-2 Vector Pipelines: • E.g. IBM 9000, NEC SX-2, Hitachi S820 Computer Architecture 503323-3 11 Dr. Rania Mohammed Multiple Instruction, Single Data (MISD) A single data stream is fed into multiple processing units. Each processing unit operates on the data independently via independent instruction streams. Few actual examples of this class of parallel computer have ever existed. Computer Architecture 503323-3 Dr. Rania Mohammed 12 Dr. Rania Mohammed 6 Computer Architecture Multiple Instruction, Multiple Data (MIMD) Currently, the most common type of parallel computer. Most modern computers fall into this category. Multiple Instruction: every processor may be executing a different instruction stream Multiple Data: every processor may be working with a different data stream Computer Architecture 503323-3 13 Dr. Rania Mohammed MIMD Distributed Memory MIMD Shared Memory Execution can be synchronous or asynchronous, deterministic or non-deterministic Examples: most current supercomputers, networked parallel computer "grids" and multi-processor SMP computers - including some types of PCs. Computer Architecture 503323-3 Dr. Rania Mohammed 14 Dr. Rania Mohammed 7 Computer Architecture System Topologies A system may also be classified by its topology. A topology is the pattern of connections between processors. The cost-performance trade off determines which topologies to use for a multiprocessor system. Computer Architecture 503323-3 15 Dr. Rania Mohammed Topology Classification A topology is characterized by its diameter, total bandwidth, and bisection bandwidth Diameter – the maximum distance between two processors in the computer system. Total bandwidth – the capacity of a communications link multiplied by the number of such links in the system. Bisection bandwidth – represents the maximum data transfer that could occur at the bottleneck in the topology. Computer Architecture 503323-3 Dr. Rania Mohammed 16 Dr. Rania Mohammed 8 Computer Architecture System Topologies Shared Bus Topology: Processors communicate with each other via a single bus that can only handle one data transmissions at a time. In most shared buses, processors directly communicate with their own local memory. Computer Architecture 503323-3 M M M P P P Shared Bus Global Memory 17 Dr. Rania Mohammed System Topologies Ring Topology: P Uses direct connections between processors instead of a shared bus. Allows communication links to be active simultaneously but data may have to travel through several processors to reach its destination. Computer Architecture 503323-3 Dr. Rania Mohammed 18 P P P P P P P Dr. Rania Mohammed 9 Computer Architecture System Topologies Tree Topology: Uses direct connections P between processors; each having three connections. There is only one unique path between any pair of processors P P Computer Architecture 503323-3 P P P 19 P Dr. Rania Mohammed System Topologies Mesh Topology: In the mesh topology, every processor connects to the processors above and below it, and to its right and left. Computer Architecture 503323-3 Dr. Rania Mohammed 20 P P P P P P P P P Dr. Rania Mohammed 10 Computer Architecture System Topologies Cube Topology: Is a multiple mesh topology. Each processor connects to all other processors whose binary values differ by one bit. For example, processor 0(000) connects to 1(001) or 2(010) or 4(100). Computer Architecture 503323-3 001 000 P P 011 P P 010 P P 101 P 100 P 111 110 21 Dr. Rania Mohammed System Topologies Hypercube Topology: Is a multiple mesh topology. Each processor connects to all other processors whose binary values differ by one bit. For example, processor 0(0000) connects to 1(0001) or 2(0010) or 4(0100) or 8(1000) . Computer Architecture 503323-3 Dr. Rania Mohammed P P 0010 0001 0000 P P P P P P 0100 P P P P 1000 P 22 P P P Dr. Rania Mohammed 11 Computer Architecture System Topologies Completely Connected Topology: Every processor has n-1 connections, one to each of the other processors. There is an increase in complexity as the system grows but this offers maximum communication capabilities. P P P P P P Computer Architecture 503323-3 Dr. Rania Mohammed 23 P P Dr. Rania Mohammed 12 Computer Architecture CLO, topics and objectives CLO 1.3: Understand modern architectures and its features Topics: 1. General Register Organization 2. Stack Organization 3. Instruction Formats 4. Addressing Modes 5. Data Transfer and Manipulation 6. Program Control By the end of this chapter, the student will be able to: Describe the central processing unit (CPU). Describe the operation of a memory stack. Describe the Various instruction formats together with a variety of addressing modes. Computer Architecture 503323-3 3 Dr. Rania Mohammed MAJOR COMPONENTS OF CPU Storage Components Registers Flags Execution (Processing) Components Arithmetic Logic Unit( ALU) {Arithmetic calculations, Logical computations, Shifts/Rotates} Transfer Components Bus Register File ALU Control Components Control Unit Control Unit Computer Architecture 503323-3 Dr. Rania Mohammed 4 Dr. Rania Mohammed 2 Computer Architecture REGISTERS In Basic Computer, there is only one general purpose register, the Accumulator (AC) In modern CPUs, there are many general-purpose registers It is advantageous to have many registers • Transfer between registers within the processor are relatively fast • Going “off the processor” to access memory is much slower How many registers will be the best ? Computer Architecture 503323-3 5 Dr. Rania Mohammed GENERAL REGISTER ORGANIZATION Input Clock R1 R2 R3 R4 R5 R6 R7 Load (7 lines) SELA { 3x8 Decoder SELD OPR 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 8 x 1 MUX 8 x 1 MUX A bus B bus { } SELB ALU Output Computer Architecture 503323-3 Dr. Rania Mohammed 6 Dr. Rania Mohammed 3 Computer Architecture OPERATION OF CONTROL UNIT The control unit directs the information flow through ALU by: Selecting various Components in the system Selecting the Function of ALU Example: R1 R2 + R3 [1] MUX A selector (SELA): BUS A R2 [2] MUX B selector (SELB): BUS B R3 [3] ALU operation selector (OPR): ALU to ADD [4] Decoder destination selector (SELD): R1 Out Bus Computer Architecture 503323-3 7 Dr. Rania Mohammed OPERATION OF CONTROL UNIT Control Word 3 3 3 5 SELA SELB SELD OPR Encoding of register selection fields Binary Code Computer Architecture 503323-3 Dr. Rania Mohammed SELA SELB SELD 000 Input Input None 001 R1 R1 R1 010 R2 R2 R2 011 R3 R3 R3 100 R4 R4 R4 101 R5 R5 R5 110 R6 R6 R6 111 R7 R7 R7 8 Dr. Rania Mohammed 4 Computer Architecture ALU CONTROL Encoding of ALU operations OPR Select Operation Symbol 00000 00001 00010 00101 00110 01000 01010 01100 01110 10000 11000 Transfer A Increment A ADD A + B Subtract A – B Decrement A AND A and B OR A and B XOR A and B Complement A Shift right A Shift left A TSFA INCA ADD SUB DECA AND OR XOR COMA SHRA SHLA Computer Architecture 503323-3 9 Dr. Rania Mohammed ALU CONTROL Examples of ALU Microoperations Symbolic Designation Microoperation R1 R2 R3 R4 R4 R5 R6 R6 + 1 R7 R1 Output R2 Output Input R4 shl R4 R5 0 Computer Architecture 503323-3 Dr. Rania Mohammed SELA R2 R4 R6 R1 R2 Input R4 R5 SELB R3 R5 R5 SELD R1 R4 R6 R7 None None R4 R5 10 OPR SUB OR INCA TSFA TSFA TSFA SHLA XOR Control Word 010 100 110 001 010 000 100 101 011 101 000 000 000 000 000 101 001 100 110 111 000 000 100 101 00101 01010 00001 00000 00000 00000 11000 01100 Dr. Rania Mohammed 5 Computer Architecture REGISTER STACK ORGANIZATION Stack Very useful feature for nested subroutines, nested interrupt services Also efficient for arithmetic expression evaluation Storage which can be accessed in LIFO Pointer: SP Only PUSH and POP operations are applicable Computer Architecture 503323-3 11 Dr. Rania Mohammed REGISTER STACK ORGANIZATION Push, Pop operations Stack Initially: 63 SP = 0, EMPTY = 1, FULL = 0 FULL Flags EMPTY PUSH 5 SP SP + 1 M[SP] DR If (SP = MAX) then (FULL 1) EMPTY 0 POP D 4 SP C 3 6-bits B 2 A 1 0 DR M[SP] SP SP 1 If (SP = 0) then (EMPTY 1) FULL 0 Computer Architecture 503323-3 Dr. Rania Mohammed Address DR 12 Dr. Rania Mohammed 6 Computer Architecture MEMORY STACK ORGANIZATION Memory with Program, Data, and Stack Segments A portion of memory is used as a stack with a processor register as a stack pointer PC Program (Instructions) PUSH: SP SP - 1 M[SP] DR POP: AR Data (Operands) 1000 2000 3000 DR M[SP] SP SP + 1 Stack 3997 SP 3998 3999 4000 Most computers do not provide hardware to check stack overflow (full stack) or underflow (empty stack) must be done in software Computer Architecture 503323-3 13 Dr. Rania Mohammed REVERSE POLISH NOTATION Arithmetic Expressions: A + B A+B +AB AB+ Infix notation Prefix or Polish notation Postfix or reverse Polish notation The reverse Polish notation is very suitable for stack manipulation Evaluation of Arithmetic Expressions Any arithmetic expression can be expressed in parenthesis-free Polish notation, including reverse Polish notation (3 * 4) + (5 * 6) 34*56*+ SP SP SP 3 3 3 4 Computer Architecture 503323-3 Dr. Rania Mohammed 4 SP SP 6 5 5 12 12 12 12 * 5 6 * 14 SP 30 SP 42 + Dr. Rania Mohammed 7 Computer Architecture PROCESSOR ORGANIZATION In general, most processors are organized in one of 3 ways Single register (Accumulator) organization ◦ Basic Computer is a good example ◦ Accumulator is the only general-purpose register General register organization ◦ Used by most modern computer processors ◦ Any of the registers can be used as the source or destination for computer operations Stack organization ◦ All operations are done using the hardware stack ◦ For example, an OR instruction will pop the two top elements from the stack, do a logical OR on them, and push the result on the stack Computer Architecture 503323-3 15 Dr. Rania Mohammed COMMON BUS SYSTEM Instruction Fields OP-code field specifies the operation to be performed Address field designates memory address(es) or a processor register(s) Mode field determines how the address field is to be interpreted (to get effective address or the operand) The number of address fields in the instruction format depends on the internal organization of CPU Computer Architecture 503323-3 Dr. Rania Mohammed 16 Dr. Rania Mohammed 8 Computer Architecture COMMON BUS SYSTEM The three most common CPU organizations: Single accumulator organization: ADD /* AC AC + M[X] */ X General register organization: ADD R1, R2, R3 /* R1 R2 + R3 */ ADD R1, R2 /* R1 R1 + R2 */ MOV R1, R2 /* R1 R2 */ ADD R1, X /* R1 R1 + M[X] */ PUSH X /* TOS M[X] */ PUSH Y /* TOS M[Y] */ Stack organization: /* TOS M[X] + M[Y] */ ADD Computer Architecture 503323-3 17 Dr. Rania Mohammed THREE, AND TWO-ADDRESS INSTRUCTIONS •Three-Address Instructions OP AD1 AD2 AD3 Program to evaluate X = (A + B) * (C + D) : ADD R1, A, B /* R1 M[A] + M[B] */ ADD R2, C, D /* R2 M[C] + M[D] */ MUL X, R1, R2 /* M[X] R1 * R2 */ - Results in short programs - Instruction becomes long (many bits) • Two-Address Instructions OP AD1 AD2 Program to evaluate X = (A + B) * (C + D) : MOV ADD MOV ADD MUL MOV Computer Architecture 503323-3 Dr. Rania Mohammed /* R1 M[A] /* R1 R1 + M[B] /* R2 M[C] /* R2 R2 + M[D] /* R1 R1 * R2 /* M[X] R1 R1, A R1, B R2, C R2, D R1, R2 X, R1 18 */ */ */ */ */ */ Dr. Rania Mohammed 9 Computer Architecture ONE ADDRESS INSTRUCTIONS One-Address Instructions OP AD Use an implied AC register for all data manipulation -Program to evaluate X = (A + B) * (C + D) : LOAD ADD STORE LOAD ADD MUL STORE A B T C D T X /* /* /* /* /* /* /* Computer Architecture 503323-3 AC M[A] AC AC + M[B] M[T] AC AC M[C] AC AC + M[D] AC AC * M[T] M[X] AC */ */ */ */ */ */ */ 19 Dr. Rania Mohammed ZERO ADDRESS INSTRUCTIONS Zero-Address Instructions OP - Can be found in a stack-organized computer -Program to evaluate X = (A + B) * (C + D) : AB+CD+* PUSH PUSH ADD PUSH PUSH ADD MUL POP Computer Architecture 503323-3 Dr. Rania Mohammed A B C D X /* /* /* /* /* /* /* /* TOS A */ TOS B */ TOS (A + B) */ TOS C */ TOS D */ TOS (C + D) */ TOS (C + D) * (A + B) */ M[X] TOS */ 20 Dr. Rania Mohammed 10 Computer Architecture ADDRESSING MODES Addressing Modes : Specifies a rule for interpreting or modifying the address field of the instruction (before the operand is referenced) Variety of addressing modes to give programming flexibility to the user to use the bits in the address field of the instruction efficiently Computer Architecture 503323-3 21 Dr. Rania Mohammed TYPES OF ADDRESSING MODES Implied Mode Address of the operands are specified implicitly in the definition of the instruction • No need to specify address in the instruction • EA = AC, or EA = Stack[SP] • Examples from Basic Computer • CLA, CME, INP Immediate Mode Instead of specifying the address of the operand, operand itself is specified • No need to specify address in the instruction • However, operand itself needs to be specified • Sometimes, require more bits than the address • Fast to acquire an operand Computer Architecture 503323-3 Dr. Rania Mohammed 22 Dr. Rania Mohammed 11 Computer Architecture TYPES OF ADDRESSING MODES Register Mode Address specified in the instruction is the register address • • • • • Designated operand need to be in a register Shorter address than the memory address Saving address field in the instruction Faster to acquire an operand than the memory addressing EA = IR(R) (IR(R): Register field of IR) Register Indirect Mode Instruction specifies a register which contains the memory address of the operand • Saving instruction bits since register address is shorter than the memory address • Slower to acquire an operand than both the register addressing or memory addressing • EA = [IR(R)] ([x]: Content of x) Autoincrement or Autodecrement Mode When the address in the register is used to access memory, the value in the register is incremented or decremented by 1 automatically Computer Architecture 503323-3 23 Dr. Rania Mohammed TYPES OF ADDRESSING MODES Direct Address Mode Instruction specifies the memory address which can be used directly to access the memory • Faster than the other memory addressing modes • Too many bits are needed to specify the address for a large physical memory space • EA = IR(addr) (IR(addr): address field of IR) Indirect Addressing Mode The address field of an instruction specifies the address of a memory location that contains the address of the operand • When the abbreviated address is used large physical memory can be addressed with a relatively small number of bits • Slow to acquire an operand because of an additional memory access • EA = M[IR(address)] Computer Architecture 503323-3 Dr. Rania Mohammed 24 Dr. Rania Mohammed 12 Computer Architecture TYPES OF ADDRESSING MODES Relative Addressing Modes • The Address fields of an instruction specifies the part of the address (abbreviated address) which can be used along with a designated register to calculate the address of the operand • Address field of the instruction is short • Large physical memory can be accessed with a small number of address bits • EA = f(IR(address), R), R is sometimes implied 3 different Relative Addressing Modes depending on R; • PC Relative Addressing Mode (R = PC) - EA = PC + IR(address) • Indexed Addressing Mode (R = IX, where IX: Index Register) - EA = IX + IR(address) • Base Register Addressing Mode (R = BAR, where BAR: Base Address Register) - EA = BAR + IR(address) Computer Architecture 503323-3 25 Dr. Rania Mohammed ADDRESSING MODES - EXAMPLES PC = 200 OP MODE Address Load to AC Mode 500 Address 200 201 Memory Load to AC Mode Address = 500 202 Next Instruction 399 400 450 700 500 800 600 900 702 325 800 300 R1 = 400 IX = 100 AC Addressing Mode Direct address Immediate operand Indirect address Relative address Indexed address Register Register indirect Autoincrement Autodecrement Computer Architecture 503323-3 Dr. Rania Mohammed Effective Address 500 800 702 600 400 400 399 Content of AC 800 500 300 325 900 400 700 700 450 Comments /* AC (500) */ /* AC 500 */ /* AC ((500)) */ /* AC (PC+500) */ /* AC (IX+500) */ /* AC R1 */ /* AC (R1)*/ /* AC (R1)+ */ /* AC -(R1) */ 26 Dr. Rania Mohammed 13 Computer Architecture DATA TRANSFER INSTRUCTIONS Typical Data Transfer Instructions Name Load Store Move Exchange Input Output Push Pop Mnemonic LD ST MOV XCH IN OUT PUSH POP Computer Architecture 503323-3 27 Dr. Rania Mohammed DATA TRANSFER INSTRUCTIONS Data Transfer Instructions with Different Addressing Modes Mode Direct address Indirect address Relative address Immediate operand Index addressing Register Register indirect Autoincrement Autodecrement Computer Architecture 503323-3 Dr. Rania Mohammed Assembly Convention LD LD LD LD LD LD LD LD LD ADR @ADR $ADR #NBR ADR(X) R1 (R1) (R1)+ -(R1) 28 Register Transfer AC M[ADR] AC M[M[ADR]] AC M[PC + ADR] AC NBR AC M[ADR + XR] AC R1 AC M[R1] AC M[R1], R1 R1 + 1 R1 R1 - 1, AC M[R1] Dr. Rania Mohammed 14 Computer Architecture DATA MANIPULATION INSTRUCTIONS Three Basic Types: Arithmetic instructions Logical and bit manipulation instructions Shift instructions Arithmetic Instructions Name Increment Decrement Add Subtract Multiply Divide Add with Carry Subtract with Borrow Negate(2’s Complement) Computer Architecture 503323-3 Mnemonic INC DEC ADD SUB MUL DIV ADDC SUBB NEG 29 Dr. Rania Mohammed DATA MANIPULATION INSTRUCTIONS Logical and Bit Manipulation Instructions Name Mnemonic Clear CLR Complement COM AND AND OR OR Exclusive-OR XOR Clear carry CLRC Set carry SETC Complement carry COMC Enable interrupt EI Disable interrupt DI Computer Architecture 503323-3 Dr. Rania Mohammed Shift Instructions Name Logical shift right Logical shift left Arithmetic shift right Arithmetic shift left Rotate right Rotate left Rotate right thru carry Rotate left thru carry 30 Mnemonic SHR SHL SHRA SHLA ROR ROL RORC ROLC Dr. Rania Mohammed 15 Computer Architecture FLAG, PROCESSOR STATUS WORD In Basic Computer, the processor had several (status) flags – 1 bit value that indicated various information about the processor’s state – E, FGI, FGO, I, IEN, R In some processors, flags like these are often combined into a register the processor status register (PSR); sometimes called a processor status word (PSW) Common flags in PSW are C (Carry): Set to 1 if the carry out of the ALU is 1 S (Sign): The MSB bit of the ALU’s output Z (Zero): Set to 1 if the ALU’s output is all 0’s V (Overflow): Set to 1 if there is an overflow Status Flag Circuit A 8 8 c7 8-bit ALU F7 - F0 c8 V Z S C B F7 8 Check for zero output F Computer Architecture 503323-3 Dr. Rania Mohammed 31 Dr. Rania Mohammed 16