Computer Architecture 2 Chapter 2 PDF
Document Details
Uploaded by Deleted User
USTHB
2024
Tags
Summary
This document provides lecture notes on computer architecture, specifically covering Chapter 2: Main Components of a Computer. The topics include Arithmetic Logic Units (ALUs), buses, registers, internal memories, cache memory, and memory hierarchies. The material is presented in a slide format.
Full Transcript
Computer Architecture 2 L2ING Chapter 2 Main Components of Computer 2024-2025 PLAN 1. Arithmetic Logic Unit (ALU - UAL) 2. Buses 3. Registers 4. Internal Memories: RAM (SRAM & DRAM), ROM 5. Cache memory: principle, usefulness, and management strategies...
Computer Architecture 2 L2ING Chapter 2 Main Components of Computer 2024-2025 PLAN 1. Arithmetic Logic Unit (ALU - UAL) 2. Buses 3. Registers 4. Internal Memories: RAM (SRAM & DRAM), ROM 5. Cache memory: principle, usefulness, and management strategies 6. Memory hierarchy Arithmetic Logic Unit (ALU - UAL) The ALU is the only component in the computer responsible for performing all instructions. INSTRUCTION Arithmetic Logic (+, -, %, *…) (AND, OR, XOR, …) Operands & Operator Examples Instruction1: A + B, (A,B): Operands, +: Binary Operator Instruction2: NOT A, A: Operand, NOT: Unary Operator Arithmetic Logic Unit (ALU) 1. UAL is composed of several combinational logic circuits 2. It has 2 inputs (on N bits) as operands, and one output as result 3. It also has a command input allowing to choose the operation to be done (Multiplexer) 4. A table of codes of operations for the command input associated with the UAL 5. Has Flags (flags) PSW = Program Status Word Arithmetic Logic Unit (ALU) C (Command Input) A (Op1) UAL S (Res) B (Op2) PSW (Flags) Arithmetic Logic Unit (ALU) C.C.ADD A C.C.SOUS C.C.AND B C.C.CMP C.C.XOR Arithmetic Logic Unit (ALU) Half-Adder It is a circuit that adds two binary numbers on one bit each and returns two outputs. S: result of the addition R: carry out of the addition 𝑺𝟎 = 𝑨𝟎 ⊕ 𝑩𝟎 𝐴0 𝐵0 𝑆0 𝑅0 𝑹𝟎 = 𝑨𝟎 𝑩𝟎 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 Arithmetic Logic Unit (ALU) Full Adder It is a circuit that adds two numbers on N bits and returns two outputs. S: result of the addition R: carry out of the addition 𝑺𝒏 = 𝑨𝒏 ⊕ 𝑩𝒏 ⊕ 𝑹𝒏−𝟏 N bits 𝑹𝒏 = 𝑹𝒏−𝟏 (𝑨𝒏 ⊕ 𝑩𝒏 ) + 𝑨𝒏 𝑩𝒏 𝐴𝑛 𝐵𝑛 𝑅𝑛−1 𝑆𝑛 𝑅𝑛 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1 Arithmetic Logic Unit (ALU) Full Adder – 4 bits Arithmetic Logic Unit (ALU) Full Adder – 4 bits Processor 32 bits → 32 A.C Processor 64 bits → 64 A.C Arithmetic Logic Unit (ALU) C C.C.ADD A C.C.SOUS C.C.AND B MUX S C.C.CMP C.C.XOR Arithmetic Logic Unit (ALU) Table of codes & line commande C 𝑪𝟐 𝑪𝟏 𝑪𝟎 S C=011 0 0 0 Add 0 0 1 Sous 0 1 0 Multip MUX S=Div 0 1 1 Div 1 0 0 Shift Right 1 0 1 Shift Left 1 1 0 (A or B) xor A 1 1 1 Not Arithmetic Logic Unit (ALU) Multiplexer (4 vers 1) 𝑪𝟏 𝑪𝟎 𝑪𝟏 𝑪𝟎 S 𝑬𝟎 𝑬𝟏 0 0 𝑬𝟎 MUX S 𝑬𝟐 4 vers 1 0 1 𝑬𝟏 𝑬𝟑 1 0 𝑬𝟐 S=𝑬𝟎 𝑪𝟎 𝑪𝟏 + 𝑬𝟏 𝑪𝟎 𝑪𝟏 + 𝑬𝟐 𝑪𝟎 𝑪𝟏 + 𝑬𝟑 𝑪𝟎 𝑪𝟏 1 1 𝑬𝟑 Arithmetic Logic Unit (ALU) Multiplexer (4 vers 1) S=𝑬𝟎 𝑪𝟎 𝑪𝟏 + 𝑬𝟏 𝑪𝟎 𝑪𝟏 + 𝑬𝟐 𝑪𝟎 𝑪𝟏 + 𝑬𝟑 𝑪𝟎 𝑪𝟏 Arithmetic Logic Unit (ALU) C C.C.ADD A C.C.SOUS C.C.AND B MUX S C.C.CMP C.C.XOR PSW (Flags) ZF SF CF OF … Arithmetic Logic Unit (ALU) – CF (Carry-out): set to 1 in case of overflow on unsigned operation – OF (Overflow): set to 1 in case of overflow on signed operation. – ZF (Zero): flag set to 1 if the result of the operation is 0. – PF (Parity): 1 if the low-order of result contains even number of 1, 0 otherwise. – AF (Auxiliary): 1 if there is a carry on the 3rd bit, 0 otherwise – SF (Sign): 1 if the high-order bit of the result is = 1, 0 otherwise Arithmetic Logic Unit (ALU) C 𝟐𝒏 𝐛𝐢𝐭𝐬 = Nb_operations N-bits C.C.OP1 A C.C.OP2 N-bits N-bits C.C.OP3 B MUX S C.C.OP4 C.C.OPi … Z S C O Arithmetic Logic Unit (ALU) The Buses A communication bus is a set of wires Ensures the transmission of signals and the exchange of information between the different units of the computer The Buses Each wire carries a signal or not, it is present or absent: 1: signal present, 0: signal absent 1 0 1 1 10110100 0 1 8 fils → bus 8-bits 0 0 The Buses Depending on the nature of the information to be transported, there are three types of information bus ✓ Data bus ✓ Address bus ✓ Command (or control) bus The Buses The Buses ✓ Address bus (addressing or memory bus): this is a unidirectional bus, it allows to transport the addresses generated by the CPU for addressing the Main Memory. ✓ Size of B.A = n-bits → Main memory size = 2𝑛 bytes / byte = octet = 8 bits NB: Size of data stored in main memory being the byte, or a multiple of bytes Taille de bus d’adresses Address space M.C 4 bits 16 octets 8 bits 256 octets 10 bits 1 k octets 16 bits 64 k octets (65536) 32 bits 4G octets The buses ✓ Data bus: this is a bidirectional bus, it ensures the transfer of data between the microprocessor and its environment, and vice versa. The size of this bus specifies the possible values of the data that can be carried by this bus If the size of B.D = n-bits, so : ✓ Les valeurs possibles des données non-signées pouvant être véhiculées par ce bus :[0, 𝟐𝒏 -1] ✓ Possible values of signed data that can be carried by this bus: [-𝟐𝒏−𝟏 , +𝟐𝒏−𝟏 -1] Taille B.D Non-signées possibles Signées possibles 4 bits 0 , 15 (0H – FH) -8 , +7 (8H – 7H) 8 bits 0 , 255 (00H – FFH) -128 , +127 (80H – 7FH) 16 bits 0 , 65535 (0000H-FFFFH) -32768 , +32767 (8000H – 7FFFH) 32 bits ??? ??? 64 bits ??? ??? The buses ✓ Control bus: This is a bidirectional bus, used to transport control signals indicating the type of operation. ✓ M/IO = 1: memory access, =0: peripheral access M/IO ✓ R/W = 1: read operation, = 0: write operation CPU R/W ✓ DEN = 0: valid data, 1 = invalid data DEN Program R/W M/IO DEN Mov BL, X 1 1 0 IN AL, port_clavier 1 0 0 OUT port_ecran, BL 0 0 0 Mov X, AL 0 1 0 The registers When the processor executes the instructions in progress, the data is temporarily stored in small fast memories (8, 16, 32 or 64 bits) called REGISTERS. Depending on the type of processor, the total number of registers can vary from a dozen to several hundred (Ex: Intel-32bits contains 16 registers). They can be of the address type (they then contain an address of the memory word) or data (they then contain the content of a memory word). They can be specific and have a very precise function (for example the “ordinal counter” register) or general and serve mainly for intermediate calculations (for example the “AX” accumulator register) The registers 15 0 AX Accumulator Register BX Base Register CX Counter Register DX Data Register SP Stack Pointer BP Base Pointer DI Destination Index SI Source Index General register Specific register The registers 7 07 0 AH AL BH BL H : High CH CL L : Low DH DL SP BP DI SI The registers 15 0 X X X X OF DF IF TF SF ZF X AF X PF X CF PSW Condition bits (red) Bits to read, configured by the processor and interpreted by the programmer – CF (Carry-out): set to 1 in case of overflow on unsigned operation – OF (Overflow): set to 1 in case of overflow on signed operation. – ZF (Zero): flag set to 1 if the result of the operation is 0. – PF (Parity): 1 if the low-order bit of the result is an even number of 1, 0 otherwise.– AF (Auxiliary): 1 if carried on the 3rd bit, 0 otherwise – SF (Sign): 1 if the high-order bit of the result is = 1, 0 otherwise The registers – CF (Carry-out): set to 1 in case of overflow on unsigned operation – OF (Overflow): set to 1 in case of overflow on signed operation. – ZF (Zero): flag set to 1 if the result of the operation is 0. – PF (Parity): 1 if the low-order bit of the result is an even number of 1, 0 otherwise. – AF (Auxiliary): 1 if carried on the 3rd bit, 0 otherwise – SF (Sign): 1 if the high-order bit of the result is = 1, 0 otherwise 1111 1111 0111 1111 1000 0000 + + + 0000 0001 0000 0001 1111 1111 1 0000 0000 1000 0000 1 0111 1111 CF=1 CF=0 OF=1 (Somme 2 nbrs pos CF=1 OF=0 donne result neg !) OF=1 ZF=1 ZF=0 ZF=0 The registers 15 0 X X X X OF DF IF TF SF ZF X AF X PF X CF PSW Control Bits (bleu) These are writeable bits, configured by the programmer. DF=0: left to right (CLD instruction: DF=0) DF=1: right to left (STD instruction: DF=1) IF=0 : ignored (CLI instruction: IF=0) IF=1 : allowed (CLI instruction: IF=1) TF=0: step-by-step execution TF=1: block execution Pour modifier le bit TF: PUSHF (empiler le registre PSW) POP AX (AX=PSW) OR AX, 0000000100000000b (Pour TF=1) ou AND AX, 1111111011111111b (pour TF=0) PUSH AX POPF (Dépiler sommet de pile vers PSW) Les registres 4 Segment Registers DS (Data Segment) : contains DATA segment address, points to the data memory area SS (Stack Segment) : contains stack segment address CS (Code Segment) : contains code segment address and instructions of the executable program. ES (Extra Segment) : address for auxiliary or additional data Internal Memories We call "memory" any device capable of acquiring, recording, storing information (data + programs), and restoring it on demand. Internal Memories Registers UCT Very (CPU) fast access Secondary Main Memory Memory Primary memory Auxiliary Internal External Mass memory Fast RAM Slow Hard disc access ROM Flash disque access DRAM CD/DVD … etc … etc Internal Memories Main memory The central RAM ROM memory is RAM- Volatil dominated Non volatil Read/Write Read Only Temporarily stores information Stores immutable information in use on the computer (Ex:Bios) Internal Memories RAM (Random Access Memory) ✓ RAM is a structure composed of several memory cells of the same size ✓ It is formally defined as an array with cells indexed as addresses ✓ It is characterized by its Size (capacity), and the Memory Word. Internal Memories Central Memory RAM Cell Address 0 0 0 0 1 1 1 1 0 Cell Cell Word memory = Number of bits (8 bits).... Size = Nb_cells × Word memory (en. octets Cell Address N-3 Nb_cell = 16, Word_Mem = 8 bits Size_Mem = 16 × 8 bits (1 octet) Cell Address N-2 = 16 octets Cell Address N-1 Internal Memories Communication between CPU & RAM Address bus : Adresse 2 Cell Adresse N-1 Cell Cell. CPU... 10100110. 10100110 Adresse 2 Cell Adresse 1 Control bus (to read) Cell Adresse 0 Internal Memories Communication between CPU & RAM Address bus : Adresse 1 Cell Adresse N-1 Cell Cell. CPU Data bus : 00001111... 00001111. Cell Adresse 2 00001111 Adresse 1 Control bus (to write) Cell Adresse 0 Internal Memories INTUITIF COMPUTE ! Cell 0000 Cell Cell If the address is on n-bits, then it is. possible to reference 𝟐𝒏 memory cells. (In the example it is 16 memory cells. or words).. Cell 1101 Cell 1110 Cell 1111 Internal Memories Memory Segmentation Segmentation is the subdivision (logical, not physical) of RAM @Segment into multiple equal-sized areas. 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 It is done by cutting the bits of 0 1 0 0 the address bus 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 Internal Memories Memory Segmentation @Segment @Offset 0 0 0 0 Ecriture Segment 11 0 0 0 1 Segment:Offset 0 0 1 0 0 0 1 1 0 1 0 0 00:01 Segment 10 0 1 0 1 0 1 1 0 0 1 1 1 01:01 1 0 0 0 Segment 01 1 0 0 1 1 0 1 0 11:10 1 0 1 1 1 1 0 0 1 1 0 1 Segment 00 1 1 1 0 1 1 1 1 Internal Memories The Stack ✓ The stack is a structure of central memory ✓ Uses the principle of LIFO (Last In First Out) 0000 0001 ✓ SP is a register that points to the top 0010 of the stack 0011 ✓ Manipulated by PUSH and POP 0100 instructions 0101 0110 ✓ Saves return addresses when calling 0111 procedures 1000 SS=10 1001 1010 SP xxxxx 1011 1100 1101 1110 1111 Internal Memories The stack 10: offset1: Ins 1 10: offset2: Ins 2 10: offset3: CALL PROC 10: offset4: Ins 4 PROC 0000 10:offset: Inst 1 0001 10:offset: Inst 2 0010 RET 0011 0100 CS=10 0101 0110 0111 1000 SS=01 1001 1010 SP xxxxx 1011 Procedure Near 1100 1. CALL : Empiler @_Retour (Sp=Sp-2) 1101 2. RET: Sp=Sp+2 1110 1111 Dépiler sommet vers Reg Ip Internal Memories 10: offset1: Ins 1 The Stack 10: offset2: Ins 2 10: offset3: CALL PROC 10: offset4: Ins 4 0000 PROC 0001 00: offset1: Ins 1 0010 00: offset2: Ins 2 0011 RET 0100 CS=10 0101 0110 Procedure Far 0111 1. CALL : Empiler CS_Retour 1000 2. Sp=Sp-2 SS=01 1001 1010 SP 3. Empiler Ip_Retour xxxxx 1011 4. Sp=Sp-2 1100 5. RET: CS=00 1101 1110 Dépiler sommet vers Ip 1111 Dépiler CS Internal Memories RAM SRAM DRAM It does not need to it needs to be updated be updated Random Access (55ms) Read/Write Larger size Small size Fast access time Long access time Used in cache Used in central memories memory Consumes less Consumes more energy energy lot of money. Abordable Internal Memories Internal Memories Technology Access $ par GB time SRAM 1 – 10 ns >> 1000 Central Memory DRAM 100 ns 10 SSD 100 𝜇𝑠 1 Secondary Memory HHD 10 ms 0.1 Cache Memory 1 heure OUTPUT INPUT UNITS CPU UNITS Cache Memory Rapide + de (SRAM) petite taille CENTRAL SECONDARY 1 jour MEMORY MEMORY Cache Memory INPUT UNITS OUTPUT UNITS CPU 2 1 1 4 Succès de cache Cache Memory Echec de cache (Hit) (SRAM) (Miss) 2 3 CENTRAL SECONDARY MEMORY MEMORY Cache Memory Principle of locality The locality principle states that the information that the processor will access has a high probability of being located in a spatial window and a temporal window. Temporal Locality Spatial Locality A program that has A program that has manipulated information manipulated information in in the recent past has a the recent past has a very very high chance of high chance of manipulating manipulating it again information close to this shortly after. information shortly after. Cache Memory Temporal Locality Spatial Locality For loop Table T Index Contenu - Instruction 1; (x) T Val 1 - Instruction 2; (x,y) Memory T Val 2 - Instruction 3; - ….; Block T Val 3 ….; - - ….; Block = k - ….; mots - Instruction N; mémoires T[i] Val i Cache Memory N° line Tag Cache memory Central memory 0 Block 1 @0 Mot mémoire 1 Block 2 @1 2 Block 3 @2 Block 1 (K mots) J Block M Size line = K words Block M (K mots) @ 𝟐𝒏 − 𝟏 Size of cell (Ex : 1 octets) Cache Memory Mapping function The cache size is much smaller than the memory size. A strategy for copying data blocks into the cache must be defined. This method is called mapping. Cache Memory Mapping functions ✓ Correspondance directe (direct mapped cache) : the main memory block m can only be found in the line j = m % M of the cache memory, knowing that M is the number of lines of the cache memory ✓ Correspondance totalement associatif (fully associative cache) : each memory block can be placed in any block of the cache ✓ Correspondance associative par ensemble (set associative cache) : separation of the cache memory into groups of blocks and complete associativity within a group, i.e. block m of the main memory can be found in any block of the group g = m % v of the cache memory, knowing that v is the total number of groups of blocks in the cache memory Cache Memory Mapping functions ✓ Correspondance directe (direct mapped cache) : the main memory block m can only be found in the line j = m % M of the cache memory, knowing that M is the number of lines of the cache memory j=m%M Mem Centrale j : N° line m : N° block Tag Mem Cache Block 0 M : Nombre lines 0 Line 0 Block 1 j = (m % 4) 1 Line 1 j(0) = (0 % 4)= 0 Block 2 j(1) = (1 % 4)= 1 2 Line 2 j(2) = (2 % 4)= 2 3 Line 3 Block 3 j(3) = (3 % 4)= 3 j(4) = (4 % 4)= 0 Block 4 j(5) = (5 % 4)= 1 Block 5 j(6) = (6 % 4)= 2 j(7) = (7 % 4)= 3 j(8) = (8 % 4)= 0 Cache Memory Mapping function Cache memory access (direct cache case) s+w Tag Line Mot s-r r w ✓ Taille adresse mémoire = s+w ✓ Partie Mot = w ✓ Partie Line = r ✓ Partie Tag = s-r Cache Memory Mapping function Cache memory access (direct cache case) s+w Tag Line Mot s-r r w ✓ Nombre mots mémoire (Taille mémoire centrale ‘’MC’’) = 2(𝑠+𝑤) ✓ Nombre mots par block (Taille block, Taille line)= 2𝑤 2𝑠+𝑤 ✓ Nombre blocks dans MC = 𝑤 = 2𝑠 2 ✓ Nombre de lines dans cache = 2𝑟 ✓ Taille mémoire cache = 2𝑟+𝑤 ✓ Taille tag = 𝑠 − 𝑟 𝑏𝑖𝑡𝑠 Cache Memory Mapping function Cache memory access (direct cache case) Mémoire Centrale Adresse Tag Line W W1 Mémoire Cache W2 Tag W1 W3 B1 W2 W4 L1 W3 W4 CMP × Tag W6 W7 L3 W10 W8 × W9 W11 Hit B5 W12 W13 Miss Instant t (Etat Init: L0 contient B2) (Inconvénient direct cache) Instant t+1 accès W1 accès W7 C W1 W2 W3 B0 C W1 W2 W3 B0 L0 B2 L0 B0 P L1 W4 W5 W6 B1 P W4 W5 W6 B1 L1 U W7 W8 W9 B2 U W7 W8 W9 B2 Instant t+2 Instant t+3 accès W3 accès W7 C W1 W2 W3 B0 C W1 W2 W3 B0 L0 B2 L0 B0 P L1 W4 W5 W6 B1 P W4 W5 W6 B1 L1 U W7 W8 W9 B2 U W7 W8 W9 B2 Instant t+4 Instant t+5 accès W2 accès W9 C W1 W2 W3 B0 C W1 W2 W3 B0 L0 B2 L0 B0 P L1 W4 W5 W6 B1 P L1 W4 W5 W6 B1 U W7 W8 W9 B2 U W7 W8 W9 B2 Cache Memory Mapping functions ✓ Correspondance totalement associatif (fully associative cache) : each memory block can be placed in any line of the cache Mem Centrale Mem Cache Block 0 0 Line 0 Block 1 1 Line 1 Block 2 2 Line 2 3 Line 3 Block 3 Block 4 Block 5 Cache Memory Mapping functions L’accès à la mémoire cache (Cas cache totalement associatif ) s+w Tag W s w ✓ Nombre de mots mémoire (On dit mots mémoire ou taille MC) = 2(𝑠+𝑤) ✓ Nombre mots par block (Taille block, Taille line)= 2𝑤 (car taille bloc=taille line) 2𝑠+𝑤 ✓ Nombre blocks dans MC = = 2𝑠 (NB_Bloc= taille totale MC/taille d’un bloc) 2𝑤 ✓ Nombre de lines dans cache = Indéterminé ✓ Taille tag = 𝑠 Cache Memory Mapping function L’accès à la mémoire cache (Cas cache totalement associatif ) Mémoire Centrale Adresse Tag W W1 Mémoire Cache W2 Tag W3 B1 L1 W4 CMP × Tag L3 W10 W11 Hit × B5 W12 W13 Miss Cache Memory Mapping functions ✓ Correspondance associative par ensemble (set associative cache) : separation of the cache memory into groups of blocks and complete associativity within a group, i.e. block m of the main memory can be found in any block of the group g = m % v of the cache memory, knowing that v is the total number of groups of blocks in the cache memory ✓ Combination between Direct and Associative functions ✓ Divide the cache into V sets, and each set g contains a number of lines k ✓ A memory block can be stored in any line of a specific (unique) set If a group contains K-lines, the system is called a K-way set associative cache (Most computers today use 2-way or 4-way cache) Cache Memory Mapping functions ✓ Divide the cache into V sets, and each set g contains a number of lines k ✓ A memory block can be stored in any line of a specific set (g = m % V) Tag Cache mem Central mem g0, k lines Block 0 g1, k lines Block 1 Block 2 g2, k lines g4, k lines Block 4 Block M Cache Memory Mapping function L’accès à la mémoire cache (Cas cache associatif par ensemble) s+w Tag Ensemble Mot s-d d w ✓ Nombre de mots mémoire = 2 (𝑠+𝑤) ✓ Nombre mots par block (Taille block, Taille line)= 2𝑤 2𝑠+𝑤 ✓ Nombre blocks dans MC : M = = 2𝑠 2𝑤 ✓ Nombre de lines dans ensemble = k (k-way: selon les archis actuelles k=2 ou 4) ✓ Nombre d’ensembles = V = 2𝑑 ✓ Nombre de lines dans cache = k * V = k * 2𝑑 (nb lignes par groupes * nb total groupe) ✓ Taille cache = k* 2(𝑑+𝑤) mots (nb lines total * taille une line) ✓ Taille Tag = s – d Cache Memory Mapping function L’accès à la mémoire cache (Cas cache associatif par ensemble) Mémoire Centrale Adresse Tag v W W1 Mémoire Cache W2 Tag W3 B1 Tag W4 g1 Tag Tag CMP × Tag Tag g3 W10 Tag Tag W11 Hit × B5 W12 W13 Miss Cache Memory Cache Memories Levels Cache Cache Cache Central Secondary CPU L3 Memory L1 L2 Memory 1000 bytes 300 ps 64 KB 256 KB 2-4 MB 1 ns 3-10 ns 10-20 ns Size Access Time 4-16 GB 50-100 ns 1 TB 5-10 ms Cache Memory Cache Memories Levels Mémoire cache Memory hierarchy Registers (CPU) Le cache L1 Response (on chip) Cost Time Les caches L2 & L3 (off chip) Central Memory (DRAM) Secondary Memory Size