Computer Organisation & Architecture Lecture 2: Computer Function and Cache Memory PDF
Document Details
Uploaded by StaunchBaltimore
University of Glasgow
Cao Qi
Tags
Summary
This lecture details computer functions and cache memory, including instruction fetch and execute cycles. It explores different computer components, such as the CPU and memory, and explains how these parts interact.
Full Transcript
Singapore CSC1104 - COMPUTER ORGANISATION & ARCHITECTURE LECTURE 2 : COMPUTER FUNCTION AND CACHE MEMORY Assoc. Prof. Cao Qi [email protected] School of Computing Science Acknowledgement Main contents of CSC1104 - Compute...
Singapore CSC1104 - COMPUTER ORGANISATION & ARCHITECTURE LECTURE 2 : COMPUTER FUNCTION AND CACHE MEMORY Assoc. Prof. Cao Qi [email protected] School of Computing Science Acknowledgement Main contents of CSC1104 - Computer Organisation and Architecture are derived from: Computer organization and architecture, Designing for performance. Author: William Stallings. Publisher: Pearson. Acknowledgement to: Author and Publisher. Computer organization and Design, The hardware/software interface. Authors: D. Patterson and J. Hennessy. Publisher: Morgan Kaufmann. Acknowledgement to: Authors and Publisher. 2 School of Lecture Contents Computing Science Computer Function: Instruction Fetch and Execute Interrupts Cache Memory: Characteristics of Memory Systems Memory Hierarchy Cache Memory Principles 3 School of Computing Science Computer Function 4 School of Von-Neumann Computing Science Architecture (Recap) Von-Neumann architecture by John von Neumann. Data and instructions stored in the same memory. Memory contents are accessible by addresses. Instruction execution in a sequential way 5 School of CPU Components: Top- Computing Science Level View Memory address Memory content Memory address register (MAR): specifies address in memory for the next read or write. Memory buffer register (MBR): contains data read/write from/to memory. Program counter (PC): Instruction register (IR): Transfers data between external I/O 6 devices and Computer. School of Computer Functions Computing Science Execute a program, with a set of instructions stored in memory. An instruction cycle: tasks required to process a single instruction: Fetch cycle: processor reads instructions from memory one at a time. Execute cycle: current instruction is executed. 7 School of Computing Science Instruction Fetch and PC e.g. Address: 0x0300 AC e.g. 0x0301 Instruction: 0x0302 0x1234 Accumulator Processor fetches an instruction from memory to instruction register (IR). program counter (PC) register holds address of the next instruction to be fetched. PC is increased after each instruction fetch, pointing to the next instruction in sequence, unless special instructions received. Accumulator (AC): data register in CPU is for temporary storage. 8 School of Instruction Register (IR) Computing Science Fetched instruction is loaded into the IR. Instruction specifies action to be taken, with 4 action categories: Processor-memory: Data transferred in processor memory. Processor-I/O: Data transferred in processor I/O peripherals. Data processing: Processor performs certain operations on data. Control: Alters sequence of instruction execution. e.g., CPU fetches an instruction from memory 0x145, which is a control instruction to specify the next instruction being from memory 0x182. CPU will then set the program counter (PC) to 0x182. On the next fetch cycle, instruction fetched from memory 0x182, rather than 0x146 (i.e., 0x145 + 1 = 0x146). 9 School of Operators, operands, Computing Science operations Operators are expressions to do some operation (e.g., add, subtract, multiply, store). Operands are the values the operators do operations upon. 10 School of Example 2.1 - A Partial Computing Science Program Execution 4 bits 12 bits 16 bits Instruction code: 24 = 16 different opcodes, 212 = 4,096 memory addressed Both instructions and data are 16-bits long Fetch Cycle Execute Cycle +1 Example opcodes: 0x1 = Load AC opcode address address from memory 0x2 = Store AC to memory +1 0x5 = Add data from memory into AC opcode address +1 opcode address 11 School of Interrupts Computing Science Almost all computers provide a mechanism by other modules (I/O, memory, etc.) to interrupt normal processing of CPU. Interrupts as a way to improve processing efficiency. Most external devices are much slower than CPU. 12 School of Computing Science Classes of Interrupts Classes of Descriptions Interrupts Program Result of arithmetic overflow, division by zero, attempt to execute an illegal machine instruction, or reference outside a user’s allowed memory space. Timer Generated by timer within CPU. It allows operating system to perform certain functions on a regular basis. I/O Generated by I/O controller, to signal normal completion of an operation, request service from CPU, or to signal a variety of error conditions. Hardware Generated by a failure such as power failure or memory Failure parity error. 13 School of Program Flow without Computing Science and with Interrupts X = interrupt occurs during course of execution of user program 14 School of Computing Science Interrupt Cycle Receiving interrupt request, CPU checks which interrupt occurs. For a pending interrupt, interrupt handler is performed: CPU suspends execution of the current program; saves its context (program counter; data relevant to current activity). Sets program counter (PC) to the starting address of an interrupt handler. CPU resumes original execution after I/O interrupt is serviced. 15 School of Sequential Interrupt Computing Science Approach 1. Disable interrupts while an interrupt is being processed. when an interrupt occurs, CPU interrupts are disabled immediately. New interrupt requests won’t be responded. ❑Interrupts are handled in sequential order. ISR: Interrupt ❑After ISR completes, service routine CPU interrupts are enabled. Before resuming user program, CPU checks if any interrupts occur but yet to respond. 16 School of Nested Interrupt Computing Science Approach 2. Define priorities for interrupts. Allow higher priority interrupt to be serviced first. It can interrupt lower-priority interrupt service routine (ISR). ❑ Devices are assigned by different priorities of interrupt handler. Higher priority interrupt ISR, ❑ The higher the e.g., priority = 2 number, the higher Lower priority the interrupt priority. interrupt ISR, ❑ Interrupt from higher e.g., priority = 1 priority devices are responded first by CPU. 17 School of Example 2.2 – ISR of Computing Science Multiple Interrupts ❑ System with three I/O devices: a printer (2), a disk (4), and a communications line (5). priority = 5 priority = 2 priority = 4 18 School of Computing Science Memory Hierarchy and Cache Memory 19 School of Important Characteristics Computing Science of Memory 1. Capacity of memory: (bytes or words) 1 word = 2 bytes; 1 byte = 8 bits. 2. Performance of memory Access time (latency): time from an address is presented to the memory, to data stored or read out. Memory cycle time: access time + any additional time required before the next access can commence. Transfer rate: the rate at which data can be transferred into or out of a memory unit. ❑ For random-access memory: 1/(clock cycle time). 𝑛 ❑ For non-random-access memory: 𝑛𝑇 = 𝑇𝐴 + 𝑅 where Tn = Average time to read or write n bits. TA = Average access time. n = Number of bits to read/write. 20 R = Transfer rate, in bits per second (bps) School of Physical Characteristics Computing Science ▪ Volatile memory: data decayed naturally or lost when electrical power is off. ▪ Nonvolatile memory: no electrical power needed to retain data once recorded. ▪ Nonerasable memory: cannot be altered, unless destroying storage units, i.e., Read-only memory (ROM). ▪ Erasable memory: can be altered and erased from storage. ▪ Semiconductor memory: either volatile or nonvolatile. ▪ Magnetic-surface memories: nonvolatile 21 School of Memory Hierarchy Computing Science Design constraints on a computer’s memory: how large? how fast? how expensive? Trade-off: capacity, access time, cost. Access time ↓ cost per bit ↑ Capacity ↑ cost per bit ↓ Capacity ↑ access time ↑ Solution: not to rely on single memory components, but on a memory hierarchy. 22 School of Relative Cost, Size and Computing Science Speed Characteristics Smaller, more expensive, faster memories supplemented by larger, cheaper, slower memories. Primary: Volatile, semiconductor Cost per bit ↓ Capacity ↑ Access time ↑ Frequency of access by CPU ↓ Secondary: Non-volatile, external 23 School of Performance of Accesses 2 Computing Science Levels of Memory Two-level memory: M1 is smaller, faster, more expensive than M2. If data is in Level M1, CPU can access directly. If in Level M2, data is first transferred to Level M1, then accessed by CPU. M2 contain all program instructions and data. Most recently accessed instructions and data are M1. Cluster data in M1 need be swapped back to M2 regularly. Hit ratio H: probability of data is found in M1. T1: access time to M1 T2: access time to M2 Total access time per word = H ×T1 + (1 – H) ×(T1 + T2) 24 School of Example 2.3 – Accesses Computing Science 2 Levels of Memory ❖For a two levels of memory system, Level M1 contains 1000 bytes and has an access time of 0.01 μs; Level M2 contains 100,000 bytes with an access time of 0.1 μs. The hit ratio is 95%. Calculate the average time to access a word by the CPU? Solution: a) T1 = 0.01 μs. T2 = 0.1 μs. H = 0.95 b) Time = H ×T1 + (1 – H) ×(T1 + T2) = 0.95×0.01 μs + 0.05×(0.01 μs + 0.1 μs) = 0.0095 + 0.0055 = 0.015 μs. 25 School of Cache Memory Principles Computing Science When CPU attempts to read a word from memory address, checks if it is in cache. If so, delivered to CPU directly. If not, a data block of memory read into cache, word is delivered to CPU. Locality of reference: when a data block fetched into cache from memory, likely there will be future references to data in the same block. 26 School of Cache/Main Memory Computing Science Structure C blocks (lines ) Cache consists of 2n addressable blocks called words lines. 1 block = K words 1 line = K words + tag (+ control bits) M blocks Size of 1 cache line = size of 1 memory block = K words No. of blocks in memory: M = 2n/K C Access time of larger cache ↑ Chips and circuit boards area limits cache capacity. Cache performance is very sensitive to the nature of workload, impossible to get a single “optimum” cache capacity. 29 School of Mapping Function and Cache Computing Science Access Methods No. of cache lines