University of Glasgow CSC1104 Computer Architecture Lecture 1 PDF
Document Details
Uploaded by StaunchBaltimore
University of Glasgow
Cao Qi
Tags
Summary
This document is a lecture from the University of Glasgow's CSC1104 Computer Organization and Architecture course. It covers fundamental concepts of computer evolution, performance, and different architectures like CISC and RISC.
Full Transcript
Singapore CSC1104 - COMPUTER ORGANISATION & ARCHITECTURE LECTURE 1 : COMPUTER EVOLUTION AND PERFORMANCE Associate Professor Cao Qi [email protected] School of Computing Science Information Module-Lead: Dr Cao Qi: Email: Qi.C...
Singapore CSC1104 - COMPUTER ORGANISATION & ARCHITECTURE LECTURE 1 : COMPUTER EVOLUTION AND PERFORMANCE Associate Professor Cao Qi [email protected] School of Computing Science Information Module-Lead: Dr Cao Qi: Email: [email protected] Webpage: https://www.gla.ac.uk/schools/computing/staff/qicao/ Technical Officer: Mr. Vincent Ng Chiew Guan Email: [email protected] 2 School of Computing Science Acknowledgement Main contents of CSC1104 - Computer Organisation and Architecture are derived from: Computer organization and architecture, Designing for performance. Author: William Stallings. Publisher: Pearson. Acknowledgement to: Author and Publisher. Computer organization and Design, The hardware/software interface. Authors: D. Patterson and J. Hennessy. Publisher: Morgan Kaufmann. Acknowledgement to: Authors and Publisher. 3 School of Computing Science Course Objectives : To present the nature and characteristics of modern-day computer systems, about their structure and function. To provide a thorough discussion of the fundamentals of computer organization and architecture. To learn programming on IoT devices. 4 School of Computing Science Course Venues Lectures: Monday 11:00 am – 13:00 pm, by Zoom Tutorials: Tuesday, 9:00 am – 10:00 am, E2- 02-14-Lectorial 10 Group Labs, E2-04-02-SR232 ▪ Group P1 : Tuesday, 10:00 – 11:00 am ▪ Group P2 : Tuesday, 11:00 am – 12:00 pm ▪ Group P3 : Tuesday, 12:00 – 13:00 pm 5 School of Computing Science Assessments Quiz 1 - worth 30% of your overall marks Project Assignment - worth 30% of your overall marks Weekly group class discussions – worth 5% of your overall marks Final exam - worth 35% of your overall marks 6 School of Computing Science Course Schedule Week Topics 1 Computer Evolution and Performance 2 Computer Function, Cache Memory 3 Internal Memory and External Memory 4 Input/Output, Operating System 5 Number Systems and Computer Arithmetic 6 Assembly Language 7 Break 8 Instruction Set: Characteristics and Data Types Quiz 1 (Week 1-5), 30% 9 Instruction Set: Types of Operations 10 Transfer of Control and Addressing Modes 11 Instruction Pipelining; CISC and RISC 12 Parallelism; Superscalar; Multicore Project assignment due, 30% 13 Revisions Weekly group class discussions, 5% 7 School of Computing Science Books and References Computer organization and architecture, Designing for performance, William Stallings, Pearson, 10th edition, 2016. Computer organization and Design, The hardware and software interface, David A. Patterson and John L. Hennessy, Morgan Kaufmann; 5th edition 2014. 8 School of Lecture Contents Computing Science Brief History of Computers: Classes of Computers Four Generations of Computers Evolutions of Processors, CISC and RISC: CISC Processors and RISC Processors Computer Architecture and Organization Processors Performance: Measures of Performance 9 School of Classes of Computers Computing Science and Their Applications Servers (supercomputers): Many processors, large memory, high cost, Personal computers (PCs): For single users at low cost. Embedded computers: Largest class of computers. Smart Mobile Devices: Powerful as PCs. 10 School of History - Generations Computing Science of Computers Integrated Circuits Later Generations Processing power Memory capacity Dimensions Complexity Control units System software: load programs, move data from/to peripherals, perform common computations. 11 School of First Generation: Computing Science Vacuum Tubes Used vacuum tubes for digital logic elements and memory. First computer: COLOSSUS, by in Computer COLOSSUS 1943-44. First general-purpose computer: ENIAC in 1943–46. 12 Computer ENIAC tube change School of Von-Neumann Computing Science Architecture Von-Neumann architecture by John von Neumann in 1945. Stored-program computer: John von Neumann with the instruction and program stored-program IAS computer stored in same memory. Memory unit. Arithmetic logic unit (ALU). Control unit. Input–output (I/O) 13 School of Second Generation: Computing Science Transistors Transistor: solid-state silicon device. Invented at Bell Labs in 1947. Fully transistorized computers available in late 1950s. 14 School of Third Generation: Computing Science Integrated Circuits (IC) A 12-inch (300 mm) wafer of Microelectronics era: Intel Core i7 (Courtesy Intel). invention of IC in 1958. 15 School of Later Generation: Large- Computing Science Scale Integration (LSI) LSI: 1,000 – 10,000 VLSI: 10,000 – 1 million ULSI: > 1 million components per chip components per chip components per chip Construction of processors (control unit, arithmetic and logic unit). Construction of memory chips (storage density cost per bit access time ). 16 School of Decimal and Binary Computing Science Notations for Size Terms 2X vs. 10Y bytes ambiguity was resolved by adding a binary notation. 1Ki = 210 (1,024), 1Mi = 220 (1,048,576), 1Gi = 230 (1,073,741,824). 1K = 103 (1,000), 1M = 106 (1,000,000), 1G = 109 (1,000,000,000). 17 School of CISC v.s. RISC Computing Science CISC and RISC: different on instruction set. Complex Instruction Set Computer (CISC) processor architecture : Complete task using a smaller number of assembly lines. Example processor: Intel X86 Architecture Reduced Instruction Set Computer (RISC) processor architecture : Complete task utilizing small and highly optimized set of instructions. Example processor: ARM Architecture 18 School of Example - Difference of CISC Computing Science and RISC Multiplying data (two numbers) stored in memory. 1 2 3 4 5 1 Execution unit can only operate data been loaded Memory 2 into registers. 3 Task: 2 numbers - one stored at memory location (2:4), the other stored at (3:2). Calculate their product then store the product back to (2:4). RISC: A B CISC: Use simple instructions executed C D Registers A specific instruction within 1 clock cycle. "MULT" divided into 3 commands: E F is used ("MULT"). The entire task of "LOAD“, “MUL“, "STORE“. multiplying 2 Need to code four lines. +-x/ Execution Unit numbers can be LOAD A, 2:4 completed with one LOAD B, 3:2 instruction: MUL A, B MULT 2:4, 3:2 STORE 2:4, A 19 School of Intel x86 Architecture – Computing Science CISC Processors With instruction set architecture (ISA) for microprocessor-based computing. Program written on older version can execute on newer versions. All changes have involved additions to the instruction set. Over 500 instructions in the instruction set. x86 represents design effort on CISC. 20 School of ARM Architecture – RISC Computing Science Processors ARM: RISC-based microprocessors by ARM Holdings. For high-performance, low-power-consumption, small-size, and low-cost processor for embedded systems. Embedded system: a dedicated function embedded as a part of a complete device or system. Billions of embedded Millions of computers computer systems are are sold each year. produced each year. 21 School of Computer Architecture and Computing Science Computer Organisation Architectural Example Attributes : Computer Architecture instruction set, Attributes of a system visible number of bits to represent to programmers. various data types, Attributes direct impact on I/O mechanisms, memory addressing modes. logical execution of program. Organizational Example Attributes : Computer Organisation hardware details transparent to programmer, such as control Operational units and their signals, interconnections that realize interfaces between computer architectural specifications. and peripherals, memory technology used. 22 School of Distinction between Computing Science Architecture and Organization Many manufacturers offer a family of computer models, with same architecture but different in organization. A particular computer architecture may span many years. Computer organizations change with technology, price, performance characteristics, computer models, etc. For microcomputers, changes in technology influence not only organization but also more architectures. Generally, there is less of a requirement for generation-to- generation compatibility for these smaller machines. 23 School of Function Operations Computing Science 4 basic functions performed by computers: Operating environment Data Processing (source and destination ❑ Processing a wide variety of forms of data. of data) Data Storage ❑ Short term storage. Data ❑ Long term storage. movement apparatus Data Movement ❑ Input–output (I/O): when data received from or delivered to a peripheral device. Control ❑ Communications: when data moved over Mechanism longer distances, to/from a remote device. Control Data Data storage processing ❑ Managing computer’s resources and facility facility performance of its functional parts. 24 School of Examples of Computer Computing Science Function Operations Data Data movement movement Control Control Data Data Data Data storage processing storage processing Data Data movement movement Control Control Data Data Data Data storage processing storage processing 25 Single Core School of Computing Science Processor Structure Main Main structural components of a computer: I/O memory Central processing unit (CPU): Controls operation System Bus and performs data processing functions. Main memory: Stores data. CPU I/O: Moves data from/to external environment. CPU System interconnection: Communication paths among CPU, main memory, and I/O. ALU Registers Major structural components of CPU: Internal Bus Control unit: Controls operation of CPU. Arithmetic and logic unit (ALU): Control Control Unit Unit Performs data processing functions. Control unit Registers: Provides internal storage of CPU. Sequencing registers & logic decoders CPU interconnection: Communication paths among control unit, ALU, registers, etc. Control memory 26 School of Clock Cycle Time or Computing Science Clock Rate Clock speed of processor (clock cycle time (or period of clock signal): period T. Clock rate or clock speed: frequency f). Clock rate f = 1/T e.g., 1-MHz processor receives 1 million clock pulses per second. The higher clock rate, the more data can be processed within a fixed Clock rate , time to time under same conditions. process each operation 27 School of Processors Performance Computing Science Performance formula relates to number of clock cycles and clock cycle time (Period) to CPU time: Execution time for a program = No. of CPU clock cycles needed × Clock cycle time Clock rate (Frequency) inverse to clock cycle time: No. of CPU clock cycles needed Execution time = 𝐂𝐥𝐨𝐜𝐤 𝐑𝐚𝐭𝐞 Hence, ways to improve CPU performance: Reduce number of clock cycles needed for a program Reduce the clock cycle time, (or increase clock rate). 28 School of Example 1.1 – CPU Computing Science Performance ❖ A program runs in 10 seconds on computer A which has 2 GHz clock rate. The designer plans to build a computer B, run the same program in 6 seconds. A substantial increase in clock rate is possible, but will affect the rest of CPU design, causing computer B to require 1.2 times as many clock cycles as computer A. What clock rate is for computer B? Solution: ❑Number of clock cycles needed for the program on computer A: ▪ No. of CPU clock cycles on A = Execution time on A * Clock Rate on A = 10 seconds * 2 × 109 cycles per second = 20 × 109 clock cycles. ▪ No. of CPU clock cycles on B = 1.2 × No. of CPU clock cycles on A = 1.2 x 20 x 109 cycles = 24 x 109 cycles. No. of CPU clock cycles on B 𝟐𝟒×𝟏𝟎𝟗 𝐜𝐲𝐜𝐥𝐞𝐬 ▪ Clock Rate on B = = =4 𝐄𝐱𝐞𝐜𝐮𝐭𝐢𝐨𝐧 𝐭𝐢𝐦𝐞 𝐨𝐧 𝐁 𝟔 𝐬𝐞𝐜𝐨𝐧𝐝𝐬 29 x 109 cycles/second = 4 GHz School of Instruction Performance Computing Science Execution time also depends on No. of CPU instructions in a program, and clock cycles per instruction. Clock cycles per instruction (CPI): Average number of clock cycles per instruction for a program. Different instructions may need different number of clock cycles; CPI is average clock cycles of all instructions executed in a program. σ𝒏𝒊=𝟏(𝑪𝑷𝑰𝒊 × 𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕𝒊 ) 𝑪𝑷𝑰 = 𝑻𝒐𝒕𝒂𝒍 𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕 where CPIi and Instruction Counti are for each instruction class i. Performance formula relates CPI to CPU time as: No. of CPU clock cycles needed = Total No. of instructions × clock cycles per instruction (CPI). 30 School of Example 1.2 - Instruction Computing Science Performance ❖ Two computers with same instruction set architecture. Computer A has a clock cycle time of 250 ps and a CPI of 2.0 for a program. Computer B has a clock cycle time of 500 ps and a CPI of 1.2 for same program. Which computer is faster for this program, by how much? (Hint: 1 ps = 1 / 1012 Hz). Solution: ❑ Each computer executes the same number of instructions for the program, first calculate No. of CPU clock cycles needed : ▪ CPU clock cycles on A = No. of instruction × CPI on A = No. of instruction × 2.0. ▪ CPU clock cycles on B = No. of instruction × CPI on B = No. of instruction × 1.2. ▪ Then compute execution time for each computer: ▪ Execution time on A = CPU clock cycles on A × Clock cycle time A = No. of instruction × 2.0 × 250 ps. ▪ Execution time on B = CPU clock cycles on B × Clock cycle time B = No. of instruction × 1.2 × 500 ps. ▪ Hence, computer A is faster for it, with less CPU execution time at: No. of instruction×500 ps 5 × 100%= × 100% = 83.33%. 31 No.of instruction × 600 ps 6 School of CPU Performance Computing Science Equations CPU performance determined by 3 key factors: Instruction count (No. of instructions executed by a program), CPI (Clock cycles per instruction), Clock cycle time (or Clock Rate). CPU performance equation: Execution time = Instruction count × CPI × Clock cycle time. Clock rate is inverse to clock cycle time: Instruction count × CPI Important Execution time =. equations! 𝐂𝐥𝐨𝐜𝐤 𝐑𝐚𝐭𝐞 Millions of instructions per second (MIPS): as a common measure of performance. Instruction count 1 Cl𝐨𝐜𝐤 𝐑𝐚𝐭𝐞 MIPS rate = = 𝟔 = 𝐄𝐱𝐞𝐜𝐮𝐭𝐢𝐨𝐧 𝐭𝐢𝐦𝐞 × 𝟏𝟎𝟔 𝐂𝐏𝐈 × 𝐂𝐥𝐨𝐜𝐤 𝐂𝐲𝐜𝐥𝐞 𝐭𝐢𝐦𝐞× 𝟏𝟎 𝐂𝐏𝐈× 𝟏𝟎𝟔 32 School of Units of Measurement Computing Science for CPU Performance Reliable measure of computer performance is time. Execution time (unit: seconds/program) = Instruction count Clock cycles seconds × ×. program Instruction Clock cycles Execution time = Instruction count × CPI × Clock cycle time 33 School of Example 1.3 - CPU Computing Science Performance A program needs execution of 2 million instructions on a 400 MHz CPU. The program consists of 4 major types of instructions. Instruction mix and CPI for each type are below, based on a program trace experiment. Calculate the MIPS rate. Solution: ❑ According to the table above, average CPI of all instructions is: ▪ CPI = 1 × 60% + 2 × 18% + 4 × 12% + 8 × 10% = 2.24. ❑ Clock rate is 400 MHz. Based on the MIPS rate equation: Instruction count 1 ▪ MIPS Rate = = = 𝐄𝐱𝐞𝐜𝐮𝐭𝐢𝐨𝐧 𝐭𝐢𝐦𝐞× 𝟏𝟎𝟔 𝐂𝐏𝐈 × 𝐂𝐥𝐨𝐜𝐤 𝐂𝐲𝐜𝐥𝐞 𝐭𝐢𝐦𝐞× 𝟏𝟎𝟔 Cl𝐨𝐜𝐤 𝐑𝐚𝐭𝐞 𝟒𝟎𝟎×𝟏𝟎𝟔 𝟔 = 𝟔 = 179. 34 𝐂𝐏𝐈× 𝟏𝟎 𝟐.𝟐𝟒×𝟏𝟎 School of Example 1.4 – Compare Computing Science Code Performance A program consists of 3 major classes of instructions. Hardware designers supplied following facts: Instruction Class A Instruction Class B Instruction Class C CPI 1 2 3 For a high-level language program, compiler designer plans two code sequences requiring the following instruction counts. Which code sequence executes more instructions? Which is faster? What is the CPI for each sequence? Code Instruction counts for each instruction class Sequence Class A Class B Class C X 2 1 2 Solution: Y 4 1 1 ❑ Total No. of instructions executed by Code Sequence X: 2 + 1 + 2 = 5. ❑ Total No. of instructions executed by Code Sequence Y: 4 + 1 + 1 = 6. ❑ CPU clock cycles = σ (instruction counti × CPIi of each inst𝐫𝐮𝐜𝐭𝐢𝐨𝐧) ❑ Total No. of CPU clock cycles by Code Sequence X: 2*1 + 1*2 + 2*3 = 10. ❑ Total No. of CPU clock cycles by Code Sequence Y: 4*1 + 1*2 + 1*3 = 9. σ𝒏 (𝑪𝑷𝑰𝒊 ×𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕𝒊 ) 𝟏𝟎 ❑ CPI of Sequence X: 𝑪𝑷𝑰𝒙 = 𝒊=𝟏 = =𝟐 𝑻𝒐𝒕𝒂𝒍 𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕 𝟓 σ𝒏 (𝑪𝑷𝑰𝒊 ×𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕𝒊 ) 𝟗 ❑ CPI of Sequence y: 𝑪𝑷𝑰𝒚 = 𝒊=𝟏 = = 𝟏. 𝟓 35 𝑻𝒐𝒕𝒂𝒍 𝑰𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝑪𝒐𝒖𝒏𝒕 𝟔 School of Understanding Program Computing Science Performance Component Affect How? Algorithm Instruction Determines No. of instructions executed. May affect count, CPI, by favoring slower or faster instructions. E.g., if CPI algorithm uses more divisions, tends to a higher CPI. Programming Instruction Affects instruction count, as statements in language language count, are translated to machine instructions. May affect CPI CPI due to its features; e.g., a language with heavy data abstraction (e.g., Java) requires indirect calls, uses higher CPI. Compiler Instruction Compiler efficiency affects both instruction count and count, CPI, as compiler translates source language CPI instructions into machine instructions. Instruction Instruction Affects all 3 aspects of CPU performance: instructions set count, needed for a program, clock cycles of each architecture clock rate, CPI instruction, and overall clock rate of CPU. 36 School of Computing Science Next Lecture Lecture 2: Computer Function and Cache Memory 37