CPE323 Microprocessor Fetch-Decode-Execute Cycle PDF

CPE323 Microprocessor Module No. 2 Topic Fetch-Decode-Execute Cycle Period Week no.__4 and 5__: Date _February 10-22, 2025_ FETCH-DECODE-EXECUTE CYCLE Every modern computer, when you get right down to the bare metal, is doing basically the same sort of thing. In a simpler term, computers are just overgrown calculators we are using in daily basis. But the question lies to, "how do it go from a simple calculator to playing video games, sending stuff over the internet, or even decompressing and displaying the millions of pixels in a video?" In short, what’s your computer actually doing? In this module, it introduces the instruction cycle and discuss some important fundamental concepts such as what are program instructions and how computer’s microprocessor executes the program instructions. It also further elaborates the Fetch-Decode-Execute cycle. Objective/Intended Learning Outcomes Students should be able to know about instruction cycle and the important fundamental concepts of instruction cycle in a computer. Students should be able to understand the process of Fetch-Decode-Execute cycle. Students should be able to how computer’s microprocessor executes the program instructions. Discussion/Content 2. FETCH-DECODE-EXECUTE CYCLE INTRODUCTION TO INSTRUCTION CYCLE Instruction cycle is an important topic in computer organization and architecture. All computer software is built up of sets of instructions. Instructions are encoded in binary or machine code. Instruction cycle is the time required by the CPU to execute one single program instruction. Instruction cycle is the basic operation of the CPU which consist of three steps: fetch-decode-execute. The computer system’s main function is to execute the computer program. The computer program consists of set of machine instructions. The CPU is responsible to execute these program instructions. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 1 Program Instructions The program instructions are stored into the main memory. The computer memory is organized into number of cells. Each cell (location) has a specific unique memory address. The processor initiates the program execution by fetching the machine instructions one by one from the main memory RAM. The computer system needs a set of instructions which directs the computer to perform the desired operations. This set of instruction which computer interpret and execute is called a computer program. Instruction Format The computer program consists of number of instructions which directs the CPU to perform specific operation. However, CPU need to know the details as which operation is to be performed, on which data and the location of the data. This information is provided by the instruction format. Instruction format defines the layout and structure of the program instruction that can be decoded by the CPU and then perform the desired operation on the data. Opcode is part of the machine instruction that specifies which operation to be performed by the CPU while executing the instruction. Opcode directs the control unit of the CPU to operate on the data (operand) as supported by the instruction of the processor chip. Operand is the part of the machine instruction that specifies either the data itself or a reference to the data such as memory address which contains the actual data. Operand simple means the data on which the CPU performs the desired operation. Example of Instruction High level language source code – Instruction readable by human z=x+y Assembly Language LOAD ADD STORE Machine Code – Instruction readable by computers 01010100 00010101 00110110 Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 2 A BOG STANDARD ARCHITECTURE The CPU contains a number of registers, some of which fall on the address side, others on the data side; an arithmetic logic unit; the control section or control unit; connections to the memory (a large unit of storage) by two buses, the uni-directional address bus and the bi directional data bus; and internal buses or data pathways which allow the output of one register to connect to the input of another. Figure 1: A Bog Standard Architecture CENTRAL PROCESSING UNIT (CPU) CPU Structural Components/Units: Control unit – Controls the operation of the CPU and hence the computer. The Control Unit is responsible for the timing and execution of the various register transfers required to fulfill an instruction held in the IR It has a number of control lines coming out of it, which transmit CSL and CSP levels and pulses to the various registers and logic units. Arithmetic logic unit (ALU) – Performs the computer’s data processing. The Arithmetic Logic Unit is responsible for bit operations on data held in the AC and MBR and for storing the results It contains arithmetic adders, logical AND-ers and OR-ers and so on A special requirement in our architecture is a “null operation” or “no op” which simply allows the output of the AC to appear at the output of the ALU. Status Register – or also known as Condition Control Word or Status Word, is closely associated with the ALU. It is not quite the same as the other registers in that it just a collection of 1-bit flags that indicate the outcome of operations that the ALU has just carried out. There are the flags (you met in P2) Carry Overflow V flags, negative flag N, and zero flag Z These are monitored by the CU. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 3 CPU REGISTERS o Program Counter (PC) – contains the memory address of the next instruction to be fetched. Connected to the internal address bus, the Program Counter holds the address in memory of the next program instruction. Notice that it does not connect directly to the memory, but must go via the MAR. The PC is special as it is both a register and a counter. o Memory Address Registers (MAR) – contains the memory address of the current instruction to be fetched. The Memory Address Register is used to store the address to access memory. o Memory Data Registers (MDR/MBR) – contains the instruction/data after it is fetched from main memory. Others calls it as Memory Buffer Register that stores information that is being sent to, or received from, the memory along the bidirectional data bus. o Accumulator (ACC/AC) – contains the result of any calculations carries out in the ALU. The Accumulator is used to store data that is being worked on by the ALU, and is the key register in the data section of the CPU. Notice that the memory cannot access the AC directly. The MBR/MDR is an intermediary. o Current Instruction Registers (CIR/IR) – Also known as Instruction Register (IR) contains the instruction/data to be decoded. When memory is read, the data first goes to the MBR If the data is an instruction, it gets moved to the Instruction Register. The IR has two parts: IR (opcode) – The most significant bits of the instruction make up the opcode. This is the genuine instruction part of the instruction, that tells the CPU what to do. The instruction in IR (gets decoded and executed by the control unit, CU. IR (address) – The least significant bits of the instruction are actually data. They get moved to IR (As the name suggests they usually form all or part of an address for later use in the MAR (in immediate addressing they are sent to the AC. o Stack Pointer (SR) – it is connected to the internal address bus and is used to hold the address of a special chunk of main memory used for temporary storage during program execution. o All Registers – these are edge triggered D types we will use falling edge triggered devices. For all their fancy names, the registers comprise nothing more than a row of D type latches which share a common clock input providing temporary storage on the CPU In our design they are falling edge triggered (hence the circle on the clock input) Because these registers output onto buses they have tri state buffers are connected to a single input OE, for “Output Enable”, as shown in Figure 2. Figure 2: An 8-bit register with 3 state output enable. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 4 BUSES, REGISTERS, AND THEIR WIDTHS The buses carry words of information which are many bits wide, and on diagrams a bus is indicated either by a wide line, or by a single line with a dash through it often accompanied by the bus width in bits. Data – Microcontrollers have data bus widths of 4 bits, 8-bits, 16-bits and 32-bits, while the most advanced PCs use 64 bits. In these lectures we will assume that the “memory width” is 16 bits or 2 Bytes. This means that each location can store 2 Bytes. We will also assume that the data bus is 16 bits wide, and the MBR and AC registers on the data side of the CPU are therefore also 16 bits wide. The ALU is also 16 bits wide. Figure 3: The data side is 2 Bytes or 16 bits wide. Address – The address bus does not have to be the same width as the data bus. The width on CPUs over time has increased in step with contemporary memory technology, with the Intel 8086 (from 1979) having n = 20 address lines to current processors having n = 36 − 40. Having n address lines means that there are 2n addresses or locations in the address space. A convenient method of figuring out 2n is to remember that 210 = 1024 so n = 10 lines address 1K locations, n = 20 lines address 1M locations, and n = 30 can address 1G locations. Of course, microcontrollers tend to have a smaller amount of memory, because they are not designed to multitask (i.e., run multiple programs), and 256K locations is the largest number spotted (in 2010). However, for lecturing purposes it is useful (i) to have different numbers on the address and data side, and (ii) to keep things in multiple of 8 — so here we will assume a 24-bit address bus, able to access 16M location. (Note this is not necessarily 16MByte of memory. Why not?) The PC, SP, and MAR in our CPU will therefore be 24 bits wide. Figure 4: The address side is 24 bits or 3 Bytes wide. The address space has 224 locations. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 5 The IR is special. The IR (opcode) part should be wide enough to take the largest opcode. We assume the opcode is a fixed 8 bits wide, allowing 256 different instructions — which is plenty enough. The IR (address) part must have the same width as the address bus, 24 bits. So, the whole IR is 32 bits wide. It is however fed from the internal data bus which is only 16 bits wide in our architecture. Figure 5: The IR must be 8 + 24 = 32 bits width. INTRODUCTION TO THE MAIN MEMORY The memory will comprise mostly random-access memory (RAM) with some additional read- only memory (ROM) to help the machine start up. The main memory does not reside in the CPU chip but sits on the motherboard and is connected to the CPU via a bus. In fact, two (or three) buses, the data bus and address bus (and also a control bus, which carries timing pulses and the level to indicate writing or reading). The address bus has been chosen to be 24 bits wide, so the address space is from 0x0 to 224 − 1 or 0xFFFFFF in hex. The data bus is 16 bits wide, and so the contents width is 16 bits. In Fig. 6, for example, the contents of address 4 are 0x01FF. The largest (unsigned) integer number that can be held is 216 − 1 or 0xFFFF. Figure 6: 24-bit Address and contents For now, it is enough to know how to describe reading from and writing to the memory. The main memory is effectively just a large stack of registers, each with its own address. To read or write from memory the register transfers are written as MBR← read from memory ←MBR write to memory means the memory register whose address is given by the MAR. The MAR is said to point to the memory location. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 6 FETCH-DECODE-EXECUTE CYCLE Fetch-Decode-Execute cycle is the time period of which the computer reads and processes the instructions from the memory, decodes them, and executes them. This process is a continuous cycle which is used until the computer is turned off or there are no more instructions to process. With the structure of registers, units, memory and buses laid out, let us be clear that the overall operational aim is very simple. We want our CPU repeatedly to: Fetch – the next instruction from memory into the instruction register Decode – the instruction (that is, work out which it is) Execute – the instruction Fetching and an Executing an instruction simply require the CPU’s Control Section to issue Levels and Pulses which set up pathways and fire register transfers so that: Data is moved from memory to registers, and between registers Data is passed (through the ALU, and Data is stuffed back into the memory If you need an analogy, we are doing little more than “playing trains” with data. The Control Section uses Levels to “set the points” and create the route between A and B, and uses a Pulse to send the train from A to B. A fetch-decode-execute cycle five-step cycle: 1. Instruction Fetch (IF) 2. Instruction Decode (ID) 3. Data Fetch (DF) / Operand Fetch (OF) 4. Instruction Execution (EX) 5. Result Return (RR) / Store (ST) ADD 4000, 2000, 2080 Figure 7: The processor before executing the instruction in memory location 800 ADD the values found in memory locations 428 and 884 and store the result in location 800. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 7 1. Instruction Fetch (IF) Execution begins by moving the instruction at the address given by the PC (PC 800) from memory to the control unit. Bits of instruction are placed into the decoder circuit of the CU. Once instruction is fetched, the PC can be readied for fetching the next instruction. Figure 8: Instruction Fetch The instruction addressed by the PC moved from memory to the control unit. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 8 2. Instruction Decode (ID) ALU is set up for the operation. Decoder finds the memory address of the instruction's data (source operands). Most instructions operate on two data values stored in memory (like ADD), so most instructions have addresses for two source operands. These addresses are passed to the circuit that fetches them from memory during the next step. Decoder finds the destination address for the Result Return step and places the address in the RR circuit. Decoder determines what operation the ALU will perform (ADD), and sets up the ALU. Figure 9: Decode The instruction is analyzed, and the processor is configured for later steps: the data addresses are sent to the memory, the operation (+) is set in ALU, and the result return address is set. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 9 3. Data Fetch (DF) The data values to be operated on are retrieved from memory. Bits at specified memory locations are copied into locations in the ALU circuitry. Data values remain in memory (they are not destroyed). Figure 10: Data Fetch The values for the two operands are fetched from memory. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 10 4. Instruction Execution (EX) For this ADD instruction, the addition circuit adds the two source operands together to produce their sum. Sum is held in the ALU circuitry. This is the actual computation. Figure 11: Instruction Execute The addition operation is performed. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 11 5. Return Result (RR) RR returns the result of EX to the memory location specified by the destination address. Once the result is stored, the cycle begins again. Figure 12: Return Result The answer is returned to the memory, and the program counter’s updated value is sent to the memory in preparation for the next fetch. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 12 ONE CYCLE PER CLOCK TICK The processor driven by an internal clock. With every tick of the clock or heartbeat, our CPU goes through a step in what’s called the “Fetch-Execute” cycle, or “Fetch-Decode-Execute”. On each clock tick, the CPU will do one of three things: It will fetch an instruction from a memory address. It will decode that instruction. And it will execute the instruction. The clock sends out a regular electrical pulse which synchronizes (keeps in time) all the components. The frequency of the pulses is known as clock speed. Clock speed is measured in hertz (Hz). The greater the speed, the more instructions can be performed in any given moment of time. For instance, A computer with a 1 GHz clock has one billionth of a second—one nanosecond—between clock ticks to run the Fetch/Execute Cycle. In that amount of time, light travels about one foot (~30 cm). A simple processor might use five ticks to complete one instruction (five steps). Modern computers try to start a new instruction each clock tick. This is done using a pipeline. A pipeline is like an automobile assembly line. The CPU fetches an instruction and passes it along the pipeline. It is then free to fetch the next instruction on the next tick. This is more complicated than car assembly. Some instructions must wait for results from previous instructions. Various other complications. Therefore, it is not quite true that 1,000 instructions are executed in 1,000 ticks. Detailed Video: https://www.futurelearn.com/info/courses/how-computers-work/0/steps/49284 Summarized Video: https://www.youtube.com/watch?v=Z5JC9Ve1sfI Schematic Fetch/Execute Cycle Figure 13: Schematic Fetch Cycle Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 13 Many, Many Simple Operations Computers “know” very few instructions. The decoder hardware in the controller recognizes, and the ALU performs, only about 100 different instructions (with a lot of duplication). There are only about 20 different kinds of operations. Everything that computers do must be reduced to some combination of these primitive, hardwired instructions. Cycling the Fetch/ Execute Cycle ADD is representative of the complexity of computer instructions…some are slightly simpler, some slightly more complex. Computers achieve success at what they can do with speed. They show their impressive capabilities by executing many simple instructions per second. Translation A programmer writes source code, such as: this.Opacity += 0.02. However, the bits the processor needs are known as object code, binary code, or just binary. Source code is translated into assembly code, then into binary. Assembly Language Primitive programming language uses words instead of 0s and 1s. ADD Opacity, TwoCths, Opacity To convert source code into assembly, the source code must be compiled by a compiler. A compiler is a computer program that translates another computer program into assembler language. Each language requires its own compiler. The assembly language is converted to machine language by, yet another program called an assembler. Integrated Circuits (ICs) Miniaturization Computer clocks can run at GHz rates because their processor chips are so tiny. Electrical signals can travel one foot (30 cm) in a nanosecond. Early computers (the size of whole rooms) could never have run as fast because their components were farther apart than one foot. Making everything smaller has made computers faster. Integration Early computers were made from separate parts (discrete components) wired together by hand. There were three wires coming out of each transistor, the two wires from each resistor, the two wires from each capacitor, and so on. Each had to be connected to the wires of another transistor, resistor, or capacitor. Active components and the wires that connect them are manufactured together from similar materials by a single (multi-step) process. IC technology places transistors side by side in the silicon, along with the wire(s) connecting them. Result is small and reliable. Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 14 Implementing ALU Operations Logical operations such as AND and OR can be built from transistors. More complicated combinations can be made from these “simple” operations. Arithmetic, memory and control units can be made combining these parts. Eventually, you have an ALU, and then a full processor. Combining the Ideas Start with an information-processing task. Task is performed by an application implemented as a large program. The program’s commands are compiled into many simple assembly language instructions. The assembly instructions are then translated \into a more primitive binary form. Fetch/Execute Cycle executes the instructions. The processor uses logic built of transistors to perform the execution. References The Intel Microprocessors 8th Edition by Barry B. Brey Microprocessors and Interfacing 1st Edition by Godse Introduction to Microprocessors and Microcontroller 2nd Edition by John Crisp https://www.youtube.com/watch?v=Z5JC9Ve1sfI https://www.futurelearn.com/info/courses/how-computers-work/0/steps/49284 https://www.bbc.co.uk/bitesize/guides/zhppfcw/revision/4 https://www.learncomputerscienceonline.com/instruction-cycle/ A Microcontroller System: An Introduction to Computer Architecture. (n.d.). Retrieved February 26, 2022, from https://www.robots.ox.ac.uk/~dwm/Courses/2CO_2014/2CO-N1.pdf Chapter 9 Principles of Computer Operations. learning objectives explain what a software stack represents and how it is used describe how the fetch/execute - [PPT powerpoint]. Cupdf. (n.d.). Retrieved February 26, 2022, from https://cupdf.com/document/chapter-9-principles-of- computer-operations-learning-objectives-explain-what.html?page=1 The journey INSIDESM: Microprocessors student... - intel.in. (n.d.). Retrieved February 26, 2022, from https://www.intel.in/content/dam/www/program/education/us/en/documents/the-journery- inside/microprocessor/tji-microprocessors-handout8.pdf Created by: CPE323 SECOND SEMESTER USTP INSTRUCTORS 15

CPE323 Microprocessor Fetch-Decode-Execute Cycle PDF

Document Details

Tags

Related

Summary

Full Transcript