Concepts on Computer Instructions PDF

Document Details

ReformedWetland4952

Uploaded by ReformedWetland4952

Université de Mohamed Chérif Messaadia Souk Ahras

2024

Ahmed AHMIM

Tags

computer instructions assembly language programming levels computer science

Summary

This document presents an overview of computer instructions, and explores various aspects of programming, from high-level languages to machine-level instructions. It also summarizes the concept of assemblers, compilers, and interpreters, which translate source code into executable machine code. The document targets students who learn about computer architecture and low-level programming.

Full Transcript

Concepts on Computer Instructions Chapter 3: Concepts on Computer Instructions Ahmed AHMIM University of Mohamed Cherif Messadia Souk Ahras [email protected] An overview of Computer Instructions...

Concepts on Computer Instructions Chapter 3: Concepts on Computer Instructions Ahmed AHMIM University of Mohamed Cherif Messadia Souk Ahras [email protected] An overview of Computer Instructions November 18, 2024 Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 1 / 112 Presentation Overview 1 Assembler Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 2 / 112 Programming Levels To write a program, the user often has a choice between several languages (Java, Fortran, Pascal, C, C++, Ada, assembler, etc.). The computer, however, only understands its own language, the machine language, with its set of machine-level instructions. Source Code and Machine Code In programming, the term ”language” refers to a set of instructions and syntactic rules that allow writing what is called source code. However, the machine can only execute programs written in machine code or object code. Therefore, it is necessary to translate the source code into object code before submitting it to the machine. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 3 / 112 Translation vs Interpretation This translation work is done automatically using translator programs, such as assemblers and compilers. If the object code is not explicitly produced, we then talk about interpreting the source code. Interpretation Interpreting a language involves executing its instructions directly from the source code. This task is carried out by an interpreter program (e.g., the PHP or Python interpreter). Assemblers: Translate low-level assembly code into machine code. Compilers: Convert high-level language code into object code. Interpreters: Execute the source code instructions directly, without generating object code. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 4 / 112 Programming Language Levels Hierarchy of Programming Languages Several levels of programming languages can be distinguished, with higher levels being closer to the user’s language and lower levels being better suited to the machine’s characteristics. The current situation is summarized in the following figure. Figure: Programming levels Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 5 / 112 High-Level Languages A programming language includes: An alphabet: a set of elementary symbols available. Names or identifiers: groups of alphabet symbols (created by the programmer in programming languages, subject to certain constraints such as the number of characters, the type of the first character, etc.). Sentences or instructions: sequences of names and punctuation symbols that adhere to the syntactic and semantic aspects of the language. Structured and Unambiguous Natural languages are often ambiguous. Computer languages are structured and strictly unambiguous. They can be formally defined. Assemblers were the first programming languages and they are directly dependent on machine architecture. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 6 / 112 High-Level Languages (HLL) Introduction to HLL High-Level Languages (HLL) appeared later and allowed communication with a computer without considering its architecture. They are, therefore, independent of computers and enable better portability from one type of machine to another. In computer science, several categories of languages can be distinguished: Programming languages and command languages are the ones we use the most. There are also analysis languages and specification languages, which help during the early phases of software development. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 7 / 112 High-Level Languages (HLL) Program Writing Writing a program involves using programming language symbols to form sentences or instructions. These sentences must adhere to the syntax of the language, which governs the positioning of symbols relative to one another. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 8 / 112 Writing Programs in Machine Language Challenges of Machine Language Writing programs in machine language is a difficult task, and the number of errors increases rapidly with the size of the problem. High-Level Languages make this task less tedious by being closer to our natural language while still adhering to the requirements of rigor and unambiguity. Concepts of High-Level Languages The first concept underlying high-level languages was independence from computers, allowing for more advanced instructions with a higher semantic level than assembler instructions. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 9 / 112 Writing Programs in Machine Language Three major categories of high-level languages have developed: The first, based on the concepts of algorithms and scientific data processing, gave rise to the Fortran and Algol languages. The second focused on data processing, particularly large volumes of management data, leading to the development of the Cobol language. The third category distanced itself from the other two, also moving away from efficiency constraints, resulting in the Lisp language, which is based on list processing. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 10 / 112 Assembler Importance of Assembler Although replaced by high-level languages and relegated to a secondary role, assembler remains an important language. It is still used today in certain cases by specialists, generally for optimization purposes. For example, when one wants to take advantage of the machine’s architecture, it is necessary to code parts of a program in assembler, even though the increasing complexity of processors makes this task more delicate. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 11 / 112 Error Diagnosis Error Diagnosis Another characteristic example is error diagnosis (both hardware and software). To detect certain subtle errors, it is necessary to examine the contents of memory, which involves reviewing millions of bits to reconstruct the evolution of the program step by step and its effects on the contents of registers and memory. For this, one must have a precise understanding of the machine’s architecture and its operation at the level of elementary operations. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 12 / 112 Machine Language Composition of Machine Language Machine language consists of binary instructions as they exist in memory at the time of program execution. Assembler is merely a symbolic variant of machine language; thus, it has the same instruction set. Consequently, assembler is specific to each type of machine. It facilitates the programmer’s work by allowing the use of: Mnemonic operation codes, Labels (symbolic addresses), Literals (numeric constants), Directives (memory space reservation, macro declaration). Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 13 / 112 Advantages and Disadvantages of Assembler Access to Machine Resources Unlike high-level languages, assembler allows access to all machine resources (registers) and processing facilities (detailed operations like shifting). It does not hide anything from the programmer, which is an advantage in some cases but can become tedious in others. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 14 / 112 Assembler Instruction Structure Structure of an Assembler Instruction An assembler instruction is divided into fields. Here is the typical structure: Label Mnemonic Operation Code Operands The different fields of an instruction are generally separated by one or more spaces. The number of operands in the third field varies from one machine to another, ranging from 0 to 3. It is advisable to add comments after this field. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 15 / 112 Assembler Instruction Structure Example of Instructions donnees1: DS 1 adresse1: MOVE Al, donnees1 ADD Al, A2 JUMP adresse2 Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 16 / 112 Assembler Instructions Explanation of Instructions The instructions perform the following actions: The first instruction (an assembler directive) defines the variable donnees1 and reserves 1 memory word for it. The second instruction, labeled adresse1, transfers the content of register Al into the variable donnees1. The third instruction performs an addition between two registers and stores the result in the second register. The last instruction performs an unconditional jump to the address adresse2, which is located somewhere in the program. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 17 / 112 Mnemonic Operation Codes Ease of Use It is obviously easier to remember terms like ADD or SUB than to recall the corresponding binary codes (e.g., 101101, 001001). However, modern machines allow for several variations of the same operation, so multiple types of addition can exist depending on the nature of the operands. The operation code is thus linked to the type of operands that follow it. This can make the instruction set complex (CISC architecture). Programmers often have to consult the mnemonic code table to write an assembler program. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 18 / 112 Operands and Labels Ease of Programming Unlike machine language, assembly language allows the use of alphanumeric names for variables and labels (addresses of instructions), which greatly facilitates programming. For example, suppose we want to perform a branch. In machine language, we must provide the exact memory position in binary where the instruction we want to branch to is located. In assembly language, it is sufficient to precede the instruction (where we want to branch) with a symbolic label. We can then use that label as an operand for the branch operation. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 19 / 112 Operands and Labels (continued) Ease of Referencing Similarly, for operands, there is no longer a need to provide the exact binary address. Operands have a name that allows them to be referenced. Furthermore, each register has a name recognized by the assembler..Tab: DS 1.Dix: DC 10.Boucle: MOVE Dix, Al. MOVE A2, Tab. JUMP Boucle Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 20 / 112 Operands and Labels (continued) Explanations Definition of a variable Tab of 1 word, where DS = Define Storage. Definition of a constant Dix with the value 10, where DC = Define Constant. Transfer the value 10 stored at address Dix into register Al. Transfer the value from register A2 into variable Tab. Unconditional jump to the address defined by the label Boucle. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 21 / 112 Literals Definition In machine language, every constant must be encoded in binary. Assembly language allows for defining integer or real values in various bases (2, 8, 10, or 16), as well as character strings. The assembler is responsible for their conversion. A binary value is preceded by %. A hexadecimal value is preceded by $. If there is no special character, the value is considered decimal. Character strings are enclosed in single quotes (’ ). Examples of Literals ’A’ 65 $41 %01000001 ’01’ 12337 $3031 %0011000000110001 Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 22 / 112 Directives Definition Directives, or pseudo-instructions, are non-executable instructions that do not have a machine code equivalent. They are commands given to the assembler that provide guidance for translating the program. Like executable instructions, directives are referenced by their mnemonic code. There are different kinds of directives. Variables are data that can be modified, while constants are data that cannot be modified. Definition directives allow assigning a value to a constant or reserving memory space for a variable. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 23 / 112 Examples of Directives TTL ’Titre du programme’ Vecteur: DS 50 Definition of the variable Vecteur and reservation of 50 words Zéro: DC 0 Definition of the constant Zéro with the value 0 END End of the program Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 24 / 112 Arithmetic Expressions Limitations in Assembly Language Unlike high-level languages, arithmetic expressions used to calculate the value of a variable, such as in the assignment A = B + C / D, are not allowed in assemblers. These calculations must be programmed using multiple instructions. Assemblers allow only simple arithmetic expressions as operand addresses or labels. Example: To traverse memory locations in a reserved block, use expressions like: TABLEAU + 3 TABLEAU + I (where I can vary) Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 25 / 112 Referencing the Current Instruction Address Symbol Usage The symbol * is widely used in assemblers to reference the address of the current instruction. * + 3 corresponds to the address of the third instruction following the current one. This is especially useful for handling memory locations dynamically within assembly code. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 26 / 112 Macros and Subroutines Some assemblers allow structuring programs by grouping a sequence of instructions into a subroutine or macro-instruction. These structures are designed to modularize the program and avoid the repeated writing of frequently used instruction sequences. Advantages of Using Macros: Extends the machine’s instruction set, as each macro can be used like any other instruction. Makes source programs shorter, more structured, and easier to understand and modify. Enhances program quality, as a correctly functioning macro will always work as expected. Although macros save programming time, they do not reduce memory space in the machine code program. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 27 / 112 Macros The concept of a macro-instruction, or simply a macro, is to isolate a sequence of instructions that one wants to avoid repeating and to assign it a symbolic name for easy reference. When this symbolic name is referenced in the program, the assembler replaces it with the corresponding instruction sequence. Assemblers that support macros are known as macro-assemblers, and all modern assemblers include this functionality. Macro Definition and Delimitation Instructions used to define and delimit a macro (e.g., MACRO and ENDM) are typical examples of directives. During assembly, each call to a macro is replaced by its body, and these pseudo-instructions are eliminated. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 28 / 112 Calculating the Cube of a Number Macro for Cube Calculation MACRO CUBE (value, valuecube) MOVE value, D1 MOVE value, D2 MUL D1, D2 ; (D2 ← D1 × D2) MUL D1, D2 MOVE D2, valuecube ENDM Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 29 / 112 Subroutines Subroutines are defined similarly to macros: they also aim to avoid repeating instruction sequences that need to be used multiple times. Key Difference from Macros: The instructions in a subroutine form a separate entity from the main program. This separation remains after translation, meaning: The subroutine appears only once in memory. At runtime, any reference to the subroutine causes a branch to that subroutine (see the next Figure). Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 30 / 112 Advantages and Challenges of Subroutines This approach offers the same benefits as macros, with the added advantage of minimizing the size of the executable code, which is not the case with macros. Advantages: Reduces executable code size by storing the subroutine only once in memory. Challenges: Requires knowledge of the addresses of subroutines. It is necessary to save the return address during subroutine execution. Return Address The return address is the address of the instruction following the subroutine call. Additionally, in some languages like Fortran, subroutines can be translated separately and used by different programs. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 31 / 112 Subroutine VS Macro Figure: Subroutine VS Macro Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 32 / 112 Parameter Passing A program can exchange data with its macros or subprograms using parameters. A parameter is a variable whose name is known, but whose content is specified only at runtime. Macros and subprograms are written using formal parameters, which are replaced by effective parameters corresponding to the data being processed. Substitution Timing This substitution occurs: At compile time for macros At runtime for subprograms Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 33 / 112 Parameter Passing in Macros In the case of macros, parameter passing is relatively straightforward due to the following reasons: The macro expansion replaces its call with the corresponding instructions during compilation. The addresses of effective parameters are known at this stage. Macro Contains *Formal Parameters*: For example, ‘value‘ and ‘valueCube‘ in the cube macro. Every call to the macro includes *Effective Parameters*: The real data passed during execution. Role of the Assembler The assembler handles the substitution of parameters during compilation. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 34 / 112 Parameter Passing in Subprograms For subprograms, there are two main techniques for passing parameters: Passing by Value Involves copying the value into a known area of the subprogram (memory area or register). Not suitable for complex data structures, as the entire structure must be copied. Modifications occur only within the subprogram; the original parameter retains its value after returning. Passing by Reference Involves transmitting the addresses of the parameters to the subprogram. The subprogram works directly on the data of the calling program. Ahmed Any AHMIM modifications (UC) affect the parameter’s value Concepts on Computer Instructions upon returning November 18, 2024 35 / 112 Structure of Machine-Level Instructions Computers can perform a range of simple operations, such as: Adding two numbers Testing the sign of a numeric value Copying the contents of one register into another Storing the result of an operation in memory Essential Information A machine instruction must provide the CPU with all necessary information to trigger a basic operation. It includes: An operation code specifying the type of action One or more addresses (e.g., operand addresses, result storage location, next instruction location) The format of a machine instruction includes an operation code field and up to four address fields, depending on the instruction type. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 36 / 112 Address-Based Instruction Formats Instructions are classified by the number of addresses: 0, 1, 2, 3, or 4, depending on the common instruction format of the machine. The instruction set and structure are fundamental architectural choices. Historical Context Until the 1980s, processors favored multi-address formats for efficiency. However, as memory capacity increased, the need for larger address fields became significant, potentially increasing the instruction and CPU register sizes (e.g., the Instruction Register). Consequently, some general-purpose processor architectures began adopting two-address, one-address, or even zero-address instruction formats. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 37 / 112 Address Optimizations The address for the next instruction can be omitted by using an incremented program counter, allowing sequential instruction execution. The result address may be unnecessary if stored in place of an operand. Operand addresses can be reduced from two to one by introducing a special register, the accumulator: Accumulator Usage The second operand resides in the accumulator (loaded by the previous instruction), and the result is also stored in this accumulator. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 38 / 112 Programming a Single-Address Machine The sequence of machine/assembler instructions corresponding to the calculation instruction: F A = B × (C + D × E − ) G 1 LOAD F (LOAD = Load into the accumulator) 2 DIV G (DIV = Divide the content of the accumulator) 3 STA Ti (STA = Store the content of the accumulator) 4 LOAD D 5 MPY E (MPY = Multiply the content of the accumulator) 6 ADD C (ADD = Add to the content of the accumulator) 7 SUB Ti (SUB = Subtract from the content of the accumulator) 8 MPY B 9 STA A Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 39 / 112 Zero-Address Machine A zero-address machine can also be envisioned, utilizing a special structure such as a stack (or LIFO memory: Last In, First Out), where operands are located in the two topmost positions, and the result is placed at the top of the same stack. In such a machine, most instructions do not have address fields; however, two one-address instructions are needed to manage the stack: LOAD X (This instruction places X at the top of the stack) STORE Y (This instruction stores the value at the top of the stack into Y) Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 40 / 112 Programming a Zero-Address Machine Additionally, there is a set of zero-address instructions, such as: ADD, SUB, MPY,DIV The ADD instruction sums the two values located in the two topmost levels of the stack and places the result at the top of what remains (the two operands are removed and replaced by their sum). The SUB instruction replaces the two operands at the top of the stack with their difference (subtracting the value at the top from the one in the second position). The MPY instruction replaces the two values at the top of the stack with their product. The DIV instruction replaces them with their quotient, where the first value is the divisor. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 41 / 112 Instruction Sequence Example We can write the sequence of instructions corresponding to the evaluation of the mathematical expression from the previous example: (Insert the specific instruction sequence here) Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 42 / 112 Instruction Sequence for Zero-Address Machine The sequence of instructions for evaluating the expression: F A = B × (C + D × E − ) G 1 LOAD B PILE = {B} 2 LOAD C PILE = {B; C} 3 LOAD D PILE = {B; C; D} 4 LOAD E PILE = {B; C; D; E} 5 MPY PILE = {B; C; D * E} 6 ADD PILE = {B; C + D * E} 7 LOAD F PILE = {B; C + D * E; F} 8 LOAD G PILE = {B; C + D * E; F; G} 9 DIV PILE = {B; C + D * E; F/G} 10 SUB PILE = {B; C + D * E - F/G} 11 MPY PILE = {B * (C + D * E - F/G)} 12 STA A PILE = {} Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 43 / 112 Instruction Sequences and Memory Optimization Instruction sequences like the ones above can be written fairly easily and systematically. Variables are stored in the stack according to their appearance order in the algebraic expression, for example, B, C, D, E, F, G. An arithmetic operator is introduced as soon as the two operands occupy the two top levels of the stack. Memory Access Optimization With the growing disparity between the speeds of the CPU and memory, memory accesses significantly penalize the processor, which must wait for data from the RAM over many cycles. Processor designers have therefore sought to minimize memory references by prioritizing access to the CPU’s internal registers. This approach has allowed a return to multi-address instructions, as these now correspond to a register number, making them much smaller than a memory address. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 44 / 112 Instruction Set Overview Each machine has its own basic instruction set. It is often the result of a compromise between conflicting requirements, such as: Simplicity Universality Efficiency Programmer convenience The number of instructions typically ranges between 50 and 250. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 45 / 112 Historical Development of Instruction Sets Historically, instruction sets have become progressively more complex: Variable-length instructions Composite operations Multiple operand types (e.g., memory addresses, registers, register compositions) These features complicated CPU construction and led to the CISC (Complex Instruction Set Computer) architecture, often requiring a microprogrammed sequencer to execute a large variety of instructions. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 46 / 112 The RISC Architecture In response, research in 1981 led to the RISC (Reduced Instruction Set Computer) architecture: RISC restricts the instruction set to a small number of elementary, fixed-format instructions These instructions are simple to implement and execute quickly (often one per machine cycle) Characteristics of RISC This approach relies on a hardwired sequencer and an optimized compiler that maximizes the machine’s performance, focusing on register use and limiting memory access. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 47 / 112 Combining CISC and RISC Approaches Technological advances have allowed the merging of CISC and RISC: CISC processors (e.g., Intel x86) retain a complex instruction set but execute internally with simpler microprogram instructions for compatibility RISC processors (e.g., ARM CPUs) have gradually expanded their instruction set with added capabilities like floating-point and vector calculations, and differentiated memory access Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 48 / 112 Instruction Set Categories Instructions commonly found in all machines can be categorized into six groups: Data Transfer: Load, Move, Store, register-to-register, memory-to-register, etc. Arithmetic Operations: Basic arithmetic in integer or floating-point, single or double precision Logical Operations: AND, OR, NOT, XOR, etc. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 49 / 112 Instruction Set Categories (Continued) Additional instruction categories include: Sequence Control: Conditional and unconditional branching, loops, procedure calls Input/Output: Read, Write, Print, etc. Miscellaneous Manipulations: Shifts, format conversions, register increments, etc. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 50 / 112 Operand Addressing The address field of an instruction does not always contain the actual address of an operand. However, when it does, it is called direct addressing. Programming Convenience To facilitate programming, CPU manufacturers offer a range of methods to address operands. The instruction format includes a field whose bits indicate the chosen mode. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 51 / 112 Main Addressing Modes Among the various addressing modes, the most important are: Direct: The address field contains the actual address. Indirect: The address field contains the address of the effective address; multiple levels of indirection are sometimes possible. Immediate: The address field contains the operand itself. Implicit: The operation code indicates where the operand is, typical for a zero-address machine. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 52 / 112 Main Addressing Modes (Continued) Indexed: The effective address is the sum of the address field and the index register content. Based: The effective address is the sum of the address field and the base register content. Relative: Similar to base addressing but uses the Program Counter (PC) content as the base address. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 53 / 112 Combining Addressing Modes Addressing modes can sometimes be combined, such as indexed addressing with indirect addressing. Application Order The order of applying addressing methods is crucial as it affects the result. Example Notation: Symbols like a, b, c, α are used for arbitrary data, with abbreviations like: IMM: Immediate addressing I: Indirect addressing XR1: Index register 1 B1: Base register 1 Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 54 / 112 Example Scenario The effect on an accumulator (ACC) during a load operation under different addressing conditions is illustrated. Example Data The memory contents and some register values are summarized in the following table. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 55 / 112 Examples of Operations on the Accumulator The examples of operations on the accumulator are in the following table; the addressing conditions are indicated after the address field and are separated by commas. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 56 / 112 Case 1 and Case 2 Let ADR be the number of bits in the address field of an instruction. Let 2n be the size of central memory. We can distinguish three cases: Case 1: ADR = n The entire physical memory is accessible. All addressing methods offered by the manufacturer can be used. Case 2: ADR < n ADR is not enough to address the entire memory. Base addressing can be used if the base register has sufficient size (i.e., n bits). Memory is divided into blocks that ADR can fully address. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 57 / 112 Case 3: ADR > n ADR can address memory locations that do not exist in physical memory. This allows mechanisms that overflow into auxiliary storage, such as disk units (notion of virtual memory). Virtual Memory Virtual memory can be achieved by dividing the program stored on disk into pages. Real memory is divided into page frames that can each hold exactly one program page (pagination concept). Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 58 / 112 Control Unit The control unit is the set of devices coordinating the operation of the computer to execute the sequence of operations specified in the program instructions. The main devices involved during the memory fetch and instruction decode (fetch cycle) are: Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 59 / 112 Key Components Program Counter (PC): A register containing the memory address where the instruction to be fetched is stored. Instruction Register (IR): Receives the instruction that must be executed. Operation Code Decoder: Determines which operation must be performed among all possible operations. Sequencer: Generates control signals. Clock: Emits regular electronic pulses, synchronizing all actions of the central unit. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 60 / 112 Information Flow During a Fetch Cycle The flow of information during a fetch cycle is illustrated in the following figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 61 / 112 Steps of the Fetch Cycle (See previous Figure) The different steps of the fetch cycle (corresponding to the circled numbers in the previous Figure) can be summarized as follows: 1 Transfer of the address of the new instruction from the Program Counter (PC) to the Memory Address Register (MAR). 2 A read pulse, generated by the control unit, triggers the transfer of the fetched instruction to the Memory Buffer Register (MBR), which acts as a buffer for all data read from or written to memory. 3 Transfer of the instruction to the Instruction Register (IR), containing the operation code and operand address. 4 While the operand address is sent to the MAR, the operation code is transmitted to the decoder, which identifies the requested operation and signals the sequencer. 5 The Program Counter (PC) is incremented in preparation for the next fetch cycle. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 62 / 112 Execution Cycle (See the next Figure) The fetch cycle is followed by the execution cycle, during which the operation specified in the instruction is performed by the ALU. The exact sequence of actions, coordinated by the sequencer, depends on the operation. Typically, during an execution cycle, information flows as shown in the next Figure. The execution cycle generally includes the following steps: 1 The sequencer sends command signals to memory to read the operand at the address stored in the AR, placing it in the MR. 2 Transfer of the MR contents to the ALU, specifically to the accumulator or another designated register. In some cases, such as storing a result, the accumulator’s contents are transferred to the MR. For branch instructions, the address field of the instruction is transferred to the PC. 3 The operation is performed under the sequencer’s control. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 63 / 112 Information Flow During a Fetch Cycle The flow of information during a fetch cycle is illustrated in the following figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 64 / 112 Completion of the Execution Cycle and Transition to the Next Fetch Cycle Cycle Transition Once the execution cycle is complete, the control unit immediately proceeds to the next fetch cycle, fetching the new instruction indicated by the address in the Program Counter (PC). The fetch and execution cycles described here are typical of classical architectures, known as the von Neumann architecture. To enhance the power and speed of modern supercomputers and microcomputers, the architecture has evolved significantly. Consequently, fetch and execution cycles have become more complex in contemporary systems. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 65 / 112 Synchronization of Operations The circuits that synchronize and control all computer operations are located in the control unit. Periodic signals generated by the clock define the clock cycle, the fundamental time unit governing the machine’s operation. Instruction Cycle We have already seen the fetch cycle and the execution cycle, which together compose the instruction cycle. The execution time of an instruction depends on the type of operation to be performed; an instruction cycle can extend over multiple clock cycles, as shown in the following Figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 66 / 112 CPU Cycle and Memory Cycle Definitions CPU Cycle: Refers to the execution time of the shortest instruction or the duration of a basic action that causes a change in state. Memory Cycle: Significantly longer than the CPU cycle and is a limiting factor in the computer’s performance. Improving Memory Performance To address the relative slowness of central memory: Interleaved Memory: Allows overlapping of cycles for efficiency. Cache Memory: (also known as pre-memory) anticipates data and instruction transfers to the CPU. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 67 / 112 Performance Considerations Performance Considerations The operating speed of a computer depends not only on its clock frequency but also on: Memory cycle and structure Cache access time Other architectural factors Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 68 / 112 Pipelines and Pipelining The technique of pipelining was first implemented in supercomputers, then in Intel’s 80386 in 1985, and is now incorporated into all microprocessors. It is a straightforward concept inspired by assembly line organization, such as car manufacturing lines. Definition Pipelining is a technique for performing more work per unit of time when a given operation must be repeated on a large number of operands. It involves segmenting a complex operation into a sequence of simpler actions. Each simple action is carried out by a specific unit, allowing multiple complex instructions to be processed simultaneously at different stages of execution. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 69 / 112 Objective of Pipelining The goal of pipelining is not to speed up the execution of a single instruction but to increase performance when applied repeatedly on a series of data. Process of Pipelining Instead of designing a unit S capable of performing operation P in time T , the work is divided into segments P1 , P2 , P3 , P4 ,... executed by sub-units S1 , S2 , S3 , S4 ,... in times T1 , T2 , T3 , T4 ,..., which are fractions of time T. Once the pipeline is full, i.e., when all sub-units are occupied, results begin to output at a much higher rate than in an unsegmented unit. As shown in the following Figure (where A, B, C, D, and E are operations to execute and RA , RB , RC , RD , RE are their results), the result throughput is approximately N times higher in a unit divided into N parts than in a non-segmented unit. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 70 / 112 Segmentation of an operational unit Figure: Segmentation of an operational unit Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 71 / 112 Objective of Pipelining Pipelining does not increase the speed of executing a single operation (it always takes time T to process) but allows for more results per second by continuously feeding operands into the pipeline. This is a method to enhance megaflop capacity. Floating-Point Operations To achieve this, floating-point operations are segmented, and the hardware is restructured similarly to stations on an assembly line. Example: An addition operation could be divided into five parts, each handled by a dedicated unit: Compare exponents Swap mantissas, if necessary Shift mantissas Add mantissas Normalize the result Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 72 / 112 Pipelining in the Control Unit Pipelining can also apply to devices beyond the ALU, such as the Control Unit for instruction processing. Instruction Cycle Segmentation The instruction cycle can be divided into sequential actions: Fetch the next instruction from memory [Fetch Instruction] Decode the operation code [Decode] Fetch the operand [Fetch Data] Execute the operation [Execute] Write the result [Write Result] These actions can follow each other and overlap in time, thus increasing throughput. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 73 / 112 Clock Speed and Pipeline Efficiency Advantages of Pipeline Segmentation Simplifying each elementary action in the pipeline can increase clock speed and provide additional performance gains. Early pipelines had a few stages. The Prescott architecture in Pentium IV holds the record with a 31-stage pipeline. Typical pipelines generally have around 15 stages. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 74 / 112 Pipeline Stalls (Bubbles) Pipelines are not 100% efficient; certain constraints cause delays known as bubbles (pipeline stalls) because they introduce wait times during execution. Resource Conflict: Two pipeline stages may require the same resource (e.g., two memory accesses, one for retrieving an instruction and another for data). Data Dependency: If one stage’s execution requires data computed by a previous instruction, it may not yet be available if the instruction is still in the pipeline. Instruction Dependency: In the case of a jump instruction, the next instruction cannot be fetched until the address is known, often requiring the decoding and execution of the jump. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 75 / 112 Pipeline Design Trade-Offs Balancing Pipeline Length and Efficiency The delays mentioned are increasingly significant as the pipeline lengthens, explaining the compromise most CPUs reach between the efficiency of long pipelines and the penalties due to their complexity. Pipeline stalls (bubbles) add latency, particularly in long pipelines. Effective pipeline design seeks to maximize throughput while minimizing stall-induced delays. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 76 / 112 Clock Today, computers are synchronous, meaning that the outputs of all circuits change simultaneously according to the rhythm of a global periodic signal called the clock. Definition of a Clock A clock is a periodic signal with a frequency equal to the inverse of its period (or cycle time). The clock period consists of two parts: A high period A low period An example of a typical clock signal is illustrated in the following Figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 77 / 112 Types of Synchronization There are two types of synchronization: Level-based synchronization: All state changes occur while the clock is at a fixed level (high or low). Edge-triggered synchronization: All state changes happen on a fixed clock pulse edge (rising or falling). Key Advantage Edge-triggered synchronization allows reading and writing of a state element within a single clock cycle. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 78 / 112 Edge-Triggered Synchronization Requires all signals to be written to be stable when the active pulse edge occurs. The clock period must be long enough to allow signals to stabilize within combinational circuit blocks, as shown in the following figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 79 / 112 Sequencer Definition The sequencer is an automaton that generates the control signals necessary to activate and control the units involved in executing a given instruction. These signals are distributed to various control points of the relevant components according to a timing diagram, taking into account the response times of the circuits involved. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 80 / 112 Types of Sequencers This automaton can be implemented in two ways: Wired sequencer Microprogrammed sequencer These alternatives are illustrated in the following Figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 81 / 112 Wired Sequencer A wired sequencer is a complex sequential circuit where each executable instruction has a dedicated sub-circuit capable of controlling its execution. The appropriate sub-circuit is activated by a signal from the decoder. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 82 / 112 Microprogrammed Sequencer (Wilkes’ Model, 1951) Concept by Maurice Wilkes The same result as a wired sequencer can be achieved using a sequence of microinstructions stored in a microprogram memory. The term ”micro” indicates a lower, more detailed programming level than machine level. The operation code of the instruction to be executed serves as the address for the first microinstruction in the microprogram. This microprogram can generate a sequence of control signals equivalent to those produced by a wired sequencer. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 83 / 112 Microprogram Storage and Execution Store the microprograms (firmware) for different operation codes in a high-speed memory. Add a mechanism for sequential execution of microinstructions, allowing for conditional branching. The memory, used in read-only mode, is non-volatile, typically ROM or flash memory. Diagram Reference The following Figure illustrates the structure of a microsequencer. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 84 / 112 Structure of a microSequencer Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 85 / 112 Advantages and Trade-offs of Microprogrammed Sequencers Advantages The main advantage of a microprogrammed sequencer lies in its flexibility and simplicity. The trade-off is a slightly lower speed compared to wired logic. Microprogramming allows for a more flexible technique to control electronic commands compared to wired logic. Logical Equivalence It is always possible to replace a logic circuit with a microprogram. An example of logical equivalence between circuits and microprogramming is shown in the following figure. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 86 / 112 Formats of Microinstructions The format of microinstructions varies across machines. Horizontal Microprogramming: Wilkes advocated for long, detailed microinstructions where each bit corresponds to a command. Vertical Microprogramming: Very compact microinstructions that yield a single command, requiring decoding before generating a control signal. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 87 / 112 Example of Equivalence between Logic Circuits and Microprogram Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 88 / 112 Program Development The development of a program, from problem analysis to debugging, requires numerous software tools that make up a programming environment. To operate, these tools use operating system services, particularly the file manager. Programming Environment Figure 3.4 shows the diagram of a programming environment containing a minimum set of tools. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 89 / 112 Typical Tools in a Programming Environment The classic tools of an environment include: Text editor Translator (compiler, assembler) Linker Loader Debugger Although related to programming, project development involves more than coding; it also requires designing the program based on the problem. A modern environment must therefore also provide tools suited to the design phase (in the broadest sense). Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 90 / 112 Programmaing Envirenement Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 91 / 112 Compilation Structure of a Compiler A compiler translates a source program written in a high-level language into an object program written in a low-level language. The compiler itself is an important and substantial program. Its size depends on the nature of the language being processed, its level of sophistication, and the degree of code optimization. After being compiled, a program must go through the linker and loader before it can be executed. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 92 / 112 Phases of a Compiler The work of the compiler is divided into several phases: Lexical analysis Syntax analysis Semantic analysis Intermediate code generation Code optimization Object code generation These phases are logically grouped into two main parts: an analysis part and a synthesis part. During the analysis phases, the compiler must account for error detection and handling. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 93 / 112 Structure and operation of a compiler Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 94 / 112 Lexical Analysis Definition Lexical analysis is the first phase of compilation. Its primary role is to read the sequence of characters that constitutes the source program and produce a sequence of (syntactic) elements of the language, which are then processed by the syntax analyzer. Language elements can include numbers, identifiers, operators, reserved words, separators, etc. When we read text, we also perform lexical analysis. Text is simply a sequence of characters. We start by searching for and identifying small sequences of characters that form words. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 95 / 112 Symbol Table and Error Detection Identifiers, such as variable or procedure names, along with their attributes, are stored in a symbol table. Information irrelevant to compilation, such as comments, is removed. The lexical analyzer helps detect errors such as: Identifiers that are too long Characters not belonging to the language vocabulary Illegal numbers Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 96 / 112 Syntax Analysis Definition The syntax analyzer receives a list of syntactic elements (reserved words, variable identifier names, arithmetic operators, punctuation marks, etc.) from the lexical analyzer. It checks whether this list is correct according to the syntax of the language (e.g., in Pascal, a semicolon is required between two statements). From these elements, the syntax analyzer generates the syntax tree of the program. The syntax tree is produced from the language phrase to be analyzed (the program) and the syntax rules. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 97 / 112 Building the Syntax Tree Two approaches are possible for establishing this tree: Bottom-Up Approach: Starts from the elements constituting the phrase and finds all the rules that allow moving up to the root. Top-Down Approach: Starts from the root and applies the rules that allow descending to the desired phrase. The following Figure shows the syntax tree obtained using the top-down approach on a small piece of Pascal code. When reading text, we also perform syntax analysis to understand the structure of sentences, typically in the form of subject-verb-complement. During compilation, the syntax analyzer detects errors such as: Unmatched parentheses Poorly constructed block structures or statements Missing delimiters These errors account for the majority of errors in a program. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 98 / 112 Example of a syntax tree Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 99 / 112 Semantic Analysis Definition Semantic analysis focuses on analyzing the meaning and significance of the phrases in the language. It uses the syntax tree to identify the operators and operands of the instructions. Its main task is to verify type consistency, ensuring that each operator works with operands that are allowed by the language. To perform these checks, the semantic analyzer uses information stored in the symbol table. Errors detected during this phase include: Use of undeclared identifiers or identifiers declared multiple times Type incompatibility between operators and operands Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 100 / 112 Intermediate Code Generation Definition After the various analyses, the next step is to generate the program in object code. A common method is to divide this task into two stages: intermediate code generation and object code generation. Intermediate code can be defined as the code for an abstract machine. It must possess two properties: It should be easy to produce from the syntax tree. It should be easy to translate into object code. Thus, from the syntax tree of a program, the compiler generates a stream of simple instructions resembling macros. Unlike assembly language, these instructions do not make explicit references to the registers of the target machine. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 101 / 112 Code Optimization Definition Code optimization involves improving the generated code to make it faster to execute and/or less memory-intensive. The main technique used is to remove redundancies to minimize the final number of instructions. Efforts are also made to evaluate expressions that can be evaluated, meaning expressions that use constants. The following example illustrates these techniques. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 102 / 112 Code Examples Generated Code from Syntax Tree tmpl := 3.1415926 × 100.0 (tmp: temporary variable) tmp2 := RealToInteger(tmpl) tmp3 := idl + tmp2 (id: identifier) tmp4 := tmp3 × id2 id3 := tmp4 Optimized Code tmpl := idl + 314 id3 := tmpl × id2 Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 103 / 112 Code Optimization Definition Optimization is a delicate phase, often requiring a lot of time during compilation. Effective optimization must consider the characteristics of the target machine. For example, in the case of a vector computer, optimization plays an important role. Since optimization significantly increases compilation time, it is advisable to avoid this phase during the development and debugging of programs. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 104 / 112 Importance of Optimization Significance Code optimization plays a crucial role for machines using recent processors. In these machines, complexity is present at the compiler level, which must take advantage of a large number of registers while minimizing accesses to main memory. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 105 / 112 Object Code Generation Definition Object code generation is the final phase of compilation: it generates relocatable object code, meaning relative to origin 0. It translates each instruction from intermediate code into a sequence of target machine instructions. It assigns memory positions for both data and program instructions. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 106 / 112 Complexity of Generation Key Considerations A critical aspect is the allocation of CPU registers. This generation phase is one of the most complex and costly in the compiler. The produced code must be correct and of high quality, as it needs to optimize resource management (e.g., registers) of the target machine. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 107 / 112 Symbol Table Definition During compilation, it is essential to have access to the description of identifiers and their characteristics. The symbol table groups all this information and makes it available to different compiler elements. Contains names of variables, constants, and procedures. Each entry is associated with a record that includes: The name of the object. Its characteristics (type, numeric address, number and type of parameters, etc.). All compilation phases can utilize this table, requiring very fast access. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 108 / 112 Linker Definition A linker (linkage editor) is software that allows for the combination of multiple object programs into a single executable program. To develop large programs, they are structured into modules that are compiled independently. A program can consist of multiple files, each containing one or more subprograms. One of the files must contain the main program. All files are compiled separately but can reference subprograms in other files, resulting in external references. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 109 / 112 Loader Definition The object program obtained after linking must still be loaded into main memory for execution. The loader handles this task and is typically coupled with the linker. In most modern multiprogrammed systems, the address to load the program is decided at the last moment. In earlier systems, computers only had one program in memory, so addresses were fixed in advance. This type of loader is called an absolute loader. Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 110 / 112 Relocatable Loader Definition Current loaders also relocate the program in memory; these are known as relocatable loaders. As the program starts at address 0, all instructions are numbered relative to this origin. It suffices to add the offset from address 0 to all addresses (instructions and data). Ahmed AHMIM (UC) Concepts on Computer Instructions November 18, 2024 111 / 112 The End Questions? Comments?