Lecture 02 Background on Intel's ISA PDF
Document Details
Uploaded by SprightlyMood
Julius-Maximilians-Universität Würzburg
2024
Prof. Dr. Alexandra Dmitrienko
Tags
Related
- Intel 8254 PDF
- 10__AOK__Ligjërata 10__Adresimi i operandëve përmes instruksioneve në procesorin Intel 8086 -- MSc. Valdrin Haxhiu.pdf
- Intelligence-Driven Computer Network Defense PDF
- Embedded Systems Lesson 4 (Embedded Processors) PDF
- Introduction to Microprocessors and Microcontrollers PDF
- COE 205 Lab 2: Introduction to Assembly Language Programming (PDF)
Summary
This document is lecture notes on Intel's Instruction Set Architecture (ISA). It covers different processor instruction set architectures, focusing on x86-64 and its differences with x86. The lecture notes include background on processor internals and software exploitation techniques.
Full Transcript
Security of Software Systems Lecture 02: Background on Intel’s Instruction Set Architecture (ISA) Prof. Dr. Alexandra Dmitrienko SS 2024, Julius-Maximilians-Universität Würzburg Goal of the Lecture To gain knowledge about Intel’s processor inter...
Security of Software Systems Lecture 02: Background on Intel’s Instruction Set Architecture (ISA) Prof. Dr. Alexandra Dmitrienko SS 2024, Julius-Maximilians-Universität Würzburg Goal of the Lecture To gain knowledge about Intel’s processor internals Why learning about processors? ▪ Attacks that exploit software vulnerabilities are processor specific ▪ It is necessary to know how processors work in order to understand how vulnerabilities can be exploited Why considering Intel processors? ▪ Intel processors dominate the current market of PCs, notebooks and servers ▪ E.g., 63,8% market share in Q1 2024 Source: https://www.cpubenchmark.net/market_share.html ▪ in the mobile and embedded markets, ARM architectures are dominant – and slowly coming to PCs and servers 2 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Variety of Instruction Set Architectures ▪ Different versions of processor instruction set architectures exist ▪ x86 (i386, IA-32) for 32-bit processors (older & low-end systems) ▪ x86-64 (x64, AMD64, Intel 64) extension for 64-bit (Modern PCs, notebooks, servers with Intel and AMD CPUs) ▪ ARM, MIPS, RISC-V, etc. (mostly mobile and embedded systems) ▪ In this lecture, we will first concentrate on Intel x86-64 and then learn about differences to Intel x86 3 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Content of the Lecture x86-64 registers, Memory Program Function data types segmentation x86 compilation calling and and basic and stack differences process system calls assembler operations instructions 4 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Disclaimer ▪ Note that this is not a complete tutorial on the x86-64 instruction set architecture ▪ We will only study selected features and instructions that are necessary for the understanding of the software exploitation techniques 5 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Recommended Literature ▪ A book Bruce Dang, Alexandre Gazet, Elias Bachaalany, Sébastien Josse Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation Available online: https://repo.zenk- security.com/Reversing%20.%20cracking/Practical%20Reverse%20Engine ering.pdf 6 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Background on Intel’s Instruction Set Architecture (ISA) PROGRAM COMPILATION PROCESS Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2020 Example Program 0000000000400524 : 0…24 push rbp 0…25 mov rbp,rsp 0…28 sub rsp,0x10 #include 0…2c mov DWORD PTR [rbp-0x4],0x8 0…33 mov eax,0x40063c int main (void) { 0…38 mov edx,DWORD PTR [rbp-0x4] int c; 0…3a mov esi,edx Compile 0…3f mov rdi,rax c=2+6; 0…41 mov eax, 0x0 printf("c=%d\n",c); 0…44 call 400400 } 0…49 mov eax,0x0 0…4e leave 0…4f ret … CODE 0x40063c: 99 'c' DATA 0x40063d: 61 '=' My simple addition program 0x40063e: 37 '%' 0x40063f: 100 'd' 0x400640: 10 '\n' 0x400641: 0 '\000' 8 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Compilation Process in C/C++ Source Code (.c,.cpp,.h) Preprocessing using preprocessor (cpp) Source code including headers and macros (.i,.ii) Compilation using compiler (gcc, g++) Assembly Code (.s) Assembling using assembler (as) Machine Code (.o,.obj) Static Library Linking using (.lib,.a) linker (ld) Executable Machine Code Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Vote using Pingo: https://pingo.coactum.de/685902 Assembly code is written using…? Machine code instructions Mnemonics Opcodes Bytecode 10 Prof. A. Dmitrienko Introduction to IT Security, WS 22/23 Vote using Pingo: https://pingo.coactum.de/685902 Assembly code is written using…? Machine code instructions Mnemonics Opcodes Bytecode 11 Prof. A. Dmitrienko Introduction to IT Security, WS 22/23 Terminology ▪ MACHINE CODE is code that is directly executable by the computer’s physical processor without further translation ▪ BYTECODE is code that can be executed by a virtual machine. Unlike machine code, bytecode is portable across platforms ▪ OPCODE is a number interpreted by a machine (virtual or silicon) that represents the operation to perform ▪ MNEMONICS are instructions (statements) in assembly language. Each instruction typically consists of an operation or opcode plus zero or more operands 12 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Mnemonics vs. Machine Code vs. Opcode Machine code Mnemonics Opcodes Operands ▪ No bytecode here since this code is machine-dependable 13 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Background on Intel’s Instruction Set Architecture (ISA) X86 REGISTERS, DATA TYPES AND BASIC ASSEMBLER INSTRUCTIONS Prof. A. Dmitrienko Secure Software Systems Lecture, SS 2024 CPU Registers ▪ Tiny, fast storage available to the processor ▪ Addressed differently than main memory ▪ Much faster to access ▪ Different sets of registers are available ▪ General-purpose registers: Utilized to store immediate values and memory addresses ▪ Segment registers: Identify segment in memory ▪ Program status and control register: Stores status flags ▪ Instruction pointer register: Holds the memory address of the instruction to be executed next 15 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 General-Purpose Registers on x86-64 ▪ Sixteen 64 bit wide general-purpose registers Bit 63 0 Accumulator Register: Accumulator for rax intermediate arithmetic and logic results Base Register: Base pointer for rbx memory access Counter register: Counter for loop/string rcx operations Data register: I/O Pointer (I/O port rdx access, interrupt calls, etc.) rsp Stack Pointer rbp Base Pointer (Pointer to data on stack) rsi Source index pointer for string ops rdi Destination index pointer for string ops r8 Caller saved, e.g., function arguments ⋮ r15 Callee saved register for arbitrary use 16 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 8-, 16-, 32-bit General-Purpose Registers Bit 63 31 0 eax rax Accumulator Register 15 0 ax 15 8 7 0 similar for rbx, rcx, rdx ah al Bit 63 31 0 esp rsp Stack Pointer 15 0 sp 7 0 similar for rbp, rsi, rdi spl Bit 63 31 0 r8d r8 Register 8 15 0 r8w 7 0 similar for r9-r15 r8b 17 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Instruction Pointer Register ▪ The instruction pointer (rip) points to the instruction that should be executed next ▪ rip is not a general-purpose register ▪ i.e., it cannot be accessed by any instruction except explicit branch instructions such as call, jmp, ret Bit 63 0 Instruction Pointer: points to the next rip instruction to be executed 18 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Program Status and Control Register ▪ The rflags register stores the status of arithmetic (e.g., add, sub) and bit-wise instructions (e.g., and, or) ▪ Common flags in the rflags register: ▪ ZF (zero flag): result of the last arithmetic operation is zero ▪ SF (sign flag): result of the last arithmetic operation has a 1 in the most significant bit (i.e., the number is negative) ▪ CF (carry flag): indicates that the result requires a carry for unsigned numbers ▪ OF (overflow flag): indicates that the result overflows the maximum size for signed numbers Bit 63 0 status of the program being executed rflags (e.g., carry, zero, overflow flag) 19 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Data Types ▪ Byte: 8 bits ▪ Word: 16 bits ▪ Double Word (DWORD): 32 bits ▪ Quad Word (QWORD): 64 bits Bit 63 31 15 7 0 Byte Word Double Word Quad Word 20 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Little Endianness on Intel CPUs ▪ “Endianness refers to the sequential order used to numerically interpret a range of bytes in computer memory” (Wikipedia) ▪ Example: Store double word 0x33DE01FF Big Endian Little Endian Address Value Address Value 0x8000 0x33 0x8000 0xFF 0x8001 0xDE 0x8001 0x01 0x8002 0x01 0x8002 0xDE 0x8003 0xFF 0x8003 0x33 ▪ Intel CPUs use Little Endian format 21 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Data Movement ▪ The mov instruction is utilized to perform a data movement operation from source (src) to destination (dst) mov dst,src dst := src ▪ Different types of data movement available 1. Immediate to register 2. Register to register 3. Immediate to memory 4. Register to memory / memory to register 22 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Immediate to Register ▪ Set register (reg) to immediate value (imm): mov reg,imm reg := imm ▪ Example codes: mov rax,0x12F2C001 rax := 0x12F2C001 mov edx,0x25F0 edx := 0x25F0 23 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Register to Register ▪ General syntax: mov reg1,reg2 reg1 := reg2 ▪ Example codes: mov rax,rbx rax := rbx mov al,bl al := bl 24 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Immediate to Memory mov [reg],imm MEM[reg] := imm ▪ Square brackets indicate dereferencing: Content of the register reg is used as a pointer to a location in MEM ▪ The immediate imm is stored at that memory location ▪ Example codes: mov [rcx],0x2 MEM[rcx] := 0x2 mov [rcx-0x4],0x1 MEM[rcx-0x4] := 0x1 mov [rcx+0x8],0x4 MEM[rcx+0x8] := 0x4 25 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Register to Memory mov [reg1],reg2 MEM[reg1] := reg2 ▪ Store content of register reg2 at the memory location (MEM) pointed to by register reg1 ▪ Example codes: mov [rcx],rbx MEM[rcx] := rbx mov [rax+rbx],rdx MEM[rax+rbx] := rdx mov [rcx+0x8],edi MEM[rcx+0x8] := edi 26 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Memory to Register mov reg1,[reg2] reg1 := MEM[reg2] ▪ Load data from the memory location pointed to by register reg2 to register reg1 ▪ Example codes: mov rax,[rbx] rax := MEM[rbx] mov rcx,[rax+rbx] rcx := MEM[rax+rbx] mov rdi,[rcx+0x8] rdi := MEM[rcx+0x8] 27 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Size Directive for Various Granularity ▪ Different granularity in memory dereferencing instructions is possible ▪ e.g., write one byte, word, dword, qword at a memory location MEM[rax+3] := 0xFF mov byte ptr [rax+3],0xFF (1 Byte) MEM[rdi+2] := 0xFFFF mov word ptr [rdi+2],0xFFFF (2 Bytes) MEM[rbx] := 0xA000FFFF mov dword ptr [rbx],0xA000FFFF (4 Bytes) MEM[r12] := mov qword ptr [r12], 0xC000B000A000FFFF 0xC000B000A000FFFF (8 Bytes) 28 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Some Subtle mov Instructions ▪ Single instruction can perform a memory read, arithmetic operation, and memory write at once inc qword ptr [rax] MEM[rax] := MEM[rax]+1 ▪ Complex memory dereference instructions are possible mov r12,[rax+rbx*4] r12 := MEM[rax+rbx*4] 29 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Increment/Decrement Instructions ▪ Instructions for increment and decrement operations are implemented with inc and dec ▪ Example Codes: inc rsi rsi := rsi+1 dec rdi rdi := rdi-1 30 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 The lea Instruction ▪ Loads Effective Address ▪ Special instruction that does not dereference memory although it uses square brackets [] lea reg1,[reg2+offset] reg1 := reg2+offset ▪ Example code lea rax,[rbx-0x4] rax := rbx-0x4 31 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Arithmetic Instructions ▪ Common instructions for addition, subtraction, multiplication, and division are implemented with add, sub, mul, and div, respectively ▪ Example codes: add rax,rbx rax := rax+rbx sub rax,rcx rax := rax-rcx mul rsp,0x16 rsp := rsp*0x16 div rax,[rsp] rax := rax/MEM[rsp] 32 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Bit-Level Operations ▪ Bit-Level instructions perform bit-wise operations on two data values ▪ Example Codes: rax := rax & rax and rax,rax (equivalent to rax := rax) or rax,rbx rax := rax | rbx rdx := rdx ^ rdx xor rdx,rdx (equivalent to rdx := 0) not rsi rsi := ~ rsi 33 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Unconditional Jump Instructions ▪ Unconditional jump instructions change the value of the instruction pointer rip to a specified address ▪ Direct jump instructions use a fixed target address jmp address rip := address jmp function rip := function ▪ Indirect jump instructions: the target address can be either a general purpose register or a memory operand jmp rax rip := rax jmp [rbx] rip := MEM[rbx] 34 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Conditional Jump Instructions ▪ Conditional jump instructions are executed if a certain condition holds ▪ Conditions are managed via the rflags register ▪ Most of the time a conditional jump instruction is preceded by compare instruction cmp ▪ There are several conditional jump instructions, e.g., ▪ jle – destination operand of preceding cmp instruction is less than or equal to the source operand ▪ jz – jump if zero flag in rflags register is set ▪ for more examples, check https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow 35 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Intel vs. AT&T Syntax Extra material ▪ We use Intel syntax in this lecture Intel: AT&T: mov rax,1 movq $1,%rax mov rbx,0ff movq $0xff,%rbx ▪ In AT&T syntax, registers are prefixed with % and immediate values are prefixed with $ ▪ Direction of operands in AT&T is opposite from that of Intel ▪ AT&T syntax mnemonics have a suffix: quad (64 bits), long (32 bits), word (16 bits), byte (8 bits) 36 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Intel vs. AT&T Syntax (contd.) Extra material ▪ In Intel syntax the base register is enclosed in '[ ]' whereas in AT&T syntax it is enclosed in '( )‘: Intel: AT&T: mov [rcx],rbx movq (%rcx),%rbx ▪ Complex operations in AT&T are more obscure: mov rax,[rbx+20] movq 0x20(%rbx),%rax add rax,[rbx+rcx*2] addq (%rbx,%rcx,0x2),%rax lea rax,[rbx+rcx] leaq (%rbx,%rcx),%rax sub rax,[rbx+rcx*4-0x20] subq -0x20(%rbx,%rcx,0x4),%rax 37 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Background on Intel’s Instruction Set Architecture (ISA) MEMORY SEGMENTATION AND STACK OPERATIONS Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Memory Segmentation (1) ▪ Exploitation of security bugs involve overwriting or overflowing one portion of memory into another → understanding memory management is crucial ▪ Program is executed: 1. OS creates an address space in which the program will run 2. This address space includes the actual program instructions as well as any required data 3. The stack and the heap are initialized 39 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Memory Segmentation (2) ▪ Five segments: text, data, bss, stack, and heap (Higher Address) ▪ text holds the program instructions Stack ▪ data and bss segments for global variables ▪ data contains static initialized data ▪ bss contains uninitialized data Heap ▪ text is read-only, whereas data and bss writeable Uninitialized Global Data ▪ stack is a data structure (LIFO) grows bss down the address space Initialized Global Data data ▪ heap is a data structure (FIFO) grows up the address space text (Lower Address) 40 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Vote using Pingo: https://pingo.coactum.de/685902 In which memory segment the following variable will be stored? static float v = 0; Stack Heap Data Text 41 Prof. A. Dmitrienko Introduction to IT Security, WS 22/23 Vote using Pingo: https://pingo.coactum.de/685902 In which memory segment the following variable will be stored? static float v = 0; Stack Heap Data Text 42 Prof. A. Dmitrienko Introduction to IT Security, WS 22/23 In Which Memory Segment are Variables Stored? (Higher Address) Stack Heap Uninitialized Global Data bss Initialized Global Data data text (Lower Address) 43 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Stack ▪ Stack is a last in, first out (LIFO) memory area where the Stack Pointer (rsp) points to the last stored element on the stack ▪ Typically, the stack grows downwards ▪ The stack can be accessed by two basic operations 1. push elements onto the stack (rsp is decremented) 2. pop elements off the stack (rsp is incremented) EXAMPLE CODE Stack push 0x8000AAFF pop rax rsp 0x8000AAFF rax 0x8000AAFF 44 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Stack Frame ▪ Each function is associated with one stack frame on the stack The rbp register is used High Addresses to reference function Stack arguments and local variables Function Arguments Stack grows Stack Return Address downwards Frame Saved Base Pointer Base Pointer (rbp) Local Variables Stack Pointer (rsp) Low Addresses The rsp register holds the stack pointer and always points to the last element on the stack 45 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Stack Frame (Text summary) ▪ Stack is divided into individual stack frames ▪ Each function call sets up a new stack frame on top of the stack 1. Function arguments ▪ Arguments provided by the caller of the function 2. Return address ▪ Upon function return (i.e., a return instruction is issued), control transfers to the code pointed to by the return address (i.e., control transfers back to the caller of the function) 3. Saved Base Pointer ▪ Base pointer of the calling function 4. Local variables ▪ Variables that the called function uses internally 46 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Background on Intel’s Instruction Set Architecture (ISA) FUNCTION CALLING Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention Stack … esp Code : Instruction, … call Function_A Instruction, … 48 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention Stack … esp Code : Instruction, … call Function_A Instruction, … : Instruction, … ret 49 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention ▪ Function calls are performed Stack using call instruction ▪ e.g., call Function_A … esp ▪ The call instruction pushes the return address onto the stack ▪ The return address simply points to the next instruction after the call instruction Code : Instruction, … ▪ Function returns are performed call Function_A using ret instruction Instruction, … ▪ The ret instruction pops the return address off the stack and loads it into : Instruction, … the instruction pointer (rip) ret ▪ Hence, the execution will continue in the main function 50 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention ▪ Function calls are performed Stack using call instruction ▪ e.g., call Function_A … ▪ The call instruction pushes the Return Address esp return address onto the stack ▪ The return address simply points to the next instruction after the call instruction Code : Instruction, … ▪ Function returns are performed call Function_A using ret instruction Instruction, … ▪ The ret instruction pops the return address off the stack and loads it into : Instruction, … the instruction pointer (rip) ret ▪ Hence, the execution will continue in the main function 51 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention ▪ Function calls are performed Stack using call instruction ▪ e.g., call Function_A … esp ▪ The call instruction pushes the Return Address return address onto the stack ▪ The return address simply points to the next instruction after the call instruction Code : Instruction, … ▪ Function returns are performed call Function_A using ret instruction Instruction, … ▪ The ret instruction pops the return address off the stack and loads it into : Instruction, … the instruction pointer (rip) ret ▪ Hence, the execution will continue in the main function 52 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Calling Convention ▪ Function calls are performed Stack using call instruction ▪ e.g., call Function_A … esp ▪ The call instruction pushes the Return Address return address onto the stack ▪ The return address simply points to the next instruction after the call instruction Code : Instruction, … ▪ Function returns are performed call Function_A using ret instruction Instruction, … ▪ The ret instruction pops the return address off the stack and loads it into : Instruction, … the instruction pointer (rip) ret ▪ Hence, the execution will continue in the main function 53 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example : Function Prologue Instruction, … Function Epilogue 54 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Function Arguments Return Address rsp : Function Prologue Code Instruction, … : Function Epilogue 55 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved rsp Return Address Base Pointer) : Function Prologue Code Instruction, … : Function Epilogue 56 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved rsp Return Address Base Pointer) : Function Prologue Code Instruction, … : Function Epilogue push rbp 57 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) Saved Base Pointer rsp : Function Prologue Code Instruction, … : Function Epilogue push rbp 58 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) Saved Base Pointer rsp Initialize new Base Pointer : Function Prologue Code Instruction, … : Function Epilogue push rbp 59 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) Saved Base Pointer rsp Initialize new Base Pointer : Function Prologue Code Instruction, … : Function Epilogue push rbp mov rbp,rsp 60 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer : Function Prologue Code Instruction, … : Function Epilogue push rbp mov rbp,rsp 61 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue push rbp mov rbp,rsp 62 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue push rbp mov rbp,rsp sub rsp,16 63 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer Initialize new Base Pointer Local Variables rsp Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue push rbp mov rbp,rsp sub rsp,16 64 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer Initialize new Base Pointer Local Variables rsp Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue push rbp mov rbp,rsp sub rsp,16 Instruction, … 65 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer Initialize new Base Pointer Local Variables rsp Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue push rbp mov rbp,rsp sub rsp,16 Instruction, … 66 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer Initialize new Base Pointer Local Variables rsp Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … 67 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer Initialize new Base Pointer Local Variables rsp Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … leave 68 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … leave 69 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base Pointer Register 70 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved Return Address Base Pointer) rbp Saved Base Pointer rsp Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base Pointer Register 71 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved rsp Return Address Base Pointer) Saved Base Pointer Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base Pointer Register 72 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved rsp Return Address Base Pointer) Saved Base Pointer Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base Pointer Register Issue return to caller 73 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments stack (Field: Saved rsp Return Address Base Pointer) Saved Base Pointer Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base ret Pointer Register Issue return to caller 74 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue and Epilogue by Example Stack Store Base Pointer (rbp) of caller on Function Arguments rsp stack (Field: Saved Return Address Base Pointer) Saved Base Pointer Initialize new Base Pointer Local Variables Reserve space for : local variables Function Prologue (16 bytes) Code Instruction, … : Function Epilogue Set Stack Pointer push rbp (rsp) to the location mov rbp,rsp where Saved Base sub rsp,16 Pointer is stored Instruction, … Load Saved Base leave Pointer to the Base ret Pointer Register Issue return to caller 75 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Reading Simple Addition Program … #include CODE 0000000000400524 : 400524 push rbp void add (int a, int b) Function 400525 mov rbp,rsp prologue { 400528 sub rsp, 0x10 printf("%d\n", a+b); 40052c mov DWORD PTR [rbp-0x4],edi } 40052f mov DWORD PTR [rbp-0x8], esi 400532 mov eax,DWORD PTR[rbp-0x8] a+b Compile 400535 mov edx,DWORD PTR[rbp-0x4] int main(void) 400538 add edx, eax { 40053a mov eax,0x40066c int a, b; 40053f mov esi,edx esi :=a+b Prepare a=2; 400541 mov rdi,rax rdi :=“%d\n” arguments for printf b=6; 400544 mov eax,0x0 add(a, b); 400549 call 400418 Call return 0; 40054e leave Function 40112c ret epilogue } … 40066c db `%d\n` DATA 76 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Prologue (Text Summary) ▪ The purpose of the function prologue is ▪ to backup selected registers ▪ to reserve space for local variables ▪ Note: When the function is entered, the stack pointer points to the return address of the function ▪ Examples: ▪ Store Base Pointer on the stack: push rbp ▪ Initialize Base Pointer of the called function with the current value of the Stack Pointer: mov rbp,rsp ▪ Reserve space for local variables (e.g., 16 Bytes): sub rsp,16 77 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Function Epilogue (Text Summary) ▪ The purpose of the function epilogue is ▪ to set the Stack Pointer back to its original state ▪ to restore selected registers ▪ to issue the return ▪ Examples: ▪ Reset the Stack Pointer (by overwriting it with ebp): mov esp,ebp ▪ Restore the caller‘s Base Pointer: pop ebp ▪ Pop the return address from the stack and return back to the caller: ret 78 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Different Calling Conventions Extra material ▪ Calling conventions are compiler- and OS-specific ▪ Linux, MacOS use System V AMD64 ABI convention ▪ Windows, UEFI (firmware) follow a different convention System V AMD64 Microsoft x64 Argument first 6 in registers first 4 in registers passing (rdi, rsi, rdx, rcx, r8, r9) (rcx, rdx, r8, r9) rest on the stack, right to left rest on the stack, right to left Callee-saved rbx, rsp, rbp, rsi, rbx, rsp, rbp, r12-r15 registers rdi, r12-r15 Return Value RAX RAX Reserved 128 bytes Red Zone 32 bytes Shadow Space memory below stack per function on stack 79 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Interrupt Handling ▪ Interrupts indicate that the CPU has to halt (interrupt) the execution of a program ▪ Intel has a dedicated interrupt instruction: int num ▪ Prominent int flavor is int3 ▪ Utilized as a software breakpoint in debuggers ▪ Another example: int 0x80 ▪ 32-bit Linux and Unix programs use this to perform a system call 80 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Performing a System Call with syscall ▪ Requires a priori loading of the system call number in rax and the arguments (Arg) in general-purpose registers as follows: System Call Number Arg 1 Arg 2 Arg 3 Arg 4 Arg 5 Arg 6 rax rdi rsi rdx r10 r8 r9 Return Value rax rax Example: rdi rax ssize_t write( unsigned int fd, rsi const void *buf, size_t count rdx ); 81 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Extra material Background on Intel’s Instruction Set Architecture (ISA) DIFFERENCES ON 32-BIT X86 SYSTEMS Prof. A. Dmitrienko Secure Software Systems Lecture, SS 2020 32-bit x86 Registers ▪ 32-bit x86 processors do not have 64-bit registers ▪ Hence, register names begin with e ▪ e stands for extended (compared to 16-bit registers of 80286) ▪ edx:eax is used to process and store a 64-bit value Bit 63 32 31 0 edx eax ▪ Registers r8-r15 are not available 83 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 32-bit x86 System Calls ▪ Instead of using syscall, which was introduced with x86-64, 32-bit Linux uses interrupts (int 0x80) ▪ Register usage changes as follows System Call Number Arg 1 Arg 2 Arg 3 Arg 4 Arg 5 Arg 6 eax ebx ecx edx esi edi ebp Return Value eax 84 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 32-bit x86 Calling Conventions ▪ Arguments are usually not passed in registers ▪ Many more different calling conventions (syscall, optlink, fastcall, register, TopSpeed, safecall…) pascal stdcall cdecl 16-bit Windows 32-bit Windows Linux/Unix from left to right; from right to left; from right to left; Argument args passed on the args passed on the args passed on the Passing stack stack stack Callee responsible Callee responsible Caller responsible Stack for cleaning up the for cleaning up the for cleaning up the stack stack stack Return (e)ax or dx:ax eax (edx:)eax Value 85 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 Summary of the lecture ▪ Background information on x86 Instruction Set Architecture: ▪ Program compilation process ▪ Processor registers, data types and basic assembler instructions ▪ Memory segmentation and stack operations ▪ Function calling convention and system calls ▪ Differences to instruction set for 32-bit processors ▪ By having this knowledge, you are equipped to dig into internals of software exploitation techniques ▪ Apply this knowledge in practice in the exercise sessions! 86 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024 End of Class 87 Prof. A. Dmitrienko Security of Software Systems Lecture, SS 2024