Principles of Software Security IFN657 Lecture 3 PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Computer Organization and Design RISC-V Edition PDF
- (The Morgan Kaufmann Series in Computer Architecture and Design) David A. Patterson, John L. Hennessy - Computer Organization and Design RISC-V Edition_ The Hardware Software Interface-Morgan Kaufmann-102-258-pages-7.pdf
- Computer Organization and Architecture (11th Edition Global) PDF
- Lecture 02 Introduction to Assembly Language PDF
- Low Level Appunti PDF
- Introduction to Network and Web Application PDF
Summary
This document is lecture notes for a postgraduate course on principles of software security. It covers topics like x86 architecture, assembly language basics, and different levels of abstraction in computer science.
Full Transcript
Principles of Software Security IFN657 Lecture 3 Key Points from Last Lecture C language is e icient and error-prone Close to machine model, lexible memory management C vs C# Type-unsafe, no out-of-bound check, no string length check, manual memory management Comput...
Principles of Software Security IFN657 Lecture 3 Key Points from Last Lecture C language is e icient and error-prone Close to machine model, lexible memory management C vs C# Type-unsafe, no out-of-bound check, no string length check, manual memory management Computer memory is broken into sections Stack can grow by making function calls Heap can grow by dynamically allocate memory ff f Lecture Outline x86 architecture Assembly basics Machine vs. Assembly vs. C Levels of Abstraction Hardware of electrical circuits that implement complex combinations of logical operators such as XOR, AND, OR, and NOT gates Microcode level is also known as irmware Machine code level consists of opcodes, hexadecimal digits that tell the processor what you want it to do Low-level languages are human-readable version of a computer architecture’s instruction set High-level languages (e.g. C/C++) are turned into machine code by compilation Interpreted languages are not directly compiled into machine code Translated into bytecode and executed within an interpreter f Assembly Assembly is the highest level language that can be reliably and consistently recovered from machine code when high-level language source code is not available Vulnerable code or malware are typically stored in binary form at the machine code level A disassembler takes the binary as input and generate assembly language code as output Assembly language is actually a class of languages, we will concentrate on x86 Assemblers and Linkers Assembler vs. Compiler Compiler Assembler Compiler translates high level programming Assembler converts the assembly language code to machine level code. level language to machine level code. Source code in high level programming Assembly level code as input. Compiler checks language. and converts the complete Assembler generally does not code at one time. convert complete code at one time. lexical analyzer, Syntax analyzer, Semantic Assembler does works in two passes. analyzer, Code optimizer, Code generator, Mnemonic and Error of version handler machine code. Binary version of machine code. C, C++ , Java compilers. GAS, GNU assemblers. AT&T vs. Intel (NASM) Two main forms of assembly syntax NASM format: , mov eax, 10 AT&T format: , mov $10, %eax AT&T format reverses the order of operands, uses a % before registers and a $ before literal values Fundamental Data Types Fundamental Data Types Memory Data in Memory Little-endian format: a low-order byte is stored at the lower address The x86 Architecture CPU Registers The internals of most modern computer architectu low the Von Neumann architecture, illustrated in F hardware components: The central processing unit (CPU) executes code. The main memory of the system (RAM) stores al An input/output system (I/O) interfaces with devi A small amount of data storage available to the CPU keyboards, and monitors. Can be accessed more quickly than any other CPU storage Registers General registers are used by the CPU during execution Control Main ALU Memory Unit Segment registers track sections of memory (RAM) Status lags are used to make decisions Instruction pointers keep track of the next Input/Output Devices instruction to execute Figure 4-2: Von Neumann architecture As you can see in Figure 4-2, the CPU contains f x86 Registers General Segment Status Instruction registers registers register pointer EAX (AX, AH, AL) CS EFLAGS EIP EBX (BX, BH, BL) SS EBP (BP) ECX (CX, CH, CL) DS ESP (SP) EDX (DX, DH, DL) ES ESI (SI) FS EDI (DI) GS General Registers Store data or memory addresses x64 Registers Data Registers EAX: the primary accumulators are used in input/output and most arithmetic instructions EBX: the base registers could be used in indexed addressing ECX: the count registers store the loop count in iterative operations EDX: the data registers are also used in input/output operations, sometimes along with AX Index Registers ESI: used as source index for string operations EDI: used as destination index for string operations Segment Registers CS: points to code segment containing all the instructions to be executed DS: point to data segment containing global data, constants and work areas SS: points to stack segment contains local data and return addresses of procedures or subroutines Other segment registers, ES, FS and GS, provide additional segments for storing data All memory locations within a segment are relative to the starting address of the segment Status Register The EFLAGS register is a lag ZF: the zero lag is set when the result of an operation is equal to zero CF: the carry lag is set when the result of an operation is too large or too small SF: the sign lag is set when the result of an operation is negative TF: the trap lag is used for debugging x86 processor will execute only one instruction at a time if this lag is set Used by gdb f f f f f f Instruction Pointer - EIP Also known as program counter The register contains the o set address of the next instruction to be executed EIP’s only purpose is to tell the processor what to do next Complete address of the current instruction in the code segment: cs:eip When you control EIP, you can control what is executed by the CPU Attackers always attempt to gain control of EIP through exploitation ff Other Pointer Registers ESP: stack pointer register provides the o set value within the program stack ss:esp refers to the top of the stack (current position of data or address within the program stack) EBP: base pointer register mainly helps in referencing the parameter variables passed to a subroutine (e.g. a function call) ss:ebp refers to the stack frame of the current call ff Lecture Outline x86 architecture Assembly basics Simple Instructions mov instruction Instruction Description mov eax, ebx Copies the contents of EBX into the EAX register mov eax, 0x42 Copies the value 0x42 into the EAX register mov eax, [0x4037C4] Copies the 4 bytes at the memory location 0x4037C4 into the EAX register mov eax, [ebx] Copies the 4 bytes at the memory location speci ied by the EBX register into the EAX register mov eax, [ebx+esi*4] Copies the 4 bytes at the memory location speci ied by the result of the equation ebx+esi*4 into the EAX register f f Simple Instructions mov vs. lea lea (load e ective address) instruction puts a memory address into the destination e.g. lea eax, [ebx+8] puts EBX+8 into EAX e.g. mov eax, [ebx+8] loads the data at the memory address EBX+8 Why do we need lea? mov eax, ebx+8 is invalid ff Simple Instructions mov vs. lea mov eax, [ebx+8] places the value 0x20 into EAX lea eax, [ebx+8] places the value 0xB30048 into EAX Simple Instructions Arithmetic Instruction Description sub eax, 0x10 Subtracts 0x10 from EAX add eax, ebx Adds EBX to EAX and stores the result in EAX inc edx Increments EDX by 1 dec ecx Decrements ECX by 1 mul 0x50 Multiplies EAX by 0x50 and stores the result in div 0x75 EDX:EAX Divides EDX:EAX by 0x75 and stores the result in EAX and the remainder in EDX Simple Instructions Logics Instruction Description xor eax, eax Clears the EAX register or eax, 0x7575 Performs the logical or operation on EAX with 0x7575 mov eax, 0xA Shifts the EAX register to the left 2 bits; these two shl eax, 2 instructions result in EAX = 0x28, because 1010 (0xA in binary) shifted 2 bits left is 101000 (0x28) mov bl, 0xA Rotates the BL register to the right 2 bits; these two ror bl, 2 instructions result in BL = 10000010, because 1010 rotated 2 bits right is 10000010 Simple Instructions nop nop instruction does nothing Execution simply proceeds to the next instruction nop is actually a pseudonym for xchg eax, eax, but since exchanging EAX with itself does nothing, it is popularly referred to as NOP (no operation) The opcode for this instruction is 0x90 Commonly used to provide execution padding for bu er over low attacks When attackers don’t have perfect control of their exploitation ff f The Stack Short-term memory storage for functions, local variables, and low control The stack is a last in, irst out (LIFO) structure ESP is the stack pointer that points to the top of stack EBP is the base pointer that keeps track of the location of local variables and parameters The stack instructions include push, pop, call, leave, enter, and ret f f Function Calls 1. Arguments are placed on the stack using push instructions 2. A function is called using call. The current instruction address in EIP is pushed onto the stack. EIP is set to 3. Local variables and EBP is pushed onto the stack (save EBP for the calling function) 4. The function performs its work 5. Local variables and EBP of calling function is restored 6. EIP is restored so that calling program continues to execute 7. Arguments are removed Stack Layout Stack Layout Conditionals test and cmp test instruction is identical to and, but the operands are not modi ied Only set the lags, typically ZF (the zero lag) cmp instruction is identical to sub, but the operands are not modi ied Only set the ZF and CF (the carry lag) cmp dst, src ZF CF dst = src 1 0 dst < src 0 1 dst > src 0 0 f f f f f Branching Jump instructions A branch is a sequence of code that is conditionally executed depending on the low of the program The term branching describes the control low through the branches of a program. Unconditional jump will always transfer to the target location jmp Conditional jumps use the lags to determine whether to jump or skip to next instruction More than 30 di erent types of conditional jumps! e.g. jz ff f f f Examples of Conditional Jumps Instruction Description jz loc Jump to specified location if ZF = 1. jnz loc Jump to specified location if ZF = 0. je loc Same as jz, but commonly used after a cmp instruction. Jump will occur if the destination operand equals the source operand. jne loc Same as jnz, but commonly used after a cmp. Jump will occur if the destination operand is not equal to the source operand. jg loc Performs signed comparison jump after a cmp if the destination operand is greater than the source operand. jge loc Performs signed comparison jump after a cmp if the destination operand is greater than or equal to the source operand. ja loc Same as jg, but an unsigned comparison is performed. jae loc Same as jge, but an unsigned comparison is performed. jl loc Performs signed comparison jump after a cmp if the destination operand is less than the source operand. jle loc Performs signed comparison jump after a cmp if the destination operand is less than or equal to the source operand. jb loc Same as jl, but an unsigned comparison is performed. jbe loc Same as jle, but an unsigned comparison is performed. jo loc Jump if the previous instruction set the overflow flag (OF = 1). js loc Jump if the sign flag is set (SF = 1). jecxz loc Jump to location if ECX = 0. C Main Method and Offsets Standard two arguments main method: int main(int argc, char ** argv) The parameters argc and argv are given at runtime: filetestprogram.exe -r filename.txt Results of argc and argv when the program is run: argc = 3 argv = filetestprogram.exe argv = -r argv = filename.txt A Simple C Program int main(int argc, char* argv[]) { if (argc != 3) {return 0;} if (strncmp(argv, "-r", 2) == 0){ DeleteFileA(argv); } return 0; } A Simple C Program In compiled form 004113CE cmp [ebp+argc], 3 ; LOCATION 1 004113D2 jz short loc_4113D8 004113D4 xor eax, eax 004113D6 jmp short loc_411414 004113D8 mov esi, esp 004113DA push 2 ; MaxCount 004113DC push offset Str2 ; "-r" 004113E1 mov eax, [ebp+argv] 004113E4 mov ecx, [eax+4] 004113E7 push ecx ; Str1 004113E8 call strncmp ; LOCATION 2 004113F8 test eax, eax 004113FA jnz short loc_411412 004113FC mov esi, esp ; LOCATION 3 004113FE mov eax, [ebp+argv] 00411401 mov ecx, [eax+8] 00411404 push ecx ; lpFileName 00411405 call DeleteFileA Home Readings Learn NASM Assembly https://www.tutorialspoint.com/assembly_programming/ This can be used as a reference for expanding your assembly knowledge