Chapter3_4-6-v4.pdf
Document Details
Uploaded by HolyPalladium
Tags
Related
Full Transcript
3.4 ARM Addressing Modes 59 Addressing Mode Discussed previously: Register-to-register: – e.g., ADD r1, r2, r3 Literal or immediate: the actual value is part of the instruction #12 – e.g., ADD r1, r2, #5 performs [r1] ← [r2] + 5 New: Register indirect 60 Register indirect addressing Register indirec...
3.4 ARM Addressing Modes 59 Addressing Mode Discussed previously: Register-to-register: – e.g., ADD r1, r2, r3 Literal or immediate: the actual value is part of the instruction #12 – e.g., ADD r1, r2, #5 performs [r1] ← [r2] + 5 New: Register indirect 60 Register indirect addressing Register indirect: a register contains the address of the operand e.g., LDR r1, [r0] copies to r1 the content of the memory location with address stored in r0. Useful in access tables and arrays: – e.g., ADD r0, r0, #4 then LDR r1, [r0] moves next element of an array to r1. – ARM has facility for this to be done in one instruction. 55 76 55 76 61 Register Indirect addressing with an offset (a) With a literal offset Effective address of operand = base address contained in a register + literal offset The literal offset is unsigned 12 bits (i.e., 0-4095 bytes) Example: LDR r1, [r0, #4] Address 1000 1004 The unsigned offset can also be subtracted from the base address: e.g., LDR r1, [r0, #-4] 62 Register Indirect addressing with an offset (b) With the of set as a second register LDR r2,[r0,r1] – [r2] ← [[r0] + [r1]] – load r2 with the location pointed at by r0 plus r1 LDR r2,[r0,r1,LSL #2] – [r2] ← [[r0] + 4 x [r1]] – Register r1 is scaled by 4. This allows you to use a scaled offset when dealing with arrays. The register offset can also be subtracted from the base address: LDR r2, [r0, #-r1] LDR r2, [r0, #-r1, LSL #2] 63 Pre-indexing and post-indexing Elements in an array or similar data structure are frequently accessed sequentially. Auto-indexing addressing modes allow pointers to be automatically adjusted to point at the next element. Two autoindexing modes: – Pre-indexed mode: Pointer updated before memory access – Post-indexed mode: Pointer updated after memory access 64 Pre-indexing and post-indexing Pre-indexing Increment address then access memory Post-indexing Access memory then increment address 65 Pre-indexing and post-indexing Before operation: [r1] = 0x200, [0x200] = 0x05, [0x20C] = 0x77 200 Address 05 After operation: – LDR [r1] – LDR [r1] – LDR [r1] 77 r0, [r1, #12] = 0x200, [r0] = 0x77 r0, [r1, #12]! (Pre-indexing) = 0x20C, [r0] = 0x77 r0, [r1], #12 (Post-indexing) = 0x20C, [r0] = 0x05 20C 66 ARM’s load and store encoding Knowing about the encoding provides an understanding of what options are available. (W: applicable only to pre-index addressing. Always 0 for post-index addressing) (ldr/str) (ldrb/strb) 67 ARM’s load and store encoding W: applicable only to pre-index addressing. In the case of post-indexed addressing, the write back bit is redundant and must be set to zero. e.g., ldr r0, [r1], #4 The non-zero offset of 4 indicates the programmer wants to increment (write-back) to address to the base address. Otherwise, he/she should have written ldr r0, [r1] (i.e., without the offset) Therefore, post-indexed data transfers always write back the modified base. 68 Example 1 e.g., strpl r4, [r2, -r6, LSL #2]! opcode = 0101 0111 0010 0010 0100 0001 0000 0110 = 0x57224106 Bits Description Code 31-28 Condition PL 0101 27-26 Defines a load/store instruction 01 25 Use shift register 1 24 P: Pre-index 1 23 U: Decrement Pointer (in –r6) 0 22 B: Word 0 21 W: Write back adjusted pointer 1 20 L: Store 0 19-16 rbase= r2 0010 (i.e., 2 in hex) 15-12 rtransfer = r4 0100 (i.e., 4 in hex) 11-7 Shift = 2 in LSL #2 00010 (i.e., 2 in hex) 6-5 LSL 00 4 Fixed 0 3-0 r6 0110 (i.e., 6 in hex) 69 Example 2 e.g., ldr r1, [r0], #4 opcode = 1110 0100 1001 0000 0001 0000 0000 0100 = 0xE4901004 Bits Description Code 31-28 Condition AL 1110 27-26 Defines a load/store instruction 01 25 Use 12-bit immediate value 0 24 P: Post-index 0 23 U: Increment Pointer 1 22 B: Word 0 21 W: Always 0 for post-indexing 0 20 L: Load 1 19-16 rbase= r0 0000 (i.e., 2 in hex) 15-12 rtransfer = r1 0001 (i.e., 4 in hex) 11-0 Immediate value of 4 0000 0000 0100 (i.e., 4 in hex) 70 Summary of ldr/str options Address accessed by LDR/STR is specified by a base register plus an offset Offset can be: – An unsigned 12-bit immediate value (i.e., 0 - 4095 bytes). LDR r0, [r1, #8] – A register, optionally shifted by an immediate value LDR r0, [r1, r2] LDR r0, [r1, r2, LSL #2] This can be either added or subtracted from the base register: LDR r0, [r1, #-8] LDR r0, [r1, -r2] LDR r0, [r1, -r2, LSL #2] Choice of pre-indexed or post-indexed addressing 71 ldr pseudoinstruction Recall about literal encoding: A data processing instruction (including mov) has 12 bits available for the literal consisting of an 8-bit immediate value and number of ror encoded Error occurs when loading a literal using mov if the literal cannot be represented by an 8-bit value with even number of ror. How does ARM address this error so that all 32-bit values can be loaded? 72 ldr pseudoinstruction LDR rd, =const This will either: – Produce a mov or mvn instruction to generate the value (if possible). or – Generate an ldr instruction with a PC-relative address to read the constant from a literal pool (Constant data area embedded in the code). For example – LDR r0,=0xFF – LDR r0,=0x55555555 => => MOV r0,#0xFF LDR r0,[PC,#Imm12] … … 0x55555555 This is the recommended way of loading constants into a register Called pseudoinstruction because this is not part of the processor’s instruction set. 73 Example use in Lab 2 Source code Assembled code 74 Example use in Lab 2 Result is the starting address of the word we try to allocate using.space From the assembled code, you can see Result represents the address 0x1020 ldr r3, =Result is actually ldr r3, =0x1020 The assembler generate a literal 0x1020 and put it in address 0x1018 (i.e., at the end of the code section). This is an item in the literal pool. The instruction ldr r3, [pc, #4] tries to retrieve the content of the address 0x1018 and put it in r3. The address 0x1018 is 12 bytes under the ldr instruction (at 0x100C) For our example, – pc = 0x100C + 8 = 0x1014 (pipelining) – pc + 4 = 0x1018 75 ldr/str byte/halfword We can load or store byte (8-bit) or halfword (16-bit) We need to understand two ways of data organization: – Little Endian: Most significant byte (MSB) stored in the highest address – Big Endian: MSB stored in lowest address Example: 12345678 12 34 56 Big Endian 78 78 56 34 12 Little Endian 76 Little Endian We use little endian throughout the course. For example, ldr r0, =0x40000200 str r7, [r0] would give: C5 A2 28 15 77 Byte load ldrb Rd, [Rx] loads one byte from a memory location pointed to by Rx into the least significant byte of Rd. 78 Byte store strb Rd, [Rx] stores the least significant byte of Rx to the memory location pointed to by Rx 79 Half-word load ldrh Rd, [Rx] loads two bytes (halfword) from a memory location pointed to by Rx into the least significant byte of Rd. 80 Half-word store strh Rd, [Rx] stores the lower 16-bit contents to the memory location pointed to by Rx 81 Half-word/byte load with sign extension Sign extension of signed number – Positive number: e.g., extending the 8-bit number of 5616 to a 32-bit number will result in 0000005616 – Negative number: e.g., extending the 8-bit number of 8216 to a 32-bit number will result in FFFFFF8216 8216 is -7E16 With 32-bit, this number is represented as the 2’complement of 0000007E16 → FFFFFF8216 – Thus, signed extension involves copying the signed bit (D7) to the upper 24 bits of the 32-bit register. Similar for half-word. 82 Half-word/byte load with sign extension 83 Byte load with sign extension ldrsb r0, [r1] Assume [r1] = 0x80000 and the content in the memory location 0x80000 is 60, then [r0] = 00000060 r0 0x00 0x00 0x00 0x80001 0x60 0x80000 0x60 If instead, the content in the memory location 0x80000 is FE (= -2), then [r0] = FFFFFFFE 0x80001 0xFE r0 0xFF 0xFF 0xFF 0xFE 0x80000 84 Half-word load with sign extension ldrsh r0, [r1] Assume [r1] = 0x80000 and the content in the memory location 0x80000 is 0104, then [r0] = 00000104 r0 0x00 0x00 0x10 0x10 0x04 0x80001 0x80 0x02 0x80001 0x80000 0x04 If instead, the content in the memory location 0x80000 is 8002 (= -3276610), then [r0] = FFFF8002 r0 0xFF 0xFF 0x80 0x02 0x80000 85 Half-word/byte load/store encoding (ldrsb) (ldrh/strh) (ldrsh) For halfword and signed halfword / byte, offset can be: – An unsigned 8-bit immediate value (i.e., 0-255 bytes). – A register (unshifted). 86 Regular load vs half-word/byte load 87 Regular ldr/str vs half-word/byte ldr/str Legal Illegal ldr r0, [r1, #4000] ldrb r0, [r1, #4000] ldr r0, [r1, r2] ldrh r0, [r1, r2] ldr r0, [r1, r2, LSL #30] ldrh r0, [r1, #4000] (offset is limited to 0 to 255) ldrh r0, [r1, r2, LSL #30] (no shifting is allowed for ldrh) ldr r0, [r1, r2, LSL r3] ldrh r0, [r1, r2, LSL r3] (no dynamic shift offset is allowed for all ldr/str operations) ldr r0, [r1, r2, LSL #40] (shift is limited to 0 to 31) 88 Example: Lookup Table This example converts a hexadecimal digit to its corresponding ASCII code according to the following lookup table: Hexadecimal ASCII 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x41 0x42 0x43 0x44 0x45 0x46 89 Example: Lookup Table.text: Identifies the code section of the program..data: Defines a section of code that contains initialized data and variables rather than executable code. The list begins at the address labeled Data The list values are comma-separated and are byte-sized (as defined by.byte) 90 Example: Lookup Table Hexadecimal ASCII 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x41 0x42 0x43 0x44 0x45 0x46 Example: Convert the hex digit B to its corresponding ASCII code Set [r2] 0xB Set base address to be the address of the label Data, store it in r1. The desired ASCII code is located at address [r1]+[r2] Use ldrb r0, [r1, r2] to retrieve the desired ASCII code. 91 3.5 Looping and Branching 92 Looping in ARM Loop – repeating a sequence of instructions a certain number of times. e.g., To add 9 to r0 five times, I could do: mov r0, #0 mov r1, #9 add r0, r0, r1 add r0, r0, r1 add r0, r0, r1 add r0, r0, r1 add r0, r0, r1 Disadvantage: Too much code space is needed if repeating 100 times Use looping 93 Three steps in writing a for loop Step 1: Initialize variables – Count: specifies how many times you want to repeat the loop. (e.g., [r2] = 5 in previous example) – Other variables depending on application. Step 2: Identify statement(s) to repeat – In the previous example, it would be: add r0, r0, r1 Step 3: Determine whether to stop looping – Use subs r2, r2, #1 to decrement [r2] after one iteration (repetition) of the loop, and check whether [r2] is 0. Two possibilities: If [r2] = 0 (i.e., finished the desired number of iterations), exit loop. If [r2] ≠ 0, branch back to Step 2. 94 Example 1 e.g., Write a program to (a) clear r0, (b) add 9 to r0 a thousand times. Z=0 mov r0, #0 ldr r2, =1000; Step 1 AddMore: add r0, r0, #9; Step 2 subs r2, r2, #1; Step 3 bne AddMore 95 General format of a for loop MOV r0,#10 Loop: code... SUBS r0,r0,#1 BNE Loop Post loop... // Step 1 // Step 2 // Step 3 ;fall through on zero count 96 Example 2 Write a program to place the value 0x55 into 100 consecutive bytes starting from address 0x1500. Step 1: Initialization – [r1] = 0x5555555 (each time write 4 bytes) – [r0] = 25 – [r2] = 0x1500 Step 2: Statements to repeat – str r1, [r2], #4 Step 3: Termination conditions – [r0] = [r0] – 1 – Stop repetition if [r0] = 0 97 Example 2 mov r0, #25 ldr r1, =0x55555555 ldr r2, =0x1500 Again: str r1, [r2], #4 subs r0, r0, #1 bne Again Here: b Here 98 While loop Pseudocode: while expression statement(s) end If expression is true, run statement(s) once. Assembly code: Loop: cmp r0, #0 beq WhileExit …. statement(s) b Loop WhileExit: Post loop....; Exit 99 Example 1 using while loop e.g., Write a program to (a) clear r0, (b) add 9 to r0 a thousand times. mov r0, #0 ldr r2, =1000 // Step 1 AddMore: cmp r2, #0 beq WhileExit add r0, r0, #9 // Step 2 sub r2, r2, #1 // Step 3 b AddMore WhileExit: Post loop; Exit 100 Example on while loop from Lab 4 This example traverses a list and operates on each value of the list. Let’s have a good understanding of the list first: The list begins at the address labeled Data The list values are comma-separated and are wordsized (as defined by.word) The address of the end of the list (i.e., immediately after the last item) is labeled as _Data. This is important as otherwise we have no way to know where the list ends. 101 Example on while loop from Lab 4 Data items are stored correctly as displayed above because the list Data starts on the word boundary (address divisible by 4). However, this is not guaranteed. For example, the list would be unaligned by adding a single byte with a value of 1 in front of the Data list 102 Example on while loop from Lab 4 The resulting memory looks like: The original list values are still there, but because they are now shifted by a byte, they no longer make sense in decimal. We can correct this with the.align directive:.align: Fill address 1031 to 1033 by 0x00 Start writing the first word in address 1034 103 Example on while loop from Lab 4 This code traverses the list: 104 Example on while loop from Lab 4 does the following: loads the value at the address stored in r2 to r0. This is initially the first element of the Data list. The value of r2 is then incremented by 4. r2 now points to the second element of the Data list. This continues until some end condition is reached. 105 Example on while loop from Lab 4 The end condition in this case is to compare the current address in r3 against the address immediately following the end of the list, i.e., the value in _Data: If [r3]>[r2], end has not been reached, Z = 0, the branch bne is taken. If [r3]=[r2], end is reached, Z = 1, bne not taken. 106 Example on while loop from Lab 4 Suppose we want to process a list of values, and count how many positive, negative, and zero values are in the list. To do it: 1. We would use three registers to store the number of positive, negative and zero values. 2. For each value we traverse using the above program, we test whether it is greater than, equal to or less than 0 and add 1 to the appropriate register. 107 Example on while loop from Lab 4 108 3.6 Subroutine call and return 109 Subroutine A subroutine is a sequence of instructions that can be called from different places in a program. Two reasons for creating subroutines: – The problem is too big: Easier to divide the problem into smaller sub-problems – There are several places in a program that need to perform the same operation 110 Calling subroutine: Is it just branching? Calling subroutine involves the “jumping” part, which is the same as branching. But we need to get back to the main program after the subroutine returns. Thus, we need a location in which the return address can be stored. Image courtesy of S. Katzen, The essential PIC18 Microcontroller, Springer 111 ARM support for subroutine Use BL (branch with link) – Performing branching and – Store the return address (i.e., the address of the instruction immediately below the bl instruction) into r14 (also called lr) register. At the end of the subroutine, use BX lr to return [lr] to [pc]. 112 Example Suppose that you want to evaluate if x > 0 then x = 16x + 1 else x = 32x several times in a program. Assuming that x is in r0, we can write : Func1: cmp r0, #0 //test for x > 0 movgt r0, r0, lsl #4 // if x > 0 x = 16x addgt r0, r0, #1 // if x > 0 then x = 16x + 1 movle r0, r0, lsl #5 // ELSE if x < 0 THEN x = 32x bx lr // [pc]