Functions and Function Invocation - Practical Reverse Engineering

Unlike x86/x64, which has only one instruction for function invocation (CALL) and branching (JMP), ARM offers several depending on how the destination is encoded. When you call a function, the processor needs to know where to resume execution after the function returns; this location is typically referred to as the return address. In x86, the CALL instruction implicitly pushes the return address on the stack before jumping to the target function; when it is done execut- ing, the target function resumes execution at the return address by popping it off the stack into EIP.

The mechanism on ARM is essentially the same with a few minor differ- ences. First, the return address can be stored on the stack or in the link register (LR); to resume execution after the call, the return address is explicitly popped off the stack into PC or there will be an unconditional branch to LR. Second, a

branch can switch between ARM and Thumb state, depending on the destination address’s LSB. Third, a standard calling convention is defi ned by ARM:

The fi rst four 32-bit parameters are passed via registers (R0-R3) and the rest are on the stack. Return value is stored in R0.

The instructions used for function invocations are B, BX, BL, and BLX.

Although it is rare to see B used in the context of function invocation, it can be used for transfer of control. It is simply an unconditional branch and is identical to the JMP instruction in x86. It is normally used inside of loops and conditionals to go back to the beginning or break out; it can also be used to call a function that never returns. B can only use label offsets as its destination; it cannot use registers. In this context, the syntax of B is as follows: B imm, where imm is an offset relative from the current instruction. (This does not take into consider- ation the conditional execution fl ags, which are discussed in the “Branching and Conditional Execution” section.) One important fact to note is that because ARM and Thumb instructions are 4- and 2-byte aligned, the target offset needs to be an even number. Here is a snippet showing the usage of B:

01: 0001C788 B loc_1C7A8 02: 0001C78A

03: 0001C78A loc_1C78A

04: 0001C78A LDRB R7, [R6,R2]

05: ...

06: 0001C7A4 STRB.W R7, [R3,#-1]

07: 0001C7A8

08: 0001C7A8 loc_1C7A8

09: 0001C7A8 MOV R7, R3 10: 0001C7AA ADDS R3, #2 11: 0001C7AC CMP R2, R4 12: 0001C7AE BLT loc_1C78A

In line 1, you see B being used as an unconditional jump to start off a loop.

You can ignore the other instructions for now.

BX is Branch and Exchange. It is similar to B in that it transfers control to a target, but it has the ability to switch between ARM/Thumb state, and the target address is stored in a register. Branching instructions that end with X indicate that they are capable of switching between states. If the LSB of the target address is 1, then the processor automatically switches to Thumb state;

otherwise, it executes in ARM state. The instruction format is BX <register>, where register holds the destination address. The two most common uses of this instruction are returning from a function by branching to LR (i.e., BX LR) and transferring of control to code in a different mode (i.e., going from ARM to Thumb or vice versa). In compiled code, you will almost always see BX LR at the end of functions; it is basically the same as RET in x86.

BL is Branch with Link. It is similar to B except that it also stores the return address in LR before transferring control to the target offset. This is probably the

to invoke functions. The instruction format is the same as B (that is, it takes only offsets). Here is a short snippet demonstrating function invocation and returning:

01: 00014350 BL foo ; LR = 0x00014354 02: 00014354 MOVS R4, #0x15

03: ...

04: 0001B224 foo

05: 0001B224 PUSH {R1-R3}

06: 0001B226 MOV R3, 0x61240 07: ...

08: 0001B24C BX LR ; return to 0x00014354

Line 1 calls the function foo using BL; before transferring control to the destination, BL stores the return address (0x000014354) in LR. foo does some work and returns to the caller (BX LR).

BLX is Branch with Link and Exchange. It is like BL with the option to switch state. The major difference is that BLX can take either a register or an offset as its branch destination; in the case where BLX uses an offset, the processor always swaps state (ARM to Thumb and vice versa). Because it shares the same charac- teristics as BL, you can also think of it as the equivalent of the CALL instruction in x86. In practice, both BL and BLX are used to call functions. BL is typically used if the function is within a 32MB range, and BLX is used whenever the target range is undetermined (like a function pointer). When operating in Thumb state, BLX is usually used to call library routines; in ARM state, BL is used instead.

Having explored all instructions related to unconditional branching and direct function invocation, and how to return from a function (BX LR), you can consolidate your knowledge by looking at a full routine:

01: 0100C388 ; void *__cdecl mystery(int) 02: 0100C388 mystery

03: 0100C388 2D E9 30 48 PUSH.W {R4,R5,R11,LR}

04: 0100C38C 0D F2 08 0B ADDW R11, SP, #8 05: 0100C390 0C 4B LDR R3, =__imp_malloc 06: 0100C392 C5 1D ADDS R5, R0, #7 07: 0100C394 6F F3 02 05 BFC.W R5, #0, #3 08: 0100C398 1B 68 LDR R3, [R3]

09: 0100C39A 15 F1 08 00 ADDS.W R0, R5, #8 10: 0100C39E 98 47 BLX R3

11: 0100C3A0 04 46 MOV R4, R0

12: 0100C3A2 24 B1 CBZ R4, loc_100C3AE 13: 0100C3A4 EB 17 ASRS R3, R5, #0x1F 14: 0100C3A6 63 60 STR R3, [R4,#4]

15: 0100C3A8 25 60 STR R5, [R4]

16: 0100C3AA 08 34 ADDS R4, #8 17: 0100C3AC 04 E0 B loc_100C3B8 18: 0100C3AE loc_100C3AE

19: 0100C3AE 04 49 LDR R1, =aFailed ; "failed..."

20: 0100C3B0 2A 46 MOV R2, R5

22: 0100C3B4 01 F0 14 FC BL foo 23: 0100C3B8

24: 0100C3B8 loc_100C3B8 25: 0100C3B8 20 46 MOV R0, R4

26: 0100C3BA BD E8 30 88 POP.W {R4,R5,R11,PC}

27: 0100C3BA ; End of function mystery

This function covers several of the ideas discussed earlier (ignore the other instructions for now):

■ Line 3 is the prologue, using the PUSH {..., LR} sequence; L26 is the epilogue.

■ Line 10 calls malloc via BLX.

■ Line 22 calls foo via BL.

■ Line 26 returns, using the POP {..., PC} sequence.

Arithmetic Operations

After loading a value from memory into a register, the code can move it around and perform operations on it. The simplest operation is to move it to another register with the MOV instruction. The source can be a constant, a register, or something processed by the barrel shifter. Here are examples of its usage:

01: 4F F0 0A 00 MOV.W R0, #0xA ; r0 = 0xa 02: 38 46 MOV R0, R7 ; r0 = r7

03: A4 4A A0 E1 MOV R4, R4, LSR #21 ; r4 = (r4>>21)

Line 3 shows the source operand being processed by the barrel shifter before being moved to the destination. The barrel shifter’s operations include left shift (LSL), right shift (LSR, ASR), and rotate (ROR, RRX). The barrel shifter is useful because it allows the instruction to work on constants that cannot normally be encoded in immediate form. ARM and Thumb instructions can be either 16 or 32 bits wide, so they cannot directly have 32-bit constants as a parameter; with the barrel shifter, an immediate can be transformed into a larger value and moved to another register. Another way to move a 32-bit constant into a register is to split the constant into two 16-bit halves and move them one a time; this is normally done with the MOVW and MOVT instructions. MOVT sets the top 16 bits of a register, and MOVW sets the bottom 16 bits.

The basic arithmetic and logical operations are ADD, SUB, MUL, AND, ORR, and EOR. Here are examples of their usage:

01: 4B 44 ADD R3, R9 ; r3 = r3+r9 02: 0D F2 08 0B ADDW R11, SP, #8 ; r11 = sp+8

03: 04 EB 80 00 ADD.W R0, R4, R0,LSL#2 ; r0 = r4 + (r0<<2)

05: 03 FB 05 F2 MUL.W R2, R3, R5 ; r2 = r3*r5 (32bit result) 06: 14 F0 07 02 ANDS.W R2, R4, #7 ; r2 = r4 & 7 (flag) 07: 83 EA C1 03 EOR.W R3, R3, R1,LSL#3 ; r3 = r3 ^ (r1<<3) 08: 53 40 EORS R3, R2 ; r3 = r3 ^ r2 (flag) 09: 43 EA 02 23 ORR.W R3, R3, R2,LSL#8 ; r3 = r3 | (r2<<8) 10: 53 F0 02 03 ORRS.W R3, R3, #2 ; r3 = r3 | 2 (flag) 11: 13 43 ORRS R3, R2 ; r3 = r3 | r2 (flag)

Note the “S” after some of these instructions. Unlike x86, ARM arithmetic instructions do not set the conditional fl ag by default. The “S” suffi x indicates that the instruction should set arithmetic conditional fl ags (zero, negative, etc.) depending on its result. Note that the MUL instruction truncates the result such that only the bottom 32 bits are stored in the destination register; for full 64-bit multiplication, use the SMULL and UMULL instructions (see ARM TRM for the details).

Where is the divide instruction? ARM does not have a native divide instruction. (ARMv7-R and ARMv7-M cores have SDIV and UDIV, but they are not discussed here.) In practice, the runtime will have a software implementation for division and code simply call into it when needed. Here is an example with the Windows C runtime:

01: 41 46 MOV R1, R8 02: 30 46 MOV R0, R6

03: 35 F0 9E FF BL __rt_udiv ; software implementation of udiv

Dalam dokumen Practical Reverse Engineering (Halaman 85-89)