Basic Features - Practical Reverse Engineering

and OMAP, respectively), but their core architecture is licensed from ARM.

They all implement the base instruction set and memory model defi ned in the ARM architecture reference manual. Additional extensions can be added to the processor; for example, the Jazelle extension enables Java bytecode to be executed natively on the processor. The Thumb extension adds instructions that can be 16 or 32 bits wide, thus allowing higher code density (native ARM instructions are always 32 bits in width). The Debug extension allows engineers to analyze the physical processor using special debugging hardware. Each extension is typically represented by a letter (J, T, D, etc.). Depending on their requirements, manufacturers can decide whether they need to license these additional extensions. This is why ARMv6 and earlier processors have letters after them (e.g., ARM1156T2 means ARMv6 with Thumb-2 extension). These conventions are no longer used in ARMv7, which instead uses three profi les (Application, Real-time, and Microcontroller) and model name (Cortex) with different features. For example, ARMv7 Cortex-A series are processors with the application profi le; and Cortex-M are meant for microcontrollers and only support Thumb mode execution.

This chapter covers the ARMv7 architecture as defi ned in the ARM Architecture Reference Manual: ARMv7-A and ARMv7-R Edition (ARM DDI 0406B).

highest privilege and ring 3 having the lowest. In ARM, privileges are defi ned by eight different modes:

■ User (USR)

■ Fast interrupt request (FIQ)

■ Interrupt request (IRQ)

■ Supervisor (SVC)

■ Monitor (MON)

■ Abort (ABT)

■ Undefi ned (UND)

■ System (SYS)

Code running in a given mode has access to certain privileges and registers that others may not; for example, code running in USR mode is not allowed to modify system registers (which are typically modifi ed only in SVC mode).

USR is the least privileged mode. While there are many technical differences, for the sake of simplicity you can make the analogy that USR is like ring 3 and SVC is like ring 0. Most operating systems implement kernel mode in SVC and user mode in USR. Both Windows and Linux do this.

If you recall from Chapter 1, x64 processors can execute in 32-bit, 64-bit, or both interchangeably. ARM processors are similar in that they can also operate in two states: ARM and Thumb. ARM/Thumb state determines only the instruction set, not the privilege level. For example, code running in SVC mode can be either ARM or Thumb. In ARM state, instructions are always 32 bits wide; in Thumb state, instructions can be either 16 bits or 32 bits wide. Which state the processor executes in depends on two conditions:

■ When branching with the BX and BLX instruction, if the destination register’s least signifi cant bit is 1, then it will switch to Thumb state.

(Although instructions are either 2- or 4-byte aligned, the processor will ignore the least signifi cant bit so there won’t be alignment issues.)

■ If the T bit in the current program status register (CPSR) is set, then it is in Thumb mode. The semantic of CPSR is explained in the following section, but for now you can think of it as an extended EFLAGS register in x86.

When an ARM core boots up, most of the time it enters ARM state and remains that way until there is an explicit or implicit change to Thumb. In practice, many recent operating system code mainly uses Thumb code because higher code density is wanted (a mixture of 16/32-bit wide instructions may be smaller in size than all 32-bit ones); applications can operate in whatever mode they want.

While most Thumb and ARM instructions have the same mnemonic, 32-bit Thumb instructions have a .W suffi x.

N O T E It is a common misconception to think that Thumb is like real mode and ARM is like protected mode on x86/x64. Do not think of it this way. Most operating systems on the x86/x64 platform run in protected mode and rarely, if ever, switch back to real mode. Operating systems and applications on the ARM platform can execute both in ARM and Thumb state interchangeably. Note also that these states are completely dif- ferent from the privilege modes explained in the previous paragraph (USR, SVC, etc.).

There are two versions of Thumb: Thumb-1 and Thumb-2. Thumb-1 was used in ARMv6 and earlier architectures, and its instructions are always 16 bits in width.

Thumb-2 extends that by adding more instructions and allowing them to be either 16 or 32 bits in width. ARMv7 requires Thumb-2, so whenever we talk about Thumb, we are referring to Thumb-2.

There are several other diﬀ erences between ARM and Thumb states but we cannot cover them all here. For example, some instructions are available in ARM state but not Thumb state, and vice versa. You can consult the oﬃ cial ARM documentation for more details.

In addition to having different states of execution, ARM also supports conditional execution. This means that an instruction encodes certain arithmetic conditions that must be met in order for it to be executed. For example, an instruction can specify that it will only be executed if the result of the previous instruction is zero. Contrast this with x86, for which almost every single instruction is executed unconditionally. (Intel has a couple of instructions directly supporting conditional execution: CMOV and SETNE.) Conditional execution is useful because it cuts down on branch instructions (which are very expensive) and reduces the number of instructions to be executed (which leads to higher code density). All instructions in ARM state support conditional execution, but by default they execute unconditionally. In Thumb state, a special instruction IT is required to enable conditional execution.

Another unique ARM feature is the barrel shifter. Certain instructions can

“contain” another arithmetic instruction that shifts or rotates a register. This is useful because it can shrink multiple instructions into one; for example, you want to multiply a register by 2 and then store the result in another register.

Normally, this would require two instructions (a multiply followed by a move), but with the barrel shifter you can include the multiply (shift left by 1) inside the MOV instruction. The instruction would be something like the following:

MOV R1, R0, LSL #1 ; R1 = R0 * 2

Dalam dokumen Practical Reverse Engineering (Halaman 68-71)