highly detailed information regarding the program and the data types it deals with, which makes it possible to produce a reasonably accurate high-level lan- guage representation of the program through decompilation. Because of this level of transparency, developers often obfuscate their code to make it more difficult to comprehend. The process of reversing .NET programs and the effects of the various obfuscation tools are discussed in Chapter 12.
So, a low-level representation of our little Multiplyfunction would usu- ally have to take care of the following tasks:
1. Store machine state prior to executing function code 2. Allocate memory for z
3. Load parameters xand yfrom memory into internal processor memory (registers)
4. Multiply xby yand store the result in a register
5. Optionally copy the multiplication result back into the memory area previously allocated for z
6. Restore machine state stored earlier
7. Return to caller and send back zas the return value
You can easily see that much of the added complexity is the result of low- level data management considerations. The following sections introduce the most common low-level data management constructs such as registers, stacks, and heaps, and how they relate to higher-level concepts such as variables and parameters.
38 Chapter 2
HIGH-LEVEL VERSUS LOW-LEVEL DATA MANAGEMENT
One question that pops to mind when we start learning about low-level software is why are things presented in such a radically different way down there? The fundamental problem here is execution speed in microprocessors.
In modern computers, the CPU is attached to the system memory using a high-speed connection (a bus). Because of the high operation speed of the CPU, the RAM isn’t readily available to the CPU. This means that the CPU can’t just submit a read request to the RAM and expect an immediate reply, and likewise it can’t make a write request and expect it to be completed
immediately. There are several reasons for this, but it is caused primarily by the combined latency that the involved components introduce. Simply put, when the CPU requests that a certain memory address be written to or read from, the time it takes for that command to arrive at the memory chip and be processed, and for a response to be sent back, is much longer than a single CPU clock cycle. This means that the processor might waste precious clock cycles simply waiting for the RAM.
This is the reason why instructions that operate directly on memory-based operands are slower and are avoided whenever possible. The relatively lengthy period of time each memory access takes to complete means that having a single instruction read data from memory, operate on that data, and then write the result back into memory might be unreasonable compared to the
processor’s own performance capabilities.
Registers
In order to avoid having to access the RAM for every single instruction, microprocessors use internal memory that can be accessed with little or no performance penalty. There are several different elements of internal memory inside the average microprocessor, but the one of interest at the moment is the register. Registers are small chunks of internal memory that reside within the processor and can be accessed very easily, typically with no performance penalty whatsoever.
The downside with registers is that there are usually very few of them. For instance, current implementations of IA-32 processors only have eight 32-bit registers that are truly generic. There are quite a few others, but they’re mostly there for specific purposes and can’t always be used. Assembly language code revolves around registers because they are the easiest way for the processor to manage and access immediate data. Of course, registers are rarely used for long-term storage, which is where external RAM enters into the picture. The bottom line of all of this is that CPUs don’t manage these issues automatically—
they are taken care of in assembly language code. Unfortunately, managing registers and loading and storing data from RAM to registers and back cer- tainly adds a bit of complexity to assembly language code.
So, if we go back to our little code sample, most of the complexities revolve around data management. xand ycan’t be directly multiplied from memory, the code must first read one of them into a register, and then multiply that reg- ister by the other value that’s still in RAM. Another approach would be to copy both values into registers and then multiply them from registers, but that might be unnecessary.
These are the types of complexities added by the use of registers, but regis- ters are also used for more long-term storage of values. Because registers are so easily accessible, compilers use registers for caching frequently used values inside the scope of a function, and for storing local variables defined in the program’s source code.
While reversing, it is important to try and detect the nature of the values loaded into each register. Detecting the case where a register is used simply to allow instructions access to specific values is very easy because the register is used only for transferring a value from memory to the instruction or the other way around. In other cases, you will see the same register being repeatedly used and updated throughout a single function. This is often a strong indica- tion that the register is being used for storing a local variable that was defined in the source code. I will get back to the process of identifying the nature of val- ues stored inside registers in Part II, where I will be demonstrating several real-world reversing sessions.
Low-Level Software 39
06_574817 ch02.qxd 3/16/05 8:35 PM Page 39
The Stack
Let’s go back to our earlier Multiplyexample and examine what happens in Step 2 when the program allocates storage space for variable “z”. The specific actions taken at this stage will depend on some seriously complex logic that takes place inside the compiler. The general idea is that the value is placed either in a register or on the stack. Placing the value in a register simply means that in Step 4 the CPU would be instructed to place the result in the allocated register. Register usage is not managed by the processor, and in order to start using one you simply load a value into it. In many cases, there are no available registers or there is a specific reason why a variable must reside in RAM and not in a register. In such cases, the variable is placed on the stack.
A stack is an area in program memory that is used for short-term storage of information by the CPU and the program. It can be thought of as a secondary storage area for short-term information. Registers are used for storing the most immediate data, and the stack is used for storing slightly longer-term data.
Physically, the stack is just an area in RAM that has been allocated for this pur- pose. Stacks reside in RAM just like any other data—the distinction is entirely logical. It should be noted that modern operating systems manage multiple stacks at any given moment—each stack represents a currently active program or thread. I will be discussing threads and how stacks are allocated and man- aged in Chapter 3.
Internally, stacks are managed as simple LIFO (last in, first out) data struc- tures, where items are “pushed” and “popped” onto them. Memory for stacks is typically allocated from the top down, meaning that the highest addresses are allocated and used first and that the stack grows “backward,” toward the lower addresses. Figure 2.1. demonstrates what the stack looks like after push- ing several values onto it, and Figure 2.2. shows what it looks like after they’re popped back out.
A good example of stack usage can be seen in Steps 1 and 6. The machine state that is being stored is usually the values of the registers that will be used in the function. In these cases, register values always go to the stack and are later loaded back from the stack into the corresponding registers.
40 Chapter 2
Figure 2.1 A view of the stack after three values are pushed in.
Figure 2.2 A view of the stack after the three values are popped out.
Previously Stored Value Unknown Data (Unused) Unknown Data (Unused) Unknown Data (Unused) Unknown Data (Unused) Unknown Data (Unused)
ESP
Lower Memory Addresses
Higher Memory Addresses
After POP
POPDirection
POP EAX POP EBX POP ECX
32 Bits Code Executed:
Previously Stored Value Value 1 Value 2 Value 3 Unknown Data (Unused) Unknown Data (Unused)
ESP
Lower Memory Addresses
Higher Memory Addresses
After PUSH
PUSH Direction
PUSH Value 1 PUSH Value 2 PUSH Value 3
32 Bits Code Executed:
Low-Level Software 41
06_574817 ch02.qxd 3/16/05 8:35 PM Page 41
If you try to translate stack usage to a high-level perspective, you will see that the stack can be used for a number of different things:
■■ Temporarily saved register values: The stack is frequently used for temporarily saving the value of a register and then restoring the saved value to that register. This can be used in a variety of situations—when a procedure has been called that needs to make use of certain registers.
In such cases, the procedure might need to preserve the values of regis- ters to ensure that it doesn’t corrupt any registers used by its callers.
■■ Local variables: It is a common practice to use the stack for storing local variables that don’t fit into the processor’s registers, or for vari- ables that must be stored in RAM (there is a variety of reasons why that is needed, such as when we want to call a function and have it write a value into a local variable defined in the current function). It should be noted that when dealing with local variables data is not pushed and popped onto the stack, but instead the stack is accessed using offsets, like a data structure. Again, this will all be demonstrated once you enter the real reversing sessions, in the second part of this book.
■■ Function parameters and return addresses: The stack is used for imple- menting function calls. In a function call, the caller almost always passes parameters to the callee and is responsible for storing the current instruction pointer so that execution can proceed from its current posi- tion once the callee completes. The stack is used for storing both para- meters and the instruction pointer for each procedure call.
Heaps
A heap is a managed memory region that allows for the dynamic allocation of variable-sized blocks of memory in runtime. A program simply requests a block of a certain size and receives a pointer to the newly allocated block (assuming that enough memory is available). Heaps are managed either by software libraries that are shipped alongside programs or by the operating system.
Heaps are typically used for variable-sized objects that are used by the pro- gram or for objects that are too big to be placed on the stack. For reversers, locating heaps in memory and properly identifying heap allocation and free- ing routines can be helpful, because it contributes to the overall understanding of the program’s data layout. For instance, if you see a call to what you know is a heap allocation routine, you can follow the flow of the procedure’s return value throughout the program and see what is done with the allocated block, and so on. Also, having accurate size information on heap-allocated objects (block size is always passed as a parameter to the heap allocation routine) is another small hint towards program comprehension.
42 Chapter 2
Executable Data Sections
Another area in program memory that is frequently used for storing applica- tion data is the executable data section. In high-level languages, this area typi- cally contains either global variables or preinitialized data. Preinitialized data is any kind of constant, hard-coded information included with the program.
Some preinitialized data is embedded right into the code (such as constant integer values, and so on), but when there is too much data, the compiler stores it inside a special area in the program executable and generates code that references it by address. An excellent example of preinitialized data is any kind of hard-coded string inside a program. The following is an example of this kind of string.
char szWelcome = “This string will be stored in the executable’s preinitialized data section”;
This definition, written in C, will cause the compiler to store the string in the executable’s preinitialized data section, regardless of where in the code szWelcome is declared. Even if szWelcomeis a local variable declared inside a function, the string will still be stored in the preinitialized data section. To access this string, the compiler will emit a hard-coded address that points to the string. This is easily identified while reversing a program, because hard-coded memory addresses are rarely used for anything other than pointing to the executable’s data section.
The other common case in which data is stored inside an executable’s data section is when the program defines a global variable. Global variables provide long-term storage (their value is retained throughout the life of the program) that is accessible from anywhere in the program, hence the term global. In most languages, a global variable is defined by simply declaring it outside of the scope of any function. As with preinitialized data, the compiler must use hard- coded memory addresses in order to access global variables, which is why they are easily recognized when reversing a program.
Control Flow
Control flow is one of those areas where the source-code representation really makes the code look user-friendly. Of course, most processors and low-level languages just don’t know the meaning of the words if or while. Looking at the low-level implementation of a simple control flow statement is often con- fusing, because the control flow constructs used in the low-level realm are quite primitive. The challenge is in converting these primitive constructs back into user-friendly high-level concepts.
Low-Level Software 43
06_574817 ch02.qxd 3/16/05 8:35 PM Page 43
One of the problems is that most high-level conditional statements are just too lengthy for low-level languages such as assembly language, so they are broken down into sequences of operations. The key to understanding these sequences, the correlation between them, and the high-level statements from which they originated, is to understand the low-level control flow constructs and how they can be used for representing high-level control flow statements.
The details of these low-level constructs are platform- and language-specific;
we will be discussing control flow statements in IA-32 assembly language in the following section on assembly language.