Advanced Techniques - Mobile Application Security Testing Guide (MASTG)

Debugging and Tracing

In the traditional sense, debugging is the process of identifying and isolating problems in a program as part of the software development life cycle. The same tools used for debugging are valuable to reverse engineers even when identifying bugs is not the primary goal. Debuggers enable program suspension at any point during runtime, inspection of the process’ internal state, and even register and memory modiﬁcation. These abilities simplify program inspection.

Debuggingusually means interactive debugging sessions in which a debugger is attached to the running process. In contrast, tracing refers to passive logging of information about the app’s execution (such as API calls). Tracing can be done in several ways, including debugging APIs, function hooks, and Kernel tracing facilities. Again, we’ll cover many of these techniques in the OS-speciﬁc “Reverse Engineering and Tampering” chapters.

available for Android, but for iOS there are practically no viable emulators available. iOS only has a simulator, shipped within Xcode.

The diﬀerence between a simulator and an emulator often causes confusion and leads to use of the two terms interchangeably, but in reality they are diﬀerent, specially for the iOS use case. An emulator mimics both the software and hardware environment of a targeted platform. On the other hand, a simulator only mimics the software environment.

QEMU based emulators for Android take into consideration the RAM, CPU, battery performance etc (hardware components) while running an application, but in an iOS simulator this hardware component behaviour is not taken into consideration at all. The iOS simulator even lacks the im- plementation of the iOS kernel, as a result if an application is using syscalls it cannot be executed in this simulator.

In simple words, an emulator is a much closer imitation of the targeted platform, while a simulator mimics only a part of it.

Running an app in the emulator gives you powerful ways to monitor and manipulate its environment. For some reverse engineering tasks, especially those that require low-level instruction tracing, emulation is the best (or only) choice. Unfortunately, this type of analysis is only viable for Android, because no free or open source emulator exists for iOS (the iOS simulator is not an emulator, and apps compiled for an iOS device don’t run on it). The only iOS emulator available is a commercial SaaS solution -Corellium. We’ll provide an overview of popular emulation-based analysis frameworks for Android in the “Tampering and Reverse Engineering on Android” chap- ter.

Custom Tooling with Reverse Engineering Frameworks

Even though most professional GUI-based disassemblers feature scripting facilities and extensi- bility, they are simply not well-suited to solving particular problems. Reverse engineering frameworks allow you to perform and automate any kind of reversing task without depending on a heavy-weight GUI. Notably, most reversing frameworks are open source and/or available for free.

Popular frameworks with support for mobile architectures includeradare2andAngr.

Example: Program Analysis with Symbolic/Concolic Execution

In the late 2000s, testing based on symbolic execution has become a popular way to identify security vulnerabilities. Symbolic “execution” actually refers to the process of representing possible paths through a program as formulas in first-order logic. Satisfiability Modulo Theories (SMT) solvers are used to check the satisfiability of these formulas and provide solutions, including concrete values of the variables needed to reach a certain point of execution on the path corresponding to the solved formula.

In simple words, symbolic execution is mathematically analyzing a program without executing it.

During analysis, each unknown input is represented as a mathematical variable (a symbolic value), and hence all the operations performed on these variables are recorded as a tree of operations (aka. AST (abstract syntax tree), from compiler theory). These ASTs can be translated into so- called constraints that will be interpreted by a SMT solver. In the end of this analysis, a ﬁnal mathematical equation is obtained, in which the variables are the inputs whose values are not

known. SMT solvers are special programs which solve these equations to give possible values for the input variables given a ﬁnal state.

To illustrate this, imagine a function which takes one input (x) and multiplies it by the value of a second input (y). Finally, there is anif condition which checks if the value calculated is greater than the value of an external variable(z), and returns “success” if true, else returns “fail”. The equation for this operation will be(x * y) > z.

If we want the function to always return “success” (ﬁnal state), we can tell the SMT solver to calculate the values forxandy(input variables) which satisfy the corresponding equation. As is the case for global variables, their value can be changed from outside this function, which may lead to diﬀerent outputs whenever this function is executed. This adds to additional complexity in determining correct solution.

Internally SMT solvers use various equation solving techniques to generate solution for such equations. Some of the techniques are very advanced and their discussion is beyond the scope of this book.

In a real world situation, the functions are much more complex than the above example. The increased complexity of the functions can pose signiﬁcant challenges for classical symbolic execution. Some of the challenges are summarised below:

• Loops and recursions in a program may lead toinﬁnite execution tree.

• Multiple conditional branches or nested conditions may lead topath explosion.

• Complex equations generated by symbolic execution may not be solvable by SMT solvers because of their limitations.

• Program is using system calls, library calls or network events which cannot be handled by symbolic execution.

To overcome these challenges, typically, symbolic execution is combined with other techniques such asdynamic execution(also called concrete execution) to mitigate the path explosion prob- lem speciﬁc to classical symbolic execution. This combination of concrete (actual) and symbolic execution is referred to asconcolic execution(the name concolic stems fromconcrete and symbolic), sometimes also called asdynamic symbolic execution.

To visualize this, in the above example, we can obtain the value of the external variable by per- forming further reverse engineering or by dynamically executing the program and feeding this information into our symbolic execution analysis. This extra information will reduce the complexity of our equations and may produce more accurate analysis results. Together with improved SMT solvers and current hardware speeds, concolic execution allows to explore paths in medium-size software modules (i.e., on the order of 10 KLOC).

In addition, symbolic execution also comes in handy for supporting de-obfuscation tasks, such as simplifying control ﬂow graphs. For example, Jonathan Salwan and Romain Thomas haveshown how to reverse engineer VM-based software protections using Dynamic Symbolic Execution[#salwan] (i.e., using a mix of actual execution traces, simulation, and symbolic execution).

In the Android section, you’ll ﬁnd a walkthrough for cracking a simple license check in an Android application using symbolic execution.

References

• [#vadla] Ole André Vadla Ravnås, Anatomy of a code tracer -https://medium.com/@oleavr/

anatomy-of-a-code-tracer-b081aadb0df8

• [#salwan] Jonathan Salwan and Romain Thomas, How Triton can help to reverse virtual ma- chine based software protections - https://drive.google.com/file/d/1EzuddBA61jEMy8XbjQK FF3jyoKwW7tLq/view?usp=sharing

Dalam dokumen Mobile Application Security Testing Guide (MASTG) (Halaman 53-57)