PDF Program Validation by Symbolic and Reverse Execution - Korea

This limited, lazier initialization helps break symbolic execution, giving the user relatively high confidence in a given program. Symbolic execution and reverse execution were used to help programmers check and debug programs, respectively.

Floyd-Hoare Logic

An interactive theorem prover aids a proof procedure by recursively suggesting subgoals that should be proved in a user's guide to show that the main verification goal (i.e., the postcondition) is provable. Compared to an automated theorem prover, an interactive theorem prover requires a relatively high degree of human intervention, despite machine assistance from a theorem prover such as decision procedures and proof strategies [OSR92].

Figure 2.1: The conventional procedure of mechanical program verification

Separation Logic

Also, if a theorem prover cannot resolve a verification condition within a predetermined time, this verification is assumed to be incorrect and a warning is issued, resulting in incompleteness. This high degree of human intervention is the price paid for using a highly expressive logic, such as higher-order logic, which leads to a wider range of authentication properties.

Model Checking

Constrained model checking is in many cases more efficient (both in time and space) than symbolic model checking. On the other hand, the biggest obstacle in model verification is the state explosion problem.

Figure 2.2: Counterexample-Guided Abstraction Refinement (CEGAR)

Abstract Interpretation

The type of information that can be gathered by abstract interpretation can of course be explained by gathering semantics [Nie82] (also known as static semantics [CC77]). In the field of program verification, abstract interpretation is paired with either data flow analysis or model checking.

Specification-Based Testing

A criterion is reliable if the result of a test is the same regardless of which sets of test cases are chosen among the possible sets of test cases that can be generated by a criterion. One rationale for this is that stronger test hypotheses, in general, reduce the number of test cases more aggressively.

Reverse Execution

However, it is not always possible to achieve the ideal inverse code and it may be necessary to collect the value changes from time to time. In most of the literature, only self-defined assignments (i.e., assignments where the same variable is used on both sides of the assignments, e.g., x:=x+1) are used to generate the inverse code [Flo67b, BM99 , CPF99] .

Program Slicing

However, program sharing can remove even critical program segments related to the source of the incorrect output. The difference between two groups of static program segments of a correct output variable and an incorrect output variable, respectively, is calculated.

Mechanical Debugging

The leaf on a cut tree is the proof statement at the end of a given procedure, or one of the cuts, or the end of insurmountable symbolic execution path. A symbolic execution tree is naturally explored using a program model checker such as JPF [VHB+03], XRT [GTS06] and Bogor [RDH03].

Symbolic Execution over Pointer-Data Structures

This term-based symbolic execution is implemented on top of program model checker XRT as its extension called XRTS. Heap structurally bounded symbolic execution 31Unless an enumeration object for one of the above three reasons, the ini-.

Heap-Structurally Bounded Symbolic Execution

Note that, due to the information loss due to abstraction, state σ cannot be reached unfairly if any of the previous states of σ are subsumed into the previous state during symbolic execution. We introduce heap-structural bounded symbolic execution where a bound is set on a heap structure (i.e. a pointer data structure),7 instead of a control structure.

Soundness and Completeness

That is, if an error is found in symbolic execution that implements BLI, this error may occur in a concrete execution. And while an error may occur in a concrete implementation, that error may not occur in symbolic implementations that implement BLI.

Implementation

Marked specifications are also used for sub-procedure calls as in normal symbolic execution. That is, the precondition of a subprocedure must be asserted to enter it, and the body of a subprocedure is not executed, and the postcondition of a subprocedure is assumed to be true before returning to the caller.

Future Work

In our experimental program, less memory is consumed when reverse code is used than when Bogor's original backtracking module is used. Storing the reverse code in a table, as in our experiment, leads to increased memory cost.

Reverse-Code Generation in Non-Deterministic Programs

For example, in Figure 5.1, the variable lc has been inserted so that the correct return code of atloc0offoo can be determined (see Table 5.1). More than anything else, the policy of providing the return code before running the program seems restrictive.

Figure 5.3: A non-deterministic program and its reverse code. The predicates on edges represent the guards of their targets.

Dynamic Reverse-Code Generation

It is also possible to use the redefinition technique instead of the extraction from use technique.2 Figure 5.6 shows the generation of the return code based on the redefinition technique. However, in the case of runtime statement history (b), using the redefinition technique causes the recovery of the in conly variable to fail to produce a return code.

The Provision of Inverse Functions

Soundness

Pointer Operations

The dynamic reverse-code generation method presented in this chapter analyzes the history of runtime states to generate reverse code. The dynamic reverse code generation on symbolic execution paths (possibly with loop invariants provided to finalize a symbolic execution tree) can lead to program reversal.

Figure 5.11 illustrates the use of reverse code in this extended programming lan- lan-guage that can manipulate pointer-data structures

Research Area – Main Themes

Executing the return code helps the model checker to invalidate the transition with the least possible amount of memory. Our thesis is that it is sufficient to store only the identifier of the active thread (we need this information to determine which execution block to undo) if a return code is available.

Figure 6.1: The expected effect of reverse-code generation

Related Work

Directions of the Work

Results

Recently, Akgul and Mooney proposed a way to generate reverse code through static analysis (control/data dependency. We aim to generate reverse code in the same spirit as [AM02], but we also want to be able to multi- handle thread programs, other than/in addition to [AM02].

Input Language

After presenting our input language and a motivating example, in the next two sections we demonstrate our method of generating the return code in detail (Section 7.4). In the next two sections (Section 7.5, 7.6), we also explain the auxiliary techniques needed to generate the return code.

Motivating Example

In the center of Figure 7.2(b) is state flow when two threads t(1)ogt(-1) are running simultaneously, where each state is specified as a 4-tuple of a thread, a line number for the current location3 and values of x and y. Reverse statements are generated from the previous assignments and stored commands executed before the current location.

Figure 7.2: In (b), a state is expressed as a tuple of thread id, transition id, variable x’s value, and variable y’s value.

Reverse Code Generation

Inferring a Reverse Point
Inferring a Reverse Statement
Restore Function
Analysis

Until now, we have assumed that the right-hand side of an assignment contains at most one variable. Note that several syntactic forms of assignment are possible when more than one variable is used on the right-hand side of an assignment.

Figure 7.3: A part of an abstract syntax tree of a simplified BIR program that contains a thread Foo

Selective Store

The two functions, isRRA and existRRAof, used in Figure 7.4 are shown separately in Figure 7.5 and Figure 7.6. Meanwhile, if the RRA is to be one of the previously performed assignments, an assignment variable must be used in the RRA (see line 9 in Figure 7.6), and the RRA must be reversible.

Figure 7.5: isRRA function used in Figure 7.4, returning true if a given assignment is a self-defined RRA, and false otherwise.

Derivation of Inverse Functions

Derivation of inverse functions 71assignment tracing that an assignment whose assignment variable is the same as.

Figure 7.8: Expansion rules for deriving inverse functions. A symbol ◦ represents +, − , ∗ or /, and the function body s is constructed according to the base rules in Fig-ure 7.7.

Related Work

Discussion

In a functional programming language, the programming text provides several hints about updating the heap via e.g. pattern matching.

Conclusions

In the case of non-deterministic programs such as this bounded buffer program, our dynamic reverse-code generation can outperform the existing backtracking methods in terms of memory efficiency. It has been said that the ultimate solution to backtracking is to use reverse code [Flo67b,Gri81]: executing the reverse code restores the previous states of a program.

Backtracking Methods

State Saving
Checkpointing
Static Reverse-Code Generation
Dynamic Reverse-Code Generation

Recently, Akgul and Mooney proposed a way to generate reverse code beyond self-defined commands [AM04]. Currently, Akgul and Mooney consider only deterministic programs as the target of reverse code generation.

The Case of a Bounded Buffer

Basic State Saving
Incremental State Saving
Checkpointing
Static Reverse-Code Generation
Dynamic Reverse-Code Generation

Since the loop body of each thread iterates N times, the static reverse code generation costs 4I×N×2 units of memory, which is less than the memory consumption of checkpointing. Since the loop body of each thread iterates N times, dynamic reverse code generation costs a total of I×N×2 memory units, which is less than the memory consumption of static reverse code generation.

Figure 8.2: Two possible scenarios reaching Line 22 of Figure 8.1.

Discussion

In the above discourse, we deliberately did not consider the memory usage necessary to generate reverse code. Therefore, even when considering the memory consumption required to generate reverse code, dynamic reverse code generation costs the least amount of memory units in our running example.

Conclusions

Kiasan is able to control the strength properties of the pile and is fully automatic and flexible in terms of cost and the guarantees it offers. It is imperative that an analysis tool be able to reason about these objects, their data, and their relationships (eg, [RRDH06]).

Motivating Example

In contrast to comparison, an implementation of theaddLast method is easily understood, i.e., it only modifies the other field of the last node of the recipient list object by assigning its parameter; it's actually easier to use the actual implementation than to specify it, i.e., one can focus on checking the join first by using an implementation of addLast. Unlike techniques primarily concerned with heap shapes [LAS00], these types of properties make it difficult to automatically use heap abstraction techniques that summarize objects, because the elements (whose numbers can be unlimited).

Background
Issues in Symbolic Execution
k-bounding
Lazier Initialization
Formalization

For example, symbolic execution of a non-terminating loop that adds concrete objects to a linked list does not end gracefully. Arithmetic operations and branch instructions are performed in the same way as typical symbolic execution [Kin76].

Figure 9.2: Lazy and Lazier Initializations

Contract-Based Symbolic Execution

Heap Region Versioning

Additionally, we associate a version number with each region, which is incremented when any object in the region is updated. This allows us to detect that subsequent method calls, such as compare, whose context is in the same regions and versions, return the same result value.

Context Versioning

Refined by this region specification, the analysis begins with two fresh symbolic references pointed to by the latter, denoted by ρ1 and ρ2 as their region descriptors. Freshly created concrete objects are marked with a special region ρ, and the more lazy initialization of the field fρ, which can point to objects from region ρ, can only choose between , a fresh symbolic object, or existing symbol objects in regionρ.

Discussion

The first half shows the total time taken by Bogor/Kiasan (including calls to CVC Lite), and the second half shows the time taken by CVC Lite. This happens because the condition we used for the Comparable version requires that all elements of the array be non-zero. So the condition extends the array elements to the limit.

Related Work

Conclusion and Future Work

In Proceedings of the 11th SPIN International Workshop on Model Control Software, LNCS volume 2989, pages 164–181. In Proceedings of the Second International Symposium on Static Analysis (SAS'95), LNCS volume 983, pages 1–18.