• Tidak ada hasil yang ditemukan

Programming Languages and Systems

N/A
N/A
Nguyễn Gia Hào

Academic year: 2023

Membagikan "Programming Languages and Systems"

Copied!
234
0
0

Teks penuh

This means that the loop will repeat as long as at least one execution of the original program has been executed. In addition to the (duplicated) arguments of the original call, the current activation variables are passed to the called procedure.

Fig. 1. Example program. The parameter people contains a sequence of integers that each encode attributes of a person; the main procedure counts the number of females in this sequence.
Fig. 1. Example program. The parameter people contains a sequence of integers that each encode attributes of a person; the main procedure counts the number of females in this sequence.

Transformation of Assertions

Heap-Manipulating Programs

Checking the modular products of heap-manipulating programs does not depend on any specific way of achieving the framework. Since heap handling is largely orthogonal to our main technique, we won't go into further detail here, but we do support heap manipulation programs in our implementation.

5 Soundness and Completeness

Soundness with Unary Specifications

Our implementation is based on implicit dynamic frames [25], but other approaches are also feasible, provided procedures can be specified in such a way that the caller knows that the heap remains unmodified for all executions whose activation variables are false.

Soundness for Relational Specifications

Finally, we prove thatˆ s˚p2, σ →∗skip, σ by showing that non-termination of the product implies the non-termination of at least one of the two original program runs. If the condition of a loop in the product remains true forever, the loop condition of at least one encoded execution must be true after each iteration.

Completeness

We show that (1) this is not due to an interaction between multiple executions, since the condition for each execution will remain false if it becomes false once, and (2) since the coded states of active executions progress as they do in the original program, the condition for a single execution in the product remains true forever only if it does so in the original program.

6 Modular Verification of Secure Information Flow

  • Non-interference
  • Information Flow Specifications
  • Secure Information Flow with Arbitrary Security Lattices The definition of secure information flow used in Definition 2 is a special case
  • Declassification
  • Preventing Termination Channels
  • Preventing Timing Channels

Thus, instead of the specification layer(s), information flow assertions can have the form levelBelow(e, l), which means that the security level of expression is at most. We prove this by showing that if the termination condition is true, we can prove the termination of the loop using the provided ranking function.

Fig. 5. Translation of information flow specifications.
Fig. 5. Translation of information flow specifications.

7 Implementation and Evaluation

Implementation in Viper

Using modular product programs, we can verify the absence of time side channels by adding a phantom state to the program that tracks the elapsed time since the program started; this can be achieved, for example, through a simple step-counting mechanism, or by tracing the sequence of previously executed bytecode statements. We can then assert anywhere in the program that the elapsed time does not depend on the upper data in the same way as for program variables.

Qualitative Evaluation

For languages ​​with ambiguous object references, safe information flow may require pointers to be low, i.e., equal to a consistent renaming of addresses. Therefore, our approach to duplicating the stack state space in the implementation differs from that described in Section 4.3: Instead of duplicating objects, our implementation creates a single news statement for everyone in the original program, but duplicates the fields that each object has.

Performance

We show the language features used, lines of code including specifications, general rules used for specifications (Ann), unary specifications for security (SF), relational specifications for non-interference (NI), specifications for termination (TM), and functional specifications required for non-interference (F).

8 Related Work

A popular approach is to use type systems [26]; while these are modular and perform well, they may overestimate program behavior and are therefore less accurate than approximations using logic. In addition, type systems typically struggle to prevent information leakage through side channels such as termination or program aborts.

9 Conclusion and Future Work

They reach a compromise that differs from our solution, which requires specifications but allows for precise, modular reasoning. 19] verify determinism up to equivalence using self-composition, which suffers from the shortcomings explained above.

The images or other third-party material in this chapter are included in the chapter's Creative Commons license, unless otherwise indicated in a credit line for the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by law or exceeds the permitted use, you must obtain permission directly from the copyright holder.

Asymptotic Complexity Claims via Deductive Program Verification

1 Introduction

Using CFML, the second and third authors verify the correctness and time complexity of the OCaml implementation of the Union-Find data structure [11]. We propose a standard specification writing style in the setting of the CFML program verification framework by integrating asymptotic time complexity claims (Section 4).

Fig. 1. A flawed binary search. This code is provably correct and terminating, yet exhibits linear (instead of logarithmic) time complexity for some input parameters.
Fig. 1. A flawed binary search. This code is provably correct and terminating, yet exhibits linear (instead of logarithmic) time complexity for some input parameters.

2 Challenges in Reasoning with the O Notation

Instead, it seems desirable to delay the production of the witness and to gradually construct a cost expression as the evidence progresses. This flawed proof exploits the dubious idea that "the asymptotic cost of a loop is the sum of the asymptotic costs of its iterations".

3 Formalizing the O Notation

  • Domination
  • Filters
  • Examples of Filters
  • Properties of Domination
  • Tactics

The order filter associated with the ordered type (Z,≤) is the most natural filter on the typeZ. An alternative approach is to prove that g(n, m) has complexity O(nm+n) with respect to a stronger filter, namely the product of the standard filter on Z and the universal filter on Z.

4 Specifications with Asymptotic Complexity Claims

  • CFML with Time Credits for Cost Analysis
  • A Modularity Challenge
  • A Record for Specifications
  • Why Cost Functions Must Be Nonnegative
  • Why Cost Functions Must Be Monotonic

3 Square brackets indicate a pure statement of division logic.|l|denotes the length of the list Coq. The above specification informally means that length has time complexity O(n) where the parameter represents |l|, that is, the length of the list.

5 Interactive Proofs of Asymptotic Complexity Claims

Synthesizing Cost Expressions for Straight-Line Code

This metavariable is intensified when the goal is proven by applying one of the reasoning rules. It says that the cost of a sequence is the sum of the costs of its subexpressions.

Synthesizing and Solving Recurrence Equations

During the validation of the first subgoal, Cis is a metavariable and can be instantiated at will (perhaps in several steps), allowing us to collect a concatenation of constraints that onaandb carry. In practice, instead of explicitly constructing and implementing tautologies as above, we use the first author's drag library [16], which provides facilities for introducing new parameters, gradually collecting constraints on these parameters, and finally checking whether these constraints are satisfied.

6 Examples

The boundary conditions of the Summation Lemma (Lemma8) are fulfilled: in particular the function λi.1 +costs b(i) is monotonic. Using the fact that a graph with no double sides must satisfy m≤n2, we prove that the complexity of the algorithm, viewed as a function of n, is equal to O(n3).

7 Related Work

The meaning of the O record in the multivariate case is not indicated; in particular, it is not specified which filter is intended. 4] use Coq to check the correctness of a C program that implements a numerical scheme for solving the one-dimensional sound wave equation.

From Algorithmic Game Theory to Distributed Systems with Mechanized Complexity Guarantees

Contributions

Some of the ideas we present in this paper were previously presented in summary form in a short 3-page announcement at PODC 2017 [4]. 1 Expected cost (per step) of the algorithm minus that of the fixed best action.

Organization

2 Background

Games

The CCE condition states that there is no si that can lower player i's expected cost. CCEs are essentially a relaxation of MNEs that do not require σ to be a product distribution (i.e., the players' strategies can be correlated).

Algorithmic Game Theory

The second projection of the record, distance ax, claims that pm represents a valid distribution: pmfis positive and. The precise definition of the smoothness condition is less relevant here than its consequences: if a cost minimization game is (λ, μ)-smooth, then it has POAλ/(1−μ).

3 Cage by Example

  • Overview
  • Smooth Games DSL
  • Example: Distributed Routing
  • Example: Load Balancing

The delay across an edge, modeled by the affine cost function ce(x) =aex+be, varies with the amount of traffic across that edge. The cost of allocating a stream to a server is modeled with an affine cost function that measures the total load (number of streams) on that server.

Fig. 1. System architecture
Fig. 1. System architecture

4 Smooth Games

Combinators

The smoothness of a single-language game follows case analysis on the results of BA(·) in states and* smoothness inequalities. Since the smoothness of the underlying boundary holds for all states in A , the same boundary holds for the bounded domain of states a ∈ A drawn from P .

5 Multiplicative Weights (MW)

The Algorithm

The agent maintains a weight distribution w in the action space, which is initialized to give equal weight to each action. After receiving the cost vector ct from the environment, the agent updates its weights wt+1 to penalize high-cost actions with a rate determined by the learning constant η ∈ (0.1/2).

MW Is No Regret

A high η close to 1/2 leads to higher penalties and thus relatively less exploration of the action space. The proof of Theorem 1 uses the argument of the potential function, where the potential Φte is equal to the sum of the weights Γt=.

MW Is No Regret)

  • MW Architecture
  • MW DSL
  • Interpreter
  • Proof

The typecstate defines the state of the interpreter after each step, and generally corresponds quite closely to the type-states σ used in MW DSL operational semantics. The oracle state type T is provided by the implementation of the oracle, as in operational semantics.

Fig. 7. MW DSL syntax and operational semantics, parameterized by an environment oracle defining the type T of environment states and the functions oracle recv and oracle send for interacting with the environment
Fig. 7. MW DSL syntax and operational semantics, parameterized by an environment oracle defining the type T of environment states and the functions oracle recv and oracle send for interacting with the environment

6 Coordinated MW

Machine Semantics

The work done by the server is modeled by the auxiliary relation server sent cost vectori f m m which constructs and sends to the client the cost vector derived from the set of client distributionsf. In the distributed MW setting, the cost to the playeri of a given actiona:A is defined as the expected value, over all strategy vectors for N players in which player choice actiona(pi=a), of the cost for playeriofp, with the expectation on the (N−1) size product distribution induced by the playersj=i.

Fig. 8. Semantics of the distributed machine
Fig. 8. Semantics of the distributed machine

Convergence and Optimality

Consider an implementation of the Fig.8semanticsm=⇒+mand that satisfies the conditions of all customer-bounded regrets. Application-level security properties of the system can be proven with respect to a simple, idealized network semantics.

8 Conclusion

In: Proceedings of the thirty-seventh annual ACM Symposium on Theory of Computing, pp.

2 Overview and Basic Notions

Intuitive Hoare Logic Proof

However, with even a simple imperative programming language like we have here, it is necessary to either add Hoare logic rules to Fig.2 or to change our code segment. Next, using the assign and order Hoare logic rules in Fig.2, as well as basic arithmetic via the (HL-consequence) rule, we derive.

Fig. 2. IMP program logic.
Fig. 2. IMP program logic.

Intuitive Coinduction Proof

The key part of the proof above was to show that the reachability claim about the loop (S2) was stable under language semantics. By allowing desirable program properties to be uniformly specified as accessibility requirements about the (executable) language semantics itself, our approach requires no auxiliary formalization of the language for verification purposes, and hence no soundness or equivalence proofs, and no transformations of the original program to make it fit the auxiliary constraints of the semantics.

Fig. 3. Three different operational semantics of IMP, generating the same execution step relation R (or → R ).
Fig. 3. Three different operational semantics of IMP, generating the same execution step relation R (or → R ).

Defining Execution Step Relations

3 Coinduction as Partial Correctness

Definitions and Main Theorem

Recall from Sect.2.1 that c ⇒R P holds if the initial state c can either reach a state inP or can take an infinite number of steps (with →R). We can also use Theorem 1 with other definitions of validity that can be expressed as a largest fixpoint, e.g. validity on all roads.

Example Proof: Sum

Defining stepRthus allows the left side to continue taking execution steps, as long as we continue to unfold the fixed point. So this set of claims is included in S2 by instantiating the universal quantified variable in the definition of S2 withn−1.

Example Proof: Reverse

Sequential composition and the trans rule correspond to a transitivity rule used to link separate pieces of evidence together. The fixed point in the closure definition corresponds to iterative application of these proof rules or to referring back to assertions in the original specification.

4 Experiments

  • Languages
  • Specifying Data Structures
  • Specifying Reachability Claims
  • Proofs and Automation
  • Other Data Structures
  • Schorr-Waite
  • Divergence
  • Summary of Experiments

Each of these operations begins, as in the case of proofs, with certain manipulations of definitions and fixed points in a language-independent kernel. But first, let's check a simpler property of the algorithm, which shows that the given code correctly characterizes the tree, in the absence of splitting or cycles.

Fig. 5. Syntax of HIMP, Stack, and Lambda
Fig. 5. Syntax of HIMP, Stack, and Lambda

5 Subsuming Reachability Logic

Advantages of Coinduction

The total time to run the Bedrock test script was 93 s and 31 s to recheck the proof certificate, significantly slower than our times in Table 2. To match the Bedrock examples more closely, we modified our programs to represent list nodes with fields at sequential addresses instead of using HIMP records, but this only improved performance, down to 20 s to run the proof scripts and 4 s to check the certificates.

Reachability Logic Proof System

Reachability Logic is Coinduction

In this sense, this coinduction framework is much more general than the reachability logic proof system presented in [34]. This lemma suggests what to do: take any reachability logical proof of A ϕ⇒ ϕ and any transition relation R such thatR + A, and produce a co-ductive proof of Sϕ⇒ϕ ⊆validR.

6 Other Related Work

Current Verification Tools

For example, Frama-C and Krakatoa respectively attempt to verify C and Java by translating through Why. A language-independent, reliable and (relatively) complete co-inductive proof method then allows us to verify the properties of programs by directly using operational semantics.

Operational Semantics Based Approaches

Developed languages ​​use predicates of state shallowly embedded in Coq, and inference rules derive directly from operational semantics. Iris [36] is a language-independent concurrent partitioning logic, with operational semantics formalized in Coq.

Other Coinduction Schemata

Moreover, the verification in the paper relies on Hoare-style reasoning, whereas in our approach we do not assume any such verification style, as we work directly with mathematical specifications. Finally, the monoids used are not generated and are specific to the programming language used.

7 Conclusion and Future Work

Felleisen, M.: Lambda-ν-cs conversion computations: a syntactic theory of control and condition in higher-order imperative programming languages. Chlipala, A.: The Bedrock structured programming system: combining generative metaprogramming and Hoare logic in an extended program verifier.

Protocols Powered by Coq

Furthermore, Lamport, Shostak, and Pease wrote of such programs: "We know of no area of ​​computer science or mathematics where informal reasoning is more likely to lead to error than in the study of this type of algorithm." [54]. Although we use a different model—Castro used I/O automata (see Sect.7.1), while we use a logic-of-event model (see Sect.3)—our mechanical proof is built on top of his pen-and-paper proof.

2 PBFT Recap

Overview of the Protocol

ViewChange. The view change procedure ensures progress by allowing replicas to change the leader so they don't have to wait endlessly for a failed primary. We have proven to be a critical security feature of PBFT, including the garbage collection and display change procedures, which are essential in practical protocols.

Fig. 1. PBFT normal-case (left) and view-change (right) operations
Fig. 1. PBFT normal-case (left) and view-change (right) operations

Properties

Heren is the sequence number of the last executed request and d is the state summary. 1, V, O, Nσp, where V is the set of 2f + 1 valid view change messages received by p; O is the set of messages prepared since the last checkpoint reported in V; and N contains only a special null request for which the implementation is no operation.

Differences with Castro’s Implementation

3 Velisarios Model

  • The Logic of Events
  • Messages
  • Authentication
  • Event Orderings
  • Computational Model
  • Assumptions

This identity is used to select the corresponding receiver key to check the authenticity of the data using verify. Otherwise, they return some s where the state of the machine is updated according to the events.

Fig. 2. Outline of formalization
Fig. 2. Outline of formalization

4 Methodology

  • Automated Inductive Reasoning
  • Quorums
  • Certificates
  • Knowledge Theory

One must provide a lak data2info function to extract the information embedded in some piece of data. Using this predicate, we can then combine the quorum and knowledge theories to prove the following lemma, which captures the fact that if there are two quorums for information nfo1 (known ate1) and nfo2 (known ate2), and the intersection of the two quorums is guaranteed to contain a correct node, then there must be a correct node that both happens to be 2 and fone. 2-this lemma follows from know-propagates and overlapping quorums:. learn or knowsm∧learn if knowsm). know certificate term e1 k nfo1 P ∧know certificate term e2 k nfo2 P).

5 Verification of PBFT

To prove this lemma, we proved most of the invariants stated by Castro in [14, Appendix A]. Similarly, Castro proved most of his invariants by induction on the duration of runs.

6 Extraction and Evaluation

Logics and Models

The HO model was implemented in Isabelle/HOL [22] and used, for example, to verify the Byzantine agreement algorithm EIGbyz [7] for synchronous systems with reliable links. To our knowledge, there is no tool that allows the generation of code from algorithms specified using the HO model.

Tools

Similar to the Verdi framework, PSync uses a notion of global state and supports reasoning based on multiple first-order (CL) consensus verification logic [ 27 ]. PVSh has been widely used for the verification of malicious fault tolerant synchronous systems as in [74], to the extent that its design was influenced by these verification efforts [68].

8 Conclusions and Future Work

Disel[75,84] is a verification framework that implements decoupling-style software logic and enables component verification of distributed systems. Wilcox, J.R., Woos, D., Panchekha, P., Tatlock, Z., Wang, X., Ernst, M.D., Anderson, T.E.: Verdi: a framework for implementation and formal verification of distributed systems.

Static Analysis for Java

Other works have considered how to integrate a numerical domain with analysis of the heap but unhealthy model method calls [25] and/or focus on very precise properties that do not scale beyond small programs [23,24]. We found that using summary objects causes significant slowdowns, e.g. the vast majority of analysis runs that timed out for used summary objects.

2 Numeric Static Analysis

In summary, our empirical study provides a large, comprehensive evaluation of the effects of key numerical static analysis design choices on performance, accuracy, and their trade-offs; it is the first of its kind. At merge points (eg, after the completion of a conditional), the abstract states of the possible previous states are merged to yield properties that hold regardless of the branch taken.

3 The Heap

Summary Objects (SO)

Then after a subsequent assignment.f :=7 the analysis would weakly update f with 7, producing constraints 5o f 7 in the abstract state. A robust update token implemented by forgetting xin the abstract state,3 and then re-adding it to equal the given value.

Access Paths (AP)

Then in this approach, we add a variable o f to the abstract state, model the field of object o, and we add constraint f =n. Reading from an enumeration object requires expanding the abstract state with a copy of the enumeration object and its constraints, creating a constraint ono f, and then forgetting about f.

Abstract Object Representation (OR)

Two key advantages of APoverSO are that (1) AP supports strong updates to pathsx.f, which are more accurate and cheaper than weak updates, and (2) AP may require fewer variables to keep track of, since in our design access paths are mostly local to the method, while point-to sets are computed throughout the program. A read from x.f, if not previously allocated, is just a normal read after x f is first strongly updated so that the aggregation read is a digest of f for each o∈Pt(x).

4 Method Calls

Interprocedural Analysis Order (AO)

The result is that updated summary objects from Cm replace those that were in the originalC. Then, on a call, we instantiate each placeholder with the constraints in the caller involving the placeholder's overview location.

Context Sensitivity (CS)

Using (only) accessors doesn't greatly affect the usual TD/BU trade-off: TD can provide higher precision by adding caller constraints on callee parsing, while BU's lower precision comes with the benefit of parsing method bodies less often. In TDanalysis, using abstract objects adds relatively stable overhead to all methods, as they are included in the abstract state of each method.

5 Implementation

6 Evaluation

  • Experimental Setup
  • RQ1: Performance
  • RQ2: Precision
  • RQ3: Tradeoffs

The percentage of configurations that timed out while analyzing a program ranged from 0% (xalan) to 90% (graph). The fastest configurations are all of the form BU-AP-CI-*-INT, varying only in the abstract object representation.

Table 2. Benchmarks and overall results.
Table 2. Benchmarks and overall results.

8 Conclusion and Future Work

We derive a program semantics that precisely captures data usage by abstracting the program operational trace semantics and expressing it in a fixed construct form. Finally, we demonstrate the value of expressing such analyzes as abstract rendering by combining them with an existing abstraction of composite data structures such as arrays and lists to uncover unused pieces of data.

Fig. 1. Overview of the program semantics presented in the paper. The dependency semantics , derived by abstraction of the trace semantics , is sound and complete for data usage
Fig. 1. Overview of the program semantics presented in the paper. The dependency semantics , derived by abstraction of the trace semantics , is sound and complete for data usage

2 Trace Semantics

In particular, since input data usage is not a trace property or a subset-closed property [11] (Section 4), we show that a formulation of the semantics using sets of trace sets is necessary for a sound validation of input data usage via fixed point approximation [28]. In the limit we get all infinite traces and all finite traces ending in a final state in Ω.

3 Input Data Usage

The programmer has made two errors in line 7 and line 9, which cause the input data stored in the variablesenglishandscience to be unused. Based on the input variables English, math, and science (cf. lines 1–3), the program is supposed to check whether a student has passed all three school subjects considered and store the result in the output variable pass (cf. line 11).

Fig. 3. Simple program to check if a student has passed three school subjects. The programmer has made two mistakes at line 7 and at line 9, which cause the input data stored in the variables english and science to be unused.
Fig. 3. Simple program to check if a student has passed three school subjects. The programmer has made two mistakes at line 7 and at line 9, which cause the input data stored in the variables english and science to be unused.

4 Sound Input Data Usage Validation

In the next section, we discuss the challenges to applying the standard abstract interpretation framework that arise from the fact that input data usage cannot be expressed as a trace property. More specifically, in the next section we define a program semantics P that defines exactly which subset of the input variables is not used by a program P.

5 Outcome Semantics

The outcome semantics contains the set of all infinite traces and all sets of finite traces that agree on the value of the output variables in their outcomes. At the limit, we obtain a partition containing the set of all infinite traces and all sets of finite traces that agree with the value v of the output variable in their outcomes.

Fig. 4. First iterates of the outcome semantics Λ • for a single output variable o .
Fig. 4. First iterates of the outcome semantics Λ • for a single output variable o .

6 Dependency Semantics

Finally, we show that the outcome semantics Λ• is sound and complete to prove that a program does not use (a subset of) its input variables. Now we can use Lemmas 2 and 3 to express the dependency semanticsΛ in a constructive fixed-point form (as the union ofΛ+andΛω). 16) Proof (sketch). The proof follows directly from Lemmas 2 and 3.

7 Input Data Usage Abstractions

In particular, since the intermediate state computations are irrelevant for deciding the properties of the input data usage, all sets of traces in γ(α(α•([[P]]science))) are super-approximations of exactly one set in α•([[P]]science) with the same set of initial states and outcome. Thus, in this case we can observe that all trace sets in γ(α(α•([[P]]science))) belong to N{science} and correctly conclude that P does not use the science variable.

8 Secure Information Flow Abstractions

Note that in this case non-interference does not hold since the output of the program depends on some of the input variables. This result shows that the non-interference analysis ΛF is an abstraction of the dependency semantics Λ presented earlier.

9 Strongly Live Variable Abstraction

Thus, strong live variable analysis can conclude that the input variable science is unused. We now show that strong live variable analysis is valid for proving that a program does not use the fuzzy variables.

10 Syntactic Dependency Abstractions

Finally, the cards at the top of the stack are added together, and the result is mapped to mathematics, bonus and transfer to U, and all other variables to N (cf. equation 22). Since the program is about to terminate, we have theΛ⊆γQ(ΛQ), by definition of the concretization functionγQ (cf. Eq.23).

11 Piecewise Abstractions

Gambar

Fig. 2. Modular 2-product of the program in Fig.1 (slightly simplified). Parameters and local variables have been duplicated, but control flow statements have not
Fig. 4. Construction rules for statement products.
Fig. 8. Program instrumentation for termination leak prevention. We abbreviate while (e) terminates(e c , e r ) do {s} as w.
Fig. 1. A flawed binary search. This code is provably correct and terminating, yet exhibits linear (instead of logarithmic) time complexity for some input parameters.
+7

Referensi

Dokumen terkait

Many behavioral properties can be checked using a valid trace graph. For instance quasi-liveness, which states that for every transition of the system there exists a run from

+ with an easier, rapid- development environment Visual C# combines the programming elements of C+. + with an easier,

The read operations performed by a single process P at two different local copies of the same data store. Set of

I describe my thought process when planning the sketch, then explain declaration of variables, use of expressions, color data type and color function, the draw function,

To distinguish between continuous-time and discrete-time signals: • The symbol t is used to denote the continuous-time independent variable and the symbol n is used to denote the

If initial values of input wires and corresponding gates can drive all internal wires and output wires to the valid initial values, P will be reset to the valid initial state once

* JavaCC input One file header token specification for lexical analysis grammar Example of a token specification: TOKEN : { < INTEGER_LITERAL: ["1"-"9"] ["0"-"9"]* | "0" > }

* JavaCC input One file header token specification for lexical analysis grammar Example of a token specification: TOKEN : { < INTEGER_LITERAL: ["1"-"9"] ["0"-"9"]* | "0" > }