The algorithm to calculate weighted quantitative simulation can be used as a similarity measure for service automata or OGs, but has two drawbacks: Firstly, it is not an edit distance. It calculates a value that expresses the similarity between the service automata, but gives no information about the modification actions needed to achieve simulation. Secondly, it does not take formulae of the OG into account. Therefore, a high similarity between a service automaton and an OG would not guarantee deadlock freedom as the example of Fig. 3 demonstrates: The service automaton of the customer is perfectly simulated by the OG but the overall choreography deadlocks.
4.1 Simulation-Based Edit Distance
Before we consider the OG’s formulae, we show how the similarity result of the algorithm of [18] can transformed into an edit distance. Given two statesq1and q2, Def. 1 determines the best simulation between the transitions ofq1andq2. In addition, one service automaton can stutter (i. e., remain in the same state). The weighted quantitative simulation function calculates the best label matching to maximize the similarity between the root nodes of the service automata. From the transition pairs belonging to the maximum, we can derive according edit actions (cf. Table 1).
Table 1.Deriving edit actions from transition pairs of Def. 1 transition ofS1 transition ofS2 resulting edit action similarity
a a keep transitiona L(a,a)
a b modify transitionatob L(a,b)
a ε(stutter) delete transitiona L(a, ε)
ε(stutter) a insert transitiona L(ε,a)
These edit actions define basic edit actions whose similarity is determined by the edge similarity functionL. To simplify the representation of a large number of edit actions, the basic edit actions may be grouped to macros to express more complex operations such as swapping or moving of edges and nodes, duplicating of subgraphs, or partial unfolding of loops.
The simulation-based edit distance does not respect the OG’s formulae. One possibility to achieve a matching would be to first calculate the most similar simulating service using the edit distance for Def. 1 and then to simply add and remove all nodes and edges necessary in a second step. Using the weighted quantitative simulation function of Def. 1, the resulting edit actions (cf. Table 1) simply inserts or removes edges to present nodes rather than to new nodes. This approach does in general not work to achieve matching with an OG. See Fig. 6 for a counterexample. However, also the insertion of nodes would not determine the most similar partner service, because this may result in sub-optimal solutions as Fig. 7 illustrates.
4.2 Combining Formula-Checking and Graph Similarity
Due to the suboptimal results achieved by a-posteriori formula satisfaction by node insertion, we need to modify the algorithm of [18] not to statically take the outgoing transitions of an OG’s state into account, but also check any formula- fulfilling subset of outgoing transitions. Therefore, we need some additional def- initions to base formula satisfaction and to cover the dynamic presence of OG transitions.
Definition 2 (Satisfying label set, label permutation). Let S = [QS, δS, FS, q0S, I] be a service automaton and O= [QO, δO, FO, q0O, I] an OG, and let q1∈QS andq2∈QO.
– DefineSat(ϕ(q2))⊆ P(I∩ {b| ∃q2∈QO:q2
−b
→q2})to be the set of all sets of labels of transitions leaving q2 that satisfy formulaϕ of stateq2.
– Forβ ∈Sat(ϕ(q2)), define perm(q1, q2, β)
(I∪ {ε})×(I∪ {ε}) to be a label permutation ofq1,q2 andβ such that:
(a) if q1
−→a q1, then(a, c)∈perm(q1, q2, β)for a label c∈β∪ {ε},
(b) ifq2−→b q2andb∈β, then(d, b)∈perm(q1, q2, β)for a labeld∈I∪ {ε}, (c) (ε, ε)∈/ perm(q1, q2, β), and
(d) if(a, b)∈perm(q1, q2, β), then(a, c),(d, b)∈/perm(q1, q2, β)for all labels c∈β∪ {ε}and all labelsd∈I∪ {ε}.
– DefineP erms(q1, q2, β)to be the set of all label permutations ofq1,q2andβ.
?a
(a)
?a ?b
true
?a
?c
?b
?c
(b)
?a
?b
(c)
?a
?b
?c
(d)
Fig. 6.Matching cannot be achieved solely by transition insertion. The service automa- ton (a) does not match with the OG (b) because of a missing ?b-branch. In service automaton (c), a loop edge was inserted. However, the state reached by?bin the OG requires a?c-branch to be present. After inserting this edge (d), the resulting service automaton is not simulated by the OG (b).
!a
?c
(a)
a! !b
?c?d?e
!a
true
?c true true
?c
true
?c
!b
?d ?e
(b)
!a
?c ?d ?e
(c)
!b
?c
(d)
Fig. 7.Adding states to a simulating service automaton may yield sub-optimal results.
The service automaton (a) does not match with the OG (b), because the formula (?c∧?d∧?e) is not satisfied. The OG, however, perfectly simulates the service automaton (a), and adding two edges achieves matching (c). However, changing the edge label of (a) from!ato!balso achieves matching, but only requires a single edit action (d).
The set Sat consists of all sets of labels that fulfill a state’s formula. For ex- ample, consider the OG in Fig. 3(b): For stateq2 of the OG Oagency⊕airline, we haveSat(ϕ(q2)) ={{?confirmation,?refusal}}. Likewise,Sat(ϕ(q3)) ={{?offer}, {!payment},{?offer,!payment}}.
The setP ermsconsists of all permutations of outgoing edges of two states.
In a permutation, each outgoing edge of a state of the service automaton has to be present as first element of a pair (a), each outgoing edge of a state of the OG that is part of the label setβ has to be present as second element of a pair (b).
As the number of outgoing edges of both states may be different,ε-labels can occur in the pairs, but no pair (ε, ε) is allowed (c). Finally, each edge is only allowed to occur once in a pair (d).
For β = {?confirmation,?refusal} and state q1 of the service automaton S1
in Fig. 3(a), {(?confirmation,?confirmation),(ε,?refusal)} is one of the permu- tations in P erms(q1,q2, β). Another permutation is {(?confirmation,?refusal), (ε,?confirmation)}. The permutations can be interpreted like the label pairs of the simulation edit distance: (?confirmation,?confirmation) describes a keep- ing of ?confirmation, (?confirmation,?refusal) describes changing ?confirmation to ?refusal, and (ε,?refusal) the insertion of a ?refusal transition. The inser- tion and deletion has to be adapted to avoid incorrect or sub-optimal results (see Fig. 6–7).
Definition 3 (Subgraph insertion, subgraph deletion). Let S = [QS, δS, FS, q0S, I] be a service automaton andO= [QO, δO, FO, q0O, I]an OG. Define
ins(q2) =
⎧⎨
⎩
1, if q2∈FO,
(1−p) + max
β∈Sat(ϕ(q2))
p
|β| ·
b∈β
L(ε, b)·ins(δO(q2, b)), otherwise,
del(q1) =
⎧⎪
⎨
⎪⎩
1, if q1∈FS,
(1−p) +p n ·
q1−→a q1
L(a, ε)·del(q1), otherwise,
wherenis the number of outgoing edges of q1.
Functionins(q2) calculates the insertion cost of the optimal subgraph of the OG O beginning at q2 which fulfills the formulae. Likewise,del(q1) calculates the cost of deletion of the whole subgraph of the service automatonS from stateq1. Both functions only depend on one of the graphs; that is, ins and del can be calculated independently from the service automaton and the OG, respectively.
Definition 3 actually does not insert or delete nodes, but only calculates the similarity value of the resulting subgraphs. Only this similarity is needed to find the most similar partner service and the actual edit actions can be easily derived from the state from which nodes are inserted or deleted (cf. Table 1).
With Def. 2 describing means to respect the OG’s formulae and Def. 3 cop- ing with insertion and deletion, we can finally define the weighted quantitative matching function:
Definition 4 (Weighted quantitative matching). Let S = [QS, δS, FS, q0S, I] be a service automaton andO= [QO, δO, FO, q0O, I] an OG. Aweighted quantitative matchingis a function M :QS ×QO →[0,1], such that:
M(q1, q2) =
1, if (q1∈FS∧q2∈FO), (1−p) +W1(q1, q2), otherwise,
W1(q1, q2) = max
β∈Sat(ϕ(q2)) max
P∈P erms(q1,q2,β)
p
|P|·
(a,b)∈P
W2(q1, q2, a, b),
W2(q1, q2, a, b) =
⎧⎪
⎨
⎪⎩
L(a, b)·M(δS(q1, a), δO(q2, b)), if(a=ε∧b=ε), L(ε, b)·ins(δO(q2, b)), ifa=ε,
L(a, ε)·del(δS(q1, a)), otherwise.
The weighted quantitative matching function is similar to the weighted quan- titative simulation function (Def. 1). It recursively compares the states of the service automaton and the OG, but instead of statically taking the OG’s edges into consideration, it uses the formulae and checks all satisfying subsets (W1).
Additionally,W2 organizes the successor states determined by the labelsaand b, or the insertion or deletion.
4.3 Matching-Based Edit Distance
Again, we can straight-forwardly extend the weighted quantitative matching function towards an edit distance, because the permutations give information how to modify the graph. Keeping and modification of transitions is handled as in Table 1, whereas adding and deletion of nodes can be derived from Def. 3.
In fact, the weighted quantitative matching function is not a classical distance.
It expresses the similarity between a service automaton and an OG (i. e., a characterization of many service automata) and is hence not symmetric. We still use the term “edit distance” to express the concept of a similarity measure from which edit actions can be derived.
Consider the example from Fig. 3. During the calculation ofM(q1,q2), the permutation{(?confirmation,?confirmation),(ε,?refusal)}is considered. The first label pair denotes that the?confirmationtransition is kept unmodified. The sec- ond label pair denotes an insertion of a ?refusal transition. The value of this insertion is defined by
L(ε,?refusal)·ins(δOagency⊕airline(q2,?refusal)) =L(ε,?refusal)·ins(q4)
=L(ε,?refusal) and only depends on the similarity functionL.
?offer
?confirmation
!booking
!payment
!rejection
?refusal
keep transition "?offer" to state q6
keep transition "!booking" to state q7 keep transition "!rejection" to state q8
keep transition "!payment" to state q1
keep transition "?confirmation" to state q8 insert transition "?refusal" to new state q9 q5
q6
q7 q1
q8 q
9
Fig. 8.Matching-based edit distance applied to the customer’s service
Figure 8 shows the result of the application of the matching-based edit dis- tance to the service automaton of Fig 3(a). The states are annotated with edit actions. The service automaton was automatically generated from a BPEL pro- cess and the state in which a modification has to be made can be mapped back to the original BPEL activity. In the example, a receive activity has to be replaced by apickactivity with an additionalonMessagebranch to receive the refusal message.