SPL Profile
5. Difference Analysis
5.5. Multi Copy Difference Analysis
Comparing more than two copies at the same time is a challenging task. In particular, deciding about the similarity of more than two elements is not sufficiently done as a binary decision. For example, with three elements e1, e2, and e3, there are five possible similarity results, as shown in Table 5.2.
Similarity is a transitive relationship, thus either all elements are differing, all are similar, or only one pair is similar. A binary decision about all elements would return “true” for the last case only, with all elements being similar (i.e., column 5 in Table 5.2). However, there are three cases of partly similarity not covered by a binary decision about all elements’ similarity (i.e., column 2–3 in Table 5.2).
Iterative element matching To cope with this challenge, the SPLEVO
approach proposes to stick to a pairwise element matching of each Integration Copy with the Leading Copy and iteratively build up the match model in several, steps as illustrated in Figure 5.7. With each step, the match model
5.5. Multi Copy Difference Analysis
is extended with further Regular Matches or Single Side Matches, and the final match model combines not only the SoftwareModel structures of two copies but of all copies analyzed. If a SoftwareElement cannot be matched with the Leading Copy, it will be compared to SoftwareElements of the previously compared Integration Copies. Here, only SoftwareElements will be compared that exist at the same location and did not match to the Leading Copy as well. This allows identifying similarities between the Integration Copies without a match to the Leading Copy.
Difference derivation Afterwards, the difference derivation still needs to check for Single Side Matches. In the context of multi-copy difference analysis, the definitions of Regular and Single Side Matches are still valid, but a Match element must now be able to reference more than one integration SoftwareElement (i.e., cardinality of the reference Match.integration must be changed from 0..1 to 0..∗). Thus, a Single Side Match now indicates a SoftwareElement exists in either the Leading Copy only or in one or more Integration Copies. Accordingly, if a Single Side Match refers to more than one Integration Copy, separate Difference elements of type ADD must be created for each referenced integration element. Similarly, for Regular Matches with a SoftwareElement existing in at least one but not all Integration Copies, Difference elements of type DELETE will be created for each of the integration copies not containing the element.
Model adaptations To realize this approach, the metamodel of the Dif-ferenceModel requires an adaptation, as already mentioned above. Match elements must become able to reference more than one integration Soft-wareElement, as the same element might be detected in more than one Integration Copy.
Conclusion The strategy presented above allows for comparing more than two product copies without the need of a full pairwise comparison between all copies. At the same time, the strategy allows for binary similarity decisions between pairs of elements to reduce the complexity of the similarity decisions and to enable more precise decisions, as exemplified in Table 5.2.
Software Model 1 (leading) Software Model 2
(integration) Match Model M
M
MMM M
MM
Initialization MatchingPair 1 Diffing
Diff Model M
M
MMD M
DM
DDifference element MMatch element Variation Point Modell VPM
VPVariation Point VVPGVariant Software Model 1
(leading) MatchingPair 2Soft
ware Model 3(integration) Match Model M
M
MMM M
MM MD Post
-Processing
Differenz-Modell M
MD M
D D
VPG
V VP VPG
V VP VPG
V VP
Variation Point Group
Figure5.7.:MulticopydifferenceanalysisconceptSoftwaremodelextractionleftoutforthesakeofbrevity
5.5. Multi Copy Difference Analysis
Algorithm 1:Software Model Matching
input :SoftwareModel: sml// of Leading Copy l SoftwareModel: smi// of Integration Copy i output :Set<Match>: rootMatches← /0
ScopeFilter(sml) // Filter resources and elements out of scope
ScopeFilter(smi)
ElementTypeFilter(sml ) // Filter behavior irrelevant elements
ElementTypeFilter(smi)
Set<Resource>: matchingCandidates← smi.resources foreach Resource: rl∈ sml.resources do
Resource:
ri←BestMatchResource(rl,sml.resources,matchingCandidates) if ri!= null then
SoftwareElement: sel← rl.root SoftwareElement: sei← ri.root
Match: m← Match(sel,sei) // Regular Match
m.submatches ←SubMatchTraversing(sel,sei) // Recursion
rootMatches← rootMatches ∪ m
matchingCandidates← matchingCandidates \ ri
else
rootMatches← rootMatches ∪ Match(sel,null) // Single
Side Match
end end
foreach ri∈ matchingCandidates do // remaining candidates
rootMatches← rootMatches ∪ Match(null,ri.root) // Single
Side Match
end
Algorithm 2:Sub Match Traversing
input :SoftwareElement: sel// of Leading Copy l SoftwareElement: sei// of Integration Copy i output :Set<Match>: submatches← /0
Set<SoftwareElement>: matchCandidates← sei.childElements foreach SoftwareElement: celin sel.childElements do
foreach SoftwareElement: ceiin matchCandidates do ifSimilarityCheck(cel,cei)== true then
Match: m← Match(cel,cei) // Regular Match
m.submatches ←SubMatchTraversing(cel,cei) // Recursion
submatches← submatches ∪ m
matchCandidates← matchCandidates \ cei
continue with next leading cel; end
end
submatches← submatches ∪ Match(cel,null) // Create Single Side Match
end
foreach ceiin matchCandidates do //remaining candidates
submatches← submatches ∪ Match(null,cei) // Create Single Side Match
end
return submatches;
5.5. Multi Copy Difference Analysis
Algorithm 3:Recursive Difference Derivation input :DifferenceModel: dm // the match model
output :DifferenceModel: dm // the match model with difference elements
foreach Match: m∈ dm.rootMatches do
DetectDifferences(m)
end
Function DetectDifferences(m: match) is
if m.leading != null AND m.integration != null then // Regular
Match
foreach Match: msub∈ m.submatches do
DetectDifferences(msub) end
else // Single Side Match
ifBelowMinGranularity(m) then
Match: mp←FindCoarseGrainEnoughParent(m) Difference: d← Difference(mp.leading,CHANGE) mp.differences ← mp.differences ∪ d
else if m.leading == null then
Difference: d← Difference(m.integration,ADD) Match: mp← m.parent
if mp== null then mp← m mp.differences ← mp.differences ∪ d else if m.integration == null then
Difference: d← Difference(m.leading,DELETE) Match: mp← m.parent
mp.differences ← mp.differences ∪ d end
end end
Algorithm 4:Derived Copy Cleanup (for Java technology) input :DifferenceModel: dm
output :DifferenceModel: dm // without Derived Copy false positives foreach Di f f erence: d∈ dm do // Collected by traversing the
Match Element tree
if d.type == DELETE AND
TypeOf(d.changedElement) ∈ {Field,Method,Import} then Match: m← d.match.parent
ifTypeOf(m.leading) ∈ {Class} then Class: classl← m.leading Class: classi← m.integration
if classiextends classlthen // Java inheritance
dm.differnces ← dm.differnces \ d end
end end end
Algorithm 5:Model Condensation input :DifferenceModel: dm
output :DifferenceModel: dm// reduced in size
foreach Match: m∈ dm.rootMatches do
CondenseMatch(m)
end
Function CondenseMatch(Match: m) is foreach Match: sm∈ m.submatches do
CondenseMatch(sm)
end
if m.submatches == /0 ∧ m.differences == /0 then Match: mp← m.parent
mp.submatches ← mp.submatches \ m end
end
5.5. Multi Copy Difference Analysis
# Requirement Supported by Section
R1 Support Independent and Derived Copies
Derived Copy Cleanup 5.3.3.1 R2 Consider Copy Renaming
Conventions
Normalization during matching phase
5.3.1 R3 Support Intended
Variabil-ity Mechanisms
Granularity level-aware difference derivation
5.3.2 R4 Allow for Configuration
of Analysis Scope
ScopeFilter during match-ing
5.3.1 R5 Analyze Independent
Source Directories
No assumptions except software model structure
5.3.1 R6 Favor False Positives over
False Negatives
Strict hierarchical compar-ison
5.3.1.2 R7 Provide Binary Decision Omit heuristics 5.3.1.2 R8 Support Heterogeneous
Software Artifacts
Overall generic algorithm with explicit adaptation points
Table 5.1.:SPLEVO Difference Analysis: Design decisions for consolidation re-quirements
Algorithm 6:Variation Point Initialization, part 1 input :DifferenceModel: dm
output :VariationPointModel: vpm
foreach Difference: d∈ dm do // Collected by traversing the Match tree
VariationPointGroup: vpg←CreateVariationPointGroup(d)
vpm.variationPointGroups ← vpm.variationPointGroups ∪ vpg end
Function CreateVariationPointGroup(Difference: d) is VariationPointGroup: vpg
VariationPoint: vp←CreateVariationPoint(d) vpg.variationPoints ← vpg.variationPoints ∪ vp vpg.id ← vp.location.getLabel()
return vpg end
Function CreateVariationPoint(Difference: d) is VariationPoint: vp
vp.variants ←CreateVariants(d)
vp.location ←DetermineVariationPointLocation(d) return vp;
end
5.5. Multi Copy Difference Analysis
Algorithm 7:Variation Point Initialization, part 2 Function CreateVariants(Difference: d) is
if d.type == ADD then
Variant: v← Variant(d.changedElement,leading = f alse) return /0∪ v;
else if d.type == DELETE then
Variant: v← Variant(d.changedElement,leading = true) return /0∪ v;
else if d.type == CHANGE then
Variant: vl← Variant(d.match.leading,leading = true) Variant: vi← Variant(d.match.integration,leading = f alse) return /0∪ vl∪ vi
end
Function DetermineVariationPointLocation(Difference: d) is
if d.match.leading! = null then // prefer the Leading Copy’s SoftwareElement
return d.match.leading else
return d.match.integration end
Element Pairs
Possible Similarities
1 2 3 4 5
(e1,e2) d s d d s (e2,e3) d d s d s (e1,e3) d d d s s
Table 5.2.:SPLEVO Difference Analysis:Similarity examples for three software elements
(en=software elements, d=different, s=similar)