• Tidak ada hasil yang ditemukan

Chapter 1 INTRODUCTION

1.4. The state of the art of process mining challenges

1.4.4 Addressing Concept Drift

Bose et al. (2014) introduced the topic of concept drift in process mining and proposed a generic framework and a set of features for adequately detecting changes in event logs and localising changes in a process. They have demonstrated that by using the concept drift system, heterogeneity among cases caused by process changes can be effectively detected.

The work of Rekhadevi and Appini (2015) described an idea float based framework and specific strategies for identifying when procedure changes; and limiting the parts of the procedure that have changed. They demonstrated that process changes can be managed if the idea floats have been determined.

Nithya et al. (2015) used the drift concept to determine agent guilt. They proposed a framework based on data strategies across the agent to upgrade the likelihood of determining if there is a leak of data. Dealing with concept drift system presented in this paper can be used to identify changes in real life event logs even with an insignificant number of cases.

Raviteja Pochiraju and Kumar (2015) proposed a generic approach and particular strategies for identifying the parts of the process that have been changed once a method is modified. The framework is based on various area units that characterise relationship among activities to discover variations.

Aruna and Laxmi Priya (2015) introduced the first online procedure to identify and handle the drift concept. The proposed system stands on using both abstract interpretation and sequential sampling with the new data stream approaches. Their results also

21

demonstrated that it is possible to efficiently handle non homogeneous cases generated by process changes.

Li and Kang (2015) proposed new process mining procedure to rebuild the workflow process that faced deviations of workflow instances. The system consists of building MarKow transition matrix based on analysing the workflow log, and then in developing a multi-step workflow mining algorithm to discover structurally relationships between activities. The approach has been proved to be applicable.

Hompes et al. (2015a) proposed a trace clustering approach based on the Markov cluster (MCL) algorithm for detecting common and deviating behaviour based on a set of selected perspectives. In this technique trace clustering and outlier detection are combined in order to find mainstream and deviating behaviour. The process context is considered by using both control-flow and case data in order to be able to find and explain both common and exceptional behaviour. However, MCL algorithm is non-parametric in the number of clusters.

So, the expansion and inflation parameter is set manually. This work was extended in (Hompes et al., 2015b) by providing a comparative trace clustering method that is capable of detecting changing behaviour in a process by using both control-flow and case data. The approach consists of comparing clusterings constructed for two selected fragments of an event log to detect change point. The comparison includes differences in behaviour over time as well as for distinct case groups, i.e., cases handled by different resources.

Lu et al, (2016b) proposed mappings between events based method to detect deviating events by identifying frequent similar behaviour and dissimilar behaviour among executed process instances, without discovering any normative model.

Kakkad & Sheikh, (2016) proposed a generic framework to analyse process changes based on events logs. The framework consists of different features sets that characterize relationship among activities in the event log to detect the changes and identify the regions of change in a process.

Sethi & Kantardzic (2017) presented the Margin Density Drift Detection (MD3) algorithm, which is able to accurately detect concept drift from unlabeled streaming data.

This algorithm exploits the number of samples in a classifier’s region of uncertainty (margin), as a metric for detecting drift. It is robust to stray changes in data distribution, a reliable substitute to supervised drift detectors, and also can be used in a variety of data stream environments.

The papers cited above proposed solutions for dealing with concept drift.

22

Nevertheless, most of the works considered changes only from the control-flow perspective except Hompes et al. (2015a, 2015b), whereas the data and resource perspectives are equitably essential to gain more insights. Hence, more methods which allow detection of changes from other perspectives need to be established. Moreover, drift detection was performed only in an offline setting, but it is also very important for online analysis. In addition, while working on drift concept, researchers faced some issues that need to be addressed. See Table 1.4.

Table 1.4. Summary of the approaches used to deal with concept drift

Paper Ref. Used methodology Outcome Limitation Bose et al.

(2014)

Generic framework and set of features

Detect changes in event logs and localise changes

in a process

- Control-flow

perspective only

- Encountering

challenges:

1. Change-pattern specific features.

2. Feature selection.

3. Holistic approaches.

4. Recurring drifts.

5. Change process discovery.

6. Sample complexity.

7. Online(on-the-fly) drift detection.

Rekhadevi et al. (2015)

Idea float based framework

Process changes can be managed with the identification of idea

floats Raviteja et

al. (2015)

Generic approach and particular strategies

Discover variations Aruna et al.

(2015)

Online procedure Handle efficiently non homogeneous cases generated by process

changes Li et al.

(2015)

Process reconstruct approach based on the

Markov transition matrix of event log

Rebuild the workflow process that faced deviations of workflow

instances Nithya et al.

(2015)

Agent guilt identification based

framework

Determine changes in real life event logs even with insignificant number

of cases

Control-flow perspective only

Hompes et al. (2015a)

Markov cluster algorithm based trace

clustering approach

Detect mainstream and deviating behaviour

The expansion/inflation parameter of the MCL

algorithm is set manually Hompes et Comparative trace Detect differences in Analysis process

23

al., (2015b) clustering approach behaviour automation and changes visualization are

required Kakkad et

al. (2016)

Generic framework and set of features

Detect and localize the changes in a process

Control-flow perspective only Lu et al.

(2016b)

Mappings between events based approach

Detect deviating events without discovering a

normative model

The approach accuracy is slightly lower when deviations are frequent

and more structured.

Control-flow only Sethi et al.

(2017)

Margin Density Drift Detection (MD3)

algorithm

Accurately detect concept drift from unlabeled

streaming data

Detect drifts with significantly fewer false

alarms. Control-flow only.