BCO5501 Business Process Engineering
The main purpose of this essay is to be a review of a particular research study and comparing the finding with other related research studies on the topic of trace clustering technique with the main reviewed research study. The main research method used in this essay was analysis and discussion of the other related research study for getting the conclusion about the main research topic which is trace clustering. The research study named SECPI: Searching for Explanations for Clustered Process Instances will be reviewed in this essay. The finding of the research studies used in this essay will be discussed. In addition, the trace clustering technique will be discussed in detail from different perspectives.
There will also be a short summary which will include all the main points of the essays regarding the discussed research studies. The name of the research studies which will be compared with the main reviewed research study are Discovering Deviating Cases and Process Variants Using Trace Clustering, A Framework for Trace Clustering and Concept-drift Detection in Event Streams and Detection of Temporal Changes In Business Processes Using Cluster Techniques.
The main reviewed research study of this essay is SECPI: Searching for Explanations for Clustered Process Instances. The research study discusses the cluster techniques and how the clustering process solutions are provided and the identification of the necessary feature of control- flow set without whom the presence of a cluster of the process is not possible. The flaws of the trace cluster techniques also had been discussed in this research study. The trace clustering has been defined in this research study as a way of dealing with a problem that usually event logs have in form of a different type of behavior. These different types of behavior can also be termed as process variants of the process.
Two different types of cluster techniques have been discussed in this research study. The first type of cluster technique has been based on the main principle of distance and the second type of cluster techniques is driven by the model-based approach. The main problem stated in this research study is the difficulty in finding the real reasons for segregation of an event log in a specific manner. The trace clustering is also defined as a state of art in this research study ( De Weerdt and vanden Broucke, 2014)
The main problem statement of this research study is which is earlier stated in this essay had been discussed in detail in the research study from different perspectives. The problem of the research study is the difficulty in understanding the reason for segregation of event log in a specific manner has been discussing from a model learning perspective. This model shows that how a cluster bias of a trace clustering techniques affects the composition of the solution which is created in the process. This research study identified that a lot of cluster bias is there in the current processes.
For the distance as a measurement technique of clustering, the data mining techniques are used. In this method, the distance factor is used as a factor which helps the user understand the result of clustering. For getting finding in this method, a network graph or a statistical data used in the comparative analysis had been identified most effective. The major identified drawback of this method is the process generates a large number of variable which decreases the efficiency of the method ( De Weerdt and vanden Broucke, 2014).
For another type of techniques discussed in this research study which is model-driven techniques, the visual analysis of the cluster model result have been identified as the method used for analysis. The drawback identified in this method is the requirement of high level of skill in understanding and identifying the result of the analysis and identifying the impact of balancing between precision, recall, and generalization which was created by techniques of process discovery.
This study also discusses the new analysis approach identified for understanding the distinction between different instances of a cluster of the process. This approach focuses on each individual case separately and does not give a general explanation for the challenges faced by similar processes. The main motive of this approach is to give an accurate and compact explanation for the flaws in the processes. The main dataset of this research study for using the new approach had been collected through a process under which the process instances are formed into feature vectors. Then the data set of this research study named SECPI: Searching for Explanations for Clustered Process Instances is completed by using adequate cluster label to each process instances. The cluster label is used in this data set for better checking and tracking of the result of data mining techniques used in them ( De Weerdt and vanden Broucke, 2014).
Another analysis finding which this research had collected by using algorithm method of a support vector machine classifier to find the process by which a document is classified under a particular category. In this experiment, the support vector machine (SVM) is used as a base model from which the results are derived. This model found to be most compatible with the cases having attributes of complex or multiple nature as this can easily lead to high level of dimensionality. The research study has found large-scale classification under linear method most appropriate for this type of attribute data set ( De Weerdt and vanden Broucke, 2014).
The main success of this paper is that successful adaptation of a new approach with new changes to make the approach more effective in the relation to trace clustering. The algorithm model is created in such a way that each case a separately explained in more detail manner as it will analyze the behavior of that case only. This approach also identifies the attributes with no variation and separates them from data for analysis as they will not have any contribution to the analysis but separating them will make the analysis more compact and accurate. In addition, this also saves the time of the analyst.
The main finding of the research study is the introduction of a newly created method for helping the user about understanding the result of trace clustering. One drawback of this research study is that is an inability in finding any significant reason about the how the compositions of cluster solution are formed but the plan has been made for a further query about the reason of formation of cluster solution in a certain way. The plan includes steps like analysis of attribute template which are an important part of control flow representation and finding of the inclusion of non control- flow attributes ( De Weerdt and vanden Broucke, 2014).
Discussion of other related research studies on the topic of cluster techniques
The first research study used to compare the findings of the main reviewed research study is Discovering Deviating Cases and Process Variants Using Trace Clustering. This research study has focused on searching for deviating cases where the understanding the data has become a major problem because of its uncommon characteristic and these problem affecting the determination of which is extraordinary and which is normal on different points. The author had stated the trace clustering as a mechanism which brings all similar cases together to find variation in the process and to have a better understanding of the current process. The cluster techniques which are related to control flow are focused in this research study. One of those techniques is outlier detection, which differentiates between regular and exceptional behavior with the use of binary factor and on the normative process model. This research study is focused on the ability of the Markov cluster (MCL) algorithm based trace cluster technique to find variation and fluctuation in the processes (Hompes et al., 2015).
The second research study used to compare in this essay is A Framework for Trace Clustering and Concept-drift Detection in Event Streams. In this research study, the problem of concept drift has been focused upon. This research study also discusses the use of e Concept-Drift in Event Stream Framework (CDESF) to solve some of the common problems of the trace clustering. In addition, this research study also discusses the methods that can be used to simplify the tracking and identifying the concept drift problem in the trace cluster (Junior et al., 2017).
The main result of this research study is that Concept-Drift in Event Stream Framework (CDESF) has been effective in the detection of the anomalies and drift in different cases. In addition, another finding of this research study is that a process which had not been completed should be observed in more detail for early detection of the problem in the process (Junior et al., 2017).
The third research study selected to compare in the essay is Detection of Temporal Changes In Business Processes Using Cluster Techniques. This research study focuses on the analysis of the current business process and use of cluster technique to improve a current stable process. Therefore, this research study is focused on the finding the most effective cluster techniques that can deliver better result in a business process. The main finding of this research study is the identification of a new process change named sudden drift (Wwwis.win.tue.nl., 2018).
This introduction of the essay, first of all, had stated the purposes of the essay. Then the outline of the essays was discussed. Then the main research study named SECPI: Searching for Explanations for Clustered Process Instances had been explained in detail and the two types of data discussed in the research study had been explained in detail. Those two types of clustering techniques are technique based upon the distance as measurement criteria and another type of technique used in this research study is a model-driven approach where likeliness of each case is analyzed and then categorize a particular group. In addition, another new modified approach had been introduced in this research study to make the analysis more effective.
Three other research studies were compared with this research study which is related to cluster techniques. The first research study is Discovering Deviating Cases and Process Variants Using Trace Clustering. In this research study, Markov cluster (MCL) algorithm based trace cluster technique to find variation in the processes. The main finding of this research study is that positive variation should be used to improve other process and negative fluctuation should be identified for early detection in other cases. The second research study is A Framework for Trace Clustering and Concept-drift Detection in Event Streams. This research study focus on the concept drifts problem of cluster techniques. For solving this problem, that Concept-Drift in Event Stream Framework (CDESF) had been found effective in the research study. The third compared research study is Detection of Temporal Changes In Business Processes Using Cluster Techniques and the main finding of this research study is the new cluster technique named sudden drift, which was found more effective than other cluster techniques in the research study.
De Weerdt, J. and vanden Broucke, S., 2014, September. SECPI: searching for explanations for clustered process instances. In International Conference on Business Process Management (pp. 408-415). Springer, Cham.
Hompes, B.F.A., Buijs, J.C.A.M., Van der Aalst, W.M.P., Dixit, P.M. and Buurman, J., 2015, November. Discovering deviating cases and process variants using trace clustering. In Proceedings of the 27th Benelux Conference on Artificial Intelligence (BNAIC), November (pp. 5-6).
Junior, S.B., Tavares, G.M., Ceravolo, P. and Damiani, E., 2017. A Framework for Trace Clustering and Concept-drift Detection in Event Streams. In SIMPDA (pp. 153-154).
Wwwis.win.tue.nl. (2018). Wwwis.win.tue.nl. [online] Available at: http://wwwis.win.tue.nl/~wvdaalst/publications/p806.pdf#page=42 [Accessed 29 Sep. 2018].