99 Process Mining: Overview and Opportunities - Semantic Scholar

29 downloads 177 Views 1MB Size Report
Additional Key Words and Phrases: Process mining, business intelligence, ..... The third, and most advanced, approach is
99 Process Mining: Overview and Opportunities WIL VAN DER AALST, Eindhoven University of Technology

Over the last decade, process mining emerged as a new research field that focuses on the analysis of processes using event data. Classical data mining techniques such as classification, clustering, regression, association rule learning, and sequence/episode mining do not focus on business process models and are often only used to analyze a specific step in the overall process. Process mining focuses on end-to-end processes and is possible because of the growing availability of event data and new process discovery and conformance checking techniques. Process models are used for analysis (e.g., simulation and verification) and enactment by BPM/WFM systems. Previously, process models were typically made by hand without using event data. However, activities executed by people, machines, and software leave trails in so-called event logs. Process mining techniques use such logs to discover, analyze, and improve business processes. Recently, the Task Force on Process Mining released the Process Mining Manifesto. This manifesto is supported by 53 organizations and 77 process mining experts contributed to it. The active involvement of end-users, tool vendors, consultants, analysts, and researchers illustrates the growing significance of process mining as a bridge between data mining and business process modeling. The practical relevance of process mining and the interesting scientific challenges make process mining one of the “hot” topics in Business Process Management (BPM). This paper introduces process mining as a new research field and summarizes the guiding principles and challenges described in the manifesto. Categories and Subject Descriptors: H.2.8 [Database Management]: Database Applications—Data Mining General Terms: Management, Measurement, Performance Additional Key Words and Phrases: Process mining, business intelligence, business process management, data mining ACM Reference Format: Van der Aalst, W.M.P. 2011. Process Mining: Overview and Opportunities. ACM Trans. Manag. Inform. Syst. 99, 99, Article 99 (February 2012), 16 pages. DOI = 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000

1. INTRODUCTION

Process mining aims to discover, monitor and improve real processes by extracting knowledge from event logs readily available in today’s information systems [Aalst 2011]. Over the last decade there has been a spectacular growth of event data and process mining techniques have matured significantly. As a result, management trends related to process improvement and compliance can now benefit from process mining. Starting point for process mining is an event log. Each event in such a log refers to an activity (i.e., a well-defined step in some process) and is related to a particular case (i.e., a process instance). The events belonging to a case are ordered and can be seen as one “run” of the process. Event logs may store additional information about events. Author’s address: Department of Mathematics and Computer Science, Eindhoven University of Technology, PO Box 513, 5600 MB, Eindhoven, The Netherlands. [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2012 ACM 0000-0001/2012/02-ART99 $10.00

DOI 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000 ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:2

W. van der Aalst

In fact, whenever possible, process mining techniques use extra information such as the resource (i.e., person or device) executing or initiating the activity, the timestamp of the event, or data elements recorded with the event (e.g., the size of an order).

“world”

business processes people machines components organizations

models analyzes

supports/ controls

specifies configures implements analyzes

software system records events, e.g., messages, transactions, etc.

discovery

(process) model

conformance

event logs

enhancement

Fig. 1. The three basic types of process mining: (a) discovery, (b) conformance, and (c) enhancement.

Event logs can be used to conduct three types of process mining as shown in Fig. 1 [Aalst 2011]. The first type of process mining is discovery. A discovery technique takes an event log and produces a model without using any a-priori information. Process discovery is the most prominent process mining technique. For many organizations it is surprising to see that existing techniques are indeed able to discover real processes merely based on example behaviors stored in event logs. The second type of process mining is conformance. Here, an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. The third type of process mining is enhancement. Here, the idea is to extend or improve an existing process model thereby using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. For instance, by using timestamps in the event log one can extend the model to show bottlenecks, service levels, and throughput times. Unlike traditional Business Process Management (BPM) techniques that use handmade models [Weske 2007], process mining is based on facts. Based on observed behavior recorded in event logs, intelligent techniques are used to extract knowledge. Therefore, we claim that process mining enables evidence-based BPM. Unlike existing analysis approaches, process mining is process-centric (and not data-centric), truly intelligent (learning from historic data), and fact-based (based on event data rather than opinions). Process mining is related to data mining. Whereas classical data mining techniques are mostly data-centric [Hand et al. 2001], process mining is process-centric. Mainstream business process modeling techniques use notations such as the Business Process Modeling Notation (BPMN), UML activity diagrams, Event-driven Process chains (EPC), and various types of Petri nets [Aalst and Stahl 2011; Desel and Reisig 1998; Weske 2007]. These notations can be used model process processes with concurrency, choice, iteration, etc. ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:3

This paper introduces not only process mining as a new research field, but also familiarizes the reader with the Process Mining Manifesto [TFPM 2011] released by the Task Force on Process Mining in October 2011. The growing interest in log-based process analysis motivated the establishment of a Task Force on Process Mining in 2009. This manifesto aims to promote the topic of process mining. Moreover, by defining a set of guiding principles and listing important challenges, this manifesto hopes to serve as a guide for software developers, scientists, consultants, business managers, and endusers. The goal is to increase the maturity of process mining as a new tool to improve the (re)design, control, and support of operational business processes. The remainder of this paper is organized as follows. Section 2 introduces the notion of an event log, used as input for process mining. Section 3 shows how process models can be discovered from scratch using only raw event data. Section 4 discusses the second type of process mining: conformance checking. Section 5 elaborates on the third type of process mining: enhancement. The guiding principles and challenges listed in the manifesto are summarized in Section 6. Section 7 discusses tool support and shows some real-life examples. Section 8 concludes the paper.

2. EVENT LOGS AS A STARTING POINT FOR PROCESS MINING

Digital event data is everywhere – in every sector, in every economy, in every organization, and in every home – and will continue to grow exponentially [Manyika et al. 2011]. The omnipresence of such data allows for new forms of process analysis, i.e., based on observed facts rather than hand-made models. Starting point for process mining is an event log. To introduce the basic process mining concepts we use the event log shown in Fig. 2 (log is taken from Chapter 5 of [Aalst 2011]). This event log contains 1391 cases, i.e., instances of some reimbursement process. There are 455 process instances following trace acdeh. Activities are represented by a single character: a = register request, b = examine thoroughly, c = examine casually, d = check ticket, e = decide, f = reinitiate request, g = pay compensation, and h = reject request. Hence, trace acdeh models a reimbursement request that was rejected after a registration, examination, check, and decision step. 455 cases followed this path consisting of five steps, i.e., the first line in the table corresponds to 455 × 5 = 2275 events. The whole log consists of 7539 events. Note that events can have all kinds of additional attributes (timestamps, transactional information, resource usage, etc.). Consider for example one of the 1391 “a” events. Such an event refers to the execution of “register request” for some reimbursement. The event may have a timestamp, e.g., “23-01-2012:8.38”, and an attribute describing the resources involved. Moreover, data attributes of the reimbursement (e.g., name of customer or number of loyalty card) and data attributes of the registration event (e.g., the amount claimed or a booking reference) may have been recorded. All such attributes can be used by process mining techniques. However, the backbone of process mining is the control-flow perspective. Therefore, for simplicity, events in Fig. 2 are described by their activity names only. However, it is important to realize that events can have various attributes, e.g., timestamps can be used for bottleneck analysis and resource attributes can be used for organizational mining (e.g., finding allocation rules).

3. DISCOVERY

This section introduces the notion of process discovery, i.e., automatically construct models based on observed events. ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:4

W. van der Aalst

#

trace

455 191 177 144 111 82 56 47 38 33 14 11 9 8 5 3 2 2 1 1 1

acdeh abdeg adceh abdeh acdeg adceg adbeh acdefdbeh adbeg acdefbdeh acdefbdeg acdefdbeg adcefcdeh adcefdbeh adcefbdeg acdefbdefdbeg adcefdbeg adcefbdefbdeg adcefdbefbdeh adbefbdefdbeg adcefdbefcdefdbeg

b examine thoroughly

M1 p1

a start

register request

p2

c

g p5

p3

e

examine casually

decide

d

reject request

f

reinitiate request

check ticket

d b p1

a start

end

h

p4

check ticket

M2

pay compensation

register request

g p3

examine thoroughly

e p2

c

pay compensation

decide

end

h

examine casually

f1

reinitiate request

reject request

f2 reinitiate request

1391 Fig. 2. One event log and two potential process models (M1 and M2 ) aiming to describe the observed behavior.

3.1. Applications of Process Discovery

Organizations use procedures to handle cases. Sometimes such procedures are enforced by the information system. However, in most cases, procedures are informal and may not have been documented at all. Moreover, even when procedures have been documented, reality may be very different. Therefore, it is important to discover the actual processes using event data. The discovered process models may be used — for discussing problems among stakeholders (to reach consensus; it is important to have a shared view of the real processes), — for generating process improvement ideas (seeing the actual process and its problems stimulates re-engineering efforts), — for model enhancement (e.g., bottleneck analysis, see Section 5), and — for configuring a WFM/BPM system (the discovered process model can serve as a template). 3.2. Learning Process Models From Event Logs

Process discovery techniques produce process models based on event logs such as the one shown in Fig. 2. For example, the classical α-algorithm produces model M1 for this log. This process model is represented as a Petri net [Aalst and Stahl 2011; Desel and Reisig 1998]. A Petri net consists of places (start, p1, p2, p3, p4, p5, and end ) and transitions (a, b, c, d, e, f , g, and h). Transitions may be connected to places and places may be connected to transitions. It is not allowed to connect a place to a place or a transition to a transition. ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:5

The state of a Petri net, also referred to as marking, is defined by the distribution of tokens over places. A transition is enabled if each of its input places contains a token. For example, in M1 , transition a is enabled in the initial marking of M1 , because the only input place of a contains a token (black dot). An enabled transition may fire thereby consuming a token from each of its input places and producing a token for each of its output places. Firing a in the initial marking corresponds to removing one token from start and producing two tokens (one for p1 and one for p2). After firing a, three transitions are enabled: b, c, and d. There is a non-deterministic choice between b and d. Firing b will disable c because the token is removed from the shared input place (and vice versa). Transition d is concurrent with b and c, i.e., it can fire without disabling another transition. Transition e becomes enabled after d and b or c have occurred. Note that transition e in M1 is only enabled if both input places (p3 and p4) contain a token. After executing e, three transitions become enabled: f , g, and h. These transitions are competing for the same token thus modeling a choice. When g or h is fired, the process ends with a token in place end. If f is fired, the process returns to the state just after executing a. It is easy to check that all traces in the event log can be reproduced by M1 . This does not hold for the second process model in Fig. 2. M2 is able to reproduce traces such as acdeh (455 instances), abdeg (191 instances), and acdefbdeh (33 instances). Note that M2 has two transitions corresponding to activity f . To refer to them they are named f1 and f2 . M2 also allows for behavior very different from what can be observed in the log, e.g., abeg and abdddddf1 bddddeh are possible according to the model but do not appear in the log. There are also traces in the log that cannot be replayed by M2 , e.g., adceh (177 instances), adceg (82 instances), and adcefcdeh (9 instances) are not possible according to M2 . The two process models in Fig. 2 are visualized in terms of Petri nets. In fact, both models are so-called WF-nets [Aalst et al. 2011]. A WF-net is a Petri net with one source place and one sink place such that all places and transitions are on a path from source to sink. Both models in Fig. 2 have a source place named start and a sink place end and all nodes are on a path from start to end. In general, the notation used to visualize the result may be very different from the representation used during the actual discovery process. All mainstream BPM notations (Petri nets, EPCs, BPMN, YAWL, UML activity diagrams, etc.) can be used to show discovered processes such as M1 [Aalst 2011; Weske 2007]. 3.3. Process Discovery Algorithms

Since the mid-nineties several groups have been working on techniques for automated process discovery based on event logs [Aalst et al. 2004; Aalst et al. 2007; Agrawal et al. 1998; Cook and Wolf 1998; Datta 1998; Dongen and Aalst 2004; 2005; Greco et al. 2006; Weijters and Aalst 2003]. In [Aalst et al. 2003] an overview is given of the early work in this domain. The idea to apply process mining in the context of workflow management systems was introduced in [Agrawal et al. 1998]. In parallel, Datta [Datta 1998] looked at the discovery of business process models. Cook et al. investigated similar issues in the context of software engineering processes [Cook and Wolf 1998]. Herbst [Herbst 2000] was one of the first to tackle more complicated processes, e.g., processes containing duplicate tasks. Most of the classical approaches have problems dealing with concurrency. The αalgorithm [Aalst et al. 2004] is an example of a simple technique that takes concurrency as a starting point. The α-algorithm scans the event log for particular patterns. For example, if activity a is followed by b but b is never followed by a, then it is assumed that there is a causal dependency between a and b. To reflect this dependency, the corresponding Petri net should have a place connecting a to b. We use the notation, a > b ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:6

W. van der Aalst

if and only if there is a trace σ = ht1 , t2 , t3 , . . . tn i in the log and an i ∈ {1, . . . , n−1} such that ti = a and ti+1 = b. a → b if and only if a > b and b 6> a; a#b if and only if a 6> b and b 6> a; and akb if and only if a > b and b > a. These four ordering relations are used to create places connecting the different transitions in the Petri net. The α-algorithm is simple and efficient, but has problems dealing with complicated routing constructs and noise (like most of the other approaches described in literature). Region-based approaches are able to express more complex control-flow structures without underfitting. State-based regions were introduced in 1989 [Ehrenfeucht and Rozenberg 1989] and generalized in various ways [Cortadella et al. 1998]. In [Aalst et al. 2010; Dongen et al. 2007; Sole and Carmona 2010] it is shown how these statebased regions can be applied to process mining. In parallel, several authors applied language-based regions to process mining [Bergenthum et al. 2007; Werf et al. 2010]. The basic idea of these approaches is to discover places. Note that the addition of places limits the behavior of the Petri net. The idea is to add places that do not exclude any of the behavior seen in the event log. For practical applications of process discovery it is essential that noise and incompleteness are handled well. Surprisingly, only few discovery algorithms focus on addressing these issues. Notable exceptions are heuristic mining [Weijters and Aalst ¨ 2003], fuzzy mining [Gunther and Aalst 2007], and genetic process mining [Medeiros et al. 2007]. ProM’s heuristic miner uses the algorithm described in [Weijters and Aalst 2003] (see also Section 6.2 in [Aalst 2011]). The algorithm first builds a dependency graph based on the frequencies of activities and the number of times one activity is followed by another activity. Based on predefined thresholds, dependencies are added to the dependency graph graph (or not). The dependency graph reveals the “backbone” of the process model. This backbone is used to discover the detailed split and join behavior of nodes. If an activity has multiple input arcs, then the heuristic miner analyzes the log to see whether the join is an AND-join, an XOR-join or an OR-join. In case of an OR-join, the detailed synchronization behavior is learned. If an activity has multiple output arcs, then the “split behavior” is learned in a similar fashion. See Chapter 6 of [Aalst 2011] for a more elaborate introduction to the various process discovery approaches described in literature. 4. CONFORMANCE

In recent years, powerful process mining techniques have been developed that can automatically construct a suitable process model given an event log. Whereas process discovery constructs a model without any a priori information (other than the event log), conformance checking uses a model and an event log as input. The model may have been made by hand or discovered through process discovery. For conformance checking, the modeled behavior and the observed behavior (i.e., event log) are compared. 4.1. Applications of Conformance Checking

Conformance checking techniques relate events in the log to activities in the model, e.g., events are mapped to transition firings in the Petri net. This way it is possible to compare the observed behavior in the event log and the modeled behavior. For example, one can quantify differences (e.g., “80% of the observed cases are possible according to the model”) and diagnose deviations (e.g., “in reality activity x is often skipped although the model does not allow for this”). Conformance checking can be used — to check the quality of documented processes (asses whether they describe reality accurately), — to identify deviating cases and understand what they have in common, ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:7

— to identify process fragments where most deviations occur, — for auditing purposes, — to judge the quality of a discovered process model, — to guide evolutionary process discovery algorithms (e.g., genetic algorithms need to continuously evaluate the quality of newly created models using conformance checking), and — as a starting point for model enhancement. The above list shows that conformance checking can be used for a variety of reasons ranging from evaluating a process discovery algorithm to auditing and compliance monitoring. Note that auditors need to validate information about organizations by determining whether they execute business processes within certain boundaries set by managers, governments, and other stakeholders. Clearly, event logs provide valuable input for this. 4.2. Diagnosing Differences Between Observed Behavior and Modeled Behavior

Typically, four quality dimensions for comparing model and log are considered: (a) fitness, (b) simplicity, (c) precision, and (d) generalization (Chapter 7 of [Aalst 2011]). A model with good fitness allows for most of the behavior seen in the event log. A model has perfect fitness if all traces in the log can be replayed by the model from beginning to end. Often fitness is described by a number between 0 (very poor fitness) and 1 (perfect fitness). Obviously, the simplest model that can explain the behavior seen in the log is the best model. This principle is known as Occam’s Razor. Fitness and simplicity alone are not sufficient to judge the quality of a discovered process model. For example, it is very easy to construct an extremely simple Petri net (“flower model”) that is able to replay all traces in an event log (but also any other event log referring to the same set of activities). Similarly, it is often undesirable to have a model that only allows for the exact behavior seen in the event log. Remember that the log contains only example behavior and that many traces that are possible may not have been seen yet. (Note that in our simple example most traces are frequent, but often there are many “one-of-a-kind” traces.) A model is precise if it does not allow for “too much” behavior. Clearly, the “flower model” lacks precision. A model that is not precise is “underfitting”. Underfitting is the problem that the model over-generalizes the example behavior in the log (i.e., the model allows for behaviors very different from what was seen in the log). A model should also generalize and not restrict behavior to just the examples seen in the log. A model that does not generalize sufficiently is “overfitting”. Overfitting is the problem that a very specific model is generated whereas it is obvious that the log only holds example behavior (i.e., the model explains the particular sample log, but a next sample log of the same process may produce a completely different process model). The four four quality dimensions for comparing model and log can be quantified in various ways. See [Aalst 2011; Aalst et al. 2012; Adriansyah et al. 2011; Munoz-Gama and Carmona 2011; Rozinat and Aalst 2008] for more details. 4.3. Conformance Checking Algorithms

Basically, there are three approaches to conformance checking. The first approach is to create an abstraction of the behavior in the log and an abstraction of the behavior allowed by the model. An example is the notion of a footprint described in Section 7.3 of [Aalst 2011]. A footprint is a matrix showing causal dependencies between activities. For example, the footprint of an event log may show that x is sometimes followed by y but never the other way around. If the footprint of the corresponding model shows that x is never followed by y or that y is sometimes followed ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:8

W. van der Aalst

by x, then the footprints of event log and model disagree on the ordering relation of x and y. The second approach replays the event log on the model. A naive approach towards conformance checking would be to simply count the fraction of cases that can be “parsed completely” (i.e., the proportion of cases corresponding to firing sequences leading from the initial state to the final state). This approach cannot distinguish between an “almost fitting” case and a case that is completely unrelated to the modeled behavior. A better approach is to continue replaying the event log on the model even when transitions are not enabled. Simply “borrow tokens”, force the transition to fire anyway, and record the problem. In the end, the number of “borrowed tokens” and the number of “tokens left behind” (not consumed) indicate the fitness level. See [Rozinat and Aalst 2008] and Section 7.2 in [Aalst 2011]. The third, and most advanced, approach is to compute an optimal alignment between each trace in the log and the most similar behavior in the model. Consider for example the following three alignments between the example log and model M2 : γ1 =

a c d e h a c d e h

and

γ2 =

a d c e h a  c e h

and

γ3 =

a d c e f d b e h a  c e f2  b e h

γ1 shows a perfect alignment: all moves of the trace in the event log (top part of alignment) can be followed by moves of the model (bottom part of alignment). γ2 shows an optimal alignment for trace adceh in the event log and model M2 . The first move of the trace in the event log can be followed by the model (event a). However, in the second position of the alignment, we see a move of the trace in the event log which cannot be mimicked by the model. This move in just the log is denoted as (d, ). γ3 shows an optimal alignment for trace adcefdbeh in the event log and model M2 . Here, we encounter two situations where log and model cannot move together. Also note the move (f, f2 ), i.e., event f in the log corresponds to the execution of transition f2 . Alignments γ2 and γ3 clearly show the reasons for non-conformance between model and log. Such problems can easily be quantified as shown in [Aalst et al. 2012; Adriansyah et al. 2011]. Conformance can be viewed from two angles: (a) the model does not capture the real behavior (“the model is wrong”) and (b) reality deviates from the desired model “the event log is wrong”). The first viewpoint is taken when the model is supposed to be descriptive, i.e., capture or predict reality. The second viewpoint is taken when the model is normative, i.e., used to influence or control reality. 5. ENHANCEMENT

It is also possible to extend or improve an existing process model using the event log. A non-fitting process model can be corrected using the diagnostics provided by the alignment of model and log. Moreover, event logs may contain information about resources, timestamps, and case data. For example, an event referring to activity “register request” and case “992564” may also have attributes describing the person that registered the request (e.g., “John”), the time of the event (e.g., “30-01-2012:14.55”), the age of the customer (e.g., “45”), and the claimed amount (e.g., “650 euro”). After aligning model and log it is possible to replay the event log on the model. While replaying one can analyze these additional attributes. For example, it is possible to analyze waiting times in-between activities. Simply measure the time difference between causally related events and compute basic statistics such as averages, variances, and confidence intervals. This way it is possible to identify the main bottlenecks [Aalst 2011]. Information about resources can be used to discover roles, i.e., groups of people frequently executing related activities. Here, standard clustering techniques can be used. ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:9

It is also possible to construct social networks based on the flow of work and analyze resource performance (e.g., the relation between workload and service times). See [Song and Aalst 2008] for an overview of various process mining techniques analyzing the organizational perspective based on event logs. Standard classification techniques can be used to analyze the decision points in the process model [Rozinat and Aalst 2006]. For example, activity e (“decide”) has three possible outcomes (“pay”, “reject”, and “redo”). Using the data known about the case prior to the decision, we can construct a decision tree explaining the observed behavior. Process mining is not restricted to offline analysis and can also be used for predictions and recommendations at runtime. For example, the completion time of a partially handled customer order can be predicted using a discovered process model with timing information [Aalst et al. 2011]. 6. PROCESS MINING MANIFESTO

The IEEE Task Force on Process Mining recently released a manifesto describing guiding principles and challenges [TFPM 2011]. The manifesto aims to increase the visibility of process mining as a new tool to improve the (re)design, control, and support of operational business processes. It is intended to guide software developers, scientists, consultants, and end-users. Before summarizing the manifesto, we briefly introduce the task force. 6.1. Task Force on Process Mining

The growing interest in log-based process analysis motivated the establishment of the IEEE Task Force on Process Mining. The goal of this task force is to promote the research, development, education, and understanding of process mining. The task force was established in 2009 in the context of the Data Mining Technical Committee of the Computational Intelligence Society of the IEEE. Members of the task force include representatives of more than a dozen commercial software vendors (e.g., Pallas Athena, Software AG, Futura Process Intelligence, HP, IBM, Fujitsu, Infosys, and Fluxicon), ten consultancy firms (e.g., Gartner and Deloitte) and over twenty universities. Concrete objectives of the task force are: to make end-users, developers, consultants, managers, and researchers aware of the state-of-the-art in process mining, to promote the use of process mining techniques and tools, to stimulate new process mining applications, to play a role in standardization efforts for logging event data, to organize tutorials, special sessions, workshops, panels, and to publish articles, books, videos, and special issues of journals. For example, in 2010 the task force standardized XES (www.xes-standard.org), a standard logging format that is extensible and supported by the OpenXES library (www.openxes.org) and by tools such as ProM, XESame, Nitro, etc. See http://www.win.tue.nl/ieeetfpm/ for recent activities of the task force. 6.2. Guiding Principles

As with any new technology, there are obvious mistakes that can be made when applying process mining in real-life settings. Therefore, the six guiding principles listed in Table I aim to prevent users/analysts from making such mistakes. As an example, consider Guiding Principle GP4: “Events Should Be Related to Model Elements”. It is a misconception that process mining is limited to control-flow discovery, other perspectives such as the organizational perspective, the time perspective, and the data perspective are equally important. However, the control-flow perspective (i.e., the ordering of activities) serves as the layer connecting the different perspectives. Therefore, it is important to relate events in the log to activities in the model. Conformance checking and model enhancement heavily rely on this relationship. Using Fig. 2, we showed for example alignment γ3 which relates observed trace adcefdbeh to firing seACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:10

W. van der Aalst Table I. Six Guiding Principles Listed in the Manifesto

GP1

GP2

GP3

GP4

GP5

GP6

Event Data Should Be Treated as First-Class Citizens Events should be trustworthy, i.e., it should be safe to assume that the recorded events actually happened and that the attributes of events are correct. Event logs should be complete, i.e., given a particular scope, no events may be missing. Any recorded event should have well-defined semantics. Moreover, the event data should be safe in the sense that privacy and security concerns are addressed when recording the event log. Log Extraction Should Be Driven by Questions Without concrete questions it is very difficult to extract meaningful event data. Consider, for example, the thousands of tables in the database of an ERP system like SAP. Without questions one does not know where to start. Concurrency, Choice and Other Basic Control-Flow Constructs Should be Supported Basic workflow patterns supported by all mainstream languages (e.g., BPMN, EPCs, Petri nets, BPEL, and UML activity diagrams) are sequence, parallel routing (AND-splits/joins), choice (XOR-splits/joins), and loops. Obviously, these patterns should be supported by process mining techniques. Events Should Be Related to Model Elements Conformance checking and enhancement heavily rely on the relationship between elements in the model and events in the log. This relationship may be used to “replay” the event log on the model. Replay can be used to reveal discrepancies between event log and model (e.g., some events in the log are not possible according to the model) and can be used to enrich the model with additional information extracted from the event log (e.g., bottlenecks are identified by using the timestamps in the event log). Models Should Be Treated as Purposeful Abstractions of Reality A model derived from event data provides a view on reality. Such a view should serve as a purposeful abstraction of the behavior captured in the event log. Given an event log, there may be multiple views that are useful. Process Mining Should Be a Continuous Process Given the dynamical nature of processes, it is not advisable to see process mining as a onetime activity. The goal should not be to create a fixed model, but to breathe life into process models such that users and analysts are encouraged to look at them on a daily basis.

quence acef2 beh in M2 . After relating events to model elements, it is possible to “replay” the event log on the model [Aalst 2011]. Replay may be used to reveal discrepancies between an event log and a model, e.g., some events in the log are not possible according to the model. Techniques for conformance checking can be used to quantify and diagnose such discrepancies. Timestamps in the event log can be used to analyze the temporal behavior during replay. Time differences between causally related activities can be used to add average/expected waiting times to the model. These examples illustrate the importance of guiding principle GP4; the relation between events in the log and elements in the model serves as a starting point for different types of analysis. 6.3. Challenges

Process mining is an important tool for modern organizations that need to manage non-trivial operational processes. On the one hand, there is an incredible growth of event data. On the other hand, processes and information need to be aligned perfectly in order to meet requirements related to compliance, efficiency, and customer service. Despite the applicability of process mining there are still important challenges that need to be addressed; these illustrate that process mining is an emerging discipline. Table II lists the eleven challenges described in the manifesto [TFPM 2011]. As an example consider Challenge C4: “Dealing with Concept Drift”. The term concept drift refers to the situation in which the process is changing while being analyzed [Bose et al. 2011]. For instance, in the beginning of the event log two activities may be concurrent whereas later in the log these activities become sequential. Processes may change due to periodic/seasonal changes (e.g., “in December there is more demand” or “on Friday afternoon there are fewer employees available”) or due to changing condiACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:11

Table II. Some of the Most Important Process mining Challenges Identified in the Manifesto C1

C2

C3 C4 C5

C6 C7

C8

C9

C10 C11

Finding, Merging, and Cleaning Event Data When extracting event data suitable for process mining several challenges need to be addressed: data may be distributed over a variety of sources, event data may be incomplete, an event log may contain outliers, logs may contain events at different level of granularity, etc. Dealing with Complex Event Logs Having Diverse Characteristics Event logs may have very different characteristics. Some event logs may be extremely large making them difficult to handle whereas other event logs are so small that not enough data is available to make reliable conclusions. Creating Representative Benchmarks Good benchmarks consisting of example data sets and representative quality criteria are needed to compare and improve the various tools and algorithms. Dealing with Concept Drift The process may be changing while being analyzed. Understanding such concept drifts is of prime importance for the management of processes. Improving the Representational Bias Used for Process Discovery A more careful and refined selection of the representational bias is needed to ensure highquality process mining results. Balancing Between Quality Criteria such as Fitness, Simplicity, Precision, and Generalization There are four competing quality dimensions: (a) fitness, (b) simplicity, (c) precision, and (d) generalization. The challenge is to find models that score good in all four dimensions. Cross-Organizational Mining There are various use cases where event logs of multiple organizations are available for analysis. Some organizations work together to handle process instances (e.g., supply chain partners) or organizations are executing essentially the same process while sharing experiences, knowledge, or a common infrastructure. However, traditional process mining techniques typically consider one event log in one organization. Providing Operational Support Process mining is not restricted to off-line analysis and can also be used for online operational support. Three operational support activities can be identified: detect, predict, and recommend. Combining Process Mining With Other Types of Analysis The challenge is to combine automated process mining techniques with other analysis approaches (optimization techniques, data mining, simulation, visual analytics, etc.) to extract more insights from event data. Improving Usability for Non-Experts The challenge is to hide the sophisticated process mining algorithms behind user-friendly interfaces that automatically set parameters and suggest suitable types of analysis. Improving Understandability for Non-Experts The user may have problems understanding the output or is tempted to infer incorrect conclusions. To avoid such problems, the results should be presented using a suitable representation and the trustworthiness of the results should always be clearly indicated.

tions (e.g., “the market is getting more competitive”). Such changes impact processes and it is vital to detect and analyze them [Bose et al. 2011]. 7. PROCESS MINING IN PRACTICE

Although the manifesto lists many open challenges, existing process mining techniques can easily be applied in practice. At TU/e (Eindhoven University of Technology) we have applied process mining in over 100 organizations. To help the reader to get started with process mining, we briefly discuss tool support and show two case studies taken from [Aalst 2011]. 7.1. Tool Support

The open-source tool ProM has been the de-facto standard for process mining during the last decade. Process discovery, conformance checking, social network analysis, organizational mining, decision mining, history-based prediction and recommendation, ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:12

W. van der Aalst

etc. are all supported by ProM [Aalst 2011; Verbeek et al. 2010]. For example, dozens of different process discovery algorithms are supported by ProM. The functionality of ProM is unprecedented, i.e., there is no product offering a comparable set of process mining algorithms. However, the tool requires process mining expertise and is not supported by a commercial organization. Hence, it has the advantages and disadvantages common for open-source software. Fortunately, there is also a growing number of commercially available software products offering process mining capabilities. Examples are: ARIS Process Performance Manager (Software AG), Comprehend (Open Connect), Discovery Analyst (StereoLOGIC), Flow (Fourspark), Futura Reflect (Futura Process Intelligence), Interstage Automated Process Discovery (Fujitsu), Process Discovery Focus (Iontas/Verint), ProcessAnalyzer (QPR), and Reflect|one (Pallas Athena). All of the products mentioned support process discovery, i.e., constructing a process model based on an event log. For example, Futura Reflect supports genetic process mining as described in [Medeiros et al. 2007]. Some of the systems mentioned have difficulties discovering concurrency, e.g., ARIS Process Performance Manager, Flow, and Interstage Automated Process Discovery. All systems take the timestamps in the event log into account to be able to provide performance-related information, i.e., flow times and bottlenecks can be discovered. None of the commercial software products provides comprehensive support for conformance checking, i.e., the focus is on process discovery and performance measurement. However, ProM supports the different types of conformance checking described in Section 4.3. Some of these products embed process mining functionality in a larger system, e.g., Pallas Athena embeds process mining in their BPM suite BPM|one. Other products aim at simplifying process mining using an intuitive user interface. 7.2. Discovering “Spaghetti Processes”

There is a continuum of processes ranging from highly structured processes (Lasagna processes) to unstructured processes (Spaghetti processes). Figure 3 shows why unstructured processes are often called “Spaghetti processes”. The model was obtained using ProM’s heuristic miner [Weijters and Aalst 2003]. Hence, low frequent behavior has been filtered out. Nevertheless, the model is too difficult to comprehend. Note that this is not necessarily a problem of the discovery algorithm. Activities are only connected if they frequently followed one another in the event log. Hence, the complexity shown in Fig. 3 reflects reality and is not caused by the discovery algorithm. Figure 3 is an extreme example used to illustrate the characteristics of a typical Spaghetti process. Given the data set it is not surprising that the process is unstructured; the 2765 patients did not form a homogeneous group and included individuals with very different medical problems. The process model in Fig. 3 can be simplified dramatically by selecting a group of patients with similar problems or by selecting only the most frequent activities. Nevertheless, its complexity exemplifies some of the challenges mentioned in the manifesto (in particular C1, C2, C6, C10, and C11). 7.3. Analyzing “Lasagna Processes”

Processes in municipalities are typically Lasagna processes. Figure 4 shows a so-called “WOZ process” discovered for a Dutch municipality. We applied the heuristic miner [Weijters and Aalst 2003] on an event log containing information about 745 objections against the so-called WOZ (“Waardering Onroerende Zaken”, i.e., Valuation of Real Estate) valuation. Dutch municipalities need to estimate the value of houses and apartments. The WOZ value is used as a basis for determining the real-estate property tax. The higher the WOZ value, the more tax the owner needs to pay. Therefore, Dutch muACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:13 B_Catheter a Demeur e (start) 2096

0,996 1449

0,667 4

O_ECG dagelijks (schedule) 2191

0,969 2169

0,5 1

0,5 2

0,5 4

B_Nefrostomie catheter R (start) 8

B_Ureter catheter R (start) 4

0,917 14

0,5 6

0,5 4 C_Myoclonieen (start) 4

0,5 2

0,5 1

0,5 1

B_Pleura Punctie (start) 3

0,667 17

0,5 1

0,5 1

0,5 1

0,5 1

0,75 12

B_Jejunostomie (complete) 2

0,5 3

C_Decompensatie geen K O (start) 4

B_Pacemaker inbrenge n (complete) 7

0,75 5

B_Wisselligging (complete) 64

C_Postanox encefalopa t (start) 3

B_Decubitus behandelin g (complete) 3

0,5 2

0,5 1

0,5 1

0,667 14

0,571 34

0,5 3

0,5 1

0,5 4

0,5 1

0,5 2

0,5 3

C_Ischemie waarvoor Re OK (start) 9

0,667 12

0,5 2

0,5 1

0,5 1

C_Ischemie waarvoor Re OK (complete) 2

0,5 2

0,667 3

0,8 142

C_Ileus (start) 3

0,5 3

C_Hemorrhoiden bloeden d (start) 1

0,5 1

0,5 1

0,5 1 B_Cardioversie (start) 90

0,5 1

0,889 6

0,5 1

0,667 2 O_Wond inspectie (complete) 1

0,8 72

0,75 6

0,667 2

B_Tracheostomie - percutaan (complete) 20

0,5 1

B_Rethoratocomie op OK (start) 43

B_Reintubatie (complete) 73

0,5 1

0,833 8 C_Hypoglycaemie (complete) 2

0,5 1

0,5 1

0,8 27

0,8 33

O_Kweek perifeer infuus (schedule) 1

B_Tracheostomie - percutaa n (start) 36

0,667 2

B_Drain(s) sump (start) 5

0,667 3 B_Fixateur Extern e (start) 3

0,5 3

0,5 1

O_Pulmonalis angi o (complete) 1

B_IPPB (complete) 2

0,667 2

B_ P T CA (complete) 4

0,667 3 C_Coma (start) 2

0,5 1

0,5 1

C_Decubitus stuit st. 2b (start) 2

0,5 1

C_Hyperglycaemie >20mmol/ l (complete) 1

B_Uritip (start) 3

0,5 1

0,5 2

0,5 1

C_Cholecystitis, acalc (start) 1

0,889 105

0,5 1

0,889 115

0,7 10

0,8 5

0,5 1

0,5 1

0,5 1

0,5 2

0,5 2

0,5 2

C_Geen plaats af d (start) 2

B_Wondzorg open thora x (complete) 3

0,5 2

0,5 1

0,5 1

B_Plasmaforese (complete) 2

0,5 1

0,5 11

C_Intra-peritoneaal Abces (start) 1

C_-VT (start) 16

0,5 1

B_Actief warmte toevoege n (complete) 150

0,75 62

0,5 1

C_Platzbauch (complete) 1

0,5 1

0,667 14

0,667 10

0,833 25

B_Vacuum therapi e (start) 17

0,5 2

0,833 44

0,925 59

C_-VT (complete) 2

0,969 320

0,833 15

0,5 1 0,5 2 B_Ballonneren (complete) 216 B_Donor Multi Orgaa n (start) 5

0,861 83

0,667 5

O_ E E G (complete) 5

B_Horizontaal (complete) 1

0,625 25

0,833 22

0,5 1

0,667 4

0,667 1

0,5 7

0,5 1

B_Catheter a Demeur e (complete) 150

0,75 9

O_Kweek art. lijn (schedule) 14

0,5 1

B_IABP in op IC U (start) 17

0,5 3

B_Sonde-Voedin g (complete) 159

0,8 12

0,5 1

0,75 11

0,833 32

0,833 33

0,667 5

C_Psychose/verward (start) 36

0,667 9

B_Clysmeren (start) 14

0,833 12

0,5 3

0,889 30

C_MI mogelijk (complete) 3

0,5 10

0,857 9

0,667 6

0,875 40

0,875 41

0,667 1

B_Clysmere n (complete) 13

0,667 15

O_Tobramycine dal / top (schedule) 19

0,8 31

0,625 15

0,8 32

0,75 5

B_Bezoek: afw. tijden (complete) 40

0,769 13

C_MI zeker (start) 46

0,75 7

C_s3 Shock, Hypovolaemisch (start) 7

0,833 28

0,667 5

0,833 37

C_Oligurie (< 5 ml/kg/24u) (start) 40

0,8 41

0,667 2

O_Tobramycine dal / top (complete) 18

0,8 49

B_Oogzalven / druppele n (complete) 56

0,857 41

B_Drain(s) won d (complete) 58

0,857 46

0,667 6

C_Hemi-beeld (start) 7

0,667 6

0,833 16

0,75 43

C _ D IS (start) 17

0,833 15

0,75 3

C_Pleisterlaesie (start) 3

0,5 1

0,667 6

0,667 6

0,75 11

C_Febris e.c.i. (start) 6

0,75 38

C_Non oligurische nierinsu f (start) 13

0,75 10 B_Verpleegvorm prikkelar m (complete) 1

0,5 1 O_X been (schedule) 2

0,5 2

0,667 6

B_Verpleegvorm boomsta m (complete) 7

0,667 7

0,833 19

C_Bronchitis (klinisch ) (start) 20

0,833 20

0,8 17

0,8 24

0,667 8

0,75 47

0,75 38

0,8 31

B_IABP in op OK (complete) 53

0,8 30

0,667 20

C_MI mogelijk (start) 37

0,667 15

B_Low flow bed (start) 21

0,5 1

0,667 17

0,667 25

O_Vancomycine dal / to p (schedule) 30

0,667 17

0,667 18

B_Tracheostomie (start) 21

0,5 1

0,75 7

0,75 5

B_Supra Pubische blaascat h (complete) 1

0,667 2

C_Icterus (bili > 50 ) (start) 7

0,5 1

0,9 176

0,75 8

O_Sigmoideoscopie (schedule) 3

0,75 7

0,667 16

C_Empyeem (start) 8

0,667 13

0,8 24

B_Buikligging (start) 18

0,9 289

0,8 20

0,667 3

C_Hypoglycaemie (start) 25

0,5 1

0,667 5

0,667 5

0,5 1

C_Darmperforatie (start) 5

0,5 1

B_Arterie lijn op IC U (start) 327

0,9 19

B_Beademing Niet Invasie f (start) 8

B_NO beademin g (complete) 1

0,5 1

0,5 1

O_Fundus scopie (schedule) 1

0,5 1

0,5 1

0,667 3

O_Fundus scopie (complete) 1

0,667 5

0,5 1

C_Hepatitis, drug induced (start) 5

0,5 1

0,667 4

C_Rhabdomyolysis (start) 1

0,667 2 B_CAVH(D ) (start) 4

0,5 1

0,5 1 C_Fistel bovenste tr di g (start) 3

0,667 4

C _ - VF (complete) 5

C_Bloeding waarvoor reO K (start) 48

0,643 18

0,8 7

0,5 3

C _ - VF (start) 13

C_Critical illness polyneu r (start) 3

0,5 3

0,667 2

0,667 2 B_Actief koelen (complete) 2

0,5 2

0,5 2

0,667 5

C_Hypoxemie (start) 2

0,667 3

0,5 1

C_Ischemische hepatiti s (start) 6

0,5 1

0,5 1

C_Candidosis invasief (start) 1

0,5 1

0,5 1

0,667 3

C_Decubitus overig st. 1 (start) 1

0,5 1

O_ECHO Buik (schedule) 28

0,5 1

O_Coronair angiogram (schedule) 6

0,5 1

0,667 2

C_Pneumonie (mogelijk ) (start) 1

0,75 4

0,875 95

0,5 2 C_Lijnkweek positief (start) 2

0,8 11

0,8 13

0,8 87

C_Nosocomiale Pneumonie (start) 13

0,8 49

0,5 2

0,5 2

B_Bi of Trilumen Catheter (start) 101

C_Tamponade (complete) 2

0,762 38

0,5 1

0,875 113

0,679 44

0,889 13

0,75 8 B_CVVH (complete) 55

0,825 62

C_TIA (start) 1

0,5 1

0,625 6

0,75 44

O_Kweek liescatheter veneu s (schedule) 10 O_Wond kweek (schedule) 93

O_Vancomycine dal / to p (complete) 28

0,75 29

0,8 10

C_Autoextubatie (start) 50

0,923 10

0,667 5

0,667 35

0,667 2

0,955 27

0,5 1

0,857 9

0,75 23

0,5 3

B_Fasciotomie (start) 2

B_Low flow bed (complete) 10

B_Defibrilatie (start) 14

0,8 25

0,5 1

0,667 1

B_T drain (start) 1

B_Air fluid bed (complete) 28

0,75 27

B_CAVH(D ) (complete) 3 C_Anurie (20mmol/l (start) 4

C_Haemolyse (start) 1

0,5 2

B_Necrotomie (complete) 5

0,8 2

0,5 5 C_Peritonitis (start) 2

0,923 13

C_Trombopenie (start) 5

O_Wegen 3x per week (complete) 35

0,857 123

0,667 2

0,833 23

B_PEP masker (start) 6

0,5 1

0,947 201

C_Bloedverlies > 50 ml/uur (start) 47

0,5 1 C_Platzbauch (start) 4

0,5 1

0,5 1

0,95 33

B_Isolatie strikte (start) 4

B_Pacemaker AAN (start) 158

B_Amputatie Extremiteit (start) 3

B_Rethoratocomie op OK (complete) 42

0,5 1

0,5 1 C_Stridor (complete) 1

0,923 130

0,889 9

B_Necrotomie (start) 5

0,667 2

0,5 4 O_SDD / SOD studie (schedule) 131

0,5 1

0,7 24

B_PCA pomp (complete) 2

0,5 2

0,5 1

C_Lekkage na plastiek (complete) 1

0,5 1

0,9 40

0,5 2

0,75 8

0,8 14

0,8 17

0,5 1

0,962 414

0,5 1 0,912 99

0,909 14

0,972 120

0,5 1

0,833 1024

0,857 27

B_Decubitus zorg stadium a2 (complete) 1

0,75 23

B_Intermit. catheteriseren (complete) 16

B_IABP in op OK (start) 56

0,911 0,929 12824 O_24 uurs urine Na Creatr U (complete) 1

O_CT-buik (complete) 31

0,8 6

O_CT-schede l (complete) 26

0,75 10

O_X been (complete) 2

0,75 8

0,5 1 O_X arm (schedule) 1

0,5 8

0,5 1 O_ECG op aanvraa g (schedule) 281

0,5 1

0,833 28

0,8 14

0,667 14

0,5 3

0,872 141 O_X-thorax op aanvraa g (schedule) 157

0,917 73

0,667 5

0,8 14

B_Fysiotherapie (complete) 16

0,5 4

0,5 1

0,5 1

0,812 167

0,984 168

0,5 1

0,667 4

0,958 25

O_Toxicologie (schedule) 2

B_Swan Ganz op OK (start) 117

0,667 2

0,75 10

O_Toxicologie (complete) 2

B_PCA pom p (start) 19

0,667 2

0,75 4

0,667 26

B_Swan Ganz op K O (complete) 100

0,75 4 O_Transthoracaal ECH O (complete) 10

0,909 30

0,5 3

0,75 5

O_X b.o.z . (schedule) 10

0,833 36

C_Decubitus stuit st. 1 (start) 1

0,667 5

O_Transthoracaal ECH O (schedule) 12

C_Pancreatitis (start) 2

0,667 3

B_Perifeer infuus 2 (complete) 143

O_X b.o.z . (complete) 10

0,857 8

0,5 1

0,5 1

0,998 755

0,5 2

0,75 13

0,966 271

B_PEG catheter (complete) 3

0,875 16

0,667 3 C_-Brady / Aritmie (complete) 2

0,5 1

0,8 12

C_Bloeding waarvoor > 3 CP (start) 15

0,75 24

0,75 6

0,971 54

B_IABP uit op IC U (complete) 1

0,5 1 C_Bloeding waarvoor > 3 C P (complete) 4

0,75 41

0,8 5

0,75 3

B_Halsinf./subclavia op OK (complete) 106

0,5 1

0,5 1

0,75 31

0,981 56

0,992 269

0,5 1

0,857 38

O_Digoxine (complete) 1

0,5 1

0,967 1169

0,857 21

0,833 10

0,85 21

0,5 1 B_Bi of Trilumen Cathete r (complete) 29

0,5 1

0,952 70

0,5 1

0,8 7

0,5 4

0,833 28

0,5 1 C_Decompensatie na OK (complete) 1

0,667 3

0,667 4

0,667 3

0,5 9

C_Shock, Anaphylactisch (complete) 3

0,5 1

0,75 5

0,889 20 C_Bronchitis (klinisch ) (complete) 1

B_Pacemaker standb y (complete) 130

0,889 71

0,5 1

0,5 3

0,5 4

0,5 1

0,667 6

0,667 8

0,5 1

B_Liescatheter(s ) (complete) 31

0,5 1

0,857 86 B_Vernevelaar (complete) 17

0,959 282

0,5 1

0,75 4

O_SDD rectumkweek Ma/D o (schedule) 300

0,5 10 B_Tracheostoma/Tube LOS (complete) 57

0,667 4

0,5 1

0,5 3

0,833 21

0,667 6

0,8 41 B_Beademing Niet Invasief (complete) 7

0,833 194

O_Wegen dagelijks (schedule) 158

0,5 2

0,5 4

0,5 1

0,667 5

0,955 101

0,5 1 O_Kweek liescatheter ar t (schedule) 1

0,967 57

0,955 124

0,938 15

0,5 1

O_Bloedkweek 1 (complete) 403

0,933 14

0,75 11

0,975 60

0,667 7

0,833 5

0,944 150

O_Faeces kweek (schedule) 63

0,5 3

C_Trombopenie (complete) 1

B_Brochusscopie (start) 15

0,5 1

0,833 10 C_Candida kolonisatie (start) 1

C_s1 Shock, Septisc h (complete) 3

0,5 1

C_-VKF, atrium-flutte r (complete) 52

0,5 1

0,667 4

0,5 1

O_Kweek urinecathete r (complete) 28

0,5 1 O_ECG op aanvraag (complete) 42 B_Empyeem spoelin g (complete) 1

0,667 5 B_Verwijderen tampon (start) 1

0,667 18

0,992 657

0,833 28

C_Aspiratie (complete) 1

0,5 1 O_Sinus kweek (schedule) 5

0,5 1

0,5 1

C_s4 Shock, Onbeken d (complete) 1

0,5 1 O_Lumbaal Punctie (schedule) 5

O_Lumbaal Punctie (complete) 5

0,5 1

0,8 78

0,50,889 1 7

O_Sigmoideoscopie (complete) 3

0,5 1

C_Atelectase (complete) 2

0,5 1

0,5 1 O_Bloedkweek 3 (complete) 13

0,5 1

0,5 1

0,5 1 C_Exantheem / Rash (start) 4

0,5 1

0,5 1

B_Jejunumsonde (complete) 6

O_Kweek urinecathete r (schedule) 30

0,5 1

O_Kweek liescatheter ar t (complete) 1

0,667 4

0,5 1

0,875 20

C_Diabetes Insipide s (start) 1

0,5 1

O_Kweek bi/tri lumen cath. (complete) 58

0,5 1

O_Kweek tracheostom a (schedule) 1 O_Keel kweek (complete) 19

0,5 1

0,875 19

0,667 2

0,5 1

C_Addisson / Bijnier Insuff (complete) 2

0,857 7

O_Bloedkweek 2 (complete) 252

0,5 1

O_Faeces kweek (complete) 60

0,864 40

O_Urine kweek (complete) 236

0,947 32

0,5 1

0,5 1

0,5 1

O_Kweek perifeer infuus (complete) 1

O_Wond kweek (complete) 88

0,946 120

0,667 6

0,667 3 B_Verwijderen tampon (complete) 1

0,947 9

0,5 2

0,5 1 B_Thoraxdrain (complete) 617

0,5 1 B_Actief koelen (start) 2

0,5 1

0,833 9

0,5 1 O_Pulmonalis angi o (schedule) 1

0,5 1

C_Colitis, pseudomembraneu s (complete) 1

0,5 1

0,933 18

O_Kweek tracheostom a (complete) 1

0,875 24

0,857 19

0,5 1

0,5 1

O _ B EE (complete) 290

0,5 1 C_Convulsie(s ) (start) 2

0,5 1

0,5 1

O_Bloedkweek 3 (schedule) 14

O_Cito GRAM + bronchuskwee k (schedule) 91

0,962 86

O_Cito GRAM + bronchuskwee k (complete) 86

0,938 39

O_Cito GRAM + sputumkwee k (complete) 94

0,5 5

B_Decubitus zorg stadium b3 (start) 1

0,8 24

0,5 1 C_Decubitus overig st. a3 (start) 1

0,5 1

0,5 1

C_Convulsie(s ) (complete) 2

O_Cito GRAM + sputumkweek (schedule) 97

0,973 50

0,5 1

0,889 94

0,8 9

0,5 1

0,8 8

O_Ramsay-score (schedule) 5

C_Nosocomiale Pneumonie (complete) 2

O_Bloedkweek 2 (schedule) 258

0,75 4

C_s1 Shock, Septisch (start) 24

0,5 1

0,951 230

O_Kweek bi/tri lumen cath. (schedule) 61

0,982 282

0,909 4

0,615 13

O_X-thorax 3 x p.w . (schedule) 22

B_Scleroseren GI bloedin g (complete) 4

0,5 2

0,8 8

0,941 23

0,767 106

0,833 97

C_Acute Lung Injury (start) 1

0,5 1 O_EMV scor e (complete) 3

0,5 1 O_X-thorax dagelijk s (complete) 331

0,5 1 O_kweek pacemakerdraad (schedule) 3

0,5 3 O_kweek pacemakerdraad (complete) 3

O_Bloedkweek 1 (schedule) 412

0,5 4

0,833 45

0,5 1

O _ B EE (schedule) 291

0,964 57

0,7 13

0,5 1

0,9 1296

C_Addisson / Bijnier Insuff (start) 117

0,903 152

0,667 33

0,667 5

C_Parotitis (start) 1

0,75 4

O_Synacthen (complete) 53

B_Halsinf./subclavia op Ok (start) 772

0,5 1

0,5 1

B_Reintubatie na Autoext (complete) 14

0,5 1

0,5 1

0,972 53

0,965 122

0,5 1

0,5 1

0,803 158

0,5 2 O_Kweek peritoneum (schedule) 7

0,936 2032

0,5 1

0,75 8 0,8 227

0,833 23

0,5 2

0,9 1359

0,5 1

0,667 6

0,5 6

0,833 44

O_Synacthen (schedule) 55

0,667 2

B_Donor Weefsel (start) 1

0,667 5

0,75 22 0,75 13

0,667 3

0,857 15

0,5 2

0,984 203

0,5 1 O_Lithium (complete) 1

0,976 135 B_Isolatie strikte (complete) 3

0,5 1

0,5 1

0,979 44

0,833 317

0,5 1

0,98 64

0,5 2 O_SDD sputumkweek Ma/D o (complete) 232

0,75 4 B_PEP masker (complete) 5

0,5 1 O_IAP studie (complete) 1

0,875 32

0,857 15

0,909 91

0,923 214

O_SDD keelkweek Ma/D o (complete) 240

0,5 1

0,5 1

0,75 105

0,5 7

0,667 5 O_SDD rectumkweek Ma/D o (complete) 246

0,974 208

O_SDD / SOD studi e (complete) 37

0,933 82 0,965 170

0,75 23

O_SDD sputumkweek Ma/D o (schedule) 288

0,857 14

0,857 53

0,5 2

0,974 277

0,974 161

0,667 8

0,667 24

O_CT bekken (schedule) 1

0,5 1

0,5 1 B_Reintubatie na Autoext (start) 14

B_Decubitus zorg stadium b4 (start) 1

0,995 250 O_Digoxine (schedule) 1

0,5 3

O_Doppler perifere vate n (complete) 2

0,857 128

C_-Brady / Aritmie (start) 22

O_SDD keelkweek Ma/D o (schedule) 293

0,5 1

0,938 168

0,5 1 O_Echo nier blaas prostaat (schedule) 15

B_Isolatie contact (complete) 3

0,5 1

B_Decubitus zorg stadium a4 (complete) 1

0,5 2

C_s2 Shock, Cardiaal (complete) 4

0,5 1 C_Anurie (1500 ml/24) (start) 3

B_Perifeer infuus (start) 2837

0,5 2 B_Drain(s) redon (start) 210

0,5 1

0,955 54

0,947 71

0,667 3

0,917 12

0,5 2

0,5 1

0,667 2

0,833 10

B_Blaasspoele n (start) 12

B_Blaasspoele n (complete) 5

C_Rethoratocomie (start) 6

0,5 1

0,75 3

0,5 3

B_Arterie lijn op OK (start) 2002

0,75 4

0,667 3

0,5 1

B_IABP in op IC U (complete) 12

0,667 4

B_Drain golf (start) 23

0,984 1564

0,75 5

0,5 1

0,667 2

0,667 7

0,5 1

C_Pneumonie (start) 5

B_Intermit. Haemo Dialys e (start) 43

O_Pleura vocht kweek (complete) 26

0,5 2

0,5 4

0,8 4

0,5 2

0,875 29

0,833 20

0,5 1

0,815 83

0,824 21 B_Perifeer infuus (complete) 1573

0,889 11

C_Loge Syndroom (start) 2

O_Gentamycine dal / top (complete) 115

B_Decubitus zorg stadium a3 (complete) 2

0,999 1659

0,955 316

0,5 4

0,667 5

0,667 2

C_Autoextubatie (complete) 11

0,932 115

0,75 18

O_EMV scor e (schedule) 10

0,667 4

0,667 2

0,5 6

O_Keel kweek (schedule) 19

C_Bloeding waarvoor reO K (complete) 12

B_Duo luchtmatras (start) 192

0,5 1

0,942 172

0,667 2

0,5 1

0,5 3

0,8 14

C_GI-bloeding (start) 9

0,938 25

0,5 1

0,722 41

0,947 179

0,5 4

C_N Phrenicus Paralyse (start) 0,5 1 10

0,625 7 B_Buikligging (complete) 15

B_ E R C P (start) 2

0,969 51

0,5 3

0,5 1

O_CT-schedel (schedule) 30

0,5 0,5 13 3

0,857 27

0,545 11 C_Aspiratie (start) 5

B_Maagsonde (complete) 894

0,5 1

C_Hypertensie (start) 1

0,5 1

0,961 147

0,667 6

0,667 3 C_Subcutaan emfysee m (start) 7

0,5 1

0,8 5

0,8 18

B_Nefrostomie catheter L (complete) 1

C_Cholecystitis, stenen (start) 2

0,5 13

0,5 1

0,667 5 C_Depressie (start) 1

0,889 17

O_ECG 3 x p.w. (complete) 10

B_Doorbewegen (complete) 30

0,5 1

0,5 1

C_Pneumonie (klinisch ) (start) 4

0,5 1

0,706 14 O_ECG 3 x p.w. (schedule) 27

0,75 7

B_Plasmaforese (start) 5

0,8 20

0,5 2

C_Colitis, pseudomembraneus (start) 1

B_Air fluid bed (start) 42

0,667 2

0,667 2

0,5 1

0,667 15

0,667 6

0,5 1

O_BAL / Lavag e (complete) 6

0,625 12

0,5 11

0,5 1

C_CVA (start) 13

B _ C V VH (start) 87

0,9 10

B_Catheter epiduraal (complete) 39 C_Subcutaan emfysee m (complete) 1

0,9 42

C_Atelectase (start) 6

0,5 1

0,75 5

0,5 1

0,9 43

0,5 4 C_Urineweginfectie (start) 2

0,5 1

0,8 7

0,5 2 B_Supra Pubische blaascat h (start) 23

1 13945

0,5 1 O_Biopsie (schedule) 2

0,667 2 C_-SVT, paroxysmaa l (start) 15

C_Decubitus hak st.1 (start) 2

0,974 175

0,857 14

0,889 5

0,857 6

0,889 15

0,5 2

O_Fenytoine (complete) 7

0,5 6 O_BAL / Lavag e (schedule) 6

0,5 4

B_Decubitus behandelin g (start) 4

0,5 4

0,8 8

O_CT-buik (schedule) 32

0,947 118 O_X-thorax op aanvraag (complete) 28

0,5 1

0,5 1 O_Coronair angiogra m (complete) 5

0,667 3 O_Gentamycine dal / to p (schedule) 122

0,667 8

0,912 32

O_Pleura vocht kweek (schedule) 31

0,8 77 B_Liescatheter(s) (start) 90

C_Pancreatitis (complete) 1

0,5 3

B_PEG cathete r (start) 7

0,5 21

0,667 4

0,5 1

B_Pleura Punctie (complete) 3

0,8 57

B_Duo luchtmatras (complete) 57

0,5 2

0,667 3

0,75 6

0,5 2

C_Decubitus overig st. b4 (start) 1

O_Coloscopie (complete) 2

B_Basiszor g (complete) 43

0,667 27

0,5 1

0,5 3

C_Naadlekkag e (start) 3

0,5 1

C_Naadlekkage (complete) 1

0,5 1

0,5 3 C_Abces (start) 2

B_Laparotomie (start) 13

0,5 1 O_Echo perifere vaten (schedule) 3

0,5 1

0,5 1

0,927 1518

0,667 17

0,5 2

O_Coloscopie (schedule) 2

O_Lithium (schedule) 1

O_Biopsie (complete) 2

0,667 9 C_Lijn sepsis (start) 9

C_Intra-peritoneaal Abces (complete) 1

0,857 80

0,964 929

B_Thoraxdrain (start) 1863

0,5 2

0,5 2

O_Ascites kwee k (schedule) 2

0,667 4

1 21398

C_Acute Tubulus Necros e (start) 24

C_Decubitus hak st. a3 (start) 1

0,5 0,5 1 1 C_Longbloeding (start) 3

0,667 9

0,5 1

0,5 1

0,667 4 O_Liquor kweek (schedule) 4

0,5 1

0,667 5

0,889 21

C_Sufheid (start) 36

0,5 1

B_Intermit. catheterisere n (start) 28

0,909 32

0,5 1

C_ARDS (start) 12

O_Kweek art. lij n (complete) 12

0,5 1

0,769 16 B_Medium care (complete) 390

0,769 18

B_Intermit. Haemo Dialyse (complete) 14

O_ECG dagelijk s (complete) 374

0,833 18

C_Bacteriemie (start) 22

0,5 1

B_Isolatie Universeel (complete) 2

B_IABP uit op IC U (start) 1

0,909 33

O_X TWK (complete) 1

0,5 3

0,5 4

0,5 1

0,5 1 B_Fasciotomie (complete) 2

0,5 1

C_GI-bloeding (complete) 2

0,5 1

0,5 5 O_ E EG (schedule) 6

0,5 1 C_reOK ivm pleuravoch t (start) 1

C_Resp Insuff (start) 82

0,5 1

0,667 2

B_Minitracheotomie (complete) 2

0,5 1

C_Ischemie, Myocard (complete) 1

0,5 1

B_Bloedtoediening met druk (start) 5

0,667 9

0,5 1

0,667 18

C_Ischemie, Myocard (start) 20

0,5 1

0,5 1

O_TEE (complete) 79

0,5 1

B_Reanimatie (complete) 20

0,667 2

B_Drain golf (complete) 6

0,667 12

O _ T EE (schedule) 84

0,667 2

C_Decompensatie na OK (start) 3

0,667 2

B_Pacemaker inbrenge n (start) 7

0,5 1 O_X TWK (schedule) 1

0,5 1

C_ARDS (complete) 1

0,975 147

0,75 48

C_Druk necrose elder s (start) 1

C_Wondinfectie (start) 3

0,5 1

B_Isolatie aerogen e (complete) 1

C_-Asystolie (start) 16

0,8 24

0,667 1

0,5 1

B_Isolatie aerogen e (start) 1

0,667 5

O_Methyl blauw/ fistulogram (complete) 2

B_Weanen (start) 355

0,935 929

0,5 1

B_Actief warmte toevoege n (start) 158

0,75 3

0,8 51 M_MeasurementBloodGa s (complete) 28252

0,5 1

B_Wondzorg open bui k (complete) 10

C_Pustuleuze af w (start) 1

0,5 1

0,815 74

0,5 1

O_Echo perifere vaten (complete) 2

0,5 1

O_Liquor kweek (complete) 4

0,792 31

0,5 1

0,5 2

0,5 2

C_Pleura-Effusie (start) 2

0,5 1

B_Isolatie druppel (complete) 3 C_Leucopenie (start) 1

0,5 2

0,667 2

0,667 3

O_Methyl blauw/ fistulogram (schedule) 2

0,75 7

0,8 168

C_-VKF, atrium-flutte r (start) 181

0,5 2

0,667 0,5 4 1 C_Decubitus hak st.a2 (start) 3

0,667 3

0,5 7

M_MeasurementChemistry (complete) 19168

0,667 6 B_Verband spal k (complete) 2

0,8 67

B_Cardioversie (complete) 80

C_Lekkage na plastiek (start) 5

0,667 4

0,5 3

0,5 1

0,5 6

B_Vacuum therapie (complete) 6

0,667 10

0,5 5

B_Minitracheotomie (start) 4

0,98 153

0,5 1

0,75 7

0,667 4

0,5 2 O_Fenytoine (schedule) 7

0,667 15

C_Thrombo-embolie art (complete) 1

0,75 7

0,667 10

0,5 1

0,5 3

0,5 4

0,667 2

0,5 3

0,833 14

0,5 1

0,5 2

0,5 1

0,833 29

0,5 1

0,875 22

0,75 11

0,5 7

B_Isolatie druppel (start) 15

0,5 3

0,5 1

B_Catheter a demeure (start) 534

0,8 20

0,5 1

B_Isolatie Universeel (start) 3

0,5 2

B_Oogglazen (start) 3

O_Doppler perifere vaten (schedule) 16

0,667 3

C_Endocarditis (start) 2 C_Thrombo-embolie art (start) 2

0,998 532

C_Resp Insuff (complete) 6

0,667 6

0,929 100

B_Horizontaal (start) 1

0,667 9

0,983 128

B_Decubitus zorg stadium 1 (complete) 3

0,667 3

0,909 1197

0,909 41

0,75 9

0,667 13 B_CPAP (start) 18

0,5 1

0,5 1

0,857 45

0,5 3

0,667 1

0,5 1

0,75 4

0,992 1718

0,5 3

0,667 45

0,5 2

C_Dwarslaesi e (start) 1

0,5 1

C_Hepatitis, drug induced (complete) 1

0,5 1

0,5 2

C_-Premature Slagen NNO (start) 3

0,667 2

B_Orthopaedische tracti e (start) 2

0,667 2

0,5 1

0,75 97

0,5 2

0,995 1716

C_Dehiscentie (start) 3

0,5 2

O_X arm (complete) 1

0,5 4

0,996 534 B_Scleroseren GI bloedin g (start) 4

1 9484

C_s2 Shock, Cardiaal (start) 47

0,667 11

B_Beademing (start) 2187

0,5 2 B_Defibrilatie (complete) 12

0,5 1

0,5 4

0,5 4 C_Sternumwondinfectie (start) 4

0,889 4

0,96 49 O_X-thorax cito (complete) 53

B_Decubitus zorg stadium b2 (start) 3

0,5 1

0,5 5

0,833 29 O_X-thorax cito (schedule) 60

0,833 32

0,978 995

0,5 1

0,5 2

O_CT thorax (complete) 14

0,5 2

0,667 2

M_MeasurementClinic (complete) 12474

C_Tamponade (start) 7

0,917 48

0,8 4

0,667 2

0,812 31 O_ECG cito (complete) 31

0,833 14

0,929 15

B_Medium care (start) 768

B_Isolatie contact (start) 7

0,667 8

B_Primo luchtmatras (start) 48

0,5 2 O_ECG cito (schedule) 35 O_CT thorax (schedule) 14

0,5 1

0,667 8

0,667 2 B_Wondzorg open thora x (start) 10

B_Drain(s) won d (start) 167

B_Decubitus zorg stadium 1 (start) 7

0,667 6

0,75 5

B _ C A PD (start) 2

0,889 195

0,667 2 B_Jejunumsonde (start) 31

0,982 1050

0,857 19 B_Nefrostomie catheter L (start) 7

B_Ureter catheter L (start) 5

0,975 56 B_Catheter epiduraal (start) 170

0,667 4 B_Jejunostomie (start) 22

0,5 1

O_Wond inspectie (schedule) 4

0,667 3

0,969 66

O_X-thorax dagelijks (schedule) 2308

0,667 5

0,5 1

O_Benzodiazepines (schedule) 1

0,97 106

0,9 99

0,75 11 B_Basiszorg (start) 2010

0,5 1

0,5 2 C_Decubitus stuit st. a2 (start) 3

0,5 1 0,5 2 O_Urine kweek (schedule) 244

0,5 1

0,5 1 B_Bronchiaal toilet (complete) 247

0,667 4

O_Kweek swan gan z (schedule) 5

0,5 1

0,571 6 B_CPAP (complete) 14

0,667 26

0,939 57

0,98 140

0,5 3

0,96 198

0,5 1

0,5 2

B_Extubatie (complete) 198

B_Bloedtoediening met druk (complete) 4

0,8 17

0,875 20

B_Sonde-Voedin g (start) 365

B_PTCA (start) 6

0,8 7 0,5 2

0,75 6

O_Lab. 3x per week (complete) 5

0,933 86

0,8 6

0,5 2

0,958 78

0,5 1

0,947 38

B_Wisselligging (start) 306

0,5 1

0,5 1

O_Pleurapunctie (schedule) 3

0,5 5

O_Virus serologie (schedule) 8

0,5 3

0,964 91 0,992 244

0,5 1

B_Halsinf./subclavia op K O (start) 1294

0,929 20

0,984 86 O_Paracetamo l (schedule) 2

B_Doorbewegen (start) 129

0,5 1

O_Bronchoscopie (schedule) 28

0,667 16

0,5 1

O_Gastro / Duodenscopie (schedule) 32

0,75 3

B_Isolatie Beschermen d (start) 1

0,5 1 B_Isolatie Beschermen d (complete) 1

0,5 1 0,909 25

O_Tracheaspoelin g (schedule) 1

0,833 52

0,833 67

B_Swan Ganz op IC U (start) 18

C_Pneumothorax (complete) 11

0,8 7

0,5 1 O_Kweek sheat h (complete) 7

0,5 1

0,5 1

C_Empyeem (complete) 1

B_Ontlastende LP bij druk (complete) 1

B_Ballonneren (start) 317

0,909 3

0,5 1

0,75 12

C_Bacteriemie (complete) 1

B_Vernevelaa r (start) 25

0,875 49 0,5 2

0,792 39 B_Bronchiaal toile t (start) 373

0,5 1 O_X-thorax 3 x p.w. (complete) 4

B_Bezoek: wake n (start) 52

0,889 23

0,8 12 B_Oogzalven / druppele n (start) 102

0,75 53 0,75 6 B_Intubatie (complete) 95

0,667 2

O_Sputum kweek (complete) 405

0,8 12

0,8 22 B_Beademing gestart op IC U (complete) 46

0,5 2 O_Ascites kweek (complete) 2

0,5 2

C_Bronchitis (mogelijk ) (start) 2

0,5 2

0,985 391

0,5 2

0,5 1

0,8 14

0,5 1

0,667 2

0,5 2

B_Bezoek: kind. toegestaan (start) 1

B _ R e OK (start) 11

0,667 19

0,909 47

0,75 10

O_Sputum kweek (schedule) 428

B_Bezoek: waken (complete) 27

0,857 14

M_MeasurementDecubitus (complete) 824

B_Wondzorg overig (start) 270

0,8 5

0,75 6

0,5 1

0,5 1

B_Orthopaedische tractie (complete) 2

0,833 9

O_Kweek overige (schedule) 49

0,667 2 O_Huiduitstrijk Oksel LiR/ (complete) 2

O_I.V Catheter kweek overig (schedule) 29

0,667 2 B_ERCP (complete) 2

0,5 1 O_Virus serologi e (complete) 8

0,944 44

O_Sinus kweek (complete) 5

B_Oogglazen (complete) 2

0,5 1 B_Bezoek: afw. tijden (start) 70

0,5 1

O_Kweek overige (complete) 47

B_Isolatie beschermende (start) 1

0,5 1

0,923 150

0,5 1

B_Empyeem spoelin g (start) 2

B_Tracheostomi e (complete) 11

0,5 1

C_-SVT, paroxysmaa l (complete) 4

0,5 1

B_Tracheostoma/Tube LO S (start) 85

0,5 1

0,5 1

C_Bloedverlies > 50 ml/uur (complete) 2

0,667 4

C_Rethoratocomie (complete) 1

0,667 2 B_Halsinf./subclavia op kO (complete) 28

B_Brochusscopie (complete) 14

0,929 25

B_Decubitus zorg stadium a4 (start) 3 O_I.V Catheter kweek overig (complete) 27

0,5 1

0,5 1

B_O2 masker/neusslan g (start) 1954

0,5 1

0,5 2

0,75 3

0,5 1

B_Weanen (complete) 316

0,75 7

B_Halsinf./subclavia op IC (start) 112

0,5 1

0,833 8

O_Wegen dagelijks (complete) 53

C_s3 Shock, Hypovolaemisch (complete) 2

0,731 27

0,667 3

0,8 5 0,5 1

O_Wegen 3x per week (schedule) 123

C_Fibro-proliferatieve ARDS (start) 5

0,5 1

B_Amputatie Extremiteit (complete) 2

0,8 4

B _ R e OK (complete) 10

0,5 2

O_Kweek peritoneum (complete) 7

0,5 1

C_s4 Shock, Onbeken d (start) 6

0,5 1

0,5 1

B_Mobiliseren (complete) 49

B_Verpleegvorm prikkelarm (start) 6

0,989 112 O_Gastro / Duodenscopie (complete) 24

0,667 3 O_Pleurapunctie (complete) 3

0,5 1

0,5 1 B_NO beademing (start) 1

0,667 1

B_Reanimatie (start) 26

B_Catheter spinaa l (start) 1

0,667 13

0,5 6 B_Verpleegvorm boomsta m (start) 9

0,5 1 O_Paracetamol (complete) 1

B_Bi- PAP (start) 6

0,5 1 O_Tracheaspoelin g (complete) 1

0,5 1

0,8 4 O_Kweek sheat h (schedule) 7

0,5 1

0,667 3

0,5 1

0,5 2 O_Bronchoscopie (complete) 26

0,8 5 B_Verwijderen Agraves (complete) 5

B_Catheter a demeur e (complete) 18

O_IAP studie (schedule) 2

0,667 5

B_Laparotomie (complete) 13

B_Ontlastende LP bij dru k (start) 1

0,5 3

B_Verband gips (start) 3

0,5 1

B_Verwijderen Agrave s (start) 5

0,5 2

B_Decubitus zorg stadium a3 (start) 4 B_IPPB (start) 8

0,5 1 C_Darmperforatie (complete) 2

B_Anus Praeter Naturalis (start) 64

0,5 1 B_Decubitus zorg stadiumb4 (complete) 1

0,5 1

0,5 2

O_Ramsay-scor e (complete) 3

0,5 1

B_Mobiliseren (start) 237

0,5 1

0,5 1 C_Polyurie (>40ml/kg/24u) (start) 1

0,5 3

B_Perifeer infuus 2 (start) 265

B_O2 masker/neusslan g (complete) 213

B_Beademing gestart op IC U (start) 61

0,918 112

C_-Asystolie (complete) 6

0,5 2

0,5 9 B_Beademing (complete) 1868

0,5 1

0,5 2

B_Uro stoma (start) 12

0,667 3

0,667 6

0,5 1 B_Extubatie (start) 202

0,999 1282

0,944 27

B_Maagsonde (start) 2430

0,8 15

B_Drain(s) sump (complete) 2

0,8 3

B_Bi-PAP (complete) 5

0,978 106

0,5 1

0,9 104

B_Verband spal k (start) 7

0,5 1 B_Intubatie (start) 102

B_Fysiotherapie (start) 371

0,817 448 0,5 1

O_Benzodiazepines (complete) 1

0,667 5 O_Kweek swan gan z (complete) 5

0,875 22

0,5 1

C_Beademingsafhankelijkhei d (start) 27

0,5 1 C_Ischemie (start) 35

C_Candidaemie (start) 3

0,8 1 0,5 1

B_Anus Praeter Naturalis (complete) 3

C_Leverfalen (start) 2

0,5 0,667 3 2 C_Hypotensie (start) 17

0,5 2 C_Oligurie (< 5 ml/kg/24u ) (complete) 5

B_Wondzorg open bui k (start) 33

0,5 1

O_Lab. 3x per week (schedule) 10

O_Cystoscopie (schedule) 1

0,5 1

0,5 1

B_Arterie lijn op IC U (complete) 184

O_Cystoscopie (complete) 1

Fig. 3. Spaghetti process describing the diagnosis and treatment of 2765 patients in a Dutch hospital. The process model was constructed based on an event log containing 114,592 events. There are 619 different activities (taking event types into account) executed by 266 different individuals (doctors, nurses, etc.).

nicipalities need to handle many objections (i.e., appeals) of citizens that assert that the WOZ value is too high. Figure 4 shows the process of handling these objections within a particular municipality. The diagram is not intended to be readable; it is only included to show the contrast with Fig. 3. OZ04 Incompleet complete

Domain: heus1 complete

OZ04 Incompleet start

OZ10 Horen complete OZ10 Horen start

OZ02 Voorbereiden start OZ02 Voorbereiden complete

OZ06 Stop vordering start

OZ06 Stop vordering complete

OZ08 Beoordelen start

OZ08 Beoordelen complete

OZ09 Wacht Beoord start

OZ15 Zelf uitspraak start

OZ15 Zelf uitspraak complete OZ20 Administatie start

OZ09 Wacht Beoord complete OZ16 Uitspraak start

OZ12 Hertaxeren start

OZ12 Hertaxeren complete

OZ16 Uitspraak complete

OZ20 Administatie complete

OZ24 Start vordering start

OZ24 Start vordering complete

OZ18 Uitspr. wacht start OZ18 Uitspr. wacht complete

Fig. 4. WF-net discovered based on an event log of a Dutch municipality. The log contains events related to 745 objections against the so-called WOZ valuation. These 745 objections generated 9583 events. There are 13 activities. For 12 of these activities both start and complete events are recorded. Hence, the WF-net has 25 transitions.

The discovered WF-net has a good fitness: 628 of the 745 cases can be replayed without encountering any problems. The fitness of the model and log at the event level is 0.98876214. This value is based on the approach described in [Aalst 2011; Rozinat and Aalst 2008]. The high value shows that almost all recorded events are explained by the model. Hence, the WOZ process is clearly a Lasagna process. Nevertheless, it was interesting for the municipality to see the deviations highlighted in the model. Figure 5 shows a fragment of the diagnostics provided by the ProM’s conformance checker. The municipality’s log contains timestamps. Therefore, it is possible to replay the event log while taking the timestamps into account. ProM can visualize the phases of the process that take most time. For example, the place in-between “OZ16 Uitspraak start” (start of announcement of final judgment) and “OZ16 Uitspraak complete” (end of announcement of final judgment) was visited 436 times. The average time spent in this place is 7.84 days. This indicates that activity “OZ16 Uitspraak” (final judgment) takes about a week. It is also possible to simply select two activities and measure the time that passes in-between these activities. On average 202.73 days pass in-between the completion of activity “OZ02 Voorbereiden” (preparation) and the completion of “OZ16 Uitspraak” (final judgment). Such examples illustrate that process mining – ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:14

W. van der Aalst

Fig. 5. Fragment of the WF-net annotated with diagnostics generated by ProM’s conformance checker. The WF-net and event log fit well (fitness is 0.98876214). Nevertheless, several low-frequent deviations are discovered. For example. activity “OZ12 Hertaxeren” (re-evaluation of WOZ value) is started 23 times without being enabled according to the model.

unlike classical Business Intelligence (BI) tools – helps organizations to “look inside” their processes. This is in stark contrast with contemporary BI tools that typically focus on reporting and fancy looking dashboards. 8. CONCLUSION

This paper introduced process mining as a new technology enabling evidence-based process analysis. We introduced the three basic types of process mining (discovery, conformance, and enhancement) using a small example and used some larger examples to illustrate the applicability in real-life settings. Nevertheless, there are still many open scientific challenges and most end-user organizations are not yet aware of the potential of process mining. This triggered the development of the Process Mining Manifesto by an international task force involving 77 process mining experts representing 53 organizations. This manifesto can be obtained from http://www.win.tue.nl/ieeetfpm/. The reader interested in process mining is also referred to the recent book on process mining [Aalst 2011]. Also visit www.processmining.org for sample logs, videos, slides, articles, and software. ACKNOWLEDGMENTS The author would like to thank all that contributed to the Process Mining Manifesto: Arya Adriansyah, Ana Karla Alves de Medeiros, Franco Arcieri, Thomas Baier, Tobias Blickle, Jagadeesh Chandra Bose, Peter van den Brand, Ronald Brandtjen, Joos Buijs, Andrea Burattin, Josep Carmona, Malu Castellanos, Jan Claes, Jonathan Cook, Nicola Costantini, Francisco Curbera, Ernesto Damiani, Massimiliano de Leoni, Pavlos Delias, Boudewijn van Dongen, Marlon Dumas, Schahram Dustdar, Dirk Fahland, Diogo R. Ferreira, ¨ Walid Gaaloul , Frank van Geffen, Sukriti Goel, Christian Gunther, Antonella Guzzo, Paul Harmon, Arthur ter Hofstede, John Hoogland, Jon Espen Ingvaldsen, Koki Kato, Rudolf Kuhn, Akhil Kumar, Marcello La Rosa, Fabrizio Maggi, Donato Malerba, Ronny Mans, Alberto Manuel, Martin McCreesh, Paola Mello, Jan Mendling, Marco Montali, Hamid Motahari Nezhad, Michael zur Muehlen, Jorge Munoz-Gama, Luigi Pon´ tieri, Joel Ribeiro, Anne Rozinat, Hugo Seguel P´erez, Ricardo Seguel P´erez, Marcos Sepulveda, Jim Sinur, Pnina Soffer, Minseok Song, Alessandro Sperduti, Giovanni Stilo, Casper Stoel, Keith Swenson, Maurizio Talamo, Wei Tan, Chris Turner, Jan Vanthienen, George Varvaressos, Eric Verbeek, Marc Verdonk, Roberto

ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

Process Mining: Overview and Opportunities

99:15

Vigo, Jianmin Wang, Barbara Weber, Matthias Weidlich, Ton Weijters, Lijie Wen, Michael Westergaard, and Moe Wynn.

REFERENCES A ALST, W. VAN DER 2011. Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer-Verlag, Berlin. A ALST, W. VAN DER, A DRIANSYAH , A., AND D ONGEN, B. VAN 2012. Replaying History on Process Models for Conformance Checking and Performance Analysis. WIREs Data Mining and Knowledge Discovery. A ALST, W. VAN DER, D ONGEN, B., H ERBST, J., M ARUSTER , L., S CHIMM , G., AND W EIJTERS, A. 2003. Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering 47, 2, 237– 267. A ALST, W. VAN DER, H EE , K. VAN, H OFSTEDE , A., S IDOROVA , N., V ERBEEK , H., V OORHOEVE , M., AND W YNN, M. 2011. Soundness of Workflow Nets: Classification, Decidability, and Analysis. Formal Aspects of Computing 23, 3, 333–363. A ALST, W. VAN DER, R EIJERS, H., W EIJTERS, A., D ONGEN, B. VAN, M EDEIROS, A., S ONG, M., AND V ER BEEK , H. 2007. Business Process Mining: An Industrial Application. Information Systems 32, 5, 713– 732. ¨ A ALST, W. VAN DER, R UBIN, V., V ERBEEK , H., D ONGEN, B. VAN, K INDLER , E., AND G UNTHER , C. 2010. Process Mining: A Two-Step Approach to Balance Between Underfitting and Overfitting. Software and Systems Modeling 9, 1, 87–111. A ALST, W. VAN DER, S CHONENBERG, M., AND S ONG, M. 2011. Time Prediction Based on Process Mining. Information Systems 36, 2, 450–475. A ALST, W. VAN DER AND S TAHL , C. 2011. Modeling Business Processes: A Petri Net Oriented Approach. MIT press, Cambridge, MA. A ALST, W. VAN DER, W EIJTERS, A., AND M ARUSTER , L. 2004. Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering 16, 9, 1128–1142. A DRIANSYAH , A., D ONGEN, B. VAN, AND A ALST, W. VAN DER 2011. Conformance Checking using CostBased Fitness Analysis. In IEEE International Enterprise Computing Conference (EDOC 2011), C. Chi and P. Johnson, Eds. IEEE Computer Society, 55–64. A GRAWAL , R., G UNOPULOS, D., AND L EYMANN, F. 1998. Mining Process Models from Workflow Logs. In Sixth International Conference on Extending Database Technology. Lecture Notes in Computer Science Series, vol. 1377. Springer-Verlag, Berlin, 469–483. B ERGENTHUM , R., D ESEL , J., L ORENZ , R., AND M AUSER , S. 2007. Process Mining Based on Regions of Languages. In International Conference on Business Process Management (BPM 2007), G. Alonso, P. Dadam, and M. Rosemann, Eds. Lecture Notes in Computer Science Series, vol. 4714. SpringerVerlag, Berlin, 375–383. B OSE , R., A ALST, W. VAN DER, Z LIOBAITE , I., AND P ECHENIZKIY, M. 2011. Handling Concept Drift in Process Mining. In International Conference on Advanced Information Systems Engineering (Caise 2011), H. Mouratidis and C. Rolland, Eds. Lecture Notes in Computer Science Series, vol. 6741. SpringerVerlag, Berlin, 391–405. C OOK , J. AND W OLF, A. 1998. Discovering Models of Software Processes from Event-Based Data. ACM Transactions on Software Engineering and Methodology 7, 3, 215–249. C ORTADELLA , J., K ISHINEVSKY, M., L AVAGNO, L., AND YAKOVLEV, A. 1998. Deriving Petri Nets from Finite Transition Systems. IEEE Transactions on Computers 47, 8, 859–882. D ATTA , A. 1998. Automating the Discovery of As-Is Business Process Models: Probabilistic and Algorithmic Approaches. Information Systems Research 9, 3, 275–301. D ESEL , J. AND R EISIG, W. 1998. Place/Transition Nets. In Lectures on Petri Nets I: Basic Models, W. Reisig and G. Rozenberg, Eds. Lecture Notes in Computer Science Series, vol. 1491. Springer-Verlag, Berlin, 122–173. D ONGEN, B. VAN AND A ALST, W. VAN DER 2004. Multi-Phase Process Mining: Building Instance Graphs. In International Conference on Conceptual Modeling (ER 2004), P. Atzeni, W. Chu, H. Lu, S. Zhou, and T. Ling, Eds. Lecture Notes in Computer Science Series, vol. 3288. Springer-Verlag, Berlin, 362–376. D ONGEN, B. AND A ALST, W. VAN DER 2005. Multi-Phase Mining: Aggregating Instances Graphs into EPCs and Petri Nets. In Proceedings of the Second International Workshop on Applications of Petri Nets to Coordination, Workflow and Business Process Management, D. Marinescu, Ed. Florida International University, Miami, Florida, USA, 35–58. D ONGEN, B. VAN, B USI , N., P INNA , G., AND A ALST, W. VAN DER 2007. An Iterative Algorithm for Applying the Theory of Regions in Process Mining. In Proceedings of the Workshop on Formal Approaches to

ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.

99:16

W. van der Aalst

Business Processes and Web Services (FABPWS’07), W. Reisig, K. Hee, and K. Wolf, Eds. Publishing House of University of Podlasie, Siedlce, Poland, 36–55. E HRENFEUCHT, A. AND R OZENBERG, G. 1989. Partial (Set) 2-Structures - Part 1 and Part 2. Acta Informatica 27, 4, 315–368. G RECO, G., G UZZO, A., P ONTIERI , L., AND S ACC A` , D. 2006. Discovering Expressive Process Models by Clustering Log Traces. IEEE Transaction on Knowledge and Data Engineering 18, 8, 1010–1027. ¨ G UNTHER , C. AND A ALST, W. VAN DER 2007. Fuzzy Mining: Adaptive Process Simplification Based on Multi-perspective Metrics. In International Conference on Business Process Management (BPM 2007), G. Alonso, P. Dadam, and M. Rosemann, Eds. Lecture Notes in Computer Science Series, vol. 4714. Springer-Verlag, Berlin, 328–343. H AND, D., M ANNILA , H., AND S MYTH , P. 2001. Principles of Data Mining. MIT press, Cambridge, MA. H ERBST, J. 2000. A Machine Learning Approach to Workflow Management. In Proceedings 11th European Conference on Machine Learning. Lecture Notes in Computer Science Series, vol. 1810. Springer-Verlag, Berlin, 183–194. TFPM – IEEE T ASK F ORCE ON P ROCESS M INING. 2011. Process Mining Manifesto. In BPM Workshops. Lecture Notes in Business Information Processing Series, vol. 99. Springer-Verlag, Berlin. M ANYIKA , J., C HUI , M., B ROWN, B., B UGHIN, J., D OBBS, R., R OXBURGH , C., AND B YERS, A. 2011. Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute. M EDEIROS, A., W EIJTERS, A., AND A ALST, W. VAN DER 2007. Genetic Process Mining: An Experimental Evaluation. Data Mining and Knowledge Discovery 14, 2, 245–304. M UNOZ -G AMA , J. AND C ARMONA , J. 2011. Enhancing Precision in Process Conformance: Stability, Confidence and Severity. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), N. Chawla, I. King, and A. Sperduti, Eds. IEEE, Paris, France. R OZINAT, A. AND A ALST, W. VAN DER 2006. Decision Mining in ProM. In International Conference on Business Process Management (BPM 2006), S. Dustdar, J. Fiadeiro, and A. Sheth, Eds. Lecture Notes in Computer Science Series, vol. 4102. Springer-Verlag, Berlin, 420–425. R OZINAT, A. AND A ALST, W. VAN DER 2008. Conformance Checking of Processes Based on Monitoring Real Behavior. Information Systems 33, 1, 64–95. S OLE , M. AND C ARMONA , J. 2010. Process Mining from a Basis of Regions. In Applications and Theory of Petri Nets 2010, J. Lilius and W. Penczek, Eds. Lecture Notes in Computer Science Series, vol. 6128. Springer-Verlag, Berlin, 226–245. S ONG, M. AND A ALST, W. VAN DER 2008. Towards Comprehensive Support for Organizational Mining. Decision Support Systems 46, 1, 300–317. V ERBEEK , H., B UIJS, J., D ONGEN, B. VAN, AND A ALST, W. VAN DER 2010. ProM 6: The Process Mining Toolkit. In Proc. of BPM Demonstration Track 2010, M. L. Rosa, Ed. CEUR Workshop Proceedings Series, vol. 615. 34–39. W EIJTERS, A. AND A ALST, W. VAN DER 2003. Rediscovering Workflow Models from Event-Based Data using Little Thumb. Integrated Computer-Aided Engineering 10, 2, 151–162. W ERF, J., D ONGEN, B. VAN, H URKENS, C., AND S EREBRENIK , A. 2010. Process Discovery using Integer Linear Programming. Fundamenta Informaticae 94, 387–412. W ESKE , M. 2007. Business Process Management: Concepts, Languages, Architectures. Springer-Verlag, Berlin.

ACM Transactions on Management Information Systems, Vol. 99, No. 99, Article 99, Publication date: February 2012.