Process Mining Introduction PDF
Document Details
Wil van der Aalst
Tags
Summary
This document introduces process mining, a data science approach focusing on real-world processes. It covers concepts like data explosion, process models, and different perspectives, making it ideal for understanding and improving business processes.
Full Transcript
Chapter 1 Introduction prof.dr.ir. Wil van der Aalst www.processmining.org Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Chapter 3 Data Mining Process Modeling and Analysis Part II: From Event Logs to Pr...
Chapter 1 Introduction prof.dr.ir. Wil van der Aalst www.processmining.org Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Chapter 3 Data Mining Process Modeling and Analysis Part II: From Event Logs to Process Models Chapter 4 Getting the Chapter 5 Chapter 6 Advanced Data Process Discovery: An Process Introduction Discovery Techniques Part III: Beyond Process Discovery Chapter 7 Chapter 8 Chapter 9 Operational Conformance Mining Additional Support Checking Perspectives Part IV: Putting Process Mining to Work Chapter 10 Tool Chapter 11 Analyzing Chapter 12 Analyzing Support “Lasagna Processes” “Spaghetti Processes” Part V: Reflection Chapter 13 Chapter 14 Epilogue Cartography and Navigation PAGE 1 Data explosion PAGE 2 not just data … processes matter… Process Mining: Data Science in Action PAGE 1 Example process model examine thoroughly c1 c3 pay compensation examine start register casually decide c5 end request c2 c4 reject check ticket request reinitiate request PAGE 5 Same process in terms of BPMN rather than Petri nets examine thoroughly pay examine compensation casually register decide request start reject end check ticket request reinitiate request PAGE 6 What are process models used for? insight: while making a model, the modeler is triggered to view the process from various angles; discussion: the stakeholders use models to structure discussions; documentation: processes are documented for instructing people or certification purposes (cf. ISO 9000 quality management); verification: process models are analyzed to find errors in systems or procedures (e.g., potential deadlocks); performance analysis: techniques like simulation can be used to understand the factors influencing response times, service levels, etc.; animation: models enable end users to “play out” different scenarios and thus provide feedback to the designer; configuration: models can be used to configure a system. PAGE 7 Limitations Executable models may be used to force people to work in a particular manner. However, most models are not well-aligned with reality. Most hand-made models are disconnected from reality and provide only an idealized view on the processes at hand: “paper tigers”. Given (a) the interest in process models, (b) the abundance of event data, and (c) the limited quality of hand-made models, it seems worthwhile to relate event data to process models: process mining! PAGE 8 What is Process Mining ? Process mining is to discover, monitor and improve real processes by extracting knowledge from event logs readily available in today's (information) systems. Process mining includes: (automated) process discovery conformance checking (i.e., monitoring deviations by comparing model and log) Social network/ organizational mining automated construction of simulation models model extension model repair case prediction Professor Wil van der Aalst (God father of process mining) history-based recommendations Department of Mathematics & Computer Science Eindhoven University of Technology Unlike traditional approaches the goal is not to http://www.win.tue.nl/ieeetfpm/lib/exe/fetch.php?media=shared:process_mining_manifesto- small.pdf construct a single static model. Process mining techniques can be used to dynamically generate process maps based on the most recent data. What is the difference between BPM and Process Mining? BPM Process Mining Business Process Management (BPM) Process mining aims to bridge the gap techniques and tools evolve around between BI and BPM process models Starting point for process mining is an It focuses on improving corporate event log performance by managing and optimising Each event in such a log refers to an a company's business processes activity and is related to a particular case Unfortunately, process models are often The events belonging to a case are completely disconnected from actual ordered and describe one “run” of the event data process Analysis results are unreliable because they are not based on observed facts, but on an idealized model of reality http://www.processmining.org/_media/publications/p651.pdf Positioning Process Mining process discovery process mining data mining / conformance machine checking predictive learning analytics BPM Process Mining Versus Data Mining Both start from data. Data mining techniques are typically not process-centric. Topics such as process discovery, conformance checking, and bottleneck analysis are not addressed by traditional data mining techniques. Process Mining Versus Data Mining End-to-end process models and concurrency are essential for process mining. Process mining assumes event logs where events have timestamps and refer to cases (process instances). Process mining and data mining need to be combined for more advanced questions. Perspectives The control-flow perspective focuses on the control- flow, i.e., the ordering of activities. The organizational perspective focuses on information about resources hidden in the log, i.e., which actors (e.g., people, systems, roles, and departments) are involved and how are they related. The case perspective focuses on properties of cases, e.g., cases can also be characterized by the values of the corresponding data elements. The time perspective is concerned with the timing and frequency of events. PAGE 11 All Types of Process Mining Organization managers want to know : What is the most frequent path in my organization process? How are the cases distributed over my organization process? To what extend do the cases comply with my process model? What are the routing probabilities in my process? What are the throughput times of my cases? What are the service times for my tasks? How much time was spent between any two tasks in my process? How are my cases actually being executed? What are the business rules in my process? Are these rules indeed being obeyed? How many of my people are typically involved in a case? Who are Active? Who are Idle? What is the communication structure and dependencies among my people? Who are important people in my organization? Who subcontracts work to whom? What are the bottlenecks in my process? Van der Aalst, W. M., van Dongen, B. F., Günther, C. W., Rozinat, A., Verbeek, E., & Weijters, T. (2009, January). ProM: The Process Mining Toolkit. In Proceedings of the Business Process Management Demonstration Track (BPMDemos 2009), Ulm, Germany, September 8, 2009. Chicago Starting point: event log XES, MXML, SA-MXML, CSV, etc. PAGE 12 Simplified event log a = register request, b = examine thoroughly, c = examine casually, d = check ticket, e = decide, f = reinitiate request, g = pay compensation, and h = reject request PAGE 13 Process discovery b examine thoroughly g c1 c3 pay c compensation a examine e start register casually decide c5 end request h c2 d c4 reject check ticket request f reinitiate request PAGE 14 Another example b c1 examine c3 thoroughly a e h start register decide c5 reject end request request d c2 check ticket c4 PAGE 15 Play-In event log process model PAGE 20 Play-Out process model event log PAGE 19 Replay extended model showing times, frequencies, etc. diagnostics predictions recommendations event log process model PAGE 21 Replay Connecting models to real events is crucial! Possible uses: − Conformance checking − Repairing models − Extending the model with frequencies and temporal information − Constructing predictive models − Operational support (prediction, recommendation, etc.) PAGE 22 Play In: Simple process allowing for many traces abdeg adbeg adbeg adbeh abdehabdeh adbeh adbeh adbeh get support from local accept manager (b) request (g) register travel decide (e) request (a) start check budget reject end by finance (d) request (h) Question abde adbf acde adce adbe abdf adbe adcf abdf acdf Create a process model that allows for the traces shown above. Answer abde adbf acde adce adbe abdf adbe adcf abdf acdf get support from local manager (b) get detailed accept motivation request (e) letter (c) register travel request (a) start check budget reject end by finance (d) request (f) Replay aceg get support from local manager (b) ? check budget (d) is missing! get detailed accept motivation request (g) letter (c) register travel decide (e) request (a) start check budget reject end by finance (d) request (h) reinitiate request (f) Another event log b examine thoroughly g c1 c3 pay c compensation a examine e start register casually decide c5 end request h c2 d c4 reject check ticket request f reinitiate request PAGE 17 Process Mining Tools: Overview Open-source Lightweight Enterprise-level Apromore Disco Celonis U. QPR Minit Melbourne, Xpress myInvenio U. Tartu, QUT, … ProcessGold bupaR QPR Enterprise U. Hasselt Signavio PI pm4py … various ProM 30 TU/e Desire lines in process models PAGE 23 Trends and terms Business Process Management (BPM) Business Intelligence (BI) Six Sigma Online Analytical Processing (OLAP) Business Activity Monitoring (BAM) Complex Event Processing (CEP) Corporate Performance Management (CPM) Visual Analytics (VA) Predictive Analytics (PA) Continuous Process Improvement (CPI) Total Quality Management (TQM) PAGE 24 Six Sigma Six Sigma was originally developed by Motorola in the early 1980s. DMAIC approach: − Define the problem and set targets, − Measure key performance indicators and collect data, − Analyze the data to investigate and verify cause-and- effect relationships, − Improve the current process based on this analysis, − Control the process to minimize deviations from the target. Key principle (if you can’t measure it, you can’t manage it!) PAGE 25 [μ-6σ, μ+6σ] with a 1.5σ shift A process that “runs at Six Sigma” has only 3.4 defective cases per million cases, i.e., on average 99.9997% of the cases is handled properly. PAGE 26 Six sigma quality management approach Let's go through a simple example of calculating Six Sigma using medical data. Scenario: Suppose you are monitoring the time it takes to prepare a medication in a hospital pharmacy. You have collected data from 10 different instances (for simplicity) on the time taken (in minutes) to prepare a specific medication. Sample Data (in minutes): 8, 9, 10, 9, 8, 7, 10, 9, 8, 9 Steps to Calculate Six Sigma: Six sigma example Healthcare Reference Model Example process discovery for hospital (627 gynecological oncology patients, 24331 events) Simple healthcare process model Extension References These slides are adopted from process mining online course + other supported materials listed in the blackboard reference list PAGE 28