From relational database to valuable event logs for process mining – A procedure

The huge potential of process mining applications is -luckily- already discovered in a variety of business settings. In industry, more and more companies are learning about its potential value. In meanwhile, academic researchers continue their quest to the best algorithm, the most meaningful metrics, the most understandable visualisations, etcetera. Whatever ‘best’, ‘meaningful’, and ‘understandable’ may be… These are food for thought and discussion on their own. But I’d like to address a different mini-research-topic-on-its-own: the event log.

An implicit assumption in process mining (both research and applications), is the existence of an event log.

The event log can be generally described as ‘A collection of events.  An event 1) refers to a specific activity, 2) that took place at a certain moment in time, and 3) can be assigned to a unique case’.

Companies indeed often have this type of information. Yet, this is in many cases hidden in a database system, and does not present itself in a ready-to-go format. Hence the challenge of restructuring this data into an event log. To reach a valuable event log, a thorough understanding of the event log structure and the link to the original data structure is necessary.

When teaching my students on how to build a valuable event log, pointing them to all different decisions and options they have to bear in mind, I noticed the trial-and-error learning curve and the lack of proper teaching material. Building a decent event log came across like a set of arbitrary decisions and they could not (yet) grasp the consequences of these decisions.

Thinking back of how I learned the ‘art of building event logs’, it struck me that I started almost 10 years ago with my first event log building project (in 2007). Actually, the struggles of my students now coincide with the struggles of companies I had the privilege to work with. Over the years, I guided different companies on their journey into the world of process mining. Along this journey, I too had my fair share of trial-and-error.

So this is the outcome: I wrote a procedure to build valuable event logs from data that is stored in relational databases. The procedure is the result of distilling and structuring my experiences I had over the years. It is not based on academic research, but has to be seen as a report of ‘lessons learned’, poured into a manual. The aim is twofold:

  • to help novice process (mining) analysts taking those first steps on the learning curve, and
  • to create awareness of the importance of a well-thought-out event log structure.

I hope I can succeed in both aims. This leaves me only one more thing to say: happy event log building!

Mieke Jans

This entry was posted in Business, Education, Tutorial by Mieke Jans. Bookmark the permalink.

About Mieke Jans

Mieke Jans obtained a PhD in the domain of Accounting Information Systems at Hasselt University. Her PhD topic on process mining for forensic purposes led her to a carreer start in consulting at Deloitte Belgium. As part of the data analytics team, she worked on very diverse topics, ranging from predictive asset management to supporting IT audits. The common denominator of all projects was business analytics. In September 2014, Mieke returned to the academic world, pursuing the research that interests her most: how to turn process analytics techniques into workable support for financial auditors?

One thought on “From relational database to valuable event logs for process mining – A procedure

  1. Actually, it’s necessary new focus to discovery the best way to restructuring event log, depending on the dataset to obtain a thorough understanding of the event log structure it’s important to develop way to planned the subject you want to discovery.
    For this if there is someone who knows the context of the business that generated the log, it is important that it be inserted in the alignment.

Leave a Reply

Your email address will not be published. Required fields are marked *