There is no doubt about the current role of Machine Learning in the fascinating world of Business Intelligence. Predicting whether a customer will be loyal to the company or not, understanding customers’ behavior or anticipating market fluctuations are typical examples on which Machine Learning may be pivotal. Unfortunately, most successful Machine Learning algorithms like Random Forests, Neural Networks or Support Vector Machines do not provide any mechanism to explain how they arrived at a particular conclusion and behave like a “black box”. This means that they are neither transparent nor interpretable. We could understand transparency as the algorithm’s ability to explain its reasoning mechanism, while interpretability refers to the algorithm’s ability to explain the semantics behind the problem domain.
The huge potential of process mining applications is -luckily- already discovered in a variety of business settings. In industry, more and more companies are learning about its potential value. In meanwhile, academic researchers continue their quest to the best algorithm, the most meaningful metrics, the most understandable visualisations, etcetera. Whatever ‘best’, ‘meaningful’, and ‘understandable’ may be… These are food for thought and discussion on their own. But I’d like to address a different mini-research-topic-on-its-own: the event log.
An implicit assumption in process mining (both research and applications), is the existence of an event log.
The picture above shows a list of possible actions that might be going on in an organization. As a process modeler, would you include all of them as activities in your diagram? Or do you think some are too detailed or rather not detailed enough to be considered an activity? Maybe you even argue that some of them are not relevant enough to figure in a process at all…