New Algorithm by Pusan National University Scientists Can Repair Missing Data in Event Logs with Superior Accuracy

BUSAN, South Korea, Dec. 15, 2021 /PRNewswire/ — Digitalization has enabled businesses to record their operations in event logs where each activity in a business process is recorded as data with certain attributes such as a timestamp, event name etc. These logs are helpful as they give an overview of the operations and can be used to develop process models that optimize the business process. However, the quality of the optimization process is only as good as the data stored and event logs with missing events lead to poor analysis and data models.

In a collaborative study, researchers from Pusan National University, South Korea, including Dr. Sunghyun Sim and Prof. Hyerim Bae, along with Prof. Ling Liu from Georgia Institute of Technology have developed a method that can restore missing data in an event log. The study, published in IEEE Transactions on Services Computing, uses imputation methods that use correlations between available data to find missing information. "Since data is collected from multiple perspectives in numerous information systems, there is a relationship between the collected data. Starting with this point, our study suggested a method of restoring missing event values by utilizing the relationship among entities in the event log, which can overcome human error or system," explains Dr. Sim. 

In event logs, events have attributes that are linked to other events in "single event" or "multiple event" relationships. In the former case, each attribute of an event corresponds to a unique attribute in another event. Based on this relationship, the researchers developed a Systematic Event Imputation (SEI) method that restores a missing value by simply referring to the available value it is linked to.

However, in the latter case where attributes have multiple correspondences, a simple matching of attributes is not possible. For such situations, a multiple event imputation (MEI) method was developed where missing events are first estimated and used to create event sequences or event chains. These sequences can be compared with an event log without missing data to restore the missing event attributes.

These imputation methods were applied simultaneously by a bagging recurrent event imputation (BREI) algorithm, uses bootstrap sampling and recurrent event imputation (REI) to repair the event log. On tests with real-world event logs, the researchers found that their algorithm improved restoration accuracy by 10–30% compared to existing restoration algorithms. Moreover, it could restore almost 90% of the data accuracy even when more than half of it was missing.

Apart from optimizing business processes, the researchers are optimistic that such an algorithm can be extended to other applications that rely on the quality of data. One promising avenue lies in improving the data fed to AI systems and this method has the potential to accelerate the development of AI technologies. "It is possible to improve the performance of artificial intelligence by improving the quality of data in its learning process. The algorithm will also help prevent model malfunction by improving the quality of data it collects in real-time in a real-time environment," elaborates Prof. Hyerim.

The high accuracy of the new algorithm, as well as its versatility is sure to ensure its widespread application in industry in the near future.


Title of original paper: Bagging Recurrent Event Imputation for Repair of Imperfect Event Log With Missing Categorical Events

Journal: IEEE Transactions on Services Computing

*Corresponding author’s email:

About Pusan National University


Na-hyun Lee
+82 51 510-7928

SOURCE Pusan National University