Published January 8, 2021 | Version 1
Dataset Open

Supplementary Material for "Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning"

Description

Supplementary material for the article: Heinrich, Kai ; Zschech, Patrick ; Janiesch, Christian ; Bonin, Markus: Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning. In: Decision Support Systems, 2021. Abstract: "Predicting next events in predictive process monitoring enables companies to manage and control processes at an early stage and reduce their action distance. In recent years, approaches have steadily moved from classical statistical methods towards the application of deep neural network architectures, which outperform the former and enable analysis without explicit knowledge of the underlying process model. While the focus of prior research is on the long short-term memory network architecture, more deep learning architectures offer promising extensions that have proven useful for other applications of sequential data. In our work, we introduce a gated convolutional neural network and a key-value-predict attention network to the task of next event prediction. In a comprehensive evaluation study on 11 real-life benchmark datasets, we show that these two novel architectures surpass prior work in 34 out of 44 metric-dataset combinations. For our evaluation, we consider the effects of process data properties, such as sparsity, variation, and repetitiveness, and discuss their impact on the prediction quality of the different deep learning architectures. Similarly, we evaluate their classification properties in terms of generalization and handling class imbalance. Our results provide guidance for researchers and practitioners alike on how to select, validate, and comprehensively benchmark (novel) predictive process monitoring models. In particular, we highlight the importance of sufficiently diverse process data properties in event logs and the comprehensive reporting of multiple performance indicators to achieve meaningful results." Data is available under Creative Commons Attribution-ShareAlike (CC-BY-SA) http://creativecommons.org/licenses/by-sa/4.0/ Software code is available under GNU General Public License (GNU GPL v3) http://www.gnu.org/licenses/gpl-3.0.html *** Using this data for academic publications is granted explicitly. *** The dataset was created jointly by researchers working at the Technische Universität Dresden and TU Dortmund University.

Files

_readme.txt

Files (214.7 kB)

Name Size Download all
Checksum: md5:ea3fc6ab1981bc82caddebee8d6a8d15

PID: http://hdl.handle.net/11304/b203f94f-a361-4bfa-983f-97be15b991c5
3.4 kB Preview Download
Checksum: md5:1e98c701d9e7e55a9763971dfe6b6557

PID: http://hdl.handle.net/11304/8b6ede29-3db2-4f6f-804d-ff6c595d388c
17.0 kB Download
Checksum: md5:c904855cb2f2de147eddb4dc82edffa8

PID: http://hdl.handle.net/11304/7d2b8d49-b184-4da1-9484-4de209463291
55.5 kB Preview Download
Checksum: md5:a2c3dab8ef6b3aae1539777eff9ef135

PID: http://hdl.handle.net/11304/237e96cc-f328-40fb-8228-b355f628bc4b
138.8 kB Download

Additional details

Identifiers

B2SHARE Legacy Record ID
08b7ff704f724b94a61b4a6cac0fe1e0