Published January 8, 2021
| Version
1
Dataset
Open
Supplementary Material for "Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning"
Description
Supplementary material for the article:
Heinrich, Kai ; Zschech, Patrick ; Janiesch, Christian ; Bonin, Markus: Process Data Properties Matter: Introducing Gated Convolutional Neural Networks (GCNN) and Key-Value-Predict Attention Networks (KVP) for Next Event Prediction with Deep Learning. In: Decision Support Systems, 2021.
Abstract: "Predicting next events in predictive process monitoring enables companies to manage and control processes at an early stage and reduce their action distance. In recent years, approaches have steadily moved from classical statistical methods towards the application of deep neural network architectures, which outperform the former and enable analysis without explicit knowledge of the underlying process model. While the focus of prior research is on the long short-term memory network architecture, more deep learning architectures offer promising extensions that have proven useful for other applications of sequential data. In our work, we introduce a gated convolutional neural network and a key-value-predict attention network to the task of next event prediction. In a comprehensive evaluation study on 11 real-life benchmark datasets, we show that these two novel architectures surpass prior work in 34 out of 44 metric-dataset combinations. For our evaluation, we consider the effects of process data properties, such as sparsity, variation, and repetitiveness, and discuss their impact on the prediction quality of the different deep learning architectures. Similarly, we evaluate their classification properties in terms of generalization and handling class imbalance. Our results provide guidance for researchers and practitioners alike on how to select, validate, and comprehensively benchmark (novel) predictive process monitoring models. In particular, we highlight the importance of sufficiently diverse process data properties in event logs and the comprehensive reporting of multiple performance indicators to achieve meaningful results."
Data is available under Creative Commons Attribution-ShareAlike (CC-BY-SA)
http://creativecommons.org/licenses/by-sa/4.0/
Software code is available under GNU General Public License (GNU GPL v3)
http://www.gnu.org/licenses/gpl-3.0.html
*** Using this data for academic publications is granted explicitly. ***
The dataset was created jointly by researchers working at the Technische Universität Dresden and TU Dortmund University.
Files
_readme.txt
Files
(214.7 kB)
| Name | Size | Download all |
|---|---|---|
|
Checksum: md5:ea3fc6ab1981bc82caddebee8d6a8d15
PID: http://hdl.handle.net/11304/b203f94f-a361-4bfa-983f-97be15b991c5 |
3.4 kB | Preview Download |
|
Checksum: md5:1e98c701d9e7e55a9763971dfe6b6557
PID: http://hdl.handle.net/11304/8b6ede29-3db2-4f6f-804d-ff6c595d388c |
17.0 kB | Download |
|
Checksum: md5:c904855cb2f2de147eddb4dc82edffa8
PID: http://hdl.handle.net/11304/7d2b8d49-b184-4da1-9484-4de209463291 |
55.5 kB | Preview Download |
|
Checksum: md5:a2c3dab8ef6b3aae1539777eff9ef135
PID: http://hdl.handle.net/11304/237e96cc-f328-40fb-8228-b355f628bc4b |
138.8 kB | Download |
Additional details
Identifiers
- B2SHARE Legacy Record ID
- 08b7ff704f724b94a61b4a6cac0fe1e0