Clustering mixed-type player behavior data for churn prediction in mobile games (CROSBI ID 320214)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Perišić, Ana ; Pahor, Marko
engleski
Clustering mixed-type player behavior data for churn prediction in mobile games
Marketers have long since understood the importance of customer segmentation and customer churn prediction modelling. However, linking these processes remains a challenge. Customer segmentation is often performed by applying a clustering algorithm on customer behavioral data, which is another challenging task since datasets on customer behavior typically comprise mixed-data types. This research focuses on clustering player behavior data for churn prediction modelling in the mobile games market and constructing a dissimilarity measure capable of simultaneously handling categorical and quantitative data. The problem of finding an appropriate dissimilarity measure for mixed-type data with unbalanced categorical features and highly skewed numerical features is handled by establishing a hybrid dissimilarity measure constructed as a normalized linear combination of distances. Distances are calculated conditional on feature type following the principles of Gower’s coefficient calculation where for numerical features, distances are calculated by applying a modified winsorized Huber loss, while for categorical features, we incorporate a distance measure based on variable entropy. In conjunction with the PAM clustering algorithm, the established dissimilarity measure is applied on real-world datasets and the performance is compared to several state-of-the- art clustering algorithms. Secondly, this research investigates the potential of customer segmentation as an integral part of churn prediction modelling in online games which is operationalized by applying the proposed clustering method on a real dataset comprising mixed-type data originating from a casual mobile game. The benefits of customer segmentation are supported by the data since churn prediction models exhibit higher performance when the clustering is performed prior to churn classification.
Mixed type data ; Clustering ; Distance measure ; Segmentation ; Churn prediction ; Customer behavior
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
31 (1)
2023.
165-190
objavljeno
1435-246X
1613-9178
10.1007/s10100-022-00802-8
Povezanost rada
Ekonomija, Interdisciplinarne društvene znanosti, Interdisciplinarne prirodne znanosti, Matematika