Companies often forget that the battle for skilled workers is not only fought on the labor market, but within the companies themselves. Those who succeed in permanently retaining qualified employees secure their wealth of experience and their own competitiveness. This is the conclusion of a pilot project on the topic of machine learning, which the Windhoff Group carried out together with the German Pension Insurance Association (DRV Bund).
“At DRV Bund, the topic of strategic human resources planning will be more important than ever in the coming years. Due to Germany’s demographics, we find ourselves in a dilemma: On the one hand, as the number of retirees increases, so do the volume of applications and thus our staffing needs. On the other hand, a very large number of employees is leaving us in the medium term due to reaching retirement age themselves, which reduces our workforce. On top of that, there is an increasingly intense war for talent,” explains Dr. Michael Tekieli, responsible for People Analytics at DRV Bund.
Changing priorities
So far, the issue of fluctuation has not been one of the pain points, but it will definitely increase in relevance in the near future, Dr. Tekieli explains. “To avoid fluctuation becoming a pain point, we need targeted solutions. These should enable us to at least anticipate changes in the personnel landscape early and accurately and, in the best case, use our knowledge to proactively and effectively counteract departures that might impede our company’s competitiveness.”
Taking this into consideration, two questions became focal points in the pilot project: To what extent can machine learning help identify attrition risks? Is explainable artificial intelligence (XAI) capable of understanding reasons for non-age-related fluctuation? To provide answers to these questions, the project managers decided to use Smart Predict in conjunction with the SAP Analytics Cloud.
In the first step, it was important to create a coherent data basis from internal (ERP and HR systems) as well as external sources. In practice, this meant that an employee’s personal and professional data was supplemented by aspects from the corporate environment. In total, 40 descriptive attributes were coded.
Next, the most important influencing factors were identified: Age, age of youngest child, actual hours worked excluding absenteeism, length of service in months, absolute salary increase in the past twelve months, and the severity of constraints due to actions taken during the pandemic when matched against the Covid Stringency Index. Scientific publications and the creativity of the entire project team were consulted in the search for suitable descriptive variables. Different time horizons of non-age-related termination in the coming months (1/3/6/12) emerged as target variables. The attributes captured were collected for each of the over 25,000 employees on a monthly basis from 2018 through 2020. This resulted in a data set of 650,000 rows or 230 megabytes.
Smart Predict
Smart Predict makes it possible to carry out analysis of collected data as a self-service. Decisive arguments for the use of SAC were fast results through automated machine learning, transparency thanks to XAI, and highly predictable, powerful machine learning algorithms. The analysis works intuitively and can be performed without any programming skills, so neither IT experts nor data science resources are required.
To counteract reservations about the new technology, a so-called out-of-sample test was conducted. Among other things, the project revealed that an algorithm trained on data prior to June 2020 would have retrospectively detected departing employees in the second half of 2020 with a hit rate of 12.5 percent. The test also showed that the learned correlations were robust, i.e., transferable into the future. Overall, more than 50 percent of non-age-related attrition was detected by machine learning.
Another method of increasing acceptance was a plausibility check of the patterns found, which the predictive analytics model uses to generate forecasts. The patterns continued to be an important means of understanding which employee might leave the company and for what motivations.
However, modern machine learning algorithms are so complex that the effect of influencing factors cannot be directly understood. This is referred to as a black-box phenomenon. Consequently, approaches to successively explain the black box have been created in recent years. For example, SAC has been using SHAP values since the Q3 2021 release. With this, the importance and influence of various attributes can be explained on a local level, i.e., in relation to the individual employee. This enables plausibility and causality checks by domain experts as well as gaining new insights.
Convincing results
Undoubtedly, modern technology allows IT experts and business users alike to extract objective answers from historical HR data. Thus, fluctuation can be anticipated months in advance and even probabilities can be quantified. Consequently, active HR retention measures (for example, targeted upskilling opportunities) can be initiated to keep employees in the company.
“I was initially very skeptical about the use of automated machine learning, since the supposedly important step of hyper-parametrization is omitted. I was all the more positively surprised by the prediction quality. A key factor of success for automated machine learning is certainly the data quality. If all the steps are completely automated, the quality of the training data becomes even more important. The rather unpopular and dry topic of data integrity should therefore be at least as important to organizations as machine learning,” Dr. Tekieli summarizes.
Add Comment