Skip to main content
Log in

Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

To tackle the societal and person-specific adverse consequences of long-term unemployment, many public employment services (PES) have implemented data-driven profiling systems to promptly identify vulnerable job seekers. More recently, PES increasingly rely on more complex machine learning (ML) models due to their enhanced accuracy. However, increasing concerns are raised regarding the algorithmic opacity, which hinders comprehension and trust in the predictions. The current study focuses on the explainability of the ML-based profiling model deployed at the Flemish PES (VDAB), aiming to predict clients’ likelihood of securing sustainable employment. We compare two explainability techniques: (1) TreeSHAP is a state-of-the-art method grounded in the theoretical properties of the Shapley values, and (2) TreeInterpreter is a computationally feasible approximation that foregoes some of these properties. Leveraging multiple evaluation metrics, our findings suggest that for tree-based models, approximations to the SHAP (SHapley Additive exPlanations) values yield very similar insights and maintain explanatory performance while minimizing computational overhead. This enables institutions with large client bases to generate real-time explanations without being compelled to deteriorate the model’s accuracy. Additionally, our analysis identifies key predictors of job seekers’ employment prospects, offering valuable insights for PES and related agencies striving to improve their support for job seekers in need. Clients’ online behavior, acting as a proxy for hard-to-measure job search intensity and motivation, emerges as a key component in the profiling model, presenting promising opportunities for future profiling efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability Statement

The data that support the findings of this study are available from Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding (VDAB), but restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available. The data are, however, available from the authors upon reasonable request and with the permission of Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding (VDAB).

Notes

  1. https://github.com/slundberg/shap.

  2. https://github.com/Trusted-AI/AIX360.

References

  1. Loxha A, Morgandi M. Profiling the unemployed: a review of OECD experiences and implications for emerging economies. In: Social protection discussion papers and notes, 91051. The World Bank. 2014. https://ideas.repec.org/p/wbk/hdnspu/91051.html. Accessed 8 July 2020.

  2. Soukup T. Profiling: predicting long-term unemployment at the individual level. Central Eur J Public Policy. 2011;5(1):118–43.

    Google Scholar 

  3. Desiere S, Struyven L. Using artificial intelligence to classify job seekers: the accuracy-equity trade-off. J Soc Policy. 2021;50(2):367–85. https://doi.org/10.1017/S0047279420000203.

    Article  Google Scholar 

  4. van Landeghem BD, Sam Struyven L. Statistical profiling of unemployed job seekers. IZA World of Labor. 2021. https://doi.org/10.15185/izawol.483.

  5. Lepri B, Oliver N, Letouzé E, Pentland A, Vinck P. Fair, transparent, and accountable algorithmic decision-making processes. Philos Technol. 2018;31(4):611–27. https://doi.org/10.1007/s13347-017-0279-x.

    Article  Google Scholar 

  6. Scoppetta A, Buckenleib A. Tackling long-term unemployment through risk profiling and outreach. In: A discussion paper from the employment thematic network. Technical Dossier no. 6. Eur. Comm.–ESF Transnatl, Coop. 2018;6:1–28.

  7. Brandt M, Hank K. Scars that will not disappear: long-term associations between early and later life unemployment under different welfare regimes. J Soc Policy. 2014;43(4):727–43. https://doi.org/10.1017/S0047279414000397.

    Article  Google Scholar 

  8. Eurofound, Adăscăliței D, Weber T. Tackling labor shortages in EU Member States. Publications Office of the European Union, Luxembourg. 2021. https://doi.org/10.2806/363602.

  9. Henman PWF. Digital social policy: past, present, future. J Soc Policy. 2022;51(3):535–50. https://doi.org/10.1017/S0047279422000162.

    Article  Google Scholar 

  10. Barnes SA, Wright S, Irving P, Deganis I. Identification of latest trends and current developments in methods to profile job seekers in European public employment services: final report. Directorate-General for Employment, Social Affairs and Inclusion, European Commission, Brussels. 2015.

  11. Lechner M, Smith J. What is the value added by caseworkers? Labour Econ. 2007;14(2):135–51. https://doi.org/10.1016/j.labeco.2004.12.002.

    Article  Google Scholar 

  12. Caswell D, Marston G, Larsen JE. Unemployed citizen or ‘at risk’ client? Classification systems and employment services in Denmark and Australia. Crit Soc Policy. 2010;30(3):384–404. https://doi.org/10.1177/0261018310367674.

    Article  Google Scholar 

  13. Zejnilović L, Lavado S, Martínez de Rituerto de Troya Í, Sim S, Bell A. Algorithmic long-term unemployment risk assessment in use: counselors’ perceptions and use practices. Glob Perspect. 2020. https://doi.org/10.1525/gp.2020.12908.

  14. Wang W, Qiu L, Kim D, Benbasat I. Effects of rational and social appeals of online recommendation agents on cognition-and affect-based trust. Decis Support Syst. 2016;86:48–60. https://doi.org/10.1016/j.dss.2016.03.007.

    Article  Google Scholar 

  15. Moerel L, Storm M. Automated decisions based on profiling: information, explanation or justification—that is the question! autonomous systems and the law. 2019. https://doi.org/10.2139/ssrn.3356631.

  16. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. https://doi.org/10.1023/A:1010933404324.

    Article  Google Scholar 

  17. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67. https://doi.org/10.1038/s42256-019-0138-9.

    Article  Google Scholar 

  18. Saabas A. Interpreting random forests. 2014. http://blog.datadive.net/interpreting-random-forests/. Accessed 9 Aug 2021.

  19. Bloch L, Friedrich CM, For the Alzheimer’s Disease Neuroimaging Initiative. Machine learning workflow to explain black-box models for early Alzheimer’s disease classification evaluated for multiple datasets. SN Comput Sci. 2022;3(6):509. https://doi.org/10.1007/s42979-022-01371-y.

    Article  Google Scholar 

  20. Banerjee JS, Mahmud M, Brown D. Heart rate variability-based mental stress detection: an explainable machine learning approach. SN Comput Sci. 2023;4(2):176. https://doi.org/10.1007/s42979-022-01605-z.

    Article  Google Scholar 

  21. Inan MSK, Rahman I. Explainable AI integrated feature selection for landslide susceptibility mapping using TreeSHAP. SN Comput Sci. 2023;4(5):482. https://doi.org/10.1007/s42979-023-01960-5.

    Article  Google Scholar 

  22. Hu X, Zhang X, Lovrich N. Public perceptions of police behavior during traffic stops: logistic regression and machine learning approaches compared. J Comput Soc Sci. 2021;4(1):355–80. https://doi.org/10.1007/s42001-020-00079-4.

    Article  Google Scholar 

  23. Molnar C. Interpretable machine learning: a guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/. Accessed 31 Aug 2021.

  24. Kumar IE, Venkatasubramanian S, Scheidegger C, Friedler S. Problems with Shapley-value-based explanations as feature importance measures. In: Proceedings of the international conference on machine learning (ICML). 2020. p. 5491–500. PMLR. http://proceedings.mlr.press/v119/kumar20e/kumar20e.pdf.

  25. Walker R, Brown L, Moskos M, Isherwood L, Osborne K, Patel K, King D. ‘They really get you motivated’: experiences of a life-first employment programme from the perspective of long-term unemployed Australians. J Soc Policy. 2016;45(3):507–26. https://doi.org/10.1017/S0047279416000027.

    Article  Google Scholar 

  26. Nguyen AP, Martínez MR. On quantitative aspects of model interpretability. 2020. arXiv:2007.07584

  27. Dumitrescu E, Hué S, Hurlin C, Tokpavi S. Machine learning for credit scoring: improving logistic regression with non-linear decision-tree effects. Eur J Oper Res. 2022;297(3):1178–92. https://doi.org/10.1016/j.ejor.2021.06.053.

    Article  MathSciNet  Google Scholar 

  28. Bock KWD, den Poel DV. Reconciling performance and interpretability in customer churn prediction using ensemble learning based on generalized additive models. Expert Syst Appl. 2012;39(8):6816–26. https://doi.org/10.1016/j.eswa.2012.01.014.

    Article  Google Scholar 

  29. Desiere S, Langenbucher K, Struyven L. Statistical profiling in public employment services: an international comparison. In: OECD Social, Employment and Migration Working Papers, No. 224. OECD Publishing, Paris. 2019. https://doi.org/10.1787/b5e5f16e-en. Accessed 8 July 2020.

  30. Wijnhoven MA, Havinga H. The Work Profiler: a digital instrument for selection and diagnosis of the unemployed. Local Econ. 2014;29(6–7):740–9. https://doi.org/10.1177/0269094214545045.

    Article  Google Scholar 

  31. Allhutter D, Cech F, Fischer F, Grill G, Mager A. Algorithmic profiling of job seekers in Austria: how austerity politics are made effective. Front Big Data. 2020. p. 5. https://doi.org/10.3389/fdata.2020.00005.

  32. Kern C, Bach RL, Mautner H, Kreuter F. Fairness in algorithmic profiling: a German case study. 2021. arXiv:2108.04134.

  33. Grundy J. Statistical profiling of the unemployed. Stud Polit Econ. 2015;96(1):47–68. https://doi.org/10.1080/19187033.2015.11674937.

    Article  Google Scholar 

  34. Sztandar-Sztanderska K, Zielenska M. Changing social citizenship through information technology. Soc Work Soc. 2018;16(2):1–13.

    Google Scholar 

  35. Matty, S. Predicting likelihood of long-term unemployment: the development of a UK Job seekers' Classification Instrument. In: Department for Work and Pensions Working Paper, No. 116. 2013. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/210303/WP116.pdf. Accessed 8 July 2020.

  36. de Troya ÍMDR, Chen R, Moraes LO, Bajaj P, Kupersmith J, Ghani R, Brás NB, Zejnilovic L. Predicting, explaining, and understanding risk of long-term unemployment. In: 32nd conference on neural information processing systems (NeurIPS) workshop on AI for social good. 2018. https://www.researchgate.net/profile/Laura-Moraes-3/publication/342452939_Predicting_explaining_and_understanding_risk_of_long-term_unemployment/links/5ef5073f92851c52d6fdb7b7/Predicting-explaining-and-understanding-risk-of-long-term-unemployment.pdf. Accessed 4 Aug 2020.

  37. Caigny AD, Coussement K, Bock KWD. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur J Oper Res. 2018;269(2):760–72. https://doi.org/10.1016/j.ejor.2018.02.009.

    Article  MathSciNet  Google Scholar 

  38. Kütük Y, Güloğlu B. Prediction of transition probabilities from unemployment to employment for Turkey via machine learning and econometrics: a comparative study. J Res Econ. 2019;3(1):58–75.

    Google Scholar 

  39. Boškoski P, Perne M, Rameša M, Boshkoska BM. Variational Bayes survival analysis for unemployment modelling. Knowl Based Syst. 2021;229: 107335. https://doi.org/10.1016/j.knosys.2021.107335.

    Article  Google Scholar 

  40. Zhao L. Data-driven approach for predicting and explaining the risk of long-term unemployment. In: E3S Web of Conferences, vol. 214, 01023. EDP Sciences. 2020. https://doi.org/10.1051/e3sconf/202021401023.

  41. Chen H, Janizek JD, Lundberg S, Lee SI. True to the Model or True to the Data?. 2020. arXiv preprint arXiv: 2006.16234

  42. Janzing D, Minorics L, Blöbaum P. Feature relevance quantification in explainable AI: a causal problem. In: International conference on artificial intelligence and statistics. PMLR. 2020. p. 2907–2916. http://proceedings.mlr.press/v108/janzing20a/janzing20a.pdf.

  43. Sundararajan M, Najmi A. The many Shapley values for model explanation. In: International conference on machine learning. PMLR. 2020. p. 9269–9278. http://proceedings.mlr.press/v119/sundararajan20b/sundararajan20b.pdf.

  44. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I. Explainable AI for trees: from local explanations to global understanding. 2019. arXiv:1905.04610.

  45. Frye C, Rowat C, Feige I. Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability. In: Advances in neural information processing systems, vol 33. 2020. p. 1229–1239. https://proceedings.neurips.cc/paper/2020/file/0d770c496aa3da6d2c3f2bd19e7b9d6b-Paper.pdf.

  46. Lundberg SM, Lee S. A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17). Curran Associates Inc., Red Hook. 2017. p. 4768–4777. https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf.

  47. Ancona M, Ceolini E, Öztireli C, Gross M. Gradient-based attribution methods. In: Samek W, Montavon G, Vedaldi A, Hansen L, Müller KR, editors. Explainable AI: interpreting, explaining and visualizing deep learning. Lecture notes in computer science. Springer, Cham, vol 11700. 2019. p. 169–191. https://doi.org/10.1007/978-3-030-28954-6_9.

  48. Okeson A, Caruana R, Craswell N, Inkpen K, Lundberg SM, Nori H, Wallach HM, Vaughan JW. Summarize with caution: comparing global feature attributions. IEEE Data Eng Bull. 2021;44(4):14–27.

    Google Scholar 

  49. Montavon G, Samek W, Müller KR. Methods for interpreting and understanding deep neural networks. Digit Signal Process. 2018;73:1–15. https://doi.org/10.1016/j.dsp.2017.10.011.

    Article  MathSciNet  Google Scholar 

  50. Hohmeyer K, Lietzmann T. Persistence of welfare receipt and unemployment in Germany: determinants and duration dependence. J Soc Policy. 2020;49(2):299–322. https://doi.org/10.1017/S0047279419000242.

    Article  Google Scholar 

  51. Vansteenkiste S, Deschacht N, Sels L. Why are unemployed aged fifty and over less likely to find a job? A decomposition analysis. J Vocat Behav. 2015;90:55–65. https://doi.org/10.1016/j.jvb.2015.07.004.

    Article  Google Scholar 

  52. Considine M, McGann M, Ball S, Nguyen P. Can robots understand welfare? Exploring machine bureaucracies in welfare-to-work. J Soc Policy. 2022;51(3):519–34. https://doi.org/10.1017/S0047279422000174.

    Article  Google Scholar 

  53. Kanfer R, Wanberg CR. Job search and employment: a personality-motivational analysis and meta-analytic review. J Appl Psychol. 2017;86:837–55. https://doi.org/10.1037/0021-9010.86.5.837.

    Article  Google Scholar 

  54. Vansteenkiste S, Verbruggen M, Sels L. Flexible job search behavior among unemployed job seekers: antecedents and outcomes. Eur J Work Organ Psychol. 2016;25(6):862–82. https://doi.org/10.1080/1359432X.2016.116840.

    Article  Google Scholar 

  55. Chen H, Covert IC, Lundberg SM, Lee S-I. Algorithms to estimate Shapley value feature attributions. Nat Mach Intell. 2023;5:590–601. https://doi.org/10.1038/s42256-023-00657-x.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to express our gratitude to Dr. Karolien Scheerlinck, Stijn Van De Velde, Joris Van Den Bossche, and Dieter Verbeemen of the VDAB AI Team for their assistance and feedback provided throughout this research project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wouter Dossche.

Ethics declarations

Conflict of interest

This research is supported by the Career Management Analytics research chair, sponsored by the Flemish PES (VDAB: Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Fig. 5.

Fig. 5
figure 5

SHAP decision plot for a local explanation of a client with a predicted employment likelihood of 8.9%. The plot shows the 20 most important drivers behind the model’s decision, with the features with the highest impact plotted at the top. The decision line starts at the base rate (mean of outcome variable for all training observations as shown by the vertical line) and incrementally adds the attribution values for all features until the final prediction is reached (colored bar above). The distance between the vertical line and the start of the blue decision line at the bottom of plot shows the sum of the attribution values for the features left out of the plot. The feature values of the client are plotted in the figure between brackets (e.g., the most influential feature for this client was the average unemployment duration in previous unemployment episodes of 1112 days)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dossche, W., Vansteenkiste, S., Baesens, B. et al. Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling. SN COMPUT. SCI. 5, 536 (2024). https://doi.org/10.1007/s42979-024-02884-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-02884-4

Keywords

Navigation