Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling

Dossche, Wouter; Vansteenkiste, Sarah; Baesens, Bart; Lemahieu, Wilfried

doi:10.1007/s42979-024-02884-4

Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling

Original Research
Published: 10 May 2024

Volume 5, article number 536, (2024)
Cite this article

SN Computer Science Aims and scope Submit manuscript

73 Accesses
Explore all metrics

Abstract

To tackle the societal and person-specific adverse consequences of long-term unemployment, many public employment services (PES) have implemented data-driven profiling systems to promptly identify vulnerable job seekers. More recently, PES increasingly rely on more complex machine learning (ML) models due to their enhanced accuracy. However, increasing concerns are raised regarding the algorithmic opacity, which hinders comprehension and trust in the predictions. The current study focuses on the explainability of the ML-based profiling model deployed at the Flemish PES (VDAB), aiming to predict clients’ likelihood of securing sustainable employment. We compare two explainability techniques: (1) TreeSHAP is a state-of-the-art method grounded in the theoretical properties of the Shapley values, and (2) TreeInterpreter is a computationally feasible approximation that foregoes some of these properties. Leveraging multiple evaluation metrics, our findings suggest that for tree-based models, approximations to the SHAP (SHapley Additive exPlanations) values yield very similar insights and maintain explanatory performance while minimizing computational overhead. This enables institutions with large client bases to generate real-time explanations without being compelled to deteriorate the model’s accuracy. Additionally, our analysis identifies key predictors of job seekers’ employment prospects, offering valuable insights for PES and related agencies striving to improve their support for job seekers in need. Clients’ online behavior, acting as a proxy for hard-to-measure job search intensity and motivation, emerges as a key component in the profiling model, presenting promising opportunities for future profiling efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward Interpretable Machine Learning: Constructing Polynomial Models Based on Feature Interaction Trees

Explainable statistical learning in public health for policy development: the case of real-world suicide data

Article Open access 17 July 2019

Data-driven artificial intelligence to automate researcher assessment

Article 05 February 2021

Data Availability Statement

The data that support the findings of this study are available from Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding (VDAB), but restrictions apply to the availability of these data, which were used under license for the current study and so are not publicly available. The data are, however, available from the authors upon reasonable request and with the permission of Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding (VDAB).

Notes

References

Loxha A, Morgandi M. Profiling the unemployed: a review of OECD experiences and implications for emerging economies. In: Social protection discussion papers and notes, 91051. The World Bank. 2014. https://ideas.repec.org/p/wbk/hdnspu/91051.html. Accessed 8 July 2020.
Soukup T. Profiling: predicting long-term unemployment at the individual level. Central Eur J Public Policy. 2011;5(1):118–43.
Google Scholar
Desiere S, Struyven L. Using artificial intelligence to classify job seekers: the accuracy-equity trade-off. J Soc Policy. 2021;50(2):367–85. https://doi.org/10.1017/S0047279420000203.
Article Google Scholar
van Landeghem BD, Sam Struyven L. Statistical profiling of unemployed job seekers. IZA World of Labor. 2021. https://doi.org/10.15185/izawol.483.
Lepri B, Oliver N, Letouzé E, Pentland A, Vinck P. Fair, transparent, and accountable algorithmic decision-making processes. Philos Technol. 2018;31(4):611–27. https://doi.org/10.1007/s13347-017-0279-x.
Article Google Scholar
Scoppetta A, Buckenleib A. Tackling long-term unemployment through risk profiling and outreach. In: A discussion paper from the employment thematic network. Technical Dossier no. 6. Eur. Comm.–ESF Transnatl, Coop. 2018;6:1–28.
Brandt M, Hank K. Scars that will not disappear: long-term associations between early and later life unemployment under different welfare regimes. J Soc Policy. 2014;43(4):727–43. https://doi.org/10.1017/S0047279414000397.
Article Google Scholar
Eurofound, Adăscăliței D, Weber T. Tackling labor shortages in EU Member States. Publications Office of the European Union, Luxembourg. 2021. https://doi.org/10.2806/363602.
Henman PWF. Digital social policy: past, present, future. J Soc Policy. 2022;51(3):535–50. https://doi.org/10.1017/S0047279422000162.
Article Google Scholar
Barnes SA, Wright S, Irving P, Deganis I. Identification of latest trends and current developments in methods to profile job seekers in European public employment services: final report. Directorate-General for Employment, Social Affairs and Inclusion, European Commission, Brussels. 2015.
Lechner M, Smith J. What is the value added by caseworkers? Labour Econ. 2007;14(2):135–51. https://doi.org/10.1016/j.labeco.2004.12.002.
Article Google Scholar
Caswell D, Marston G, Larsen JE. Unemployed citizen or ‘at risk’ client? Classification systems and employment services in Denmark and Australia. Crit Soc Policy. 2010;30(3):384–404. https://doi.org/10.1177/0261018310367674.
Article Google Scholar
Zejnilović L, Lavado S, Martínez de Rituerto de Troya Í, Sim S, Bell A. Algorithmic long-term unemployment risk assessment in use: counselors’ perceptions and use practices. Glob Perspect. 2020. https://doi.org/10.1525/gp.2020.12908.
Wang W, Qiu L, Kim D, Benbasat I. Effects of rational and social appeals of online recommendation agents on cognition-and affect-based trust. Decis Support Syst. 2016;86:48–60. https://doi.org/10.1016/j.dss.2016.03.007.
Article Google Scholar
Moerel L, Storm M. Automated decisions based on profiling: information, explanation or justification—that is the question! autonomous systems and the law. 2019. https://doi.org/10.2139/ssrn.3356631.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32. https://doi.org/10.1023/A:1010933404324.
Article Google Scholar
Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67. https://doi.org/10.1038/s42256-019-0138-9.
Article Google Scholar
Saabas A. Interpreting random forests. 2014. http://blog.datadive.net/interpreting-random-forests/. Accessed 9 Aug 2021.
Bloch L, Friedrich CM, For the Alzheimer’s Disease Neuroimaging Initiative. Machine learning workflow to explain black-box models for early Alzheimer’s disease classification evaluated for multiple datasets. SN Comput Sci. 2022;3(6):509. https://doi.org/10.1007/s42979-022-01371-y.
Article Google Scholar
Banerjee JS, Mahmud M, Brown D. Heart rate variability-based mental stress detection: an explainable machine learning approach. SN Comput Sci. 2023;4(2):176. https://doi.org/10.1007/s42979-022-01605-z.
Article Google Scholar
Inan MSK, Rahman I. Explainable AI integrated feature selection for landslide susceptibility mapping using TreeSHAP. SN Comput Sci. 2023;4(5):482. https://doi.org/10.1007/s42979-023-01960-5.
Article Google Scholar
Hu X, Zhang X, Lovrich N. Public perceptions of police behavior during traffic stops: logistic regression and machine learning approaches compared. J Comput Soc Sci. 2021;4(1):355–80. https://doi.org/10.1007/s42001-020-00079-4.
Article Google Scholar
Molnar C. Interpretable machine learning: a guide for making black box models explainable. https://christophm.github.io/interpretable-ml-book/. Accessed 31 Aug 2021.
Kumar IE, Venkatasubramanian S, Scheidegger C, Friedler S. Problems with Shapley-value-based explanations as feature importance measures. In: Proceedings of the international conference on machine learning (ICML). 2020. p. 5491–500. PMLR. http://proceedings.mlr.press/v119/kumar20e/kumar20e.pdf.
Walker R, Brown L, Moskos M, Isherwood L, Osborne K, Patel K, King D. ‘They really get you motivated’: experiences of a life-first employment programme from the perspective of long-term unemployed Australians. J Soc Policy. 2016;45(3):507–26. https://doi.org/10.1017/S0047279416000027.
Article Google Scholar
Nguyen AP, Martínez MR. On quantitative aspects of model interpretability. 2020. arXiv:2007.07584
Dumitrescu E, Hué S, Hurlin C, Tokpavi S. Machine learning for credit scoring: improving logistic regression with non-linear decision-tree effects. Eur J Oper Res. 2022;297(3):1178–92. https://doi.org/10.1016/j.ejor.2021.06.053.
Article MathSciNet Google Scholar
Bock KWD, den Poel DV. Reconciling performance and interpretability in customer churn prediction using ensemble learning based on generalized additive models. Expert Syst Appl. 2012;39(8):6816–26. https://doi.org/10.1016/j.eswa.2012.01.014.
Article Google Scholar
Desiere S, Langenbucher K, Struyven L. Statistical profiling in public employment services: an international comparison. In: OECD Social, Employment and Migration Working Papers, No. 224. OECD Publishing, Paris. 2019. https://doi.org/10.1787/b5e5f16e-en. Accessed 8 July 2020.
Wijnhoven MA, Havinga H. The Work Profiler: a digital instrument for selection and diagnosis of the unemployed. Local Econ. 2014;29(6–7):740–9. https://doi.org/10.1177/0269094214545045.
Article Google Scholar
Allhutter D, Cech F, Fischer F, Grill G, Mager A. Algorithmic profiling of job seekers in Austria: how austerity politics are made effective. Front Big Data. 2020. p. 5. https://doi.org/10.3389/fdata.2020.00005.
Kern C, Bach RL, Mautner H, Kreuter F. Fairness in algorithmic profiling: a German case study. 2021. arXiv:2108.04134.
Grundy J. Statistical profiling of the unemployed. Stud Polit Econ. 2015;96(1):47–68. https://doi.org/10.1080/19187033.2015.11674937.
Article Google Scholar
Sztandar-Sztanderska K, Zielenska M. Changing social citizenship through information technology. Soc Work Soc. 2018;16(2):1–13.
Google Scholar
Matty, S. Predicting likelihood of long-term unemployment: the development of a UK Job seekers' Classification Instrument. In: Department for Work and Pensions Working Paper, No. 116. 2013. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/210303/WP116.pdf. Accessed 8 July 2020.
de Troya ÍMDR, Chen R, Moraes LO, Bajaj P, Kupersmith J, Ghani R, Brás NB, Zejnilovic L. Predicting, explaining, and understanding risk of long-term unemployment. In: 32nd conference on neural information processing systems (NeurIPS) workshop on AI for social good. 2018. https://www.researchgate.net/profile/Laura-Moraes-3/publication/342452939_Predicting_explaining_and_understanding_risk_of_long-term_unemployment/links/5ef5073f92851c52d6fdb7b7/Predicting-explaining-and-understanding-risk-of-long-term-unemployment.pdf. Accessed 4 Aug 2020.
Caigny AD, Coussement K, Bock KWD. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur J Oper Res. 2018;269(2):760–72. https://doi.org/10.1016/j.ejor.2018.02.009.
Article MathSciNet Google Scholar
Kütük Y, Güloğlu B. Prediction of transition probabilities from unemployment to employment for Turkey via machine learning and econometrics: a comparative study. J Res Econ. 2019;3(1):58–75.
Google Scholar
Boškoski P, Perne M, Rameša M, Boshkoska BM. Variational Bayes survival analysis for unemployment modelling. Knowl Based Syst. 2021;229: 107335. https://doi.org/10.1016/j.knosys.2021.107335.
Article Google Scholar
Zhao L. Data-driven approach for predicting and explaining the risk of long-term unemployment. In: E3S Web of Conferences, vol. 214, 01023. EDP Sciences. 2020. https://doi.org/10.1051/e3sconf/202021401023.
Chen H, Janizek JD, Lundberg S, Lee SI. True to the Model or True to the Data?. 2020. arXiv preprint arXiv: 2006.16234
Janzing D, Minorics L, Blöbaum P. Feature relevance quantification in explainable AI: a causal problem. In: International conference on artificial intelligence and statistics. PMLR. 2020. p. 2907–2916. http://proceedings.mlr.press/v108/janzing20a/janzing20a.pdf.
Sundararajan M, Najmi A. The many Shapley values for model explanation. In: International conference on machine learning. PMLR. 2020. p. 9269–9278. http://proceedings.mlr.press/v119/sundararajan20b/sundararajan20b.pdf.
Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee S-I. Explainable AI for trees: from local explanations to global understanding. 2019. arXiv:1905.04610.
Frye C, Rowat C, Feige I. Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability. In: Advances in neural information processing systems, vol 33. 2020. p. 1229–1239. https://proceedings.neurips.cc/paper/2020/file/0d770c496aa3da6d2c3f2bd19e7b9d6b-Paper.pdf.
Lundberg SM, Lee S. A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17). Curran Associates Inc., Red Hook. 2017. p. 4768–4777. https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf.
Ancona M, Ceolini E, Öztireli C, Gross M. Gradient-based attribution methods. In: Samek W, Montavon G, Vedaldi A, Hansen L, Müller KR, editors. Explainable AI: interpreting, explaining and visualizing deep learning. Lecture notes in computer science. Springer, Cham, vol 11700. 2019. p. 169–191. https://doi.org/10.1007/978-3-030-28954-6_9.
Okeson A, Caruana R, Craswell N, Inkpen K, Lundberg SM, Nori H, Wallach HM, Vaughan JW. Summarize with caution: comparing global feature attributions. IEEE Data Eng Bull. 2021;44(4):14–27.
Google Scholar
Montavon G, Samek W, Müller KR. Methods for interpreting and understanding deep neural networks. Digit Signal Process. 2018;73:1–15. https://doi.org/10.1016/j.dsp.2017.10.011.
Article MathSciNet Google Scholar
Hohmeyer K, Lietzmann T. Persistence of welfare receipt and unemployment in Germany: determinants and duration dependence. J Soc Policy. 2020;49(2):299–322. https://doi.org/10.1017/S0047279419000242.
Article Google Scholar
Vansteenkiste S, Deschacht N, Sels L. Why are unemployed aged fifty and over less likely to find a job? A decomposition analysis. J Vocat Behav. 2015;90:55–65. https://doi.org/10.1016/j.jvb.2015.07.004.
Article Google Scholar
Considine M, McGann M, Ball S, Nguyen P. Can robots understand welfare? Exploring machine bureaucracies in welfare-to-work. J Soc Policy. 2022;51(3):519–34. https://doi.org/10.1017/S0047279422000174.
Article Google Scholar
Kanfer R, Wanberg CR. Job search and employment: a personality-motivational analysis and meta-analytic review. J Appl Psychol. 2017;86:837–55. https://doi.org/10.1037/0021-9010.86.5.837.
Article Google Scholar
Vansteenkiste S, Verbruggen M, Sels L. Flexible job search behavior among unemployed job seekers: antecedents and outcomes. Eur J Work Organ Psychol. 2016;25(6):862–82. https://doi.org/10.1080/1359432X.2016.116840.
Article Google Scholar
Chen H, Covert IC, Lundberg SM, Lee S-I. Algorithms to estimate Shapley value feature attributions. Nat Mach Intell. 2023;5:590–601. https://doi.org/10.1038/s42256-023-00657-x.
Article Google Scholar

Download references

Acknowledgements

We would like to express our gratitude to Dr. Karolien Scheerlinck, Stijn Van De Velde, Joris Van Den Bossche, and Dieter Verbeemen of the VDAB AI Team for their assistance and feedback provided throughout this research project.

Author information

Authors and Affiliations

Department of Work and Organisation Studies (WOS), KU Leuven, Campus Leuven, Naamsestraat 69, 3000, Leuven, Belgium
Wouter Dossche & Sarah Vansteenkiste
Research Center for Management Informatics (LIRIS), KU Leuven, Naamsestraat 69, 3000, Leuven, Belgium
Bart Baesens & Wilfried Lemahieu
Department of Decision Analytics and Risk, University of Southampton, Southampton, UK
Bart Baesens

Authors

Wouter Dossche
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Vansteenkiste
View author publications
You can also search for this author in PubMed Google Scholar
Bart Baesens
View author publications
You can also search for this author in PubMed Google Scholar
Wilfried Lemahieu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wouter Dossche.

Ethics declarations

Conflict of interest

This research is supported by the Career Management Analytics research chair, sponsored by the Flemish PES (VDAB: Vlaamse Dienst voor Arbeidsbemiddeling en Beroepsopleiding).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Fig. 5.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dossche, W., Vansteenkiste, S., Baesens, B. et al. Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling. SN COMPUT. SCI. 5, 536 (2024). https://doi.org/10.1007/s42979-024-02884-4

Download citation

Received: 27 September 2023
Accepted: 07 April 2024
Published: 10 May 2024
DOI: https://doi.org/10.1007/s42979-024-02884-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling

Abstract

Access this article

Similar content being viewed by others

Toward Interpretable Machine Learning: Constructing Polynomial Models Based on Feature Interaction Trees

Explainable statistical learning in public health for policy development: the case of real-world suicide data

Data-driven artificial intelligence to automate researcher assessment

Data Availability Statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interpretable and Accurate Identification of Job Seekers at Risk of Long-Term Unemployment: Explainable ML-Based Profiling

Abstract

Access this article

Similar content being viewed by others

Toward Interpretable Machine Learning: Constructing Polynomial Models Based on Feature Interaction Trees

Explainable statistical learning in public health for policy development: the case of real-world suicide data

Data-driven artificial intelligence to automate researcher assessment

Data Availability Statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation