To read this content please select one of the options below:

Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques

Jinwook Choi (Korea University Business School, Seoul, Republic of Korea)
Yongmoo Suh (Korea University Business School, Seoul, Republic of Korea)
Namchul Jung (School of Business Administration, Hongik University, Seoul, Republic of Korea)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 13 March 2020

Issue publication date: 2 June 2020

762

Abstract

Purpose

The purpose of this study is to investigate the effectiveness of qualitative information extracted from firm’s annual report in predicting corporate credit rating. Qualitative information represented by published reports or management interview has been known as an important source in addition to quantitative information represented by financial values in assigning corporate credit rating in practice. Nevertheless, prior studies have room for further research in that they rarely employed qualitative information in developing prediction model of corporate credit rating.

Design/methodology/approach

This study adopted three document vectorization methods, Bag-Of-Words (BOW), Word to Vector (Word2Vec) and Document to Vector (Doc2Vec), to transform an unstructured textual data into a numeric vector, so that Machine Learning (ML) algorithms accept it as an input. For the experiments, we used the corpus of Management’s Discussion and Analysis (MD&A) section in 10-K financial reports as well as financial variables and corporate credit rating data.

Findings

Experimental results from a series of multi-class classification experiments show the predictive models trained by both financial variables and vectors extracted from MD&A data outperform the benchmark models trained only by traditional financial variables.

Originality/value

This study proposed a new approach for corporate credit rating prediction by using qualitative information extracted from MD&A documents as an input to ML-based prediction models. Also, this research adopted and compared three textual vectorization methods in the domain of corporate credit rating prediction and showed that BOW mostly outperformed Word2Vec and Doc2Vec.

Keywords

Acknowledgements

This research was supported by the Korea Univeristy Business School Research Grant.

Citation

Choi, J., Suh, Y. and Jung, N. (2020), "Predicting corporate credit rating based on qualitative information of MD&A transformed using document vectorization techniques", Data Technologies and Applications, Vol. 54 No. 2, pp. 151-168. https://doi.org/10.1108/DTA-08-2019-0127

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles