The Interaction of Borrower and Loan Characteristics in Predicting Risks of Subprime Automobile Loans

We utilize the data of a very large UK automobile loan firm to study the interaction of the characteristics of borrowers and loans in predicting the subsequent loan performance. Our broader findings confirm the earlier research on the issue of subprime auto loans. More importantly, unmarried borrowers living with furnished tenancy agreements who have relatively new jobs have a probability of defaulting of more than 60% compared to an average 7% default rate in overall subprime borrowers in the dataset. Also, in the above category are those who live in a less prosperous part of the UK such as the north-west, are full-time self-employed, have other large loan arrears, fall into the bottom 25% percentile of monthly income, secure loans with high loan to total value (LTV), purchase expensive automobiles with shorter loan duration payment plans, and have a high dependency on government support. This in fact is also true of those who go into arrears, except that the highest probability in this context is around 40% compared to 6% for an overall sample. These findings shall help in the understanding of subprime auto loans performance in relation to borrowers and loan features alongside helping auto finance firms improve predictive models and decision-making.


Introduction
The subprime market is mostly made up of subprime mortgage loans, as well as some specialist lenders including payday loans. The market has seen significant growth, due to an increase in availability for lending to those who normally would be denied credit (Tashman 2007;Chomsisengphet and Anthony Pennington-Cross 2006). But since its peaks in 2000 and 2007, the market is now going through a period of low growth in line with the economy. The subprime mortgage crisis started in 2007 due to increased default rates, early terminations, and deterioration of lending standards (Demyanyk and Hemert 2009). Mayer et al. (2009) found that the delinquency rates in subprime mortgages increased from 5.6% in 2005 to over 23% in 2008 during the subprime mortgage crisis. The boom was linked to a deterioration in underwriting standards and a reduction in the performance of these loans. Ioannidou et al. (2009) believed the world financial crisis may have been caused by ab increased default rate on subprime mortgage loans which led to a virtual shutdown of interbank credit markets and a series of dramatic interventions by all major credit banks. The general perception is that the recent subprime mortgage crisis was caused by reduced lending standards, permitting companies to lend to borrowers who were not creditworthy (referred as subprime borrowers) and were unable to repay the loan.
Despite its obvious pitfall, the subprime market has some major benefits linked to the mortgage market, in particular by increasing the number of homeowners and in turn, the wealth that this creates (Chomsisengphet and Anthony Pennington-Cross 2006). Subprime auto loans could also help workers keep their jobs, which may require reliable transport in the absence of efficient and cost-effective public transport. Interestingly, some authors believe that despite the crash of the subprime market, the possibility of new lending strategies such as risk-based pricing will increase the efficiency of the market (Tashman 2007). Lenders of subprime loans charge a premium through higher interest rates and fees to counterweigh the higher default risk. Hence, the high interest rates offset high default rates (Adams et al. 2007;Zywicki and Adamson 2008). Some authors believe that the main aim for a lender is to maximize profits; however this is not necessarily related to risk. Roszbach (2004) argued that a higher-risk loan can be profitable even if the lender is certain that it will default, having said that, researchers, on average, believe that subprime borrowers are more risky due to having an impaired or limited credit history, and are a potential risk for the whole financial system due to interconnectivity of the financial system.
Considering the huge impact of the financial crisis in 2007 due to the subprime mortgage market, there is a growing interest in finding out the characteristics of the loans as well as those of the borrowers to explain the defaults and missed payments of these subprime loans. Some recent studies have explored the impact of subprime borrowers' demographic and loan characteristics on the performance of loans. Studies by Lax et al. (2004), Danis and Anthony Pennington-Cross (2008), Amromin and Paulson (2009), Demyanyk and Hemert (2009), Daglish (2009), and Ghulam and Hill (2017) fall into this category. But except for the last study, the authors have mainly ignored subprime consumer finance issues and car loans. The last study, although was directly related to subprime auto loan defaults and missed payments, but it did not analyze the interactions of loan and borrower demographic characteristics together.
This study fills an important gap in this regard and similar to (Ghulam and Hill 2017), first, looking at what characteristics have an impact on a borrower's ability to repay a subprime loan and the risks involved with these loans. It then specifically explores how loans and personal characteristics of the borrower, when considered simultaneously, either increase the 'risk of default' and missed payment or reduce it in some cases. More specifically, the study examines the interactions of borrower demographics, characteristics of the loan and securitized assets, and the role of other loans of the subprime borrower. These interactions were excluded in the (Ghulam and Hill 2017) and therefore this paper seeks to give a broader and more conclusive picture of the borrowers' probabilities of who were more likely to be in arrears, or would completely default. Alongside contributing to the existing literature on subprime lending, this study would provide valuable inputs to the subprime lender firms regarding profiling the borrowers into high risk, low risk, or medium risk.
Broadly confirming the study of (Ghulam and Hill 2017) and broader empirical literature, we conclude that being married, living in tenancy accommodation, engaging in full-time self-employment, and the duration of current employment have significant impact on subprime loan performance (defaults, current and past arrears, early payoffs, and write-offs). Some very important loan features such as LTV, APR, and terms are also significant determinants of above-mentioned destinations; and the same could be said of the age and price of automobile. More importantly, our extension of the earlier study confirms that the income of the borrower and the price of the car happens to be an essential predictor of defaults and the chance of going into arrears.
The building of some interesting interactions and scenarios revealed that those unmarried borrowers with furnished tenancy arrangements and relatively new jobs-whilst living in UK's less prosperous North West-have a 60% default probability compared to the average 7% default rate of overall subprime borrowers in our sample. The above group also included those who were also self-employed and had other loan arrears worth 4000 Pounds and more. Also, those who fell into the bottom 25% percentile of monthly income, secured loans with high LTV, and purchased expensive automobiles with a shorter loan duration plan and had a high dependency on government support also fell into this group. This could also be said of going into arrears, except that the highest probability in this context is around 40% compared to 6% for overall sample.
Furthermore, we identified the least risky borrower from the pool of perceived risky borrowers in relation to defaults and arrears. These appear to be the ones who are married, homeowners, full-time employed, and based in South of England with zero other loans arrears. These borrowers also happen to be high income earners (falling under the top 10% category, with bottom 10% LTV, and car price with longer duration loans alongside job stability and less dependency on state financial benefits to supplement their low income).
The rest of the paper proceeds as follows. Section 2 briefly explains the loans and demographic features that could have impact on payments and defaults. The methodology section explains the method chosen and evaluates the data structure. The results section looks at the findings in detail and compares these results with the literature. The conclusion section summarizes the findings and discusses limitations, as well as identifying areas for further research.

A Survey of Characteristics Related to Subprime Loans and Borrowers That Are Linked to Defaults and Missed Payments
An extensive review of the literature revealed the factors which could potentially impact the performance of a loan including subprime. These include the variables pertaining to borrower demographic, loan characteristics, and securitized assets and other loans apart from the present loan being analyzed.

Demographic Variables Affecting Loan Performance
On average, subprime borrowers tend to be more risky, as there is more chance of default and nonrepayment. The income of the borrower is considered one of the important predictors of default. The general assumption is that a borrower with a lower income has a higher probability of default. Backing up this assumption Dinh and Kleimeier (2007) found that the individuals who have a lower income had defaulted more. Equally, borrowers with a larger income and a bank account are more likely to make payments (Einav et al. 2012). Higher income and longer employment time leads to a lower annual percentage rate (APR), as higher income indicates more resources available to repay, but also the length of employment shows stability and therefore is seen to be producing less risky loans (Agarwal et al. 2008). Education can also be used to predict the probability of default. It was found that individuals with a higher level of education have a lower chance of default (Kočenda and Vojtek 2009). (Dinh and Kleimeier 2007) supported this view that a better education would reduce the risk of default, as the individual is more likely to have stable employment and the potential to earn a higher income.
Longer current employment is seen to be less risky and show more stability for a borrower. Self-employment and entrepreneurial employment is seen to be more risky and therefore more chance of default due to variable income and more liquidity constraints (Capozza and Thomson 2005). Marshall et al. (2010) found that borrowers who do not have a monthly salary have a higher probability of default. Zywicki and Adamson (2008) agree, and state that lenders tend to charge higher interest rates to self-employed borrowers due to their less predictable income. A homeowner is expected to be less risky and have a lower chance of default. Marshall et al. (2010) found that those who are not homeowners are more likely to default on their loans. Similarly, Agarwal et al. (2011) found that homeowners are 17% less likely to default and 25% less likely to file for bankruptcy. The length of time the borrower has lived at the current address can also affect risk. The findings of (Marshall et al. 2010) suggest that a longer period at the current address reduces the likelihood of defaulting. Studies by Dinh and Kleimeier (2007) regarding residential status of the subprime borrowers found that borrowers who were dependent and stayed with their parents had a lower probability of default due to few outgoings.
Married couples are less likely to default as they may have a dual income and therefore are less risky. Kočenda and Vojtek (2009) agreed with this assumption but on the flip side of this view, Dinh and Kleimeier (2007) found that those who are married are more likely to default than single borrowers due to higher household expenditures. They conclude that this could be due to dependents and therefore there is more financial pressure. In a similar study, Marshall et al. (2010) found that those who have more children are more likely to default on their loans. Agarwal et al. (2011) found that defaults and bankruptcies rise and fall over an individual's life. Their study found that the youngest (30 years and under) and the oldest (60 years and older) aged individuals have the lowest bankruptcy risk. The middle aged groups (30-60 years) are more likely to default and have a higher rate of bankruptcy. In another study, Agarwal et al. (2008) had contrasting results, finding that younger and older adults were being charged a higher rate of interest, with the least risky age to be early 50's. Yap et al. (2011) agree with this, that borrowers who are 53 years of age and older have the lowest risk of default.

Loan and Securitized Assets Characteristics and Macroeconomic Conditions Affecting Loan Performance
Subsequent to demographic characteristics of the borrowers, the next set of information and related variables are comprised of loan characteristics. Borrowers are more likely to default on larger loans, and therefore lenders need to be more aware of the amount being lent to avoid over-borrowing and increased default rates (Adams et al. 2007). Also, a larger down payment or deposit will reduce the default risk, as this in turn decreases the LTV ratio. LTV is often used when underwriting loans to predict the level of risk. Bar-Gill (2008) agrees by finding that a small down payment is linked with increased delinquency in subprime loans. Marshall et al. (2010) conclude that purchasing behavior is a good indicator of default risk, finding that borrowers who purchase less expensive items are able to make a bigger down payment and are more likely to make repayments, therefore decreasing the risk of default. In terms of the length of the loan, borrowers are more likely to default early into the loan, which is in line with the market assumption that default rates are higher in the first 12-24 months of the loan (Malik and Thomas 2010).
Prevailing economic conditions (regional and national) are also important determinants of subprime loan defaults and missed payments, similar to prime loans. It was found that subprime borrowers tend to have a lower income and fewer saving and therefore are less likely to be able to repay a loan when a trigger event happens such as unemployment (Capozza and Thomson 2005). Zywicki and Adamson (2008) added to this as they believe subprime borrowers are more likely to lose their job during recessions and uncertain economic environment, and they are likely to want to repay the loan but are unable to do so due to less savings. Rajan et al. (2015) found that an economic decline, in particular a fall in house prices, can cause an increase in delinquency. Bonfim (2009) also agrees that economic decline increases the rate of default, although this may be due to the loosening of lending standards when the economy is strong. Another reason for default on subprime loans could be that the late payment penalties can be more attractive than rates of other personal loans such as payday lenders, and therefore chose to default to meet short-term liquidity needs (Zywicki and Adamson 2008;Capozza and Thomson 2006) during a volatile economic environment.

Data and Modeling Approach
This section will review the relevant approach and methods to determine the drivers of risk of outright default and/or missed payments of subprime auto loans. We chose to use the binary logistic regression method (initially simple logit and then multinomial version), which is often used to evaluate risk and default probabilities. Logistic regression looks at borrower's loan performance over a specific time period and characteristics it in relation to their regular repayments. When it comes to the methodological framework to predict defaults and credit risk, the study by Halteh et al. (2018) developed cutting-edge tree-based stochastic models to model credit risk. Addo et al. (2018) built binary classifiers based on machine and deep learning models on real data to predict loan default probability. The performance of classification was tested on separate data. The authors concluded that tree-based models performed much better compared to models based on multilayer artificial neural networks. The authors argued that these results opened a serious debate in relation to predictions of credit risk using deep learning systems in enterprises.
The data used is comprised of the borrower's individual characteristics in relation to their payment history in order to determine any relation between the characteristics and payments (principal and interest). The data was obtained from a large UK vehicle financing company supplying secured loans to individuals in the UK with impaired credit history, considered as 'subprime', as discussed above. The data is comprised of subprime loan agreements signed from October 2012 to October 2013. Interestingly, and as expected, 7% of these borrowers defaulted within thirteen months of signing the contract. The data contains a broad range of information such as characteristics of the borrowers and loans, vehicle specification the loan is secured to, and their repayment history over the thirteen months as specified previously. The sample covers the whole of the UK borrowers from different regions including south, east, west, midland, and north regions. The company anonymized the data prior to releasing it to us for our research to assure confidentiality.
As discussed above, the variables used for the logistic regression contain a large variety of the borrowers' loan and vehicle information, with a sample size of 10,670. In order to make the interpretation more straightforward and for the ease of estimation, continuous variables such as income, deposit, APR, etc. were log transformed. For categorical variables, dummies were created for each category. A standard logistic regression function relates the set of potential predictor variables to a probability of an outcome such as Y = 0 for not defaulting or in arrears and Y = 1 for default or in arrears. The simple logistic regression model can be written as follows: where P(Y = 1) is the outcome of interest. This equation is solved to obtain: The above Equation (2) is used to derive the conditional probability of a borrower which fits into the category of a defaulter based on different explanatory variables. Subsequently, multinomial logistic regression is utilized to classify loan performance into six further categories: 1 = live deal not in arrears, 2 = live deal in arrears, 3 = live deal previously in arrears (looking at previous 10 months), 4 = car repossessed or surrendered, 5 = loan settled early (by customer, car dealer, or insurance company), and 6 = bad debt written off, which allows for more specific and useful information. The first category was dropped and treated as a reference category in the interpretation of empirical estimates. The choice of multinomial logistic regression is based on the fact that this method is useful when the response variables have more than two values and there is no ordering of the categories (Chatterjee and Hadi 2006). The regression will allow the modeling of five different categories to determine which characteristics (related to borrowers, loan, and vehicle) determine the probability that a borrower will end up in a specific category.
The variables to be selected for the regression model and presented in Table 1 were derived from the empirical literature and the input from one of the authors who worked in this finance firm from where this data was gathered from. The variables were characterized into three categories namely demographic, loans, and securitized asset characteristics. The last column of Table 1 shows expected signs for each variable i.e. positive indicates a high chance of default and going into arrears and negative indicates a lower chance of default. Married borrowers and marital status of common law were identified to be those couples who may not default on loan. Variables included having residential status of furnished, unfurnished, or other accommodation are expected to have a positive expected effect on values and were assumed to be positively related to default probability.
We expected that full-time employment will lead to the stability of income and lower likelihood of default. Borrowers in a residential area with better job prospects are less likely to default compared to others. For the higher values of variables in relation to loan characteristics and securitized assets such as percentage of total income comprising of government benefits, LTV, length of the agreement, and price and age of the car, all have a positive expected signs which show that higher values of these variables may lead to more defaults and arrears. Price of the car which the loan is secured to (log) + lcarage Age of the car which the loan is secured to, in months (log) +

Loan performance classification
Categories variable numbered 1-6 where: 1 = live deal not in arrears 2 = live deal in arrears 3 = live deal previously in arrears (looking at previous 10 months) 4 = car repossessed or surrendered 5 = loan settled early (by customer, car dealer, insurance company) 6 = bad debt written off default default status of the agreement constructed from loan classification categories default = 1 for groups 2, 4, 6 and nondefault = 0 for groups 1, 3, 5 Table 2 indicates the distribution of ten continuous variables mentioned in Table 1. These are the length of the current residence, employment period in current job, total income of the borrower, percentage of total income from government benefits, loan to value, APR, length of agreement, and price and age of car. The estimates contained in the Table 2 show that 10% of the borrowers indicated that they lived in the current residence for only 7 months, followed by 10% living for 13 years, and only 1% for 29 years and above, respectively. The median years of current residence period is 4.58 years. This indicates that most borrowers, due to the reduced number of months or years of being in employment in their current jobs, may need to shift their homes frequently. As the majority of these people were employed for a shorter time in their current employment, 50% of them have worked for only 4 years in their current employment. Although more than 50% of the borrowers did not receive any income from state benefits, 25% of the borrower's population had a monthly income of £2253, but more than one third of this income was from state benefits. Only 1% of borrowers paid the highest deposit i.e., £2300 for the car, while the majority of customers (roughly 50%) either did not pay any deposit or a negligible amount. Higher LTV is an indication of larger loan amount and not surprisingly, a significant portion of customers opted for very high LTV which led to higher monthly APR i.e., 6%, 4.03%, and 3.57%, respectively. Almost 10% of population purchased an approximately 10 year old car at a price of £2995, while 1% of the borrower population purchased an almost 15.5 year old car at £7620. Broadly speaking, the length of agreements appears to be between 2.5 years to 3 years. Overall, the overview of the distribution of these variables showed a lot of variation which will be supported and discussed in the latter part of our empirical regression analysis. Table 3 is constructed to show the consistency of assumption in terms of expected signs in relation to identifying factors which differentiate defaulters and nondefaulters. The nonparametric tests performed are a single variable test and do not control for other factors at the same time. On analyzing descriptive statistics used in Table 3 and nonparametric tests of equality of means (Mann-Whitney) we found that differences between defaulters and nondefaulters groups are statistically significant for the following different variables say under: Demographic, residential, and employment characteristics: Z statistics relating to borrowers who were single, who lived in north-west area living in furnished accommodation, and who were part time self-employed are statistically significant and therefore borrowers with these characteristics are more likely to default or go into arrears. On the other hand, borrowers who are married or divorced, are homeowners, and live in either south-east, east, and west are less likely to default or go into arears. Similarly, more time spent in current employment reduces the chance of default and arrears. An increase in the total income of the borrower produced less defaults but an increasing proportion of unearned income such as state benefits would increase the possibility of defaulting and going into arrears. For other demographic factors, the Z statistics and related probability values shows that differences in the means of defaulters and nondefaulters are not statistically significant.
Loan and securitized asset characteristics: The statistics contained in Table 3 indicate that a higher deposit paid by the customer towards the car reduces the chances of defaults and arrears. The same could be said of the monthly annual percentage rate. Higher LTV increases the chances of defaults and arears and it is the same in respect to the age of the car which the loan is secured against. If the borrower has the burden of other loans apart from the amount borrowed from the finance company being analyzed, it will affect the probability of default. The estimates contained in the table further indicate that when borrowers have arrears on other loans apart from this loan (where loan amount is more than £2000 and £4000 in particular), Z statistics are statistically significant, indicating that the higher the arrears on other loans, the higher the risk of the borrower defaulting on his/her current car loan too.  Table 4 contains estimates of the important determinants of default represented by odds ratio. Five different versions of the regression model specifications are estimated and presented in an effort to prove that by dropping some variables, it does not affect the regression coefficients in terms of signs and statistical significance, thereby proving the stability of the base model, in particular the one which contains all variables. More specifically, these specifications compare the odds ratio of the base model with the odds ratio of the models without residential and employment duration, income, loan characteristics, and automobile characteristics. Based on base model estimates, the odds ratio in this table for demographic variables ms2, rs2, and a6 (a borrower who is married, living in a furnished accommodation, and living in the North West region), indicates they are statistically significant. In particularly, married and divorced borrowers are less likely to default. On the other hand, all tenants (furnished, unfurnished, and council), full-time self-employed and residing in the North West UK, all have a positive impact on defaults and arrears.

Estimation and Explanation
As the income of the borrower increases, the odds of default decreases irrespective of the demographic characteristics. Studies by Morton (1975), Von Furstenberg and Green (1974), Chinloy (1995), Adams et al. (2007), Einav et al. (2012), Agarwal et al. (2011), and Dinh and Kleimeier (2007), all suggest that higher income borrowers are less risky, and therefore less likely to default on loans. Combining with demographic information, we could say that the odds of default are higher in borrowers who are unmarried, living in furnished accommodation, and staying in the North-West region. The borrowers are also likely to default when their income levels are low. On the other hand, it can be inferred that a borrower who is married and not living in a furnished accommodation and living other than in the north region, irrespective of his income levels, is least likely to default. This is because the North West region is a relatively less well-off region and those staying in furnished accommodation coupled with lower income, having a joint account with potentially one source of income thus leading to higher expenses towards dependents, leads to a higher chance of default.
The odds ratio in this table for the price of the car (lcarprice) indicates a high probability of a borrower defaulting when the price of the car increases. This may lead to a greater loan being taken by the borrower. In addition, the estimates show that higher the loan to value leads to higher chances of default when compared to borrowers with lower LTV's which produces fewer defaults. Other studies also support that a borrower with a larger loan is more likely to default, and also a borrower with a larger deposit, which will decrease the LTV, will have a lower probability of defaulting (Adams et al. 2007). Danis and Anthony Pennington-Cross (2008), Morton (1975), Von Furstenberg and Green (1974), Chinloy (1995), Calem and Wachter (1999), Amromin and Paulson (2009), Demyanyk and Hemert (2009), and Bar-Gill (2008) also support this. It is evident that when the loan amount increases it may create additional pressure to make regular payments and thus could lead to defaulting.  Table 4 reveals that a higher APR leads to lower odds of default. This may sound unusual, but results indicate the reason for this is due to the accurate pricing of the loans. Studies by Zywicki and Adamson (2008) also have similar findings, which indicate that borrowers would default on the cheapest loan as it would cost them less. The estimates contained in the table further reveal that the higher the term of the loan, the higher the chance to default. Ghulam and Hill (2017) also made similar conclusions. In addition, we find that a borrower who owns a car which is higher in age is highly likely to default; this may be attributed to the fact that as the car age increases it attracts greater expense for repair and maintenance.

Interactions of Borrowers and Loans Characteristics in Predicting Defaults and Currently in Arrears
We built different scenarios, Table 5 seeks to analyze the impact of each scenario with interactions with the demographic and loan/securitized asset variables. More specifically, subprime borrowers' demographic characteristics variables interacted with other variables like income, loan to value, term of the loan, age of the car, and price of the car. In presenting our estimates in the Table 5, '1' and '0' indicates if condition applies or does not apply. Starting with interactions of demographic variables with the borrower's income, the probability estimates presented in table indicate that the probability of default decreases as the income of the borrower increases irrespective of the demographic characteristics of the borrower. Interestingly, if the borrower is married, does not live in furnished accommodation with tenant status, is not full-time self-employed, and does not have other loans worth of 4000 Pounds or other loans in arrears, then this group has the lowest probability of defaults with top 1% income earner status (only 2% probability of default). On the other hand, completely opposite to the above scenarios in relation to demographic, residential, and employment status, a borrower with the top 1% car price produced the highest default probability (44%).
These two extreme scenarios in fact indicate that in terms of the least-risk taking strategy on the part of the lender, income seems to be one of the important predictors of a risk minimizing tool while car price seems to be the highest default producer variable after controlling for demographic, employment, and residential status predictors. All other important predictors such as LTV, terms of loan, and age of car appear to be relatively less strong predictors. Interestingly, at the median level of borrower income, loan LTV, term, and car price and age, estimated default probability appears to be in the range of 34 to 36% with extreme demographic, residential, and employment status (not married, lives in the north, full-time self-employed, and already with a large loan in arrears). The probability gets reduced to only 4% with a median level of borrower income, loan LTV and term, and car price and age with least extreme demographic, residential, and employment status (married, does not live in the north region, is not full-time self-employed, and does not have already large loan in arrears). Furthermore, even in the case of the extreme non loan and asset securitized scenario, 1% top income while holding other factors constant could reduce the default probability to half of that compared to extreme scenario and top 1% car price.
Interestingly, variations in loan characteristics approximated by the term (duration) of the loans and LTV do not appear to change the probability of default that much irrespective of extreme or less extreme demographic, residential, and employment characteristics of the borrower compared to borrower income and securitized asset characteristics; the increase from P25 to P99 increases the probability on average by 2 to 3%. Another important message from the table is the role of previous arrears. Every time the status changes from no previous arrear for a loan worth more than 4000 Pounds to previous arrear for a loan worth more than 4000 Pounds, the probability of default increases by 70-100% irrespective of different income levels, price, and age of the securitized asset. Hence, previous credit history appears to be very important in this context, but this finding is expected to be due to the nature of the borrowers in our context. Among the demographic variables, being married has a strong implication for the defaulter's risk. For different income levels and price ranges of the automobile, change of status from not married to married reduced the probability in the range of 35% to 40%. The impact of residential status on probability of default alone is interesting too. Furnished tenancy seems to increase the probability of default by 65 to 100% for different income levels, 55-65% for LTV and term of loans levels, and 50-70% for automobile age and price. As discussed above, automobile age and price seem to affect the chance of default, irrespective of the demographic, residential and employment characteristics. The increase in both age and price from P25 to P99 appears to be increasing the probability of default in the range of 60% to 80%. The similar increase in default probability for both variables is not surprising given the fact that the age and price of the car are likely to be negatively related.
Uncertainty in income approximated by full-time self-employment could increase the risk of stopped payments on loans or complete write-off of the loans. Similarly, part time self-employment could be linked with high hazard of going into arrears. Surprisingly, unemployed borrowers have a high risk of being paid-off early. But this could perhaps be due to being able to secure low value loans and then being able to pay early. The estimates attached to the area of residence reveal some interesting results. Borrowers residing in the Midlands have a high chance of defaulting. As discussed above, the area of residences which has the strongest impact on arrears (current) and default is in the North West of the UK. The stability of employment measured by months since current employment commenced, has a strong impact on arrears (current and the past) as well as on defaults. Higher months in current employment in fact reduces the relative risk of ending up at both destinations (arrears and defaults).
Borrowers with a higher income have a reduced relative risk of arrears (current and past) but also reduced chance of defaults. Interestingly, it does not increase the relative risk of early pay-off either. As expected, higher LTV increases the relative risk of past arrears as well as defaults. Pricing of the loan measured by interest rate charged (APR) impact on defaults and write-off is statistically significant and negative, indicating that higher rate reduces the relative risk of both destinations of subprime loans. Longer duration reduces the hazard of early payoffs as well as defaults are perhaps due to lower monthly instalments.
When it comes to the securitized asset, as discussed before, we use price and age of car. The regression coefficients converted into relative risk indicates that a higher price increases the relative risk of arrears (current and past) and defaults. Age of the automobile has the same effect on default but does not on current arrears. Similarly, increase relative risk for the write-off for this variable is statistically significant. The role of multiple loans issues other than current loans is addressed by three dummy variables. These variables include other loans in arrears less than £2000, between £2000 and 4000, and greater than £4000. The first variable affects positively (increasing risk) on defaults and early pay offs. The increasing relative risk is significant for the second variable on current arrears and the third on both defaults and write-offs. Interestingly, borrowing with a joint account increases the risk of defaults, perhaps due to the higher number of dependents as discussed before.
A clear message from the above analysis seems to be that not all variables effect different destinations of subprime loans. Marital status has a strong effect on defaults, and residential status has a strong effect on current arrears. Employment effects different destinations differently, except defaults. The area of residence does not affect write-offs and early payoffs. Job stability and income have a strong effect on all destinations except early payoffs for the current employment duration and early payoff for the income of the borrower. Government benefits effects early pay offs, defaults, and write-offs. Loan LTV impacts on past arrears and defaults is substantial. Loan duration affects defaults and early pay offs. Price of loan impacts on defaults and write-offs. The age and price of the car increase the risk of default and arrears. Complete write-off risk also exists for the age of the car. Lastly, other loan issues are also very important for borrowers to defaults and have arrears. Table 6. Analysis of arrears, defaults, early repay, and written-offs (relative risk ratios).

Interactions of Borrowers and Loans/Securitized Asset Characteristics in Predicting Going into Arrears
In an effort to understand the joint impact of borrowers and loan/securitized asset characteristics in predicting going into arrears but not completely defaulting, we ran logistic regression with the same explanatory variables but aggregated current and historical arrears together to create a dummy variable = 1 if the borrower is currently into arrears or was unable to pay for some time and then paying installments started again. Table 7 presents estimated probabilities based on different interactions of the borrowers' demographic, residential, employment, and loan origination area with loan and securitized asset characteristics.
More specifically, subprime borrowers' demographic characteristics variables were interacted with other variables like borrower income, loan to value, price of car, and months in current employment. Similar to the previous exercise, for presenting our estimates in the table, '1' and '0' indicate whether a condition applies or does not apply. Starting with interactions of demographic variables with borrowers' income, the probability estimates contained in the table indicate that the probability of going into arrears decreases with increasing income of the borrower irrespective of the demographic characteristics of the subprime borrower. Interestingly, if the borrower is married, does not live in furnished accommodation with tenant status, is not full-time self-employed, and does not have other loans worth up to 4000 Pounds or other loans in arrears, then he/she has the lowest probability of going into arrears with top 1% income earner status (only 2% probability of going into arrears, similar to defaults). On the other hand, the opposite occurs to the above scenario in relation to demographic, residential, and employment status, where borrower with top 1% car price produced the highest probability of going into arrears (31% for defaults, this figure is 44%).
Similar to outright defaults, these two opposite scenarios surely indicate that in terms of a relatively low risk-taking strategy by the lender, income seems to be one of the crucial predictors of risk minimization to avoid borrowers going into arrears, while car price seems to be the highest arrears producing indicator based on different combinations of demographic, employment, and residential status predictors. All other important predictors such as months in current employment, LTV, and age of the securitized asset (car) appear to relatively weaker predictors of going into arrears. The median level of income of the borrower plus loan LTV and car price and age, had an estimated probability of going into arrears in the range of 20 to 24% with extreme demographics; residential and employment status (not married, lives in the north, full time self-employed, and already have a large other loan in arrears). This probability gets reduced to only 4% with median level of borrower income, loan LTV and term, and car price and age but with least extreme demographics, residential and employment status.
Interestingly, similar to defaults, variations in loan characteristics approximated by LTV (moving from P25 to P99) does not appear change the probabilities of going into arrears significantly, irrespective of extreme or less extreme demographic, residential and employment characteristics of the borrower compared to borrower income and securitized asset characteristics. As the increase from P25 to P99 increases the probability increases on average by 1 to 2%. The role of previous arrears seems to be also very crucial in predicting arrears. Every time the status changes from no previous arrear for a loan worth more than 4000 Pounds to one where this happens, the probability of going into arrears increases in the range of 25% to 40%, regardless of different income and price and age of securitized asset levels. Hence, similar to defaults, it appears that previous credit history happens to be one of the important predictors in this context, but as discussed before, this finding is expected to be due to the nature of the borrowers in our context.
Among the demographic variables, being married is a risk factor for going into arrears. For different income levels and price ranges of the automobile, a change of status from not married to married reduces the probability of going into arrears by around 25%. Similar to defaults, the impact of residential status on probability of default alone is interesting too. Furnished tenancy arrangements seem increase the probability of going into arrears by 40 to 67% for different income levels, 33-60% for LTV levels, and 40-60% for automobile age and price. As discussed above, automobile age and price seem to affect almost all groups similarly irrespective of extreme or otherwise demographic and residential/employment characteristics. The increase in both age and price from P25 to P99 appears to increase the probability of going into arrears in the range of 40% to 44%. Again, a similar increase in arrears-probability for both variables is not surprising given the fact that automobile age and price are likely to be negatively related.

Developing of Some More Extreme Scenarios to Predict Defaults and Going into Arrears
As discussed before, these borrowers would tend to default as they were already into arrears on other loans which exceeded £4000. The North West region is a high unemployment area; therefore, borrowers who do not have a stable income from a stable job and had other loans to bear would obviously miss instalments of the loans on the car and be currently in arrears. If this trend continues the probability of default or current arrears would be high for such borrowers. The least risky borrowers (with only 2% probability of default or currently in arrears compared to 7.1% average of the whole sample) are those who are married, homeowner, full time employed, loan origin from South of the England, zero other loans arrears, top 10% income earner from the subprime borrowers, low LTV and car price, also with a higher loan duration current employment and low income share from the government benefits.
More interestingly, holding the loan and asset securitized at median level (P50) but considering the less extreme scenario in relation to demographic, employment, and residential characteristics also produces a lower chance of default and current arrears. This is exemplified by the scenario such as Marital status = married, residential status = homeowner, employment status = full time, loan origin = South, other loans arrears = 0, (P50) total monthly income of the borrower, (P50) LTV, (P50) car price, (P50) term of loan, (P50) months in current employment, and (P50) Income share from government benefits. These factors produce a default probability of 3.4%, which is half of the overall average default probability.
When it comes to going into arrears (current and past), different scenarios were analyzed, which revealed that all were statistically significant, but that number 4 (Table 8, 39.4%) indicates the highest probability towards this direction. This is part of the below table and has similar results to the above table. It can therefore be summarized that similar to defaults, borrowers who were single, furnished tenants, full-time self-employed, loan origin from North West of the UK, other loans arrears >£4000, low income earner, high LTV and car price, shorter duration loans and current employment, and high share from the government benefits were most likely to fall into this high risk category of going into arrears. Similar to defaults, the least risky borrowers with only 1.6% chance of going into arrears (compared to average 6.3% for the whole sample) are those borrowers who are married, homeowner, full-time employee, loan originated from the South of England, zero other loans arrears, top 10% earners among the subprime bowers, low LTV and car price, long duration of loan and current employment, and low share of government benefits in total income.

Conclusions
By extending the analysis of (Ghulam and Hill 2017) on subprime auto loan performance, this study looks into the role of borrowers' demographic, residential, and employment characteristics alongside loans and securitized asset features in assessing the risk of subprime loans going into arrears and defaults as well as write and early payoffs. A large dataset of a UK subprime car finance firms is utilized to build model of defaults and other destinations. Regression models are supplemented by nonparametric tests to distinguish between defaulters and nondefaulters based on marital, residential, employment status, term, price, and magnitude and price of loan and securitized asset features such as automobile car age and price. After presenting and explaining the odds and relative risk ratios derived from logistic and multinomial regression results, different scenarios were built alongside interactions of the above-mentioned variables to predict risk of default and other destinations of subprime loans.
Agreeing with the earlier above-mentioned study and broader empirical literature, we conclude that being married, living in tenancy accommodation, engaging in full-time self-employment, and the duration of current employment have significant implications for the subprime loan performance (defaults, current and past arrears, early payoffs, and write-offs). Loan features such as LTV, APR, and terms are also important determinants of above mentioned destinations and same could be said of securitized asset feature such as the age and price of the automobile. More importantly, the extension of the earlier study confirms that borrower income and price of the car appears to be a strong predictor of defaults and the possibility of going into arrears.
Some extreme scenarios such as being unmarried borrowers with furnished tenancy arrangements, having a relatively recent new job and living in a relatively less prosperous region such as the North West of the UK are all factors together with being self-employed having loan arrears worth more than 4000 Pounds for default to occur. Those falling into the bottom 25% percentile of monthly income and securing loans with high LTV whilst purchasing an expensive automobile with a shorter duration loan duration payment plan have a probability to default of more than 60% compared to average 7% default rates of overall subprime borrowers in our dataset. Those with a high dependency on government support also fall into this category. This is almost true of the chances of going into arrears too except that the highest probability in this context is around 40% compared to 6% for overall sample. We identified the least risky borrower in relation to defaults and arrears to be the ones who are married, homeowners, full-time employed, and based in South of England with zero other loans arrears. In additions these borrowers happen to be high income earners (falling under the top 10% category, bottom 10% LTV, and car price with longest duration loans alongside job stability and less dependent on financial benefits from the government to supplement their low income).
The study suggests that the different scenarios built and presented in this paper could in fact provide an important tool to the lenders who could use these estimates for building a risk profile of the borrowers considering the bad credit record of these subprime borrowers. The lenders could choose the combinations which would maximize their returns without significantly exposing themselves to an increasing risk of borrowers going into default or arrears given the extremely risky credit market they are operating in considering the current economic and financial climate. We believe that one of the best uses of these estimated probabilities could be in credit scoring models of these subprime lenders in automobile finance market.
One of the interesting implications of the study is that those who opt for self-employment due to a number of reasons appeared to be struggling in paying back the loans they have taken for business growth, to settle previous loans, or to meet daily expenses. This cohort of borrowers is, to some extent, already rationed in the normal credit market and consequentially, finds it hard to secure loans in the subprime market due to bad credit history. They need special help from the government in the form of guarantees and subsidies if these individuals are to become entrepreneurs (Dvouletý 2017) for more discussion in this regard).
Despite the fact that our study is rich in terms of empirical undertaking and the methodology used is popular and standard, it has certain limitations. First, our sample is comprised of subprime borrowers who have a history of defaulting on the debt payments. Hence, conclusions drawn from the study could not be generalized for the prime borrowers despite some common features among both prime and subprime borrowers. Second, we did not test the performance of our model in terms of the number of correct predictions. The study provides some interesting insights in relation to subprime credit defaults but more research is needed to assess the predictive performance of the chosen methodological framework nonetheless.
Author Contributions: Y.G. worked on building and estimating the statistical models. He was the main contributor in writing and drafting this research paper. He conducted statistical analysis, developed scenarios, provided interpretations of the empirical estimates, addressed reviewer comments and answered queries. K.D. and S.N. contributed in drafting the paper, implementing changes and clarifying the reviewer comments and queries. S.H. helped in data collection, reviewing literature and drafting the paper.
Funding: This research received no external funding.