Mathematical modelling of decision making: the case of motor insurance choices

This paper employs a statistical mechanical model as a framework to investigate how socioeconomic factors of individuals such as gender and place of residence inﬂuence their decision when deciding between comprehensive and third-party motor insurance policies in Ghana. Data from a general insurance ﬁrm was used for this investigation taking ﬁve years’ worth of transactions into account. The methods of partial least squares and the ordinary least squares are, respectively, used to estimate the parameters of the interacting and the non-interacting models in the Multipopulation Currie-Weiss model in a discrete choice framework. The ﬁndings showed that both location and gender have discernible inﬂuences on how people choose their motor insurance. We encourage insurance companies to intensify their campaign on the importance of motor insurance to all vehicle/car owners, especially those in rural areas in order to reduce the risk and associated losses in vehicular accidents on Ghanaian roads.


Introduction
The concept of insurance, driven by the need to secure oneself against risks, can be described as a form of risk management, primarily used to safeguard against potential financial losses.This practice is particularly relevant in many countries where increased vehicle usage heightens the likelihood of accidents, theft, and fire, necessitating the development of motor vehicle insurance to mitigate these risks [6].Motor insurance, as discussed in [5] and [32], is a crucial contract in vehicle operations, providing a guarantee against catastrophic losses by transferring the risk from the owner to the insurer in exchange for a premium.Often referred to as automobile insurance, its popularity has led policymakers to mandate its purchase to protect not only the at-fault driver but also innocent third parties [3,36].In Ghana, the primary forms of motor insurance include third-party and comprehensive coverage, which present significant decision-making challenges for individuals based on their socioeconomic attributes.
Theoretically, numerous models exist to facilitate informed decision-making, with discrete choice experiments (DCE) being among the most common.The application of DCE in studying human resource policy issues is a relatively new but growing area.As noted by [4], consumers express their preferences through their choices, which collectively influence the demand for various goods and services.The decision by vehicle owners, whether private or commercial to purchase motor insurance is significantly influenced by their knowledge and involvement in the transportation sector [11].A notable concern is the lack of adequate education provided to vehicle owners and aspiring drivers, which complicates their insurance policy selection process and challenges insurers in identifying key factors to attract these potential clients [44].Thus, using discrete choice experiments to identify the factors influencing an individual's decision to choose a specific motor insurance provider is crucial.
Recent times have seen various mathematical applications in the insurance sector.Studies by [40,42], and [10] have applied mathematical models to predict risks and classify insurance types.A paper by [30] proposed stochastic mean fields using a general maximum principle applicable in both insurance and finance.According to [23,27,31] mathematical modelling was utilized to inform effective insurance policies.
In [28], a continuous-time scenario was considered where an insurance contract is affected by catastrophic events influenced by climate change.The authors aimed to optimize the utility for insured parties by determining optimal premiums and necessary actions to minimize claims.In a paper by [29], the causal relationships in the growth of voluntary insurance were explored, proposing various models for enhancing insurance systems.Shared protection insurance, as discussed by [2], represents a decentralized approach where members pool resources to cover losses, challenging traditional insurance models.In [24] and [17], the focus was on developing specialized insurance schemes based on driving behaviours and other criteria, while [25] applied advanced mathematical techniques for insurance valuations.
We begin by making assumptions about how people might alter their decisions and behaviours as a result of social and economic characteristics, and then we use statistical mechanics tools to deduce the population-level implications of such assumptions.This study aims to explore the complex dynamics within groups of individuals and their impact on group behaviour, building upon studies such as [13] and [12].The concept of selforganization, which is evident in various systems from biology to economics, is crucial for understanding these dynamics.Even minor adjustments in socio-economic frameworks can lead to significant shifts in group behaviour.For instance, a small influx of immigrants can alter language accents, and targeted government interventions can notably decrease crime rates, as referenced in [26] and [7].
Phase transition describes abrupt, large-scale changes that occur due to minor modifications in the interactions among components, a concept borrowed from statistical mechanics.Spin models, particularly the multipopulation Curie-Weiss model, exemplify these transitions, as noted in studies such as [39] and [14].The versatility of this model is evident in its applications across diverse fields, such as social sciences [15], finance [21], chemistry [18], and ecology [43] because of their analytical tractability.Studies by [16] and [22] further underscore the utility of these models in the social sciences, suggesting that integrating these models can enhance our understanding of how social forces shape societal outcomes.
Furthermore, the multipopulation Curie-Weiss model serves as an essential tool for analysing binary choices within socio-economic contexts, such as education decisions or health-related behaviours.This concept is discussed in [22].The model suggests that individuals from similar backgrounds tend to make similar choices, a pattern also observed in other research, such as [34], which examines the role of socio-economic factors in educational achievement across various African nations.Through the lenses of these models, we gain deeper insights into how social and economic factors intertwine to influence individual and collective decisions in society.
For a deeper examination of the applications of statistical mechanical models in social sciences and energy fields, we recommend the following sources: [12,13,20], and [35].Previous research has explored factors influencing motor insurance choices in Ghana, yet there has been limited focus on the effects of gender and residence [19,37,38].Our study uniquely investigates the influence of gender and residence on the choice between comprehensive and third-party motor insurance policies.We employ the multipopulation Curie-Weiss model to analyse how these socio-economic characteristics impact these decisions.
The rest of the paper is organised as follows.Information on data and the content of Multipopulation Currie Weiss model (Hamiltonian) in the study are provided in Sect. 2. Section 3 presents the study's results and discussions, and Sect. 4 presents the study's conclusion.

Materials and methods
Data Table 1 summarizes the count of policies, including 20,205 comprehensive policies and 11,767 third-party policies.For the distribution of gender, there were more females than males.Table 2 provides a summary of this distribution.Considering the residential status of various policyholders, the summary indicates that most are located in rural areas, as detailed in Table 3.

Study variables
The study focused on motor insurance policy as the primary outcome variable, which was categorized as either a comprehensive insurance policy (assigned a code of +1) or a third-party insurance policy (assigned a code of -1).The study included covariates such as gender, differentiated as male or female, and type of residence, classified as either urban or rural.
Here, ζ p represents the choice made by individual p, T pq denotes the interaction strength or coefficient of interaction between individuals p and q, and y p indicates the external field of influence.The function H M on the configuration ζ is called the Hamiltonian (energy) function of the model.It represents the utility or total satisfaction of individuals resulting from their choices and the influences on them while making a decision [33].The higher the H M , the higher the level of satisfaction of the population.H M is divided into two parts: the first part models the social incentives of the population, while the second part models the private incentives of an individual within the population.T pq measures the influence an individual p has on individual q and represents the strength of an individual's social incentive.Whenever T pq is positive, it means conformity or imitation is rewarded, and when T pq is negative, it means imitation or conformity is not rewarded [41].

Discrete choice
Our interest now is how to model the private incentive y p of an individual.In order to do that, we will consider only the private incentive for now by ignoring interactions, i.e., we put T pq = 0 for every p, q ∈ {1, 2, . . ., M}, where M is the total number of individuals in the population.Our general utility function now becomes Each individual will be assigned to a vector of socio-economic attributes b p , where each component of the vector is a binary variable that takes a value of either 1 or 0 [22].Since each individual has his or her own distinct socio-economical attributes, we will let b p be defined for p = {1, . . ., M} as follows where b (j) p 's are {0, 1} -valued.For example, take into perspective the situation where the socio-economic characteristics of interest are residence b (1)  p and gender b (2)  p with (2.6) An attribute does not contribute to a person's private incentive if it has a value of zero for them.Every sensible person examines their position before making a choice or decision.Therefore, it makes sense for y p , which drives an individual to act, to rely on the direction of socio-economic characteristics b p .Define α j = (α 0 , α 1 • • • , α k ) for j = 0, . . ., k, which is the relative weight or the importance that the various socio-economic attributes have or measures the private incentive for each attribute when an individual is making a decision.The result in our scenario is Here, α 0 represents the fundamental private incentive that everyone person possesses, irrespective of their unique socio-economic characteristics.According to [22], the likelihood that a person with characteristic b p will choose "comprehensive insurance policy" is given by P(ζ p = +1) = e y p e y p + e -y p . (2.8)

The multipopulation Curie-Weiss model
This section will concentrate on finding an appropriate parametrization for the interaction coefficient T pq as well as a methodical approach that enables us to estimate the model's defining parameters from data.Each individual p in our discrete choice model is given k socio-economic attributes Members of the same group or partition are distinguished by the same socio-economic characteristics.The conclusion from equation (2.7) is that every member of a partition or group g has the identical private incentive, denoted by y g .
For the purposes of what follows, we will assume that for any pair of groups g and g , T pq = Tgg for every p ∈ I M g and q ∈ I M g .This presumption and equation (2.2) lead to the conclusion that (2.10) The private incentive component then turns into We employ the fact that y p = y g for every p ∈ I N g in the third equality.Note that is the average decision for the individuals in group or partition g.The first term in equation (2.10) changes to; For every q ∈ I M g , p ∈ I M g , and T pq = Tgg is used in the third equality.Now, we can see that by using equation (2.10), M ŵg and T gg = Tgg 2 , where M are the groupings' relatives sizes I M g and I M g respectively.Consequently, w g and w g represent the weighted average decision of the members of groups g and g , respectively.As a result of the change in our model from individual to group decisions or choices, our generic Hamiltonian now reads as follows: (2.15) H M measures the level of satisfaction of the population in choosing a comprehensive insurance.When T gg is positive, it indicates that groups g and g are content with the same decisions, and when it is negative, group imitation is discouraged or not rewarded.y g is the group g's private incentive.In this paper, we will think about using the two attributes gender and residence, which implies that our number of socio-economic attributes is k = 2 and we will have 4 partitions as each attribute assumes two values.We will examine how one's gender (whether male or female) and place of residence (whether urban or rural) influence the kind of motor insurance policy they choose to purchase.The Hamiltonian T gg w g + y g where, There are four intra group interactions (T 11 , T 22 , T 33 , T 44 ) that control how strongly members of the same group imitate one another, and twelve intergroup interactions that control the magnitude of imitation between members of different groups.This model's situation with k = 1 was taken into consideration in [16], where the thermodynamic limit of the model was established and its solution rigorously derived.Some of the model's properties were also examined.Particularly, it was demonstrated that the model factors entirely, resulting in the self-consistent equations [22] being able to describe the model's equilibrium states in their entirety: Specifically, this enables us to express equation (2.8)'s expression for the probability of person p makes a decision if p is a member of group g in a closed form as (2.17) Where The probability that p will choose ζ p = +1 is and that of ζ p = -1 is For all p in g, let E be the expectation of ζ p .Then e U ge -U g e U g + e -U g .
The general formulation for the U g in our situation, when k = 2, has the following form: This is a linear regression model that examines a group's usefulness in relation to the variables T gg and α j 's.Recall that To estimate the interaction model starting from actual data, this is the fundamental quantity required.The parameters that need to be estimated are T gg , α j , and α 0 in this case.
The version of the model presented in equation (2.3) does not involve interactions is obtained when T gg = 0 for every g, g .The non-interacting Hamiltonian could be expressed similarly by (2. 19) This reduces to the following in our case where k = 2. where Hence, we can say that U g = y g for g = 1, . . ., 4.
In order to create these equilibrium weighted average choices, the non-interacting model also factors fully: Using actual statistical data as a starting point, this will be our primary tool for estimating the non-interacting models' parameters.

Estimation
The model predicts that person p, a member of group g, has a probability of choosing +1 equal to e U g e U g + e -U g , where T gg w g + y g , and α j b (j)  g + α 0 .
In our analysis, w g represents the observed quantity, which is the empirical average choice, while tanh(U g ) is the model's prediction.We apply the least squares method to the model, minimizing the squared norm of the difference between the observed and predicted quantities.Notably, in our case, the observed quantity is equivalent to the estimated average choice of group ŵg .We must therefore determine the parameter values that minimizes: (2.20) Computations for (2.20) will take a long time due to the non-linear function tanh(U g ).Interestingly, this stimulus-response type of experiment closely mirrors the real-world applications of a social behaviour model, such as linking stimuli provided by policy incentives and media to individual behavioural responses within a population.In a paper, [9] first encountered this issue in the 1950s while developing a statistical method for bioassays.A similar approach is also utilized in statistical mechanics, for example, when determining the best order parameter for a specific Hamiltonian.Since U g is a linear function of the models parameters, and tanh is an invertible function the above error estimation is reduced to the following [8] g tanh -1 ( ŵg ) -U g 2 . (2.21) The method breaks down when the average opinion ŵg of a group is equivalent to +1 or -1, that is ŵg ≡ ±1, since tanh -1 ( ŵg ) = ±∞ then if the size of the groups is large enough ŵg ≡ ±1 will not occur.
From equation (2.21) we shall rewrite U g as, Now that our model is a set of linear equations, it can be expressed as where U is a 4 × 1 matrix which is known as the hyperbolic tangent of the equilibrium state ŵg , D is a 4 × 19 matrix with the number of columns greater than the number of rows, and X represents the vector of the model's parameters.
For the non-interacting model, we recall that U g = y g and we will rewrite the equation for g = 1, . . ., 4 as follows; Three parameters, or unknown α 1 , α 2 and α 0 , make up the system of linear equations we now have.The process is the same for formalizing it as we did for the interactive model.U is a 4 × 1 matrix, A is a 4 × 3 and X represents the vector of the parameters of the non-interacting model.As a result, U = AX [34].We used Matlab version R2016a for our computations and the results will be shown in the next section.

Results and discussion
We will consider real world problem and apply it to our estimation procedure for our model.An insurance company in Ghana provided the data we used for this estimation.
The report includes data on the number of individuals who purchased motor insurance coverage from 2015 to 2020.A total of 31,972 people were included in the statistics; 20,205 opted for comprehensive coverage, while 11,767 chose third-party policies.Individuals in this study were categorized into groups based on socio-economic characteristics such as place of residence (rural or urban) and gender (male or female).Attributes were assigned to these groups, each consisting of individuals sharing similar socio-economic characteristics.Moving forward, we will explore the cases listed in Table 4, which details the socio-economic attributes.In the subsequent discussion, we assumed that if b (1)  g = 0, this indicates that the attribute had no impact on the group's private incentives.We examined the effect of the absence of private incentives on the strength of social interactions, denoted by T gg , in the analysis of the four cases mentioned previously.Subsequently, we applied the estimation method described earlier and the parametrization of the attributes to analyse motor insurance policy choices made by Ghanaians.

Motor insurance choice
The dataset contains information on gender and includes a population of over 31,000 individuals from both rural and urban areas nationwide.In this analysis, the binary attributes b (1)  g and b (2)  g , representing gender and place of residence respectively, are used to categorize groups of people.We aim to investigate how an individual's gender and place of residence influence their choice of motor insurance policy.The population's selection of motor insurance policies is segmented in Table 5 based on socio-economic characteristics.
We will get four groups that are indexed with this type of partitioning by g = 1, . . ., 4. In particular, g = 1 represent females in the urban group, g = 2 represent males in the rural group, g = 3 represent females in the rural group, and g = 4 represent males in the urban group.
For this case study, we will focus on people choosing between comprehensive and third party motor insurance.In so doing, we shall formalise ζ p as We would reorganize our population, taking into account each individual's preferences and attributes.Table 6 offers a breakdown of individuals by gender and residence in relation to comprehensive policies.In urban areas, there are 6042 females and 7045 males covered by comprehensive policies.In rural areas, there are 5333 females and 1785 males covered by comprehensive policies.
Table 7 presents data on gender and residence concerning third-party policies.In urban areas, 4838 females and 1243 males are covered by third-party policies.In rural areas, 5200 females and 486 males are covered by third-party policies.We will calculate the weighted averages for the different groups using the data from Tables 6 and 7.The weighted average is, as you may recall, p is substituted into it, we get that where M g Com is the number of people that choose comprehensive motor insurance in group g and M g Third is the number of individuals in group g that choose third party motor insurance, for g = 1, • • • , 4.
Accordingly, when: g = 1: Females in urban We would now compute the values of the attributes b (1)  g and b (2)  g for gender and residence respectively, based on the four cases produced in Table 4, see Table 8.The values given to the attributes in each of the four cases describe how pertinent each trait is to the Hamiltonian's private incentive component.For example, in case 1, being a male as an attribute has no effect on a group's private incentives, although being a female has an impact.Similarly, living in a rural location has no impact on a group's private incentives, whereas being in an urban setting has an impact.
The result of the estimation for the inter-and intra-group coupling strength and its attributes are shown below.Table 9 shows the estimates for non-interacting discrete choice    4 and 8.

Non-interacting case
Considering the non-interacting model's estimates, it follows that; when a group's private incentive α j 's add up to a positive value, they will choose comprehensive motor insurance over third party motor insurance.Conversely, when the sum equals a negative value, they will choose third party motor insurance over comprehensive motor insurance.In Table 9, the sum of the α j 's for cases 2, 3, and 4 yields a positive value, suggesting that those who are taken into account in these cases will pick comprehensive policy as their motor insurance option.The fact that the sum of the α j 's in case 1 is negative suggests that when there is no interaction, people will choose third-party motor insurance.For all four of the scenarios taken into consideration, the estimates of private incentive for residence α 1 are miniscule and near to zero.When there is no interaction present, implying one's residence has no effect on an individual's private motivation to motor in-surance policy choice.Cases 1 and 4 have the same negative value for the private incentive for residence estimate of α 2 .Being in an urban environment does not increase a group's private incentive.The negative number therefore suggests that people in rural areas will want to get third party motor insurance.
For cases 2 and 3, on the other hand, we have a positive result, which indicates that people in urban regions will select comprehensive motor insurance.Being rural does not influence the group's private incentive in these groups.This suggests that a person's place of residence has an impact or is a key deterministic component in the decisions about their motor insurance policies.
Case 2 has a negative base private incentive of α 0 , which means that when socioeconomic factors like gender and place of residence are disregarded, people will opt for third-party insurance as their motor insurance option.In contrast, when gender and residence are not taken into account, people in cases 1, 3, and 4 will select the comprehensive motor insurance.Because cases 1, 3, and 4 have positive base private incentives.

Interacting case
The utility function of the interactive model incorporates both social and private incentives.When the social incentive T gg for groups g and g is positive, individuals in these groups tend to copy one another; nevertheless, when T gg is negative, the typical decisions of individuals in these groups will opt to have different options.In this case, the conformity is not valued.
T 11 represents the level of interaction between females interacting with one another in a urban area.The estimate for T 11 is negative (see Table 10), indicate that imitation is not rewarded, whereas T 22 measures the degree to which rural-dwelling males engage with one another.Since T 33 's estimate is positive, the imitation is rewarded.The positive estimate for T 24 suggests that males in the rural area have a greater influence on males in the urban area.The values for α 1 and α 2 are negative, indicating that the location and gender have no discernible influence or role in how people choose their motor insurance when interaction is present.Because there is no base private incentive, persons considered in these circumstances will base their decision on where they live or which gender they identify with in regard to purchasing motor insurance.

Model diagnostics and validation
Under this section our interest is to look at how well our interacting model fits the data.Partial least squares (PLS) was used to estimate the parameters of the interaction model identified in [16].U g and w g are our dependent and independent variables, respectively.A collection of dependent variables are analysed (predicted) by PLS using a set of independent variables (predictors).With this method, the independent variables produce orthogonal factors known as latent vectors, which are then arranged in decreasing order of their eigenvalues [1].
Our analysis and results are produced using the MATLAB program version R2016a.Table 11 displays the root mean square error of prediction (RMSEP) and the percentage of variation explained by the latent vectors employed in the estimation of the independent variables (w g ) and the dependent variable (U g ).
According to Table 11, the first two latent vectors account for 100% of the variation in the w g and 94.79% of the variation in the U g .Taking into consideration the variances

Conclusion
This paper presents a study that aimed to understand how socio-economic characteristics affect consumers' decisions about motor insurance in Ghana.We used a statistical mechanics model called the multipopulation Curie-Weiss model, which is a theoretical model that describes the behaviour of a system of interacting populations.The model was applied to the motor insurance data from Ghana to analyse how socio-economic characteristics, such as gender and residence, influence consumers' insurance choices.
To estimate the parameters of the interacting and non-interacting models, we used partial least squares and ordinary least squares, respectively.The results of the analysis showed that, for the non-interacting model, the sum of the estimated parameters for males in rural, females in Rural, and males in urban was positive, indicating that individuals in these cases were more likely to choose a comprehensive policy as their motor insurance option.On the other hand, the sum of the estimated parameters for females in urban was negative, which suggests that when there is no interaction, females in urban regions tend to choose third party motor insurance.The results of the interacting model showed that males in rural interacting among themselves represented by T 22 and the influence males in rural on females in rural, represented by T 23 tend to choose the same policies since the estimates are positive, while others (represented by T 33 and T 44 prefer different policies.This implies that the level of interaction between different groups of individuals plays a significant role in determining their insurance choices.
Based on these findings, we suggest that insurance companies in Ghana should increase their marketing efforts to educate vehicle/car owners about the value of motor insurance.This is particularly important for individuals in rural areas, who may not be aware of the benefits of motor insurance and the risks of not having it.By increasing awareness and understanding of the importance of motor insurance, insurance companies can help to lower the risk and losses from vehicular accidents on Ghanaian roads.
In future work, we will consider incorporating delays in the same model, which would allow for the analysis of how the timing of decisions and events may affect insurance choices.This could provide additional insights into the factors that influence consumers' insurance decisions and help to inform more effective marketing strategies for insurance companies.

Table 1
Summary of policies

Table 2
Summary of gender

Table 4
Socio-Economic Attributes of Individuals

Table 5
Population classification based on attributes

Table 6
Motor insurance policy: Comprehensive

Table 7
Motor insurance policy: Third Party

Table 8
Socio-Economic Attributes for Groups

Table 9
Motor Insurance Choices: Estimation for the non-interacting model

Table 10
Motor Insurance Choices: Estimation for the non-interacting model

Table 11
Variance of w g and U g explained by the latent vectors and RMSEP described by the independent and the dependent variables shows that the first two latent vectors are sufficient.As the number of latent vectors rises, the RMSEP declines.