Poverty‐Armed Conflict Nexus: Can Multidimensional Poverty Data Forecast Intrastate Armed Conflicts?

Poverty is widely acknowledged as a significant factor in the outbreak of armed conflicts, particularly fueling armed conflict within national borders. There is a compelling argument positing that poverty is a primary catalyst for intrastate armed conflicts; reciprocally, these conflicts exacerbate poverty. This article introduces a statistical model to forecast the likelihood of armed conflict within a country by scrutinizing the intricate relationship between intrastate armed conflicts and various facets of poverty. Poverty, arising from factors such as gender inequality and limited access to education and public services, profoundly affects social cohesion. Armed conflicts, a significant cause of poverty, result in migration, economic devastation, and adverse effects on social unity, particularly affecting disadvantaged and marginal groups. Forecasting and receiving early warnings for intrastate armed conflicts are crucial for international policymakers to take precautionary measures. Anticipating and proactively addressing potential conflicts can mitigate adverse consequences and prevent escalation. Hence, forecasting intrastate armed conflicts is vital, prompting policymakers to prioritize the development of effective strategies to mitigate their impact. While not guaranteeing absolute certainty in forecasting future armed conflicts, the model shows a high degree of accuracy in assessing security risks related to intrastate conflicts. It utilizes a machine‐learning algorithm and annually published fragility data to forecast future intrastate armed conflicts. Despite the widespread use of machine‐learning algorithms in engineering, their application in social sciences still needs to be improved. This article introduces an innovative approach to examining the correlation between various dimensions of poverty and armed conflict using machine‐learning algorithms.


Introduction
The connection between poverty and intrastate armed conflicts is not straightforward.On the one hand, poverty can lead to conflicts; on the other hand, conflicts can exacerbate poverty.To effectively address the root causes of poverty and armed conflicts, it is crucial to understand the complex interplay between them.
While poverty is often defined in monetary terms, it is essential to note that the poverty that fuels armed conflicts is multidimensional and stems from inequality that leads to grievances.Grievances are better understood as multidimensional poverty, affecting all aspects of life.
Although it is widely perceived that grievance is a significant factor contributing to the outbreak of intrastate armed conflict, there are numerous instances where, even though the level of grievance was high, it did not escalate into armed conflict.Hence, it is essential to understand the circumstances under which a state may experience armed conflict due to poverty.Cederman et al. (2010) suggest that when large ethnic groups with high mobility are excluded, the likelihood of civil war increases.Additionally, the history of previous conflicts within the country can negatively affect the potential for future conflict (Cederman et al., 2010, p. 88).Lindemann and Wimmer (2018) studied the Ethnic Power Relations dataset.Not all politically marginalized groups experience armed conflict due to ethnopolitical exclusion (Lindemann & Wimmer, 2018, p. 1).According to their research, when the dissatisfaction related to ethnopolitical inequality is aggravated by state violence that targets members of a specific group, conflicts can turn into armed rebellions.This is more likely to happen when the state's repressive institutions have limited control over the territory or if a neighboring state provides refuge.In such circumstances, leaders of the excluded groups may seize the opportunity to organize an armed rebellion (Lindemann & Wimmer, 2018, p. 13).
Grievances in society are one of the significant contributors to deteriorating social cohesion.The grievances arising from the political context demonstrate a feeling of political unrest and populism.This occurs when public officials make unstable policies and regulations for the state.This uncertain situation frustrates the public, making it difficult for them to make decisions regarding economic, social, and daily aspects of life.
When a government is politically unstable, it cannot meet public demands and provide successful services.
It is discussed that an unstable government can negatively impact a state's political situation and the country's economic and social systems (Abbas et al., 2023, pp. 1-2).It can be assumed that the structural form of exclusion can impact the outbreak of conflicts within a state.This study aims to better understand the root causes of intrastate conflicts by exploring how different types of exclusion and societal grievances affect the likelihood of armed conflict.Instead of focusing solely on the kind of government system in place, this article will attempt to identify a systematic relationship between societal grievances and the probability of intrastate armed conflict occurring.
At this point, an important question must be addressed: What type of grievance are we discussing?If we assert that the grievances enhance the likelihood of intrastate conflict, measuring which type of grievances can cause the outbreak of intrastate armed conflict is of utmost importance.Grievance resulting in the armed conflict outbreak can be over the economy, cohesion, political rights, or different factors affecting each other.
Grievances can arise for various reasons and can be linked to multidimensional poverty.Individual well-being is linked to multidimensional poverty, characterized by monetary, educational, and living standards (Oxford Poverty and Human Development Institute, 2023).Multidimensional poverty is also closely related to fragility in social well-being.Fragility in one aspect of social well-being can negatively impact an individual's well-being and, in many instances, can lead to societal grievances.Measuring fragility across different dimensions of well-being can help us better forecast the decline in social cohesion.This research aims to create a forecasting model using machine-learning algorithms.
Using suitable fragility metrics in different areas of social life, we can forecast the likelihood of the outbreak of intrastate conflict.

Poverty and Armed Conflict Relationship
History has demonstrated the inexorable intertwining of conflicts and poverty, emphasizing the imperative for enhanced understanding to combat their profound repercussions on humanity.While traditionally approached within separate academic domains-poverty within development studies and economics and armed conflict within security and peace studies-it is increasingly apparent that these phenomena are not isolated but interconnected facets of global challenges.This realization is underscored by empirical evidence indicating that some of the world's poorest nations have been ravaged by major civil wars, with a significant likelihood of relapse into armed conflict within the first five years of peace (United Nations Development Programme, 2005).
Understanding the relationship between conflict and poverty is complex, given the intricate feedback mechanisms between these phenomena.Academic inquiry has predominantly focused on elucidating how poverty can catalyze conflict and vice versa, with recent attention primarily directed towards exploring poverty's role in instigating war (Justino, 2011).The complex interplay between violent conflict and poverty manifests through various channels, including conflict as a cause of chronic poverty, insecurity exacerbating poverty, and poverty serving as a trigger for conflict.
Examining the impacts of military institutions and armed conflict on economic development unveils a critical nexus between conflict and poverty.Notably, civil wars precipitate a sharp increase in military expenditure relative to GDP, often at the expense of social spending, thereby perpetuating stagnation and underdevelopment (Collier & Hoeffler, 2006;Loayza et al., 1999).Moreover, conflicts weaken governance institutions and impede service provision, amplifying immediate and long-term human costs, particularly among vulnerable groups (Stewart & FitzGerald, 2000).
While there is consensus on the transmission mechanism validating poverty as a trigger for conflict, modern conflicts are recognized as multi-causal phenomena influenced by various short-and long-term factors beyond economic deprivation (Fearon & Laitin, 2003;Goodhand, 2001).
Conflict-induced disruptions in agriculture and investment contribute to increased economic uncertainty, leading to reliance on informal markets and elevated production costs (Justino, 2011).Furthermore, weakened social networks diminish informal risk mitigation mechanisms, exacerbating the economic toll of conflict on households.
In addition to its economic ramifications, armed conflict inflicts profound capability deprivations, undermining society's ability to realize valuable functions (Sen, 2011).The atrocities perpetrated during conflicts-ranging from massacres to forced displacement-result in severe freedom deprivation, limiting individuals' prospects for leading dignified lives.
In conclusion, the intricate relationship between conflicts and poverty necessitates a holistic approach integrating political, economic, and social strategies to address root causes and mitigate their impacts.
By fostering greater understanding and implementing targeted interventions, societies can aspire to break the cycle of violence and poverty, paving the way for a more equitable and prosperous future.

Research Methodology
This study utilizes fragility metrics based on the Fragility State Index, published annually by the Fund for Peace.This index evaluates countries based on cohesion, political stability, economic stability, social stability, and cross-cutting groups.The evaluations are based on 12 indicators within these groups and have been published yearly since 2007.We created a machine-learning algorithm that utilizes an open-source dataset provided by the Fund for Peace to forecast future outbreaks of intrastate armed conflicts.Regression analysis is frequently used in our machine-learning models to forecast a dependent variable  based on independent variable  (Kassambara, 2017, p. 6).This statistical method can also explain the interaction between dependent and independent variables that affect dependent variables (Bulut, 2018, p. 219).Our algorithm assumes a statistical relation between fragility indicators and the outbreak of armed conflict within a state.This study uses fragility data from 2007 to 2012 to measure the relation and make forecasts.The indicators and grouping of the Fragile State Index are described in Table 1.
To forecast intrastate armed conflicts, we used the Fragile State Index dataset indicators as independent variables and the data derived from the ACLED dataset as a dependent variable.It is considered an occasion of intrastate armed conflict if the country has experienced armed conflict and had more than 250 conflict-related deaths within about ten years, from 2012 until 2022.In the dataset, the case of intrastate armed conflict is marked as "1," and no intrastate conflict case is marked as "0" (ACLED, 2024).Cockayne et al. (2010) have set the threshold for conflict-related deaths at 250 within ten years, and for a civil war to be considered active, there must be at least 25 conflict-related deaths within a year (Cockayne et al., 2010 p. v).Therefore, 250 deaths in a country indicate that "armed conflict has occurred" when conducting a logistic regression model over ten years.Under these criteria, it is found that between 2012 and 2022, 57 out of 177 countries experienced intrastate armed conflict, based on the ACLED dataset.
The Fragile States Index includes 177 countries; all fragility data between 2007 and 2012 are available.Since the data on South Sudan started to be published in 2012, South Sudan was not included in the model to protect the integrity of the forecasting model, and the model was built in 177 countries.In the forecasting model, the fragility data, the model's independent variables, are taken from the Fragile States Index data (2007)(2008)(2009)(2010)(2011)(2012) and used to forecast.Armed conflict status, determined as the dependent variable, is obtained from the armed conflict data between 2012 and 2022.Between 2012 and 2022, the occurrence of armed conflict in the specified country is indicated in the dataset as "conflict status" and marked as "1" or "0." To ensure accurate results in machine learning, it is essential to consider the proportional difference between the categories of the dependent variable.This study created the training set based on category "1," which had the least number of categories.The dataset was divided into two sets-the training and test sets.
To maintain balance, the training set comprised 114 data, of which 57 were from category "0" and 57 were from category "1."The test dataset was used to evaluate the accuracy of the trained data and was tested on all data, with 177 data points-equal to the total number of countries in the dataset.The data distribution of the dataset used in the model is presented in Table 2.
The formula for multiple linear regression accounts for the effect of several independent variables on a dependent variable, assuming linear relationships is shown as equation 1: Where  is the dependent variable,   is the  th independent variable,  0 ,  1 ,…,   are the model parameters, is the error term, and  is the number of explanatory variables in the model (Bulut, 2018, p. 233).Based on equation 2, the forecasting equation for the multiple linear regression model can be expressed as: In this forecasting equation, the difference between the forecasted dependent variable ŷ and the actual  value is characterized as the residual value (), and the residual value () can be formulized as in equation 3: Table 2. Data distribution in the model.

Total amount of data 177
Amount of data in category "1" 57 Amount of data in category "0" 120 Training set data distribution "0" category: 57 "1" category: 57 Test set data distribution "0" category: 120 "1" category: 57 To convert this forecasting equation into a probability value, equation 4 is applied: Transformation is applied.In the multiple linear regression model equation, the dependent variable  values can change in the range (−∞, ∞).After transforming into a probability equation, probability values are between 0 and 1.
However, since a categorical dependent variable is needed in the logistic regression model, the formula is transformed again, and the binary logistic regression formula is used in this research.This conversion can be formulated as in equation 5: This conversion is described as "logit conversion" (Bulut, 2018, pp. 275-276).Depending on these multivariate sub-factors (independent variables), the model is created to forecast a categorical dependent variable.

Results
The logistic regression model was created with the "glm()" function in the "glmnet" statistical package, which is used to create linear regression models in the R programming language.The categorical dependent variable was chosen as family = "binomial" for the first model.The training set used to create the model is called "trainSet" and is specified in the programming code as "data = trainset."In the data obtained from the first model called "modelLogit," the weight coefficients for each independent variable were specified as "coefficients," and since the number of data used in the test set was 114 (consisting of 57 "1" and 57 "0"), "degrees of freedom" was determined as  − 1 = 113 (number of data used −1).
When the results are analyzed, it is observed that the "null deviance" is 158 and the "residual deviance" is 92.91.It can be interpreted that the model, which was initially at the "null deviance" value, decreased to the "residual deviance" value with the addition of independent (also called explanatory) variables, and its explanatory feature increased.The results for the first model created are shown in Table 3.Based on the assessment, it can be concluded that the independent variables positively impact the model.
Although the Akaike Information Criterion (AIC) is essential for comparing different models, it cannot be considered significant.the model and R programming language to examine the correlation values between the independent and dependent variables, which generate a correlation matrix.Figure 1 presents the correlation matrix created.

When we analyze the values presented in
According to Alpar, variables in a model should have a correlation value between 0.30 and 0.90.Variables outside this range should be adjusted (Alpar, 2013, p. 291).
As can be seen in Figure 1, which was created with the "corrplot" code in the "corrplot" package used to visualize the correlation between the specified variables in the R programming language, the correlation values between some variables are above 0.9.
Creating a new logistic regression algorithm is necessary to remove the independent variables with multicollinearity problems from the model.To create a new model, VIF values are checked one by one, and the process is repeated until the most appropriate model is found by removing the variable with the most significant value from the independent variables with VIF values above 5.
The "step" function in the R programming language considers the AIC in the forecasting model.Forward selection, backward elimination, or stepwise approach methods were selected, and the coefficients with low  significance were removed from the model.As a result of these experiments, the stepwise approach was determined to be the most appropriate method.The R programming codes for creating the final logistic model are given in Table 4.
The final model, created by removing the independent variables with low significance, has only four independent variables (C1.Median, E1.Median, E3.Median, and S1.Median) compared to the 12 independent variables at the initial stage.This reduction in the number of independent variables has reduced the complexity of the model.When comparing the AIC of both models, it is noted that the AIC value of the initial model is 118.9, while the AIC value of the final model is 106.3.This decrease in the AIC value indicates an improvement in the model.In the summary information of the models provided by the R program, it is stated that the final model developed in terms of the importance levels of the independent variables is better than the initial model created at the beginning.This is indicated by the symbols following the coefficients in the model summaries.

Tests for Model Fitness
When examining goodness-of-fit tests for logistic regression models, it is observed that opinions are divided into two.One view suggests that logistic regression models should be evaluated based on their forecasting success percentages.This is because the test methods for logistic regression models with categorical dependent variables are less successful than those for linear regression models.On the other hand, some believe that the application of tests is crucial in determining the success of logistic models, and many alternative test methods are available (Allison, 2014, p. 1).A meaningful way to evaluate the goodness of fit of a model is by examining the scatter plots of the residual values.In Figure 2, the residuals vs. fitted graph displays the relationship between the residual and forecasted values but does not yield a significant result.This is because logistic regression results are binary (0 or 1), so it is normal not to obtain significant results as in linear regression.The Normal Q-Q plot in the upper right corner of Figure 2 shows the concentration of residual values on the normality line, which supports the model fit.The "scale-location" plot in the lower left corner displays the relationship between the square root of the residual values and the forecasted values.
If the square root of the residual values is distributed horizontally equidistant from the line indicated on the graph and does not follow a specific pattern but is homogeneously distributed over the entire graph, it is considered an indicator of co-variance (Bobbitt, 2020;Moreno, 2019).The "scale-location" plot shows that the covariance hypothesis cannot be accepted.Although the square root values of the residual values are equally above and below the red horizontal line, they are not homogeneously distributed on the graph and form a distinct pattern.Therefore, the "binned residual plot" method, which provides good results in controlling model fit in logistic regression models for categorical variable estimation, is used to determine the model fit.
In the "Residuals vs Leverage" graph shown in Figure 2, each observation is represented as a point.
The acceptable limits, or "Cook's distance," are marked by a dashed red line.Based on the graph, we can conclude that all observations fall within Cook's distance limits, indicating no extreme values in the model.In the "binned residual plot" graph shown in Figure 3, the fact that most of the observations are within the limits indicates the appropriateness of the model (Kasza, 2015).As seen in the "binned residual plot" graph of the final model, most observations are within the specified boundaries.Although a few observations are outside the limits, the values outside the limits do not form a specific pattern; therefore, the model is acceptable.
Other methods that can be applied for model   : There is a discrepancy between values forecasted and values observed.
Since the -value obtained as a result of the Hoslem Lemeshow goodness of fit test is 0.5137 according to the  > 0.05 criterion, at a 95% confidence interval, the hypothesis  0 cannot be rejected, and there is no discrepancy between the observed and forecasted values.Acceptance of the hypothesis  0 is the expected result in the Hoslem Lemeshow goodness of fit test, and the model is appropriate according to this test.
Thanks to "PseudoR2," a function in the "DescTools" package, the McFadden test can be easily implemented in the R programming language.The McFadden test is compatible with measuring logarithmic values and testing the appropriateness of logistic regression models (Bartlett, 2014).
The results of the McFadden test between 0.2 and 0.4 indicate that the model is good (Bartlett, 2014).
The value of this test for the final model was determined as 0.3908607 and provided model suitability.
When the histogram graph of the residual values is analyzed, Figure 4 shows a distribution close to normal.
In analyzing the final model, it is helpful to examine how the independent variables interact with the dependent variables.To do so, we can visualize the "alleffects" function in the R programming language's "effects" library, as shown in Figure 5. From the graph, we can see that the independent variables C1.Median, E1.Median, E3.Median, and S1.Median have a linear relationship with the dependent variable ConflictStatus, which is categorical and represented by 1 and 0. Specifically, there is a positive linear relationship between C1.Median, E3.Median, S1.Median, and ConflictStatus, while E1.Median has a negative linear relationship.
In analyzing the final model, it is helpful to examine how the independent variables interact with the dependent variables.After conducting the fit tests for the model, the necessary codes were generated to make forecasts.
The model's forecasts are based on probabilities ranging from 0 to 1.These probabilities were converted to binary values.These probabilities can also be converted to percentages to determine the likelihood of intrastate armed conflict in percentages.However, this methodology cannot determine which countries are likely to experience intrastate armed conflict.For this reason, binary logistic regression is a better method for this research.Using binary values for forecasting is more appropriate for determining the names of the countries.
In the R programming language, the standard threshold value for generating the results as "1" and "0" is set as 0.5.However, utilizing the "OptimalCutoff" function within the "InformationValue" library makes it possible to determine the most appropriate threshold value, resulting in better forecasting by the model.The model was created using a threshold value of 0.736755.
Out of the 120 values in the data with "0" (no armed conflict), 115 were correctly forecasted.However, out of the 57 values with "1" (armed conflict), only 32 were correctly forecasted.

Tests for Forecasting Performance of the Model
The ROC curve and AUC values are used to measure the ability of a model to make accurate forecasts.This is done by comparing the forecasted values to the actual values.According to El Khouli et al. (2009), models with an AUC value between 0.9 and 1 are considered excellent, those between 0.8 and 0.9 are good, those between 0.7 and 0.8 are normal, those between 0.6 and 0.7 are poor, and those between 0.5 and 0.6 are unsuccessful (El Khouli et al., 2009, p. 1001).The ROC curve graph is presented in Figure 6.The final model's AUC value is 0.8642, which is classified as good.
When the forecasts of the final model are evaluated, approximately 83% are correct, and 17% are incorrect.
When the data specified in the Confusion Matrix regarding the model's forecasting performance is analyzed, the correct forecasting rate is observed with the "accuracy" parameter.The "accuracy" value for the suitability of the model's performance in forecasting should be greater than the "no information rate" value.The accuracy value varies between 0.767 and 0.8826 at a 95% confidence interval.Since the probability value of "P-Value [ACC > NIR]" is less than 0.05 and below 0.01, it can be evaluated that the criterion for the suitability of the model's performance is met with a probability of over 99%.
In addition to these values, the Kappa statistic in the Confusion Matrix is a test used to evaluate the similarity between forecasted and actual values.As an evaluation criterion, there is no similarity for values less than 0, values between 0.2 and 0.4 are characterized as low similarity, values between 0.4 and 0.6 as moderate similarity, values between 0.6 and 0.8 as good similarity, and values between 0.8 and 1 as excellent similarity (McHugh, 2012).The Kappa statistic value was found to be 0.57 for the final model.The similarity success between the model's forecasted and actual values can be evaluated as moderate based on the result obtained according to the Kappa statistic.
The model was developed using the R programming language.Based on the model, it was observed that the category "0" can be forecasted with 82% accuracy.This value represents the proportion of correct forecasts for the desired category.Additionally, the sensitivity value of the final model was calculated using the R programming language, and the result obtained was 0.9583333.The sensitivity of the final model for the "0" category was found to be 96%.
A good model should have sensitivity/precision values evaluated together, with both values approaching 1.
This can only be accomplished if the model does not misforecast both "1" and "0" values.The F1 score, which can evaluate both values simultaneously, is widely used to measure the model's success.The Confusion Matrix indicates the precision, recall, and F1 scores.In this study, the F1 score for the final model is 0.8846154.When the number of positive and negative forecasts differs in datasets similar to the ones used in this study, it is more appropriate to check the F1 value instead of the precision and recall values.
When analyzing the final model's forecasting performance for the value "1," representing the occurrence of armed conflict, it was found that the model's precision value is 0.8649, sensitivity value is 0.5614, and F1 score is 0.6809 while forecasting the value "1."The "balanced accuracy" value, which considers both categories and evaluates the overall forecasting performance of the model in both categories simultaneously, is observed to be 0.7599.

Discussions and Conclusions
This study has created a binary logistic regression model that forecasts a country's likelihood of armed would be a mistake to assume that the excluded indicators do not affect the occurrence of armed conflict.
The identified indicators are just the most successful ones in explaining the model.Adding more indicators to the model may make it more complex and reduce its forecast power.
The R programming language was used to create a logistic regression model that can forecast whether an armed conflict will occur in a country as "armed conflict occurs" and "armed conflict does not occur," with the representation of "1" and "0."As a result of the fit tests for model suitability, it was concluded that the model had an 86% explanatory value and was successful based on this value.
According to the evaluation of the model's forecasting performance, the model correctly forecasted 105 out of 120 values for the category "0," representing "armed conflict does not occur."In comparison, it could correctly forecast only 32 out of 57 values for the category "1," representing "armed conflict occurs."Based on these results, when the overall forecasting accuracy of the model is evaluated for a total of 177 countries, it can be assessed that the model is quite successful in the "0" category with a correct forecasting rate of around 95% and relatively unsuccessful in the "1" category with an accurate forecasting rate of about 56%.It is forecasted that this difference in performance between categories is because the available data is limited to only 16 years and that better performance results can be achieved with increased data over the years.In addition, since the number of countries with armed conflicts is considerably lower than the number of countries without armed conflicts in the distribution of the available data, it is determined that more data is needed to make more successful forecasts in both categories.
In social sciences, many models that utilize data science are used due to the increasing amount of data.
However, most of these are linear and explanatory models aiming to determine the relationship between variables.Non-linear models that aim to forecast are generally used in limited areas, such as forecasting election results (Grimmer et al., 2021, p. 398).The binary logistic regression model created in this research includes a method rarely used in the field of social sciences in terms of using data science to forecast categorical data.The study is expected to constitute a starting point for the models to be created since it can create a model with the R programming language that forecasts the occurrence of armed conflict in a country with poverty and fragility data.To improve the model, it would be helpful to increase the amount of data over the years and add new independent variables to explain the outbreak of conflict better.
By transferring the median values of the fragility data between 2016 and 2022 to the model, countries with a high probability of armed conflict between 2022 and 2032 were identified.Based on this calculation made with the model, it is forecasted that 43 of the 177 countries included in the calculation worldwide may experience internal armed conflict between 2022 and 2032.Note that these results are statistical calculations.Although it is impossible to make a concise judgment, it would be helpful to consider it as a risk assessment for the specified countries.As a result of the calculation made with the logistic regression model, the countries with a high risk of armed conflict between 2022 and 2032, according to the values of fragility indicators in the specified countries, are indicated in Table 5.
conflict.The forecast is based on the median average of the country's fragility data from the previous ten years.The fragility data uses twelve sub-indicators, each tested in the machine-learning model.These models are based on the relationship between the median value of the annual fragility data between 2007 and 2012 and the occurrence of armed conflict from 2012 to 2022.The logistic regression method is used to create the model in the R programming language.With this model, it is possible to forecast the likelihood of armed conflict in the future by adding new fragility data.As explained in detail in the research methodology review, graphical evaluations and model goodness-of-fit tests are conducted to assess the model's accuracy.After these tests, it can be concluded that the model is appropriate.The logistic regression model has been improved with these tests and methods to enhance the goodness of fit.It has an 86.42% explanatory power, as indicated by the AUC value, using only four of the 12 fragility indicators.Upon analyzing the forecasting performance of the final logistic regression model, it was observed that the model forecasts the category "0" (representing "no armed conflict") in the forecasted country with better efficacy.At the same time, it obtains relatively unsuccessful results in category "1" (representing "armed conflict will occur") in the forecasted country.The final logistic regression model for forecasting armed conflict uses "security apparatus" (C1.Median), "economic decline" (E1.Median), "human flight and brain drain" (E3.Median), and "demographic pressure" (S1.Median) as sub-indicators of fragility.Other indicators are not included due to multicollinearity, negatively affecting overall performance.The sub-indicators impact each other when forecasting an armed conflict.Hence, the adjustment was implemented in the logistic regression model.The four selected indicators (out of the 12 sub-indicators) significantly affected the occurrence of armed conflict.However, it Social Inclusion • 2024 • Volume 12 • Article 8396

, Table 1. Fragile State Index and indicators. Fragile States Index main groups Sub-indicators Description of indicator in the model
Table 3 for the first model, it becomes apparent that the coefficients The presence of multicollinearity in a forecasting model negatively affects its performance.Unforecastable relationships between independent variables can cause this issue (Pennsylvania State University, 2018).To eliminate the problem of multicollinearity between independent variables and indirectly reduce variance inflation, we can examine variance inflation values (VIF).This issue can be mitigated by removing the model's independent variables with high VIF values.To remedy this problem, a correlation matrix can be created for Social Inclusion • 2024 • Volume 12 • Article 8396

Table 4 .
Appropriate model selection with the stepwise method call.
Social Inclusion• 2024 • Volume 12 • Article 8396After conducting tests to evaluate the model's accuracy and forecasting capability, codes were developed to monitor the forecasted values within the model.This allowed for comparing the forecasted and actual values for a specified country.The overall steps for the model creation are described in Figure7.

Table 5 .
Countries at high risk of armed conflict between 2022 and 2032, according to the model.