Measuring the unmeasurable multidimensional poverty for economic development: Datasets, algorithms, and models from the poorest region of Luzon, Philippines

Poverty is the oldest social problem that ever existed and is difficult to reverse. It is multidimensional and unmeasurable. Thus, measuring by decomposing rural multidimensional poverty is critical. Most poverty studies are usually generic, exposed to large sampling errors, and intended for macroeconomic decisions. Thus, measuring poverty for a specific locality with various configurations is crucial for economic development. This work presents a processed and analyzed dataset from a huge community-based monitoring system of Goa, Camarines Sur. The local is situated in the poorest district, of the poorest province, in the poorest region of Luzon, Philippines. Research about poverty in this area is limited and measuring poverty at specific locality is scarce. The datasets contain the multidimensional poverty indicators, health, and nutrition, housing and settlement, water and sanitation, basic education from elementary to senior high school, income classifications, employment and livelihood, peace and order, summary of calamity occurrences experienced by residents, disaster risk reduction preparedness, figures of diagnostic analytics, tables of descriptive analytics, poverty analytics, measurement of decomposed poverty, summary of disaggregated configurations, graphs of predictive and prescriptive analytics, and population dynamics. This work is vital in analyzing poverty in rural and multidimensional approaches through poverty incidence, poverty gap, severity statistics, watts index, and classifications. It may also serve as a basis for measuring poverty from nearby regions and nations that use complete enumeration of its households and members. By utilizing the analyzed and processed data, further classifications and regressions can be done. It can be freely used by the government, private organizations, charitable institutions, businesses, academia, and researchers to target policies. An advantage of utilizing the dataset is to address multifaceted poverty that requires different interventions. It will facilitate the creation of programs to alleviate poverty and promote local economic development.


Rural poverty Calamity occurrences Disaster risk preparedness Advanced econometrics Data analytics Regression
Policy Philippines a b s t r a c t Poverty is the oldest social problem that ever existed and is difficult to reverse.It is multidimensional and unmeasurable.Thus, measuring by decomposing rural multidimensional poverty is critical.Most poverty studies are usually generic, exposed to large sampling errors, and intended for macroeconomic decisions.Thus, measuring poverty for a specific locality with various configurations is crucial for economic development.This work presents a processed and analyzed dataset from a huge community-based monitoring system of Goa, Camarines Sur.The local is situated in the poorest district, of the poorest province, in the poorest region of Luzon, Philippines.Research about poverty in this area is limited and measuring poverty at specific locality is scarce.The datasets contain the multidimensional poverty indicators, health, and nutrition, housing and settlement, water and sanitation, basic education from elementary to senior high school, income classifications, employment and livelihood, peace and order, summary of calamity occurrences experienced by residents, disaster risk reduction preparedness, 63,749 members that are being evaluated.Four groups were created, each of which served as the base for the multisectoral analysis [1] .The four sectors were classified using the municipality's sectoral classification system.The researchers served as the analysts and statisticians of the data to derive insights, create predictive summaries, perform advanced econometrics, and provide policies for economic development.The Partido State University and University of the Philippines Los Baños are authorized to utilize and analyze the datasets and disseminate its immediate summaries and findings for the socio-economic advancement of Filipinos.Data

Value of the Data
• Measuring multidimensional poverty is difficult and challenging thus it is deemed unmeasurable, especially for rural areas and specific localities.The dataset is useful for providing information in designing programs and policies for poverty alleviation and economic development that are well-targeted for the poorest regions.• The datasets will provide baseline data and indicators of multidimensional poverty for other developing countries and poor regions of the world to devise policies for economic development.Future researchers may adopt the data analytics procedures, variables used, methods employed, policy proposals, and computational styles to create similar studies on measuring the unmeasurable aspects of multidimensional poverty for other poor regions.• The poverty analytics dataset can be used to compute, verify, and simulate poverty incidence, poverty gap, severity statistics, watts index, and classifications for various sectors and locals of poor regions.• The processed datasets provide descriptive analytics and diagnostic analytics that measure population dynamics, health and nutrition, housing and settlement, water and sanitation, basic education from elementary to senior high school, income classifications, employment and livelihood, peace, and order, summary of calamity occurrences experienced by residents, disaster risk reduction preparedness.• Local government units, National government agencies, researchers, scholars, extensionists, policy-makers, academicians, charitable institutions, and social enterprises will find the prescriptive analytics dataset useful in designing programs and monitoring impacts for socioeconomic advancement, not just in the Philippines but in the rest of the world.• The predictive analytics dataset and econometric models could provide empirical shreds of evidence to assert poverty theories on rural studies that will add literature on scarce multidimensional poverty measurements in the Philippines and other poor countries.

Background
Since time immemorial, poverty has been one of the most persistent social problems that ever occurred and has not yet been fully resolved.Haughton and Khandker (2009) claim that it has a negative effect on the growth of society and economic prosperity [2] .It is difficult to handle and revert.Numerous economists have tried to explain the causes of global poverty and the required changes.Nonetheless, assessing poverty in rural areas is more challenging, and there is a dearth of published research on the subject.Thus, researchers have been compiling necessary datasets from immense community-based monitoring systems to measure multidimensional poverty in their region to be able to come up with sound policy interventions that can alleviate poverty and promote economic development.Measuring the unmeasurable in the field of natural sciences has been conducted by various researchers [3 , 4 , 7 , 11] .However, Economic issues and social events are difficult to quantify, challenging to analyze, and some are unmeasurable [10] .A limited amount of research has also been done on the multifaceted nature of poverty in rural areas of the Philippines, the decentralization of poverty cases, economic progress, and sociological advancement, particularly in the Bicol region, is insufficient.Moreover, only a few studies have made use of complete enumeration techniques in examining the depths of poverty.In the Partido district, no research has been done to quantify poverty using predictive analytics and advanced econometrics.Bicol region is the poorest region in Luzon and Camarines Sur is the poorest province in the Bicol region with a poverty incidence of 38.7% [6] .Thus, the dataset is vital in addressing multidimensional poverty and promoting economic growth, not just in this locale but in the rest of the world by utilizing the models and indicators used in this work.

Data Description
Poverty is a substantial absence of well-being [2] .It has no standard definition because poverty is notoriously difficult to measure.Different countries or regions have varying definitions of poverty and its definition is based on the poverty line or food thresholds.Countries or regions around the globe have different poverty lines (z) or food thresholds (f).Thus, it is important to note that modeling poverty depends on a certain region or locality.For instance, in the Philippines, a household is deemed poor if they are earning PhP10,481 and below (around 186.18 USD) based on 2019-year statistics.Considering the foregoing, the datasets are very useful in understanding and modeling poverty, not just for a specific locale of the work but may serve as a benchmark in measuring poverty around the globe.The constructs and indicators we are proving and providing with this work are generally reproducible by scholars across regions.
When modeling and analyzing poverty using multidimensional parameters, these variable sets are quite helpful.The community-based monitoring system (CBMS) was mined, wrangled, and clustered; an all-in system was used to fit the model; bidirectional, forward selection, and backward elimination approaches were used; variables were categorized and the work was reviewed; the local government provided a benchmark; the physical observation of poverty through in-person visits was conducted; and data transparency and availability were taken into consideration.The poverty models presented in this work are intended to describe, diagnose, and predict multidimensional poverty through multiple constructs and indicators.By utilizing the datasets and generating useful statistics and analytics, a prescription to alleviate poverty and promote economic development could be provided.
The dataset in the repository contains 6 components, namely: the multidimensional poverty variables, the population dynamics, the descriptive analytics, the diagnostic analytics, the poverty analytics, the predictive policy, and the prescriptive analytics [1] .The multidimensional poverty variables are contained in an excel file with 34 indicators at household magnitude (21) and proportion measurements (15) of 4 sectors (Isarog, Poblacion, Ranggas, and Salog) of 34 Barangays.
Table 1 reveals 31 variables at magnitude measurements with corresponding descriptions and definitions.These sets of variables are very useful in multidimensional poverty modeling and analysis with multidimensional constructs.The indicators were chosen based on the clustering, mining, and wrangling of the community-based monitoring system (CBMS), model fitting through an all-in system, bidirectional, forward selection and backward elimination approaches, regression, and classification of authors, review of related work, benchmark from local government, physical observation of poverty through personal visitations, and data availability and transparency.
The abovementioned table depicts 17 variables at proportion measurements that can be utilized for descriptive, diagnostics, predictive, and prescriptive modeling and analysis of poverty at various configurations.They were selected based on the core poverty indicators of the community-based monitoring system (CBMS), empirical shreds of evidence of causations from   various models of authors, review of related literature, a benchmark from local government unit, physical observation of faces of poverty, and data availability and transparency.Table 3 summarizes the 17 variables with the 7 main poverty clusters at magnitude and proportion measurements.These clusters are vital for poverty characterization and visualization purposes.
Table 4 lists the 4 primary sectors of the municipality.The four sectors were classified using the municipality's sectoral classification system.The Isarog sector was coined from the tallest forested peak on Southern Luzon, the Mt.Isarog.It is also known as the Upland Sector, named for the fact that 12 of its barangays are situated on Mt.Isarog's slope.The Poblacion sector serves as the commercial and economic hub and consists of 10 barangays.While, the Bicol word "Riverine," which describes towns near rivers, is the source of the word "Salog" sector.All barangays in the Salog Sector have access to several streams.The longest river in the area is called The Ranggas, and it runs from Mount Isarog to the seaside communities of Partido.
Fig. 1 shows an output of the dataset concerning the multisectoral calamity occurrences in all the sectors and municipalities.It was processed through descriptive analytics by summarizing all occurrences of natural disasters among 4 sectors.Based on the results, it is evident that the region is prone to natural hazards, particularly typhoons.The majority of all households suffer from the onslaught of typhoons since we are situated along the typhoon belt of the Philippines.This is an output of diagnostic analytics where the disaster risk preparedness of households is analyzed and visualized ( Fig. 2 ).More households in the Poblacion sector are prepared as compared with other sectors.By utilizing and producing this output, we can derive insights to analyze disaster risk preparedness in connection to calamity occurrence.Fig. 3 depicts the results of poverty analytics on the district and is being complemented by Table 5 .4 metrics were utilized, namely: headcount ratio which measures the incidence of poverty, gap ratio which measures the depth of poverty, severity ratio measures the intensity of poverty, and watts index that measures the degree of poverty at distributional functions.The poverty incidence is the proportion of the population that is comprised of those who live in poverty.The headcount ratio has the drawback of ignoring the degree of poverty; as the poor get poorer, the headcount index stays the same.Thus, the Poverty gap index is necessary.It is the percentage of the poverty line that the average income disparity in the population is expressed as.It assesses how far on average the impoverished fall below the poverty line to establish the level of poverty.To assess the intensity of poverty, the squared poverty gap index is derived.The statistic gives more weight to a poor person's observed income when it declines below the poverty line by squaring each data point on the poverty gap.It is the weighted sum of poverty gaps, whose weight is proportional to the size of the gap.Watts index decomposition was made to reflect the total distributional property of poverty.Fig. 4 illustrates the classification results from predictive analytics that analyzes the performance of the model.It shows the combination of interacting variables such as the number of households and access to sanitary toilet facilities, and informal settlement in connection to the probability of being poor or living below the poverty line.The graphs provide significant information on the empirical relationship and causation of variables that may be reconstructed using other variables or may be reproduced in other poor regions.
The researcher also employed a variety of interaction variables, as the image illustrates.The four factors that have an adverse effect on poverty outcomes are households without access to safe water x household members; households without access to safe water x informal settlers; households without access to safe water x households living in temporary housing; and households without access to a sanitary toilet facility x household living in temporary housing.It is clear that as housing, water, and sanitation indicators improve, the chance of a family becoming impoverished decreases.Informal settlers are more likely than formal settlers to experience poverty.Additionally, as the number of household members rises, so does the likelihood of becoming poor.The chance that a household will fall below the poverty line is lowest for households with fewer members and access to a sanitary restroom; it then rises for households with a large number of members and access to a sanitary restroom; finally, it falls for households without access to a sanitary restroom and fewer members.The findings demonstrate that one important factor influencing health dynamics is household size, which is a strong predictor of poverty.These results can be reused or duplicated by other scholars and researchers of poverty in poor regions of the world.

Experimental Design, Materials and Methods
The researchers generated, filtered, cleaned, and analyzed the dataset from the immense community-based monitoring system of the Local government unit of Goa, Camarines Sur from 2018-2020.Researchers utilized causal-explanatory design, predictive analytics, and advanced econometric models to generate the processed datasets.The researchers are authorized to process the dataset to be utilized for policy-making purposes.In data generation and analysis, the researchers used MS Excel, R, Stata, Python, and SPSS to disaggregate, decompose, and scrutinized the datasets.
The fundamental theoretical concepts or ideas that we attempt to quantify or examine are referred to as constructs.Based on economic theories or assumptions, we generate these ab-stract and frequently unobservable ideas.The multiple constructs in measuring multidimensional poverty are depicted through our models and analytics.The variables that are observable and measurable that are utilized as a stand-in for the constructs are called indicators.We employ these particular measures or data points to quantify or capture the relevant constructs.
Table 6 summarizes the multidimensional poverty indicators composed by the authors that can be utilized by the public for economic development and poverty alleviation efforts.There are 2 dependent variables, namely: poverty outcomes at the income (z) threshold (poverty based on a poverty line set by a specific region or country), and poverty outcomes at the food (f) threshold (poverty based on a food line set in specific region or country).19 multidimensional poverty indicators may be utilized in the models as independent variables with 13 interacting variables.These indicators were chosen based on econometric modelling, clustering, benchmarks from related literature, data availability, and diagnostic analytics of the authors.
To determine the attributes of variables that influence a poverty status, a study may utilize the various models devised by Filipino economists who works on poverty economics and economic development, with modifications and multiple fittings [5 , 8 , 9] .The poverty statuses of households based on income and food thresholds are the dependent variables, while multidimensional variables, calamity occurrences, and disaster risk preparedness are the independent factors.In addition, the models included a number of control variables and intervening variables.
Where: Y = logit (p) = log [p / (1-p)], p = probability of being poor of household or individual; α = the intercept or individual effects of socio-economic conditions, education, health and nutrition, water and sanitation, housing and settlement, employment and livelihood, peace and order, calamity occurrences, and disaster risk preparedness which is assumed to be constant; X = vector of independent variables or socio-economic conditions, education, health and nutrition, water and sanitation, housing and settlement, employment and livelihood, peace and order, calamity occurrences, and disaster risk preparedness, including control variables; β = vector of coefficients, intercepts, or effects of socio-economic conditions, education, health and nutrition, water and sanitation, housing and settlement, employment and livelihood, peace and order, calamity occurrences, and disaster risk preparedness on poverty outcomes; i = intervening variables or combined effects of various socio-economic conditions, education, health and nutrition, water and sanitation, housing and settlement, employment and livelihood, peace and order, calamity occurrences, and disaster risk preparedness; and μ = error term.
Logistic regression was employed to reveal the link of multidimensional variables on poverty outcomes.The Econometric Models below was used for predictive analytics.This is an econometric design that is concerned with establishing cause and effect between given variables with binary outcomes for rural setting.The logit models in this study were estimated as follows: Model 1

Negative
Poverty in multidimensional constructs may be measured by the abovementioned indicators.Poverty is multidimensional thus it must be captured by multidimensional variables.These variables include health and nutrition, housing and settlement, water and sanitation, basic education, livelihood and income, employment, natural disasters, disaster risk preparedness, and peace and order, with the presence of intervening variables.The model was designed by combining economic theories on microeconomics and developmental economics, mathematical theories, and statistical principles.Various indicators must be transformed into binary, categorical, and continuous outcomes to be used effectively.By following the data descriptors in Table 6 , the proper coding of data may be achieved.The a-priori section depicts the expected sign of coefficients that is useful in poverty prediction.Positive coefficients are expected to increase poverty outcomes measured by income and food, while negative coefficients are expected to reduce the poverty outcomes measured by income and food.The model that we have constructed is a product of poverty econometrics that validates correlation and causation from immense data inputs through multidimensional constructs.By utilizing the indicators, the coefficients and odds ratios of independent variables may be measured which will provide significant insights into the probability of poverty outcomes in the municipality at ceteris paribus assumption.The afore-  The headcount ratio (HCR) calculates the proportion of the population that is impoverished.When the expression included in brackets is true, the i function returns 1, and when it is false, it returns 0. A household is deemed poor if its income (yi) is less than the poverty line (z), in which case the i equals 1.The readability and simplicity of the headcount index are its main benefits.One drawback of the head count ratio is that it doesn't take into account the severity of poverty; as the poor get poorer, the headcount index doesn't change [2] .

Poverty Gap Metrics
z Where, G i = (z -x 1 ) x I(y i < z ) .An indicator of how severe poverty is the poverty gap index.With the non-poor having none or no poverty gap, it is defined as the average poverty gap in the population represented as a percentage of the poverty line.By calculating the average distance below the poverty line, it establishes the level of poverty.The indicator is more closely aligned with zero when the proportion of the population living in poverty is lower and more closely aligned with one when that proportion is higher [2] .Poverty Severity.Squared Poverty Gap Index.P ∝ = 1 N N i=1 ( G i z ) ∝ , (∝≥ 0 ) Where ∝ = sensitivity of index to poverty; z = poverty line; x 1 = value of expenditure (income) per capita for ith person's HH; and G i = z -x 1 (with G i = 0 when x i > z ) = poverty gap for individual i.The poverty gap index is correlated with the squared poverty gap index, sometimes referred to as the poverty severity index.The poverty gap ratio is multiplied by itself, and the average is then determined.The metric gives greater weight to an individual's observed income when it goes below the poverty level by squaring each data set representing the poverty gap.A weighted total of poverty gaps whose weight varies with gap size is the squared poverty gap index.It also takes poverty inequality into account [12] .Watts Index.W = 1 N N i=1 [ln (z) − ln (y i ) ] = ( 1 N ) q i=1 ln ( z y i ) Where the population's income (or spending) is indexed in ascending order for N individuals, and the total is divided by the number of individuals (q) whose income (yi) is below the poverty line (z).The poverty line is divided by income, logs are computed, the impoverished are added, and the index is then divided by the total population.This is one of the earliest measurements of poverty that takes distribution into account [2] .The input and output were categorized according to Estat classification.True denotes the binary outcomes of whether a household is poor or not.There are 1121 non-poor samples and 4648 poor samples in all.All 13,767 data are correctly classified by the model, demonstrating and multidimensional poverty [8 , 9] .Moreover, Fig. 5 showcases the results of poverty measurements through multiple constructs and indicators.It reflects the summary of Table 7 that can be translated into a chart to better understand the poverty cases in a particular region.Furthermore, Table 10 depicts the results of non-linear (logistic) regression on poverty outcomes that were measured or predicted by different constructs and indicators.Tables 6 and  7 reveal the indicators utilized and models 1 and 2 served as benchmarks for the analysis.x refers to the coefficient of the variables while P reveals the p-values at the four disaggregated configurations (4 sectors = Isarog, Poblacion, Salog, and Ranggas) and then the combined levels at Goa, municipality.

Limitations
The analyzed data provided in this work has no known limitation because it utilizes a complete enumeration of all the households and populations.Therefore, it is not exposed to any sampling error, thus satisfying the internal and external validity of sample assumptions.The data covers multidimensional variables that are carefully analyzed and filtered [1] .The dataset has no known biases nor limited quality size.However, time restrictions are present.Generating reliable and verifiable datasets takes time.The data covers 2018-2020 conditions and the next data update takes time to occur.

Women who died due to pregnancy related causes 1 (
HH with Women who died due to pregnancy related cases), 0 (HH without Women who died due to pregnancy related cases) 7. γ Malnourished children 0-5 years old 1 (HH with children aged 0-5 who are malnourished), 0 (HH without children aged 0-5 who are malnourished) 8.I Total number of HH with children 0-5 years old Total number of households with children 0-5 years old 9. ι Total population of children aged 0-5 years old Total population of children aged 0-5 years old from HH10.δ Households living in makeshift housing 1 (HH who are living in Makeshift Housing), 0 (HH who are not living in Makeshift Housing) 11. ε Households who are informal settlers 1 (HH who are informal settlers), 0 (HH who are not living in Makeshift Housing) 12. ζ Households without access to safe water 1 (HH without Access to Safe Drinking Water), 0 (HH with Access to Safe Drinking Water) 13. η Households without access to sanitary toilet facility 1 (HH without Access to Sanitary Toilet Facility), 0 (HH with Access to Sanitary Toilet Facility) 14. θ Children aged 6-11 years old who are not attending elementary 1 (HH with children not attending elementary), 0 (HH with children attending elementary) 15.K Total # of HH with children aged 6-11 Total number of HH with children aged 6-11 years old (Qualified for elementary level) 16. κ Total population of children aged 6-11 years old Total population of children aged 6-11 years old (Qualified for elementary level) from HH 17. λ Children aged 12-15 years old who are not attending Junior High School 1 (HH with children not attending junior high school), 0 (HH with children attending junior high school) 18. O Total # of HH with children aged 12-15 years old Total number of HH with children aged 12-15 years old (Qualified for junior high school level) 19.ο Total population of children aged 12-15 years old Total population of children aged 12-15 years old (Qualified for junior high school level) from HH 20. μ Children aged 16-17 years old not attending Senior High School 1 (HH with children not attending senior high school), 0 (HH with children attending senior high school) 21. π Total # of HH with children aged 16-17 Total number of HH with children aged 16-17 years old (Qualified for senior high school level) 22.Total population of children aged 16-17 Total population of children aged 16-17 years old (Qualified for senior high school level) with food shortage), 0 (HH without food shortage) 26.ψ Unemployed members of the labor force 1 (HH with unemployment), 0 (HH without unemployment) 27.ϒ * Total # of HH with members of the labor force Total number of HH with members of labor force (active) 28.υ * Total population of members of the labor force Total population of labor force (active) from HH 29.ω Victims of crime 1 (HH with victims of crime), 0 (HH without victims of crime) 30.H Total Number of Households Frequency count of all households 31.N Total Population Frequency count of all household members

Fig. 1 .
Fig. 1.Multisectoral calamity occurrence analytics that can be used or reproduced by the public in other regions.

Fig. 2 .
Fig. 2. Multisectoral disaster risk preparedness analytics that can be used or reproduced by the public in other regions.

Fig. 3 .
Fig.3.Poverty analytics at rural level that can be utilized or reproduced by the public in other regions.

Fig. 4 .
Fig. 4. Classification results and predictive margins of multidimensional variables that can be used or reproduced by the public in other regions.
mentioned models may be modified to add or deduct indicators that can be used for predictive analytics, specific to a local or region.To evaluate the extent of poverty the following measures may be utilized and generated.Onsay (2021) has generated the following measures and indices to analyze the rural poverty [5] : Headcount Ratio.P 0 = 1 N N i=1 (y i < z ) ; P O = N P N Where, Np = Number of poor; and N = Total Population (or sample).

Fig. 5 .
Fig. 5. Multidimensional poverty measurements through the multiple constructs and indicators.

Table 1
Overview of data indicator characteristics at magnitude measurements.

Table 2
Overview of data indicator characteristics at proportion measurements.

Table 3
Multidimensional clusters of variables that can be used for poverty modeling and analysis at magnitude and proportion measurements.
* Total # of HH with members of the labor force * Total population of members of the

Table 4
The sectors and barangays of the municipality for population dynamics and poverty analytics.

Table 5
Results of poverty analytics on all the barangays and sectors of the municipality.

Table 7
Intervening variables that can be utilized by the public for economic development.

Table 8
Classification matrix of poverty prediction that can be utilized by the public to derive insights about multidimensional poverty.

Table 9
Measuring multidimensional poverty through multiple constructs and indicators.