Identifying inter-seasonal drought characteristics using binary outcome panel data models

Abstract This study mainly focuses on spatiotemporal and inter-seasonal meteorological drought characteristics. Random Effect Logistic Regression Model (RELRM) and Conditional Fixed Effect Logistic Regression Model (CFELRM) are used to identify the spatiotemporal and inter-seasonal characteristics of meteorological drought in selected stations. The log-likelihood Ratio Chi-Square (LRCST) and Wald chi-square tests (WCTs) are used to assess the significance of RELRM and CFELRM. The Hausman test (HT) is applied to select the appropriate model between RELRM and CFELRM. For instance, HT suggests the CFELRM as an appropriate model in spring-to-summer spatiotemporal drought modelling. The significant coefficient from CFELRM indicates that an increment in moisture conditions of the spring season will decrease the probability of drought in the summer. The odds ratio of 0.1942 means that 19.42% chance of being in a higher category. Similarly, in summer-to-autumn using RELRM the computed odds ratio of 0.0673 shows that 6.73% chance of being in a higher category.


Introduction
Drought is the greatest recurring natural hazard and becomes a source of huge losses in agriculture sectors (Cunha et al. 2019;Mondol et al. 2021;Ha et al. 2022;Orimoloye et al. 2022;Zarei et al. 2023), natural ecosystems (Deng et al. 2021;Yao et al. 2022) and forestry (Anderegg et al. 2020;S anchez-Pinillos et al. 2022).Its impacts slowly hold an area over time, and it may remain for a long phase; in severe instances, drought can be sustained for many years and distress the environment, agriculture and socio-economic sectors (Haile et al. 2020;Chen et al. 2021;Han and Yang 2021;Jordaan 2022;Savelli et al. 2022;Sharma and Sen 2022).The researchers have broadly segregated the drought Papadopoulos et al. 2021).Generally, these regression models mainly deal with the continuous dependent variable.The linear regression models cannot perform appropriately for the categorical dependent variable.Therefore, categorical-based frameworks are commonly developed, for example, logistic regression, multiple logistic regression, ordered logistic regression, etc. (Ford and Labosier 2014;Bachmair et al. 2017;Meng et al. 2017;Niaz, Zhang, et al. 2021;Niaz, Tanveer, et al. 2022).Ford and Labosier (2014) used logistic regression to find drought persistence in varying stations.Meng et al. (2017) used logistic regression for the categorical dependent variable to find the spatial pattern of drought.Niaz, Raza, et al. (2022) recently used a proportional odds model to find interseasonal drought characteristics in various stations.However, these studies have not addressed the spatiotemporal effect of meteorological drought.The spatiotemporal and inter-seasonal drought characteristics can implicitly support reducing the potential negative impacts of drought.Therefore, finding the spatiotemporal characteristics of meteorological drought for prediction is vital.Thus, this issue underpins the development of the new methodology.Therefore, the current research examines spatiotemporal drought characteristics in numerous seasons (winter, spring, summer and autumn).The binary outcome panel data models with Random Effect Logistic Regression Model (RELRM) and Conditional Fixed Effect Logistic Regression Model (CFELRM) are utilized for identifying the spatiotemporal and inter-seasonal drought characteristics.Further, to substantiate the significance of RELRM and CFELRM for the seasonal investigations, the log-likelihood Ratio Chi-Square Test (LRCST) and Wald chi-square test (WCT) are utilized.The tests suggest that both models can be used for the current analysis.However, based on Hausman Test (HT), the more appropriate model is chosen to describe the spatiotemporal and inter-seasonal drought persistence.Moreover, the outcomes of the current analysis provide the groundwork for paying more attention to early warning systems for effectively managing water resources to avoid negative drought impacts in Pakistan.

Description of the study area
The geographic coordinates of Pakistan (Figure 1) are Latitude: 30 23 0 21.84 00 N and Longitude: 69 21 0 11.59 00 E covering a total area is 796,096 km 2 : Due to domain variability, its climate changes as country topography, and it's extremely exposed to the impact of climate change because of its geographical location, low technological resources, high population level, high internal variability and low resource base.The seasonal and annual rainfall patterns and extreme weather are changing, leading to droughts, landslides, cyclones and floods.In Pakistan, the winter and summer seasons often provide rain due to two primary meteorological phenomena: the summer monsoon from South Asia and the western parts.The average temperature reported by rainfall systems during monsoon and winter seasons is between 12 and 20 degrees Celsius and 19 and 35 degrees Celsius, respectively, and around 31 and 45% of the yearly rainfall during winter and monsoon seasons.Moreover, Pakistan has a semi-industrialized economy with a well-integrated agriculture sector.Most of the population of Pakistan is based on the agricultural sectors.Hence, the extreme drought events can be dangerous for agricultural sectors, ultimately affecting the country's population.Drought is progressively threatening Pakistan's agricultural and economic sectors (Anjum et al. 2012;Haroon et al. 2016;Waseem, Khurshid, et al. 2022;Hussain et al. 2023).Besides the ecological losses, drought can disturb the economy and negatively affect food security in the country (Idrees et al. 2022;Waseem, Jaffry, et al. 2022).Therefore, assessment and drought monitoring should be adequately executed to assist policymakers and water managers in improving drought management policies.

Data and methods
This research results are derived from the time-series data ranging from January 1971 to December 2017 for 42 meteorological stations in Pakistan.The climatological characteristics of the selected stations are suitable for current research.Therefore, the SPI at a three-month time scale (SPI-3) is utilized for the current analysis.For identification of the spatiotemporal and inter-seasonal drought characteristics in various stations, this study uses the binary outcome panel data models with RELRM and CFELRM.The LRCST and WCT are utilized to measure the significance of RELRM and CFELRM (Figure 2).2.2.1.Standardized precipitation index (SPI) SPI has been widely used in assessing and monitoring drought for varying time scales (Angelidis et al. 2012;Stagge et al. 2015;Niaz, Almazah, Zhang, et al. 2021;Niaz, Hussain, Zhang, et al. 2021;Niaz, Zhang, et al. 2021;Cerpa Reyes et al. 2022;Elbeltagi et al. 2023).The calculation of the SPI is merely based on the precipitation observations.SPI can be used to compare drought events in different climate regions (Cunha et al. 2019;Mondol et al. 2021;Ha et al. 2022;Orimoloye et al. 2022).SPI is interrelated to the moisture deficit in the soil at short timescales, whereas it can be linked to the groundwater and reservoir at longer timescales (Cerpa Reyes et al. 2022;Kamali and Asghari 2022;Elbeltagi et al. 2023).Several techniques are used to standardize observed precipitation to quantify the SPI values (Naresh Kumar et al. 2009;T€ urkes ¸and Tatlı 2009;Farahmand and AghaKouchak 2015).However, in the current analysis, we adopted the transformation method (Farahmand and AghaKouchak 2015) of SPI is as follows, Farahmand and AghaKouchak 2015;Ali et al. 2020;Niaz, Tanveer, et al. 2022). When where

Binary response panel data modelling
Binary response modelling applies in several studies (Manski 1988;Sueyoshi 1995;Horowitz and Savin 2001;Kleinbaum and Klein 2010;Chauhan et al. 2016).Particularly use of panel data analysis is prominent in the literature (Honor e and Lewbel 2002;Arellano and Carrasco 2003;Sutradhar et al. 2008;Charbonneau 2017;Semykina and Wooldridge 2018).The panel model for binary response is given by, and zero otherwise z indicates the binary response, i indicates the individual or items and t indicates the number of observations within i, which vary over time.x indicates explanatory variables, b is the regression coefficient, l i is the individual-specific effect and 2 it is called idiosyncratic errors because these vary across i as well as across t: Ideally, we are interested in the correlation of 2 it and 2 is within the group but uncorrelated across the groups.l i is the unobserved individual-specific effect which vary across the groups.We decide between fixed and random effect models by examing the relationship between l i and x it : The random effect model assumes that l i and x it are not correlated, which also means that the conditional distribution of f ðl i j x it Þ does not dependent on x it : If there is a correlation between l i and x it , then we prefer the fixed effect model.

Conditional fixed-effect logistic regression model
Conditional fixed-effect logistic regression gives us a consistent estimate as compared to unconditional fixed-effect logistic regression.The nonlinear binary response model is given by, The M is a nonlinear function; hence, we cannot use linear regression for estimating purposes.There are various nonlinear functions for estimating M in the literature, but the most common nonlinear function for M is the logit function.
The range is between zero and one for the above function.This is also called a cumulative distribution function (CDF) for the logistic variable.
If M is the logistic CDF, then we obtain the log-likelihood as Estimating the parameters in this model is not an easy task because the unknown parameter l i is involved here.In the linear regression model, it's easy to eliminate l i by using differences or within the transformation.The logit functional form enables us to eliminate l i from the estimating equation by conditioning on the minimum sufficient statistic for l i : In such a way, we obtained a conditional likelihood to estimate the parameters of the model.For T ¼ 2, the conditional probabilities: The distribution function is given by: The conditional log-likelihood function is given below: The conditional probability of the response variable (z i1 , z i2 , . . ., where The denominator is a sum over all the possible combinations of T Hi À Á different sequences of T zeros and ones that have the same sum as

Random effect logistics model
The individual-specific effect l i is not correlated with the explanatory variable x it , then we used the random effect logistics regression.Let l i is assumed to be a random individual-specific effect.In the case of the random effect model l i is usually specified as being distributed as Gaussian.However, to determine the random effect distribution, whether the joint likelihood forms the binomial distribution is analytically observed.The random effect logistics probability function can be expressed in exponential family form as: For a binomial logistics model, such as a grouped random effect logistics model with k as the binomial denominator, the probability function in exponential family form is given as: The log-likelihood function of the Bernoulli model can be expressed as: In the current analysis, the dependent variable is dichotomous; one shows the drought persists, while zero indicates the drought does not persist.The two seasons are used to characterize the drought persistence in the selected periods.For instance, to find drought persistence in winter-to-spring, the winter and spring seasons are used to calculate drought persistence.Hence, for winter-to-spring drought persistence, the moisture conditions of the spring seasons are considered as an independent variable for the current analysis.Several researchers have used this variable to identify drought persistence in the preceding seasons in various publications (Ford and Labosier 2014;Meng et al. 2017;Aryal and Zhu 2021;Niaz, Zhang, et al. 2021;Niaz, Raza, et al. 2022).In the current analysis, the significance of the previous seasons to the current seasons is identified by the binary outcome panel data models including RELRM and CFELRM.The LRCST and WCT are applied to evaluate the significance of RELRM & CFELRM.The LRCST and WCT are identified as important for RELRM & CFELRM.Though, based on the HT, the selected model is employed to find the spatiotemporal and inter-seasonal drought persistence in selected seasons accordingly.

Results
The data from 42 stations in Pakistan are processed for the current analysis.The stations are selected based on the availability of monthly data for the 47 years.The characteristics The climatological features of precipitation are presented for the specific stations; however, the observed characteristics in other stations can be presented accordingly.These distributions are selected as appropriate for the standardization of the precipitation data at a three-month time scale.
of the precipitation data, including average and standard deviation observed in varying stations, are provided in Figure 3.Other characteristics of the precipitation in varying stations are given in Table 1.The greater value of the precipitation is observed in Murree station with mean of 146.18 mm.Precipitation of other stations can be observed accordingly.The precipitation observations are standardized to quantify the drought severity.
Several probability distributions are used to standardize the precipitation values.The distributions are selected based on the Bayesian Information Criteria (BIC).The selected distributions and their BIC values are provided in Table 2.The empirical and theoretical distributions on varying stations are presented in Figure 4.The temporal behaviour of the SPI-3 in several stations can be observed in Figure 5. Based on the SPI-3 values, drought is classified into two classes (drought (SPI 0) and no drought SPI > 0) (Li et al. 2015).
For the numerical representation, if drought occurs the value is '1' assigned if the drought does not occur, '0' is assigned accordingly.The temporal variation for the number of months in a year with SPI 0 is presented in Figure 6.It shows that the total number of droughts occur in each year.The maximum and minimum drought occurrences in varying stations can be observed accordingly.However, the theoretical vs. empirical distributions on other stations can be observed accordingly.
The average month of SPI 0 for selected stations across years can be observed in Figure 7. Figure 8 indicates the drought count in 47 years for the selected station.The annual drought average for Dir, Astore, Badin, Kalat, Bahawalnagar and Jacobabad are 8.2127, 8.1276, 7.1489, 5.8085, 5.3404 and 5.8723; similar behaviour can be observed in other stations.Further, the numerical quantification of drought frequency in a certain station can be achieved by dividing the total number of months with SPI 0, to the total number of months (from January 1971 to December 2017).In the current research, the data of each station are categorized into four seasons (Winter, Spring, Summer and Autumn).Therefore, knowledge regarding the seasonal drought frequency is important.Hence, seasonal drought frequency is calculated by the percentage of seasons in which drought SPI 0 occurs over the studied period for various seasons (autumn, winter, spring and summer).The varying behaviour of seasonal drought frequency is presented in Figure 9. Seasonal drought persistence is calculated as the total number of years in which drought continues from one season to the next divided by the total number of years in which SPI 0 in the preceding season.The probability of drought persistence is presented in Figure 10.It can be observed that the probability of drought persistence is high in Summer-Autumn.Table 3 presents information about the winter-spring drought persistence modelling.The log-likelihood values, WCT, LRCST and p values for the RELRM & CFELRM are given.The significant values of the tests of both models indicate that the RELRM & CFELRM are important.However, the HT is employed to substantiate the more suitable model for the current data set.The p value (0.6847) of the HT confirms that the RELRM is an appropriate choice for the spatiotemporal winter-spring drought persistence modelling.In Table 4, the results of the RELRM are given and interpreted accordingly.The coefficient of SPI with an odds ratio (0:1624Þ, indicates that an increment in the SPI-3 values in the winter season will decrease the probability of drought occurrences in spring.
Further, q ¼ is the proportion of variance explained by panel-level of variance component.where d 2 l is the panel level of variance and d 2 2 is the variation of the error term, 2 : The 2 is distributed with mean zero and variance p 2 3 : When q is zero, it means no importance of the panel-level of variance component.In the current analysis, the q ¼ 0:1124, showing the panel level of variation is almost 11%, and the Likelihood Ratio test is significant, which validates the effects of panel data modelling.Table 5 presents the results about the spring-summer spatiotemporal drought persistence modelling.The loglikelihood values, WCT, LRCST and p values for the RELRM & CFELRM are given.Based on HT with p value ¼ 0.0009, it shows that the CFELRM is an appropriate choice for the spatiotemporal spring-to-summer drought persistence modelling.In Table 6, the results obtained from CFELRM are given.The significant coefficient of SPI-3 has an odd ratio ð0:1942Þ (95% confidence interval is À1.7754 to À1.5427), indicating that an increment in moisture conditions of the spring season will decrease the probability of the drought in the summer season.The odds ratio of 0.1942 means that there is a 19.42% chance of being in a higher category (i.e. 1 ¼ drought).It means the most likely to stay in the lower category (i.e.0 ¼ No drought).Table 7 provides a description of the spatiotemporal drought persistence modelling in summer-to-autumn.The log-likelihood values, WCT, LRCST and p values for the RELRM & CFELRM are provided accordingly.Based HT with a p-value (0.541) substantiates the RELRM is an appropriate choice for the summer-to-autumn spatiotemporal drought persistence analysis.In Table 8, the coefficient of SPI-3 with an odd ratio (0.0673) indicates an increase in the SPI-3 values in the summer season; the drought will be less likely to occur in autumn.The likelihood ratio test for q  9.The HT with a p-value (0.5683) suggests that the RELRM is more appropriate for the persistence of summer to autumn drought.The odds ratio (Table 10) of the SPI-3 is significant for the autumn-to-winter drought.The 95% confidence interval for the odds ratio of SPI-3 is À0.8481 to À0. 6938.The odd ratio shows that an increment in moisture conditions during the autumn season will decline the probability of drought in the winter season.Conclusively, the binary outcome panel data models provide comprehensive spatiotemporal information about the selected stations of Pakistan.Further, the results obtained from the current research would be considered for the current scenario and application site; however, they cannot be generalized for other climatic conditions.As climatology conditions of the selected region will change the outcomes and influence the extrapolations.

Discussion
Early warning and monitoring are critical elements of drought preparedness and mitigation plans.Several studies have developed techniques to monitor and predict drought occurrences in different climatic conditions (Nayak and Hassan 2021;Alawsi et al. 2022;Cerpa Reyes et al. 2022;Dikshit et al. 2022;Kamali and Asghari 2022;Elbeltagi et al. 2023;).Linear regression is utilized frequently in literature for statistical prediction (G€ uner Bacanli 2017; Kim et al. 2020;Papadopoulos et al. 2021).Generally, these regression models mainly deal with the continuous case of the dependent variable.The linear regression models cannot work properly for the categorical response variable.Thus, categorically based modelling is generally established, for instance, logistic regression, multiple logistic regression, ordered logistic regression, etc. (Ford and Labosier 2014;Bachmair et al. 2017;Meng et al. 2017;Seo et al;Niaz, Zhang, et al. 2021;Niaz, Raza, et al. 2022).Ford and Labosier (2014) have employed logistic regression to examine drought persistence in different stations.Meng et al. (2017) utilized logistic regression to investigate the spatial pattern of drought.Kuwayama et al. (2019) used panel data models with fixed effects that exploited spatial and temporal variations in drought conditions.Sun et al. (2019) used panel data models for identifying key factors of affecting regional drought.The study found that the use of panel data model had objectively and comprehensively reflected the actual situation of the drought in the region.Moreover, Defrance et al. (2020) used panel data model to address spatial impact of drought.Although the mentioned studies have identified the greater detail about drought occurrences and their impacts based on several variables.However, the identification of the inter-seasonal drought characteristics is important.For this purpose, recently, Niaz, Raza, et al. (2022) applied a proportional odds model to identify inter-seasonal drought characteristics in various stations.However, these findings have not focused on the spatiotemporal effect of meteorological drought.This study is mainly focused on spatiotemporal and inter-seasonal drought characteristics.The SPI at a 3-month time scale is utilized to identify the drought characteristics in various seasons.The binary outcome panel data models containing RELRM and CFELRM are employed for finding the spatiotemporal and inter-seasonal drought persistence in certain stations.The LRCST and WCT test are used to measure the significance of RELRM and CFELRM.The LRCST and WCT are significant for RELRM and CFELRM.Hence the LRCST and WCT substantiate that the use of  RELRM and CFELRM is vital for the current analysis.However, based on HT, the selected model is utilized for identifying the spatiotemporal and inter-seasonal drought persistence in certain stations accordingly.Further, in the literature, none of the authors have considered the binary outcome panel data models for the meteorological drought analysis, specifically for seasonal drought analysis.Thus, the spatiotemporal study of drought occurrences based on binary outcome panel data model is an innovative step in the literature that may help meteorologists and scientific researchers to substantiate their modelling.Moreover, the outcomes of the current analysis provide the groundwork for    paying more attention to early warning systems for effectively managing water resources to avoid negative drought impacts in Pakistan.In addition, these models can be more effective for other drought indices that incorporate more climate variables.

Conclusion
Knowledge of spatiotemporal and inter-seasonal drought characteristics can obliquely support diminishing the potential negative influences of drought.The binary outcome panel data models including RELRM and CFELRM are utilized to find the spatiotemporal and inter-seasonal drought persistence in selected stations.The LRCST and WCT test are used to measure the significance of RELRM and CFELRM.The log-likelihood values, WCT, LRCST and p values for the RELRM and CFELRM are calculated for the drought persistence modelling in various seasons.For instance, in the winter-spring drought persistence modelling the significant values of the tests of both models indicate their importance for the winter-spring drought persistence modelling.However, the HT is employed to substantiate the more suitable model for the current data set.The p value (0.6847) of the HT The 95% confidence interval for the odds ratio of SPI-3 is À0.8481 to À0. 6938.The odd ratio indicates that an increment in moisture conditions during the autumn season will decrease the probability of drought in the winter season.The significant value of q ð0:1422Þ is the panel level of variance which shows 14.22% of the total variance.LR test of rho ¼ 0 chibar2(01) ¼ 502.35;Prob > ¼ chibar2 ¼ 0.000.
confirms that the RELRM is an appropriate choice for the spatiotemporal winter-spring drought persistence modelling.The coefficient of SPI with an odd ratio (0:1624) indicates that increasing the SPI-3 values in winter will decrease the probability of drought occurrences in spring.The q ¼ 0:1124, shows the panel level of variation is almost 11%, and the Likelihood Ratio test is significant which validates the effects of panel data modeling.Further, the CFELRM is an appropriate choice for the spatiotemporal spring to summer drought persistence modelling.The significant coefficient of SPI-3 with an odd ratio of 0:1942 (95% confidence interval is À1.7754 to À1.5427) indicates that an increment in moisture conditions during the spring season will reduce the probability of drought in the summer season.The odds ratio 0.1942 means that 19.42% chances of being in higher category (i.e.1¼ drought).It means the most likely to stay in the lower category (i.e.0 ¼ No drought).Based HT with a p value (0.541) substantiates the RELRM is an appropriate choice for the summer-to-autumn spatiotemporal drought persistence analysis.Similarly, the RELRM is selected for the summer-to-autumn and autumn-to-winter spatiotemporal drought persistence analysis.Conclusively, the binary outcome panel data models provide comprehensive spatiotemporal information about the selected stations of Pakistan.The outcomes of the current analysis contributes to a growing literature and can serve as an early warning to the effective management of water resources to avoid the negative drought impacts in Pakistan.

Figure 1 .
Figure 1.Geographical locations of selected stations.

Figure 2 .
Figure 2. Framework for the application of binary panel data models.

Figure 3 .
Figure 3. Mean and standard deviation for selected geographical stations.

Figure 4 .
Figure 4. Theoretical vs. empirical distributions on several stations.The results of several stations are presented.However, the theoretical vs. empirical distributions on other stations can be observed accordingly.

Figure 5 .
Figure5.The temporal behaviour of SPI-3 on varying stations.The specific stations are presented, however, the temporal behaviour of SPI-3 other selected stations can be observed accordingly.

Figure 6 .
Figure6.Temporal behaviour of SPI-3 0 in various stations.The maximum and minimum drought for each year is provided.For instance, the maximum and minimum drought occurred in Astore is 12 and 5, respectively, from January 1971 to December 2017.

Figure 7 .
Figure7.The latitude and longitude of the selected stations and their monthly average drought for SPI-3 < 0 in selected time period.

Figure 8 .
Figure 8.The total number of counts with SPI-3 0 in the selected stations.

Figure 9 .
Figure 9. Seasonal drought frequency for each selected station.

Figure 10 .
Figure 10.Drought Persistence percent probability for selected stations.

Table 1 .
Climatological characteristics of the precipitation are given for the various stations.

Table 2 .
The various probability distributions and their BIC are given.

Table 3 .
Information regarding the winter-spring drought persistence modelling is provided.The log-likelihood values, WCT, LRCST and p values for the RELRM and CFELRM are given.The significant values of the tests of both models indicate that the RELRM and CFELRM are important.

Table 4 .
The results obtained from the RELRM are provided for the winter-to-spring meteorological drought persistence.1624Þ, shows that increment in the SPI-3 values in winter season will decline the probability of drought occurrences in spring.The q ¼ 0:1124, shows the panel level of variation is almost 11% and the Likelihood Ratio test is significant which validates the effects of panel data modelling.LR test of rho ¼ 0 chibar2(01) ¼ 262.10;Prob > ¼ chibar2 ¼ 0.0001.

Table 5 .
The results about the spring-summer spatiotemporal drought persistence modelling.The log-likelihood values, WCT, LRCST and p values for the RELRM and CFELRM are given.

Table 6 .
The results obtained from the CFELRM are given for the spring-to-summer meteorological drought persistence.
It implies the most likely to stay in lower category (i.e.0 ¼ No drought).

Table 7 .
The log-likelihood values, LRCST and WCT for RELRM and CFELRM are computed for the summer-to-autumn meteorological drought persistence.

Table 8 .
The results found from the RELRM are provided for the summer-to-autumn meteorological drought persistence.

Table 9 .
The values obtained from the various tests for RELRM and CFELRM are given for the autumn-to-winter meteorological drought persistence.

Table 10 .
The results attained from the RELRM are provided for the autumn-to-winter meteorological drought persistence.