Measuring provincial digital finance development efficiency based on stochastic frontier model

: Effective development of digital finance is vital to closing the regional economic disparities. This study aims at investigating the efficiency of digital finance development in China and its implications for closing regional economic disparities. Using the stochastic frontier model, we estimate the development efficiency of digital finance in 31 provinces in China from 2011 to 2020, and reveal their characteristics of temporal evolution and spatial distribution. The results show that the efficiency of digital finance development in each province shows a tendency to increase quickly first and then slowly decline. The provinces with a higher level of digital finance development always have higher development efficiency at the beginning of the sample period, which then declines rapidly after reaching the maximum, and even less than the national average value at the end of the period, with significant regional disparities observed. The provinces with a higher level of digital finance development always have higher development efficiency at the beginning of the sample period, which then declines rapidly after reaching the maximum, and even less than the national average value at the end of the period. The imbalance of development efficiency among different provinces is increasing, and the potential for development efficiency in the central and western regions is relatively greater. These findings have important implications for promoting high-quality economic development and common prosperity in China. In the future, we should continually prevent the development efficiency of digital finance to decline rapidly in all provinces (especially in the eastern region), and strive constantly to bridge the gap of development efficiency among different province, so as to provide a better surrounding for promoting high-quality economic development and common prosperity.


Introduction
The original concept of digital finance referred to the provision of financial services through various digital technologies, such as credit cards, chip cards, and automated teller machines (ATMs) (Banks, 2001). As early as 2004, some financial institutions and industries in Europe had already recognized the potential of the Internet to empower finance and expand the industry, and began investing in Internet-related financial businesses (Barbesino, 2005). As the Internet and mobile applications gradually became ubiquitous in people's lives, the definition of digital finance expanded accordingly (Manyika et al., 2016). Digital finance surpasses the geographical limitations of traditional finance, and offers more accurate services at lower costs, with prominent features such as remote access, real-time transactions, and inclusiveness. In China, digital finance emerged in 2013 and has grown rapidly since 2016, with its market size ranking among the highest in the world. Industries such as third-party payments, online lending, digital insurance, and digital currency have all reached the forefront of the world.
In addition to benefiting individual users, digital finance can also benefit the macroeconomic development and the daily operations of the financial sector. Firstly, digital finance can improve the payment and transfer efficiency of individual users, thereby promoting consumer willingness and increasing total consumption, ultimately promoting economic development. Additionally, due to the traceability of digital transactions, government spending and tax losses can be reduced, and the circulation of counterfeit and low-quality banknotes can decrease (Ozili, 2018). Secondly, from an international perspective, digital finance has also helped many developing countries overcome difficult times, created a large number of jobs (Rizzo, 2014), and changed the payment methods of some underdeveloped countries in commerce and trade (CGAP). Finally, for financial institutions, digital finance can promote the expansion of their balance sheets and reduce service costs (Scott et al., 2017). In summary, digital finance can drive the existing financial system to reshape to a certain extent, improve resource allocation efficiency, and enhance capabilities of risk control. This is of great practical significance for breaking through the constraints of traditional finance, enhancing financial support for the development of the real economy, and expanding the inclusiveness and popularity of financial services.
Although the development of digital finance brings many benefits, the serious challenges it faces cannot be ignored. For example, the impact of digital finance of income inequality has led to the emergence of the Kuznets effect, where income inequality shows a "bell-shaped" curve with the development of digital finance (Yao and Ma, 2022). Many regions in China have not yet crossed the turning point of the bellshaped curve of income inequality, which is a problem that requires attention in the future.
Digital finance has quietly infiltrated every aspect of economy and social life, playing a significant role. However, our understanding of the evolutionary characteristics and regional differences in digital finance development efficiency is still limited, and it is necessary to study it in depth. Therefore, we use the Peking University Digital Inclusive Finance Index and the stochastic frontier model (SFM) to measure the efficiency of digital finance development and grasp the dynamic and efficiency of digital finance development from a macro perspective. This allows us to better identify the operating mechanism of the digital finance market and better utilize its key role. This is helpful in theory to enhance the scientific understanding of the current level and the effects of digital finance development in China, enrich the theoretical achievements of digital finance development, and provide a theoretical basis for addressing the problem of imbalanced development of digital finance. In practice, it helps to objectively examine the development status of the industry and promote the coordinated and balanced development of digital finance in various regions.

Literature review
Existing studies have mainly focused on the level of development of digital finance and its relationship with economic and social development, regional imbalances, and technological innovation in small and medium-sized enterprises. For example, Zhang and Xing (2021) used county-level data from 2014 to 2018 to explore the dynamic distribution and regional differences of digital inclusive finance development in rural areas. They found that the overall level of digital inclusive finance in rural areas and in the eastern, central, and western regions tended to rise, and the absolute differences decreased significantly, but there was a weak trend of dispersion in the later period.
The digital finance development efficiency discussed in this paper essentially belongs to technical efficiency, which reflects the input-output status of the industry. Technical efficiency refers to the situation where it is technically impossible to increase any output (and/or decrease any input) without reducing another output (and/or increasing any other input) in a given feasible input/output vector (Koopmans, 1951). Debreu (1951) and Farrell (1957) further clarified the meaning of technical efficiency, which refers to the maximum proportional reduction of all inputs being consistent with the equivalent output, reflecting the degree to which the output approaches the production frontier under given input conditions.
The accurate measurement of technical efficiency depends heavily on appropriate methods. Aigner and Chu (1968) provided a deterministic method for measuring technical efficiency, achieving the leap from theory to practice. They specified the Cobb-Douglas production function (C-D function) as the measurement model, used quadratic programming as the estimation method, assumed that all deviations from the production frontier were attributable to inefficiency, and restricted the "residual" to be positive. The main shortcomings of this method are that the selected model is parametric and deterministic, and requires a prior function form that does not allow for measurement errors and other statistical noise.
In the late 1970s, the data envelopment analysis (DEA) method emerged, reigniting people's interest in the theory and practice of measuring technical efficiency. The DEA was first proposed by Charnes et al. (1979), who defined the optimal frontier as the envelope of all observed production possibilities and solved it using mathematical programming methods. Improvements were made by Fare and Lovell (1978), Banker and Cooper (1984), and Fare et al. (1985). The DEA is popular in evaluating technical efficiency mainly because it is non-parametric and can handle multiple outputs. Its shortcomings are that it also requires a deterministic model and does not allow for measurement errors.
As the deviation from the production frontier cannot be entirely attributed to inefficiency, and may also be related to measurement errors and other statistical noise (such as adverse weather). Aigner et al. (1977) and Meeusen and van den Broeck (1977) proposed the stochastic frontier model (SFM), which was further refined and presented in academia by Greene (2008) and Kumbhakar and Lovel (2003). The SFM is a measurement method that allows for both inefficiency and measurement errors and has outstanding noise handling capabilities. It decomposes the random disturbance term into two independent parts: a random error term and an inefficiency term, and estimates the non-negative inefficiency term to analyze and study the factors influencing inefficiency. Cornwell and Schmidt (1996) further validated that the error term in previous models could be further decomposed into inefficiency and noise, and also demonstrated that the SFM can be well applied to panel data. After incorporating fixed effects or random effects correction methods, the estimation of firm efficiency can be more accurate. Jondrow et al. (1982) specifically estimated inefficiency and suggested using the measurement of total error to estimate the expected value of the inefficiency component.
The development of the stochastic frontier model has extended to the consideration of sample structure. Yélou et al. (2010) demonstrated, based on an empirical study of the dairy industry in Quebec, Canada, that the threshold effect of the industry should be considered when using the stochastic frontier model. This means that heterogeneity generated by different groups can lead to distorted results. Lee and Lee (2014) proposed a truncated tail SFM, which optimized the complexity of calculations, and concluded through Monte Carlo simulations that "this model approximates the distribution of inefficiency precisely, as the data-generating process not only follows the uniform distribution but also the truncated half-normal distribution if the inefficiency threshold is small" (p. 1). Tsionas (2019) proposed a stochastic frontier model that simultaneously considers technological heterogeneity and universality, making it possible to take into account the production efficiency of different countries. Chen (2014) proposed dividing the industry into several groups based on technological level and measuring the efficiency of enterprises in each group through empirical studies of the computer industry in Taiwan, China. Mastromarco et al. (2012) used the PCCE (Pooled Common Correlated Effects) estimation method to analyze foreign direct investment (FDI) and other indicators of openness in a stochastic frontier model, overcoming the potential bias problem of fixed effects in panel stochastic frontier models.
The SFM not only has the advantage of error handling but also has better actual measurement efficiency than the DEA. Specifically, the choice of production function plays a decisive role in the performance of DEA and SFM, indicating that SFM has better estimation results when the relevant production function is clearly selected, while DEA performs better when the function form is severely non-standard (Gong and Sickles, 1992). Since there is a rank correlation between SFM estimation and the maximum likelihood combination error, this means that the stochastic frontier adjustment of the combined error will not change its ranking, and therefore will not change the ranking results of the deterministic model based on maximum likelihood estimation.
Some domestic studies have used the SFM to measure the efficiency of regional R&D (Research and Experimental Development) innovation, financial development of "Belt and Road" countries, and Chinese OFDI (Outward Foreign Direct Investment) (Bai et al., 2009;Zhang and Yang, 2020). Given those advantages of SFM in measuring technical efficiency, we also adopts it as a measurement model.

Samples and data
We need to select appropriate data of digital finance firstly. At present, there are two representative data of digital finance in China, which one is the China digital inclusive finance index developed by the Internet Finance Research Center of Peking University, and the other is the urban digital finance index jointly developed by the Guangzhou Institute of International Finance and Guangzhou University (Guo et al., 2020;Liao et al., 2022). The former includes three first-level indicators of coverage scope, service depth and digitized degree. It is compiled by the data of Ant Financial user from 2011 to 2020, revealing the development of digital finance of 31 provinces, 337 prefecture-level cities and 2,800 counties in China. It has been widely recognized and used in the academic community. The latter compiles the urban digital financial index system from three dimensions of digital financial services, digital financial technology and digital financial operating environment, analyzes the overall and spatial characteristics of urban digital finance, and reveals the development of digital finance of 278 cities above the prefecture-level in China from 2010 to 2020.
Based on our research objectives and data availability, we choose 31 provinces of China as samples. Referring to the work of Fu and Huang (2018) and Zhang and Hu (2022), we use the China Digital Inclusive Finance Index as the digital finance output data (the explanatory variable). The index has only been updated to 2020, thus the sample period is 2011-2020 (see Table 1). The results in Table 1 show that the five provinces of Shanghai, Beijing, Zhejiang, Guangdong and Fujian are among the best in digital finance development. Based on Table 1, it can be concluded that the digital inclusive financial index differs significantly across different provinces in China, with the mean ranging from a minimum of 185.91 to a maximum of 280.96 and the standard deviation ranging from a minimum of 91.30 to a maximum of 112.76. Furthermore, the distribution of the indicator across provinces shows a negative skewness, indicating that the tail of the distribution is skewed to the left. These results suggest that there are notable differences in the performance of this indicator across different provinces in China.
Referring to the regional classification standards of the National Bureau of Statistics in 2017, we divide the 11 provinces of Beijing, Tianjin, Liaoning, Shandong, Jiangsu, Shanghai, Zhejiang, Fujian, Hebei, Guangdong and Hainan into the eastern region, 8 provinces of Jilin, Heilongjiang, Anhui, Jiangxi, Henan, Hubei, Hunan and Shanxi into the central region, 12 provinces of Guangxi, Shanxi, Gansu, Qinghai, Neimenggu, Xinjiang, Xizang, Sichuan, Guizhou, Yunnan, Ningxia and Chongqing into the western region. We further investigate the average level of digital finance development in the whole country and the eastern, central and western region, finding that the average level of the eastern region is consistently high, with a first-mover advantage, while the development of digital finance in the central and western regions is relatively lagging behind, lower than the national average. The input data (explanatory variables) of digital finance development in each province is divided into two parts: capital input and labor input. The capital input is further divided into fiscal input and finance institution input. The former is represented by R&D investment in each province, and the latter is represented by the quantity of capital input in banking industry (accounting for about 68.6% of capital input in all finance institution), insurance industry (about 18% of capital input in all finance institution) and securities industry (about 12% of capital input in all finance institution) in each province. The labor input is represented by the quantity of financial employees in each province. The input data is manly collected from state and each provincial statistical yearbooks and the database of RESSET.

Model setting
At present, the C-D function and translog production function (translog function) are the main models of the SFM, and both of them have their own advantages and defects (Christensen et al., 1973). The C-D function is simple and easy to understand. Its parameters can be concisely and directly expressed as the output elasticity of capital and labor, so its economic meaning is clear. The translog function relatively comprehensively takes into account the interaction between capital and labor, and their effect on real output, which better solves the problem that the alternative elasticity of the C-D function is 1.
Referring to the work of Battese and Coelli (1995), when just considering two inputs of capital (K) and labor (L), we construct the SFM based on the C-D function as shown in the following Equation (1).
where the subscript i represents each sample province, and its value range is 1-31. The subscript t represents each year, and its value range is 2011-2020. The variable ln represents the logarithm of the digital financial inclusion index of each province, which is the output term of the production function. The variables ln and ln are the logarithms of capital and labor of each province, and are the input term to the production function. The is a non-negative stochastic variable related to the inefficiency of technique, and the is the observation error and other stochastic terms. The intercept term 0 and coefficients 1 and 2 are the parameters to be estimated.
The translog function can be regarded as a variant of C-D function, and its form is shown in the following Equation (2).

ln
= 0 + 1 ln + 2 ln + 3 (ln ) 2 + 4 (ln ) 2 + 5 ln * 6 ln + − The meaning of each parameter in Equation (2) is the same as that in Equation (1). It is observed that the translog function can be regarded as an approximate second-order Taylor expansion of the output function (ln , ln ) at the point (0,0) . If 3 = 4 = 5 = 0 , then the translog function is equal to the C-D function. Based on this, we choose the translog function as the benchmark model for subsequent estimation.

Descriptive statistics
The descriptive statistical results of each variable are shown in Table 2. The values of ln 、ln and ln are stable and can be estimated directly.

Model selection
(1) Individual Effect and First-order Difference Autocorrelation Test In order to reveal the development state of digital finance in each province, we number the 31 provinces in Table 1 sequentially and name them − 1 to − 31 . We use the least square dummy variable model (LSDV) to investigate the intercept term of individual heterogeneity, and show the results in Table 3.
The test results show that the p-value of dummy variable of most individuals is 0 and statistically significant, so the original assumption that all individual dummy variables are 0 can be rejected. We believe that there is an individual effect, so the mixed effect model is not fit for the data.
Then we use the first-order difference method to test the autocorrelation of panel data, and find that the first-order difference estimator FD is quite different from the intra-group estimator FE. Since the p-value of the first-order autocorrelation test is 0.0000, we believe that there is no autocorrelation.
(2) Time Effect Test In order to test whether there is time effect in the model, we define the annual dummy variable, and use − 1 to − 10 to represent 2011-2020 sequentially. Taking 2011 as the base period, we show the test results in Table 4.  Note: *, * * and * * * represents statistically significant at 10%, 5% and 1% level respectively and t-value is in brackets.  Note: *, * * and * * * represent statistically significant at 10%, 5% and 1% levels respectively and t-value is in brackets.
The coefficients of annual dummy variables are significantly negative, preliminary indicating there is time effect. Furthermore, we conduct a joint test whether all annual dummy variables are 0, getting F(9,30) = 66.18 and Prob > F = 0.000. Thus, we reject the original assumption, accepting that there is time effect in the model.
(3) SFM with Mixed Effect and Fixed Effect Test Firstly, we conduct the test of mixed effect for Equation (2), and show the results in Table 5. The coefficients of all variables are mostly statistically significant, suggesting that the overall explanatory power of the model is high.
Since the development of digital finance in different provinces is different, there may be some missing variables that do not change over time. Therefore, we further conduct the fixed effect test of robust standard error for Equation (2), and still show the results in Table 5. This time, ρ = 0.981, meaning that the variance of the compound disturbance term of the model mainly comes from the individual effect. Comparing the results of the two models, since the p-value of the fixed effect model is 0.0000, we reject the assumption that the individual residual error is 0 and select the fixed effect model finally. It means each province has its own intercept term. Note: *, * * and * * * represents statistically significant at 10%, 5% and 1% level respectively and t-value is in brackets.

Estimation of efficiency term
We take the input-output data into the Frontier 4.1 (a software specially used to complete stochastic frontier analysis), obtain the maximum likelihood estimation (MLE) results of C-D function and translog function model respectively, and show the results in Table 6. The results show that the estimated values of all parameters are mostly statistically significant at the level of 10%, suggesting both models have good explanatory power. The values of the two models are close to 1, indicating that the differences between the real output and the ideal output are mainly caused by the technical inefficiency term. The test result is consistent with our expectation. In addition, the technical inefficiency factor accounts for more than 90% of the variance of the joint stochastic disturbance term − , indicating that the error variance is mostly from the technical inefficiency term, but little relates with the stochastic error. Therefore, it is rational and acceptable to choose the stochastic frontier production function with technical inefficiency term.
After obtaining the above results, we choose the log likelihood value to conduct test. The value of the one-sided likelihood ratio test statistic is 27.26, which is bigger than the critical value of 11.345 under the condition that the degree of freedom is 3 and the confidence level is 1%. Therefore, the original hypothesis 3 = 4 = 5 = 0 is rejected, and the translog function is finally selected. However, the estimated annual digital finance development efficiency of each province may still not be exact because the inefficiency item is not included. Next, we continue to estimate the inefficiency term to obtain a more accurate value of total efficiency.

Estimation of inefficiency term
We use the OLS model, stochastic frontier semi-normal model and stochastic frontier exponential model to test the Equation (2) respectively, and show the results in Table 7.  Note: *, * * and * * * represent significance at 10%, 5% and 1% levels respectively, and t value is in brackets.
Comparing the test results of the three models, we find that the result of the OLS model is relatively undesirable, so we abandon it. The -value obtained by the stochastic frontier semi-normal model is bigger, indicating that the inefficiency term plays a dominant role in the compound disturbance term, and the original hypothesis should be rejected. In addition, we can also reject the original hypothesis according to the p-value and likelihood ratio test. That is to say, there is inefficiency term, and the SFM can be used. Finally, we test the stochastic frontier exponential model, and draw the same conclusion that the original hypothesis should be rejected.
According to the above test results, we use the stochastic frontier semi-normal model to estimate the technical inefficiency term of digital finance development in each province. The results are shown in Table 8.   The results show that the technical inefficiency term of digital finance development in each province basically shows a trend of declining at the beginning of the sample period and then of rising. Since 2013, the technical inefficiency term in eastern region has led to rise, and has been higher than that in central and western regions, indicating that the input-output marginal contribution of digital finance in eastern region is declining.
To ensure the robustness of the results, we also use the stochastic frontier exponential model to estimate the technical inefficiency term. Comparing the estimation results of the two models, we find that their correlation coefficient is as high as 91.5%, indicating that they are highly correlated, and the estimation results of both models are rational. Therefore, we bring the estimation results of the inefficiency term in Table 8 into Equation (2), and obtain new coefficient test results with the Frontier4.1 (see Table 9). The results show that after adding the inefficiency term, the -value rises, and most coefficients are statistically significant, indicating that the regression results are rational and the model setup is reasonable.

Analysis of estimation results
After completing the above tests, we finally get the estimation results of digital finance development efficiency of each province and show them in Table 10. The results in Table 10 reveal some facts as following.
Firstly, from the vertical perspective, the development efficiency of digital finance in each province shows a trend of rising rapidly at the beginning of the sample period, then declines steadily and slowly. The development efficiency of digital finance in most provinces reached the maximum during 2012-2014, and declined continuously from 2016. The cause may be that digital finance in each province is more supported by policies and markets at the initial stage of development, thus the marginal revenue of the industry is higher, and the input-output ratio is relatively larger. After a period of rapid development, the industry tends to be saturated and marginal revenue decreases. The same input can only bring less output, so the efficiency gradually decreases. Additionally, in order to regulate the industry and control the chaos of blind expansion, some government intervention appears, and may limit the rapid expansion. Ultimately, while the development level of digital finance in each province is rising slowly, the efficiency is gradually declining simultaneously.
Secondly, from the horizontal perspective, the development efficiency of digital finance in each province has evolved from divergence to convergence and then to divergence. Those provinces with higher development level of digital finance generally have relatively more development efficiency at the beginning of the sample period. For instance, the five provinces of Shanghai, Beijing, Zhejiang, Guangdong and Fujian have a relatively higher development level of digital finance initially, and the development efficiency of digital finance of them is also more. The development efficiency and development level complement each other, indicating that the development of digital finance may initially have scale effect and Matthew effect. So it is important to obtain first-mover advantage. Thereafter, although the development level of digital finance in those provinces continues to rise slowly, their development efficiency declines rapidly over time. The development efficiency of the above five provinces have ranked from the top five to the bottom five and been lower than their original value by 2020. On the contrary, the provinces of Qinghai, Guizhou, Gansu, Xinjiang and the others have a relatively lower development level of digital finance in the early stage, the development efficiency of digital finance of them has ranked the top of all provinces and been far higher than their initial values by 2020. The result is consistent with the aforementioned vertical analysis result. The cause may still be that as the development level of digital finance improves, its marginal product declines.
Finally, from various regions, the development efficiency of digital finance in the eastern, central and western region also shows a trend of first rising and then declining in the sample period (see Figure  2). The development efficiency of digital finance in the eastern region is the highest and grows fastest at the beginning of the sample period, but it also declines fastest after 2013. The development efficiency of digital finance in the central and western regions starts at a lower level and declines relatively slowly after reaching the maximum. This result is also consistent with the aforementioned vertical analysis result, which again shows that the development potentiality of digital finance in the western region is greater.

Conclusions
Over the past decade, digital finance has gained significant development in all provincial regions in China, providing a solid foundation and strong impetus for sustainable economic development. Using the digital financial inclusion index for each year from 2011 to 2020 in 31 provincial regions across China as output indicators, we initially measured the digital financial development efficiency of each region with the help of SFM, and revealed its time-series evolution and spatial distribution differences. The results found that the digital financial development level of each provincial region showed a continuous upward trend in general, but the digital financial development efficiency showed a rapid rise and then a slow decline, convergence and then dispersion; the digital financial development efficiency of provinces with higher digital financial development level and eastern regions was higher at the beginning of the period and declined faster after reaching the peak, and its value was already lower than the national average; the imbalance of digital financial development efficiency of each provincial region. There is a tendency for the degree to expand, and the central and western regions have greater potential.
The findings of this paper have the following insights. Firstly, digital finance development plays a significant role, and in order to further promote the development of digital finance in all provinces in China, we should continue to stop the rapid decline of digital finance development efficiency in all regions (especially in the eastern provinces) and continuously improve the uneven digital finance development efficiency among regions. Secondly, the digital finance development efficiency in the eastern region declines faster after reaching the peak, and the central and western regions have more potential for digital finance development. The central and western regions have greater potential for digital financial development efficiency, so resources should be reasonably allocated to guide the reasonable flow of resources and continuously improve the quality of digital financial development in each region. Thirdly, there are significant differences in the level and efficiency of digital financial development among different regions, so the central government should strengthen the overall planning and promote the coordinated development of each region through functional positioning, policy inclination and cross-regional mutual assistance.

Use of AI tools declaration
No artificial intelligence (AI) tools were used in the creation of this article.