Research on the Effects of the Geographical Adjacency and Informatization Level on Input and Output of Regional Innovation - Based on a Spatial Econometrical Analysis of the 21 Cities in Guangdong Province

Based on the diffusing theories of knowledge, this paper uses the data between 2006-2008 of the 21 cities in Guangdong province to conduct the exploratory data analysis, constructs the input-output model of innovation which take the informatization level of a region into account, and uses the common econometric model and spatial econometric models based on geographical adjacency between two regions, then compared the estimation results of these models. The systematic analysis shows that the innovation output of the 21cities is spatially correlated to each other, the result of the econometric model considered about the adjacency element is more precise than the common one, the accumulation of capital for innovation is the domain engine of innovation creating, the input of human and the enhancement of informatization level stimulate innovation weakly. However, when take the informatization level into account, the results of the new model are more precise. Based on the empirical research, this paper proposes several policy recommendations.


Introduction
Generally, when it comes to the relation between input and output of innovation, we don't consider about spatial factors. However, the creative activities of contiguous regions are often related to each other, the phenomenon could be explained either by externalities of the agglomeration economy or the spatial adhesiveness of tacit knowledge. The above two explanations emphasize the importance of knowledge as a part of innovation capital which drive innovation production. Simultaneously, the advance of informatization level of a region could improve the efficiency of its knowledge flow, then enhance its ability of innovation. So, when talking about the input and output of regional innovation, if we take both geographical adjacency and informatization level into account, we may get more exact conclusions.
As an early experimental area of the policy of reform and opening-up, Guangdong province presently confronts with both domestic challenge and international challenge: the latter refers to appearance of trade protectionism, the fierce competition of market, resource, talents, technology, standards etc.; the former is the special domestic environment which is characterized by the low level of industry system compared to the world industry chain, the lack of a strong self-innovation ability and the pressure of huge population and environment suffered from extensive economy, and calls for a transformation of development approach. So, building relationship of creative activities among the cities, making use of the comparative advantages of different cities, narrowing the gap of innovation ability of different cities and enhancing the integrated innovation ability of the whole region may pave the way for Guangdong province to face the challenges.
At first, this paper applies Exploratory Data Analysis(EDA) to the relative data of 21 prefecture-level cities in Guangdong province to test whether there is spatial correlation about creative activities among different cities; then, by the comparison results between OLS model without spatial factors and spatial econometric model, we get the former test further; at last, we introduced an indicator which reflects regional informatization into the common innovation input and output model level and then make a comparison between the common spatial econometrical model and the adjusted spatial econometrical model, which helps to find how the informatization level affect the regional innovation producing.

Review of References
A series of studies about the effects of spatial correlation to innovation input and output had been made by both foreign and domestic scholars, which offered references to the later relative studies. Some literatures talked about how the regional informatization level affects its innovation producing. By an induction of these literatures, this paper classified them into three topics, namely the relation between geographical adjacency and regional innovation, the relation between geographical distance and regional innovation, the relation between informatization level and regional innovation etc. Fritsch and Slavtchev(2007) [1] find that the relation between college and innovation depends on whether the enterprises adjacent to the college, they give an explanation of such a kind of adjacency and also underline that the quality of a college research is strongly correlated to the interaction between the contiguous enterprises and the college. Yuming Zhang and Kai Li(2007) [2] applied exploratory data analysis to the data of innovation output of 31 Chinese provinces and showed that the spatial distribution of regional innovation output is not a random process. When Ciriaci and Palma(2008) [3] estimate the regional export competitiveness, they take geographical adjacency into their model, by a spatial econometrical analysis, they find geographical adjacency promote the flowing of knowledge between the two regions. Wan and Yan (2009) [4] constructed a new knowledge production function which regard venture capital as one of independent variable, they made an empirical analysis by using SLM model and showed that the spatial flow of venture capital is vital to regional innovation. Yuming Wu(2010) [5] designed two indicators systems stand for the innovation ability of college and quality of regional innovation environment respectively, then calculated the grades of the two systems by factors analysis, their research indicates that the effects of regional innovation environment to the ability of colleges seem to be an agglomeration form. Xiaoye Qian et al.(2010) [6] analyzed how the human capital affect the growth of economy by affecting the innovation through the use of space dependency, their study indicated that the proportion of employees in higher education promotes regional innovation, but little effect on economic growth. Jing   [7][8] conceived two matrixes according to geographical distance and economic distance respectively and executed a empirical study using both static and dynamic spatial panel data model, a significant spatial correlation of Chinese regional innovation had been tested and the engine to promote the producing of innovation and building of spatial correlation were uncovered to be location and social and economic characteristics.
Geographical proximity between regions is another research perspective beyond geographical adjacency, it studies how the geographical distance between innovation agents affects their innovation behavior. Yuan Shu and Guowei Cai(2007) [9] analyzed the technology upgrading and trend of its spatial diffusion and the technology diffusion from Beijing, Shanghai and Guangzhou to other Chinese provinces had been tested, what's more, the strength technology diffusion depends on geographical distance. Rodríguez-Pose and Crescenzi(2008) [10] integrated research, knowledge spillover, regional innovation system and regional economic growth into one model, they find there is a knowledge self-constrained area with a radius of 200 kilometers, if the contiguous region is in the radiation zone, it could share the knowledge spillover of the core area and advance its ability of innovation production. Zhang et al.(2009) [11][12][13] choose an angle of ecology to make a research about the relationship between technology communities and how it is related to the economic growth of these technology communities. The study shows that regional community density, a community's geographical proximity to the nearest community and its domain overlap with the nearest community have an inverted U-shaped relationship with the community's growth.
In summary, scholars all over the world have made many studies about the relationship between geographical adjacency, geographical distance and informatization level and the regional innovation. Their efforts supplied many references for this paper which are the foundation of this paper.

460
Information Technology for Manufacturing Systems III However, there are a few defects among ever studies: firstly, when having research on the relationship between input and output of innovation and geographical adjacency, ever studies almost regarded provinces but not cities as sample which neglected the fact that there is strong heteroskedasticity among provinces especially in developing countries such as China. Secondly, with the condition of more advanced level of informatization than ever, the explicit knowledge could flow beyond the geographical frontiers which may enlarge the range of resources which lead to spatial correlation, because the explicit knowledge produced by creative activities from other areas not adjacent to the host area, it could also be transmitted to here through the information network. There may be some bias among ever studies for lack of considering spatial correlation completely. Thirdly, by raising the circulation efficiency of knowledge, the advance of informatization level could improve the regional innovation ability. When talking about informatization level, ever studies seem to concentrate only on its relationship with economic growth and neglect the relationship between informatization level and regional innovation. So, in this paper, we regard 21 prefecture-level cities in Guangdong province as research sample, have a study on the relationship between input and output of regional innovation and geographical adjacency. We also design the indicator of informatization level and regard it as an important explanatory variable in our empirical model, a further study of the relationship between informatization and regional innovation production had been presented.

The Explanation of Indicators Design and Data Resource
Explanation of Indicators. In order to study the output level of innovation of a city, this paper use number of patents granted it zlsq to denote it, i and t represents the corresponding sign of the city and the t th year respectively (the two signs have the fixed meaning in this paper). According to new growth theory, the two main elements which drive the growth of innovation production are the special input of labor and capital. We use the converted full-time equivalent of R&D personnel 1  statistical yearbook on science and technology of corresponding years, however, there are still some data of a few cities' lost in these yearbooks, so we use the statistical yearbooks of these cities to make a compensation. Besides, we extract the data of the business volume of postal and telecommunication services per capita which reflects the informatization of a city from the Guangdong statistical yearbooks of these years and convert them to the available form.

Analytical Approach and Model Construction
Exploratory Data Analysis. Exploratory Data Analysis (EDA) uses the index of Moran I as a tool to measure the relationship between attributions of different objects distributed randomly in space. The value of Moran I ranges from -1 to 1. When it is positive, it indicates that the attributions of the study objects are positively spatially correlated to each other. If the value is negative, it shows that the corresponding attributions of study objects are negatively spatially correlated to each other. When it is 0, the spatial correlation of the attribution of different objects distributed at different locations could not be found, namely the objects are distributed. The formula calculating Moran I and one its statistical characteristics is as follows: In the above formulas, i x denotes a value of one attribute of the i th city, it refers to the logarithmic form of number of patents granted of the i th city; ij w stands for the relationship of adjacency between two spatial units, if spatial unit i is adjacent to spatial unit j , it equals to 1, or it is 0. There is a null hypothesis H0: there is no spatial correlation between the study spatial units. If H0 is right, then the value of general Moran I obey to the normal distribution, so we could use the value of z to test if H0 is steady. When z >1.96, under a probability of 95% of significance, we can't refuse to deny the spatial correlation of the attributes the study spatial units owned. The Econometrical Model. We design 3 kinds of econometrical models in this paper. The first one is the common regression model which neglects the factor of spatial correlation and its parameters will be estimated by OLS. The second one takes spatial correlation into account and construct two kinds of regression models. One is spatial lagged model (SLM) and the other is spatial error model (SEM). The third form considers both the factor of spatial correlation and the factor of regional informatization level and it adds a new explanatory variable to SLM and SEM respectively. The first model is based on the innovation production function in new growth theories, the third ones combines the new growth theories with the production function which regards the informatization as one kind of inputs. Practically, it is also the reflection of the fact the advance of infromatization level will improve the circulation efficiency of knowledge, which could stimulate the increase of innovation output. The innovation production function of new growth theory and the production function containing the informatization level are as follows: In equation (1), Y refers to the output of innovation, L and K stand for input of the labor and the input of capital in the R&D departments respectively. In equation (2), Y, L and K o refer to the output of the economy, the corresponding input of labor and the input of capital in the economy respectively, K I stands for information capital in the economy (informatization level could reflect it.). When considering about the informatization's significance to knowledge circulation simultaneously, we construct the basic theory model of input and output of innovation which shares the form of equation

462
Information Technology for Manufacturing Systems III (2), but Y, L, K O and K I refer to the variables in only R&D frontier , not the whole economy. We exert a total differential process on equation (2) and make it be divided by Y both sides, we get the following form: In equation (3), O g , I g , L g refer to the growth rate of input of capital specialized in innovation, input of information, input of labor specialized in innovation respectively, TFP g stands for a constant growth rate of converting level from input to output of innovation, the rate is a function of knowledge accumulation and technology level. Also, we assume that TFP g is constant and the theory model has a Cobb-Douglass form, then we could deal with equation (3) In equation (4),  According to table 1, there is significant spatial correlation between the output of innovation of different cities, for the value of z of the corresponding Moran I is significant under a probability of 1%. Because all the values of Moran I are positive, so there is a positive spatial correlation between the innovation production activities of different cities, exactly, between the two cities adjacent to each other, which means the R&D activities of a city affect its neighbor's R&D activities. Cities with high intensity of R&D activities are surrounded by cities with high intensity of R&D activities, cities with low intensity of R&D activities are surrounded by cities with low intensity of R&D activities. We also supply the scatter diagram from 2006 to 2008 (figure 1) which further confirms the fact of spatial correlation of creative activities of different cities.
Advanced Engineering Forum Vols. 6-7 The Analysis of Econometrical Results.By the application of spatial statistic soft GEODA to our sample data set, we get the following tables of estimation (table 2, table 3): Robust LM(error) 2.585(0.107)* 2.700(0.100)* Note: *, **, *** respectively stands for the statistic significance under a probability of 10%, 5% and 1%. OLS, OLS+I, SLM, SLM+I, SEM, SEM+I denote the types of the regression model for estimation, when the sign plus I, it means the factor of informatization level had been taken into account in this regression model. To avoid the multicollinearity between explanatory variables which affect the effectiveness of estimations, we set a variables set including all the explanatory variables and replace the dependent variable with one of its independent variables constantly until every variable in the variables set to be explanatory viable once. So we have 1 Information Technology for Manufacturing Systems III explanatory variables in this paper. Then we calculate the variance inflation factor VIF i . We find all the n values of VIF are less than 10, so we need not worry about the problem of multicollinearity. The specific analysis of the estimation results is as follows: (1) From the angle of view of the corresponding coefficients of the explanatory variables, all the coefficients of the funds for S&T activities which stand for the elasticity of the output of innovation of the capital are significant under a probability of 1%-5%, it means the capital specialized in innovation is the main engine that promotes the growth of output of innovation. Though there is not serious multicollinearity among explanatory variables, but the regression of the non-converted full-time equivalent of R&D personnel and the funds for S&T activities of different cities of the same year show that they are strongly linearly related, which may offer the information that the 21 prefecture-level cities in Guangdong province as a whole lacks of a mechanism based on salary to inspirit the personnel, the mechanism based on salary difference may reduce the linearly relation of the two variables for the gap between different salary levels of different clusters of personnel. At the same time, the coefficients of converted full-time equivalent of R&D personnel of different models are all positive but not significant, which denotes input of labor specialized in innovation may promote the growth of output of innovation, but the effect is not significant. The coefficients of ln it rjdxyw in the six models are not statistically significant, which means informatization couldn't affects production of innovation, however, the introduction of it make all the constant terms which equal to the conversion efficiency from input to output of innovation become larger than the situation without the introduction of informatization.
(2) When we have an insight to the goodness of fitting of all the models, the values of R-squared in the six models are more than 0.8, which symbolizes good effects of fitting. After the introduction of the explanatory variable of informatization, the values of R-squared become larger. So informatization level is an important factor affecting the efficiency of innovation production. For the using of cross-sectional data set as sample, it is necessary to have a test on heteroskedaticity. By using the approach of Breusch-pagan, we find that all the values of Breusch-pagan are not significant and we could accept the null hypothesis that there is not heteroskedasticity among residuals. The results of estimation of this paper are credible.
(3) When divide the results to two situations based on whether consider about the factor spatial correlation 3 , firstly, the values of R-squared of models of SLM and SEM considering about spatial correlation change to be 0.783 and 0.819 from the situation of OLS 0.778. Secondly, when comparing the results of SLM and SEM, it is apparent to see the coefficient of the spatial lagged variable in SLM is not significant, but the coefficient (LMABDA) of spatial correlation error is significant under a probability of 5%, which is consistent with the former descriptive statistical result. So the residuals of OLS are spatially correlated. Without the estimation of SEM, we cannot get exact results. Thirdly, when we look into the values of Akaike criterion and Schwarz criterion, both of them in SLM are larger than in OLS, however, the same indicators in SEM are larger than in SLM. We find SEM is a better choice than OLS and SLM.
(4) The values of Lagrange Multiplier(lag) and Robust LM(lag) are not statistically significant in the models with or without fact of informatization. Simultaneously, the corresponding significance probabilities of the values of Lagrange Multiplier(error) and Robust LM(error) in models with or without considering about informatization approximate to 10%. We think they are statistically significant under a probability of 10% and the extent of them is deeper than the situation of Multiplier(lag) and Robust LM(lag). So, according to the simple rule of Aselin of how choose a regression model, we should regard SEM as the most appropriate model in this paper. What's more, after the introduction of the variable of informatization level, though the goodness of fitting has been improved, the value of LAMBDA (the coefficient of spatial correlation error) changes from 0.041 to be 0.046 and the statistical significance of value of Lagrange Multiplier (error) changes from 0.452 to be 0.446 ( a reduction of significance). Both the two changes give the information that the advance of informatization could promote the knowledge to flow over the geographical boundary and, to a certain extent, weaken the affect of spatial correlation of creative activities originated from the geographical adjacency on innovation production.

Summary
This paper builds the basic econometrical model considering the informatizatio level and chooses the data of 21 prefecture-level cities in Guangdong province from 2006 to 2008 to be the research sample. We estimate it both by OLS, SLM and SEM etc. and get the following conclusions: (1) The non-converted full-time equivalent of R&D personnel and the funds for S&T activities of different cities of the same year are strongly linearly related, which indicates that Guangdong province is lack of a mechanism based on salary difference inspiriting the R&D personnel. The mechanism may weaken the multicollinearity between the two variables.
(2) Whatever the situation considering spatial correlation or not, the funds for S&T activities standing for input of capital specialized in innovation seems to be the core engine promoting the increase of the output of innovation. The more input of funds may lead to more innovation. Also, the input of labor specialized in innovation could only improve the efficiency of innovation production weakly, namely the quality of input of labor specialized in innovation comes first, but not the quantity.
(3) The results of EDA show that there is a strong spatial correlation among the creative activities of the 21 prefecture-level cities in Guangdong province, the advance of informatization level weaken the effect of geographical adjacency on regional innovation, which means the development of the informatization network promotes knowledge to flow across regions and lowers the extent of dependence on geographical adjacency of innovation. Simultaneously, a good information network benefits the communication and interaction between two cities far away from each other and makes the resources of innovation to be spatially allocated more appropriately.