Next Article in Journal
Income Differentiation as a Factor of Unsustainability in Forestry
Previous Article in Journal
Transforming Built Environments: Towards Carbon Neutral and Blue-Green Cities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Distribution of Rural Accommodation in Extremadura, Spain-between the Randomness and the Suitability Achieved by Means of Regression Models (OLS vs. GWR)

by
José-Manuel Sánchez-Martín
1,*,
José-Luis Gurría-Gascón
2 and
Juan-Ignacio Rengifo-Gallego
2
1
Faculty of Business, Finance and Tourism, University of Extremadura, 10071 Cáceres, Spain
2
Faculty of Letters, University of Extremadura, 10071 Cáceres, Spain
*
Author to whom correspondence should be addressed.
Sustainability 2020, 12(11), 4737; https://doi.org/10.3390/su12114737
Submission received: 21 May 2020 / Revised: 6 June 2020 / Accepted: 8 June 2020 / Published: 10 June 2020

Abstract

:
There are multiple types of regression, the essential task of which is the obtaining of models which, starting from a set of regressive values, are capable of finding explanations for the variability of a dependent. However, in many cases, the territorial criterion is not considered to be a noteworthy factor of analysis, owing to which this deficiency has encouraged the arising of spatial statistics. Nevertheless, given the variety of regressions, it is not clear which can best be adapted to the analysis of tourism. In this sector, when the supply of accommodation is analysed, it is understood that it must be strongly related to the presence of resources, owing to which it has been taken as an example of an application between two differentiated regression techniques: ordinary least squares (OLS) and geographically weighted regression (GWR), with the objective of determining which of the two is best adapted to this type of analysis. The model has been drawn up based on various methods, although it has been shown that it is more efficient to resort to the declared preferences of the rural tourist, with the starting point being a survey made of the tourists. These aspects have been taken as independent variables with the aim of explaining the distribution of accommodation establishments. The results obtained show that the configuration of the spatial relations between the variable included in the model encourages the explanation of the latter, owing to which GWR is much more suitable than OLS, even when a system as complex as the distribution of accommodation establishments is analysed. Likewise, it is noteworthy that the distribution of accommodation does not also follow the guidelines marked by demand; far from it, it appears that in some areas, it is of a random nature.

1. Introduction

Tourism has developed greatly in many parts of the world. It has become an essential industry in the achieving of the socioeconomic development of a multitude of places [1]. It is for this reason that numerous studies analyse the activity from various perspectives. These include those studying the economy of tourism in various countries [2,3,4], those which concentrate on the analysis of the tourism satellite account [5,6,7], or those which analyse specific products and aspects such as sun and beach tourism [8], rural tourism [9], and cultural tourism [10].
There is a considerable diversity of themes analysed from the point of view of tourism, although all need data to implement them [11]. Until recent years, the information was concentrated on economic, heritage, social, and territorial aspects and naturally, also, all those linked to the characteristics of the tourist or the accommodation. This was abundant information although its amount was limited. Nowadays, thanks to the continuous progress of technology, information has increased exponentially. Indeed, big data synthesises the huge volume of tourist information which exists and can be used to design and administer smart tourist destinations [12] in order to predict tourist demand [13,14] or even analyse tourist behaviour [15]. Indeed, many studies examine the use of big data [16] and data mining [17]. Nevertheless, recent studies are also appearing which put forward the need to resort to small data [18] and stress the need to use only the necessary information.
Despite the disparity of approaches and subjects, it is clear that tourism needs more and more information and also the application of techniques allowing its processing and comprehension. Nevertheless, it continues to be a phenomenon which is difficult to analyse as it is characterised by a variety of both themes and spatiotemporal factors [19].
This information needs to be correctly processed by means of a multitude of statistical techniques [20], which are criticised on occasion [21], owing to their concept. These techniques may be of various types, and numerous studies use qualitative analyses applied to very varied themes. Among them, those which analyse social networks as distribution channels [22], the impact of tourism on residents [23], the image projected by a destination [24] or accommodation [25], and even the very internationalisation of quality standards [26] stand out. Others resort to quantitative analyses to ascertain the effect of seasonality and the rating of the accommodation on the price [27] or the characteristics of the tourists [28], among other subjects.
The literature includes intense epistemological debates on qualitative research compared with quantitative research applied to the analysis of tourism [29]. At the same time, studies have appeared which detail the research methods most frequently used on this subject [30,31,32]. In spite of this, the use of quantitative techniques clearly predominates. These include regression analyses in their multiple variants. Numerous studies allude to structural equation modelling [33], quantile regression [34,35], multiple linear regression [36,37,38], and even attribute a key role to the geostatistical analyst [25,39,40,41]. Among the latter, geographically weighted regression is becoming more and more important, owing to the influence of the surrounding area on tourist parameters [42].
Two clear trends are observed by those who advocate the use of quantitative analyses applied to tourism. On the one hand, there are those who favour the direct application of quantitative statistical techniques using a variable number of numerical parameters. On the other, we have those who defend the decisive role of territory in statistical analyses. In other words, two ways of approaching the statistical analyses of data can be found, which are the use of quantitative and exclusively numerical methods [43,44] and the use of geostatistics, with which the spatial criterion becomes important [41] in statistical calculations qualified by the territorial nature.
It is clear that the use of quantitative techniques has great advantages as it facilitates the analysis of considerable volumes of data and is capable of obtaining models and even predicting estimated values from these models. There is no doubt that this is one of the major contributions of multiple regression together with the discovery of outliers [45]. However, they do not always capture the essence of tourist activities, which on numerous occasions qualify all relationships, owing to the importance of territory. It should indeed be stressed that the role of geographical space in tourism is a vital and inherent part of the same [46,47,48].
The use of multiple regression to establish models and calculate the necessary supply of accommodation according to the tourist potential of a space is not new [49]. After having analysed a large amount of bibliography on the location of tourist accommodation, some authors concluded that the most logical course of action is to use geographical information systems [50,51]. The idea is that the location of the supply can be explained by using them; this can be complemented with the application of statistical techniques.
The putting of faith in this type of software is due to the conceptualisation of tourism as going beyond purely economic considerations in which territory and spatial relations acquire considerable importance [52]. Many studies carried out decades ago [53] stressed the role of spatial analysis and remain valid today [54]. In consequence, both geography and territory science give tourism a different meaning, complementing it and providing it with a key tool: the geographical information system. With this contribution, together with progress in hardware and software, we have an efficient instrument for performing a tourist analysis based on territorial support itself, albeit without renouncing the quantitative analysis.
The need for including territory in tourist analyses arises as a consequence of the fact that tourist resources, accommodation, and complementary services, together with the tourists themselves, travel movements, etc., can be represented on the territory [55]. If this is the case, an obvious circumstance which is not questioned by the literature, it would be logical to think that it is necessary to adapt analytical techniques to this situation. It is likewise necessary to mention the importance of the criterion of proximity in tourism, as nearby spaces affect and are modified by this activity, influencing and altering strictly statistical models [56,57].
Studies applying geostatistical techniques to facilitate an understanding of tourism in its most varied facets have currently become relevant. In this sense, faith is placed in analysing patterns or mapping clusters according to the formulations of Getis–Ord [58,59] or Moran [60,61]. Despite the multiple applications of both groups of geostatistical techniques, when modelling spatial relationships are used, it is necessary to resort to geographically weighted regression [62].
The literature consulted confirmed that the application of geostatistical techniques is increasing in tourism studies. It is thus clear that the importance of the territory is vital if the results are to be interpreted correctly. In order to do so, the aim of the paper is to apply and compare the results obtained by means of ordinary least squares regression (OLS) and geographically weighted regression (GWR) in the distribution of the supply of rural accommodation in Extremadura, Spain.
If on a technical level there are numerous options for analysing tourism, when specific aspects of the latter are studied, significant problems also arise. In the case chosen as a subject of study, rural accommodation establishments in a specific territorial context, doubts always arise as to their appropriate location [63]. At the same time, we have tried to determine their optimum location [51,55]. Despite the considerable efforts which have been made, only in few cases have satisfactory explanations been obtained for the location of rural accommodation establishments. This is due to the fact that their location does not always follow a logical coherence which would imply resorting to areas with the most favourable characteristics or which are adapted to the preferences of the demand. Indeed, at times, their distribution appears to be random [46] as they are not always located in the areas with the greatest capacity for attracting tourists. These facts would appear to be corroborated by observation of the preferences of the demand for this type of establishment and the location of both the accommodation and tourist attractions. This imbalance may constitute a major hindrance for establishments and in consequence may affect their profitability.
The result of the circumstances mentioned is the need for detecting the main factors which attract tourists who stay at rural accommodation establishments and comparing them with the distribution of the lodging capacity they offer. In order to do so, various techniques are used, both statistical and geostatistical, which if appropriate allow the verification of the most significant models or guidelines for the implementation of rural accommodation establishments.
The study takes as a starting point the following hypotheses: (1) the distribution of rural accommodation establishments is not always due to criteria of suitability as the implementation of an establishment depends on human decisions which are at times variable and random; (2) it is necessary to delimit correctly the independent variables which may explain the location of the establishments; in order to do this, it is necessary to find out the opinion of tourists; (3) as the territorial component is not taken into account, OLS obtains worse results than GWR in the general regression model; (4) the local regression models obtained by GWR reveal their suitability on allowing the obtaining of a differentiated equation for each entity analysed; (5) it is possible to detect outliers in the territory which must be studied specifically; (6) predictive calculations on the lodging capacity of rural accommodation may serve as a basis for planning action when this is necessary as there will not always be a balance between actual values and those calculated by the models built.

2. Materials and Methods

2.1. Study Area and Procedure

The study area chosen is Extremadura, Spain, an eminently rural region with a surface area of 41,634 km2. This territory contains numerous tourist resources of a cultural and natural character which serve to attract tourists. Likewise, it has a considerable lodging capacity if its nature as an inland destination is taken into account. Despite this, the distribution of accommodation establishments in the region is unequal, although it does follow certain guidelines depending on the type of accommodation [41,46,64,65] (Figure 1).
In 2019, a total of 1768 accommodation establishments were registered with a maximum capacity of 42,214 beds (45,957 if extra beds are included) which have a variable distribution depending on the type of accommodation [66] (Table 1).
In the specific case of the accommodation establishments considered as representing rural tourism, according to Decree 65/2015 which classifies their type [67], there has been an increase out of all proportion in recent decades [63]. This increase, which has been uncoordinated and poorly planned, has given rise to a clear imbalance between the supply and the demand. This has meant that during the recent economic crisis, some establishments have closed down [65]. It is evident that the reason for this needs to be understood. According to the literature focussing on the study area, apart from the economic situation, the most frequently heard explanation is the location of these establishments in areas lacking the attractions preferred by the demand [41,56,63,65].
This type of accommodation is subjected to other strains. These include few travellers and few overnight stays in comparison with other surrounding areas, low occupation levels, and shorter average stays [68] (Table 2).

2.2. Materials

The materials used for carrying out this research are of two types: alphanumeric and cartographical. Both have been implemented on a GIS, which has allowed the application of the techniques necessary for demonstrating the initial hypotheses.
The cartography used to perform the analysis is that of the National Geographic Institute (Instituto Geográfico Nacional, IGN) [69], which operates under the Creative Commons CC-BY 4.0 International licence. The scale of reference is 1:100,000 and the resolution is 20 m, which is sufficient to approach with guarantees to the type of analysis required by the research. It gives a vision of the territory which aims to describe the appearance and the details of the surface area together with the geographical objects found on it, and whether these are natural or a product of human activity. It allows the making of geographical and alphanumerical enquiries. Its class structure of geographical objects is based on the phenomena representing the scale mentioned with simple geometries (dot, line, and area) [70].
The alphanumeric information comes from various sources. That referring to tourist accommodation establishments comes from the Register of Tourist Companies of Extremadura [66]. It has been updated to 31 December 2019. This alphanumeric database has been conveniently georeferenced for its subsequent implementation in a geographical information system.
On the other hand, the information on the declared preferences of rural tourists regarding tourist resources comes from a direct survey. It contains data which allow the extraction of their sociodemographic profiles, the type of accommodation where they stay, the variety of tourism they practise, and naturally the elements they value in choosing their destination (Table 3). This survey was completed by 710 tourists in 2015 and was repeated by 140 in 2019 with the aim of contrasting the changes. As there are no significant differences between both years, neither in the preferences of tourists nor in the sociodemographic aspects, the entire sample was used (850 surveys). This is a reliable survey, since with 95% confidence the sampling error is 3.36%. From it, the tourist resources preferred by those staying in the study area have been discovered.

2.3. Research Procedure

Among the resources finally selected to explain the distribution of rural accommodation establishments, the criterion of distance was taken into account. To calculate it, we resorted on occasion to network analyst tools and at other times to the Euclidean distance taken from the resource to the capital of the municipality (Table 4).
Owing to the problems faced by rural accommodation establishments, the need for finding out whether they are located in optimum areas by establishing regression models is put forward. At a methodological level, the process followed is summarised in 5 stages (Figure 2).
The first stage consisted of the obtaining of information on the supply of accommodation, tourist resources, the territory analysed, and tourists. With these data, a database was designed which was implemented in a GIS in the second stage. This software, together with the statistical analysis of the database, allowed us to obtain the tourist potential of each of the municipalities analysed. The tourism potential is obtained considering the tourism resources that tourists prefer. The third stage consisted of determining the independent variables which explain the location of rural accommodation establishments and providing two differentiated models. The first of them is obtained from ordinary least squares regression (OLS) and the second by means of geographically weighted regression (GWR) in which the criterion of proximity is considered. The fourth stage concentrates on the analysis of the results obtained, while in the fifth, both models are discussed.

2.4. Exploratory Regression (OLS)

Tourist activities have a privileged position in the geographical space in which they are carried out. In a large proportion of cases, they are conditioned by a series of factors with a certain spatial continuity. The tourist resources present in the territory play their role of attraction in the centre where they are to be found, although they also benefit nearby areas; from this, it can be deduced that proximity may have a direct effect on the configuration of the tourist space [52].
It is likewise evident that resources may be superimposed in the territory analysed, which may give rise to an incorrect interpretation of the existing relationships. After assessing the advantages and disadvantages of various techniques which allow the detecting of strong correlations between the predicting variables, we decided to carry out exploratory regressions. By this, a set of models on which to decide is achieved.
In order to carry out these exploratory regressions, we decided to take as regressive variables the 12 which may influence the location of rural accommodation establishments (Table 4) and as a dependent variable the lodging capacity in rural accommodation. Both types of variables were allocated to each of the 388 population centres located in the area of analysis.
The application of the exploratory regressions was dependent on compliance with the following requirements:
  • Adjusted R squared ≥ 0.3/≥0.2;
  • Variance inflation factor (VIF) ≤3. This value reflects how much redundancy (multicollinearity) among the model explanatory variables can be tolerated. When the VIF (variance inflation factor) value is higher than about 7.5, multicollinearity may make a model unstable; consequently 7.5 is the default value here;
  • Coefficient p-value <0.05. The selected value is 0.05, indicating passing models which will only contain explanatory variables when their coefficients are statistically at a 95 percent confidence level (p-values smaller than 0.05).
The viable models obtained are analysed to select that with the best requirements for the application of OLS and GWR regressions.
The basis of exploratory regression is ordinary least squares regression. This is used frequently in tourism studies either as a single technique or together with others [71,72]. Its main task consists of predicting the values attained by the dependent variable in accordance with the regressive variables which make up the model.
There tends to be considerable discussion on the final configuration of this model as to the possible problems of multicollinearity [73,74]. Nevertheless, more and more references are appearing which stress the important role played by remainders in the explanation of the same [75]. Some authors even recommend checking whether homoskedasticity or heteroskedasticity exists to validate the model, starting from the covariance matrix [76].
Although this technique is frequently used, the possible problems that may derive from the choice of an erroneous model should always be considered. The most frequent among them are perhaps the omission of explanatory variables, the existence of nonlinear relations, the presence of outliers, multicollinearity, the inconsistent variance in the remainders, etc. Despite this, almost all these disadvantages tend to have different solutions, among which the revision of the model stands out.
For the purpose of this article, ordinary least squares regression is carried out by means of the analysis module integrated in the spatial statistics tools included in the ArcGIS v.10.5 software.
The ordinary least squares method uses the following Equation (1) [77]:
y i = β 0 + β 1 x 1 + β 2 x 2 + + β n x n + Ԑ
in which:
y i = the   value   observed   for   the   dependent   variable   at   point   i ; β 0 = the   interception   point   y   ( constant   value ) ; β n = the   regression   coefficient   or   slope   of   the   explanatory   variable   n   at   point   i ; x n = the   value   of   the   variable   n   at   point   i ; Ԑ = the   error   of   the   regression   equation .
For its application the following requirements must be complied with [77,78]:
-
The model must be linear: this can be analysed by scatter plots;
-
The data used must not depend on any external factor;
-
The explanatory variables must not be related to each other;
-
The explanatory variables must have an insignificant error when measured;
-
The remainders must come to 0;
-
The remainders must have a homogeneous variance and follow a normal distribution.
The definitive ordinary least squares regression model was subject to different tests to confirm its reliability. Among them, the F-statistic, Wald, Koenker, and Jarque–Bera statistics tests stand out.

2.5. Geographically Weighted Regression (GWR)

Ordinary least squares regression does not include in its formulation the criterion of proximity, which may compromise the models obtained in specific systems such as that of tourism. In a system as complex as the latter, it is inevitable to resort to the proximity analysis to explain certain facts. The presence of certain tourist resources thus favours not only the areas in which they are located but also those nearby, which shows the importance of proximity [79]. The need for including distance as a key parameter has been demonstrated by numerous publications which use geostatistical analyses [41,58,80,81,82].
In this context, spatially weighted regression emerges as a viable alternative for building models in which spatial relationships are analysed. This variety of regression allows the construction of a model by means of explanatory variables in the same way as OLS. However, it goes further as it specifies whether the kernel is constructed as a fixed distance, or whether it is allowed to vary in extent as a function of feature density [77]. Likewise, it allows specifying how the extension of the kernel can be determined, whether by means of the corrected Akaike’s information criterion (AICc), cross validation (CV), or bandwidth parameter.
This regression builds a different equation for each entity on including the dependent and explanatory variables of the entities in the bandwidth of each destination entity.
The following Equation (2) should be used:
y i =   β i 0 + k = 1 m β i k +   x i k +   ε i
in which yi is the dependent variable at location i; xik is the kth independent variable at location i; m is the number of independent variables; βi0 is the intercept parameter at location i; βik is the local regression coefficient for the kth independent variable at location i; and εi is the random error at location i [48] (p. 2).
GWR allows coefficients to vary continuously over the study area, and a set of coefficients can be estimated at any location—typically on a grid so that a coefficient surface can be visualised and interrogated for relationship heterogeneity. GWR makes a point-wise calibration concerning a “bump of influence” around each regression point where nearer observations have more influence on estimating the local set of coefficients than observations farther away [47]. GWR measures the inherent relationships around each regression point i, where each set of regression coefficients is estimated by weighted least squares.
To calculate it, the following Equation (3) should be used:
β ^ i = ( X T W i   X ) 1 X T W i y
in which X is the matrix of the independent variables with a column of 1 s for the intercept; y is the dependent variable vector; β ^ i = ( β i 0 , β i m ) T   is the vector of m + 1 local regression coefficients; and Wi is the diagonal matrix denoting the geographical weighting of each observed data for regression point i [48] (pp. 2–3).

3. Results

3.1. Basis of the Regression Model: Activities Carried Out by Rural Tourists

The items used to determine the activities carried out by tourists staying in a rural accommodation establishment indicate which aspects are most attractive to them (Table 5). A detailed analysis shows that they carry out activities related to generic rural tourism, although they also make cultural visits. They declare that they practise tourism in rivers, gorges, or reservoirs and gastronomy. Finally, albeit less significantly, they also declare that they watch birds and visit mines or geological formations or practise sport. Meanwhile, the remainder of activities go unnoticed.
If the activities preferred by tourists staying in rural accommodation establishments are considered, it is possible to determine the factors which most attract them. From this, it can be deduced that it is a good way of understanding which should be the parameters used to analyse the implementation of rural accommodation establishments. These parameters include some which are highly characteristic of certain territories, although there are others which can be enjoyed over a large part of Extremadura, as is the case with gastronomy. The most noteworthy are:
  • Cultural heritage. For this reason, proximity to historical ensembles has been chosen as a regressive variable; these ensembles also include World Heritage cities located outside the rural context;
  • Proximity to water resources. As a consequence of their importance, bathing areas and reservoirs have been selected as factors justifying the presence of a rural accommodation establishment. We have also included the distances to two protected landscapes (the Garganta de los Infiernos Nature Reserve and the Tajo-Internacional Nature Reserve). The former is a well-known bathing area and the latter offers river cruises which take tourists even as far as Portugal;
  • Birdwatching. In this sense, we have included the distance to Monfragüe National Park, the main centre of attraction for ornithological tourism;
  • Visiting mines, caves, or geological formations. Owing to its importance, the distance to the Villuercas-Ibores-Jara Geopark has been considered.
The replies obtained in each of the types of resources analysed justify the fact that the main attractions sought by tourists staying at rural establishments act as regressive variables. It is logical to think that the areas containing a large proportion of these attractions are the most appropriate for maintaining this type of accommodation and will have more advantages if they are near a highway as a consequence of their better access.
The tourists themselves define a logical model for the distribution of rural accommodation establishments, which is at variance with the actual situation (Figure 3a,b). It can be observed that there is a natural preference for certain areas although the offer of these accommodation establishments follows a markedly different pattern [63]. It can also be appreciated that there are certain differences depending on the population living in each municipality, especially if this is compared with the distribution of other types of accommodation [64].
The distribution of rural accommodation and the presence of the resources which support the activities carried out by tourists closely coincide in certain areas. In this sense, there is an important relationship between mountain areas and the presence of bathing areas; the latter represent the highest concentrations of the lodging capacity in rural accommodation establishments. This situation is understandable if it is taken into account that summer is the time of the year when most rural tourists are received [64,65]. It can likewise be observed that in the proximity of the sierras, numerous beds in rural accommodation establishments are also concentrated, although in many cases they do not coincide with areas prepared for bathing with the subsequent loss of attraction for summer tourism. Moreover, there are also accommodation establishments in areas lacking the main attractions sought by the demand (Table 6).
This most inconsistent situation gives rise to three major groups of rural accommodation establishments. The first of these, located in mountain areas and with the presence of bathing areas, has the best competitive advantages for attracting tourists. The second also has a certain capacity of attraction, although it is not so competitive during the long summer period characteristic of Extremadura, where temperatures often exceed 35 degrees and on occasion reach 40 degrees [83]. Finally, the third, which is unequally distributed over the territory, is that near Special Bird Protection Areas (Zonas de Especial Protección para Aves, ZEPAs). This distribution, which is more or less satisfactory, is faithfully reflected in its capacity for capturing tourists according to the time of the year. This is corroborated when the areas located in the north of Extremadura, a mountain region with natural bathing areas, attract a large number of tourists and overnight stays during the summer. Meanwhile, other areas have greater potential during the rest of the year [84].
It is obvious that the essential attractions sought by the demand are mountain or sierra areas, bathing areas, and some natural spaces, although one should always stress the enormous territorial variability regarding the lodging capacity. For this reason, despite the fact that there are clear determinants which explain the presence of the rural accommodation establishments available in specific areas, it is evident that it would be viable at least in theory to model their distribution on the presence or proximity of specific factors. Nevertheless, certain conditions can be observed among some of these factors, which means that it is as well to improve these models by using specific techniques which make it possible to eliminate the effects of collinearity.

3.2. Exploratory Regression Model

Taking into account the contrasts between the distribution of accommodation establishments and the presence of the main factors mentioned by the demand staying in rural establishments, we chose to carry out an exploratory regression. Its main advantage lies in the fact that it includes all possible combinations of the regressive variables which attempt to explain the number of rural accommodation beds in each population centre.
This type of regression also allows the use of complementary criteria to select variable models in addition to value R.
After its application, a considerable variety of models were found, of which three are worthy of note. The first takes into account all the variables which may act as regressors, while the second has been improved, taking into account exclusively the variables complying with some reliability requirements. In contrast, the third model uses as regressive variables the factors which can be deduced from the activities carried out by the type of tourist analysed (Table 7).
To obtain the exploratory regression mode, slightly different criteria were used (Table 8). In the first place, an adjusted R-squared minimum value exceeding 0.3 was required; this value was not attained in any of the 4095 combinations analysed. Likewise, a p-value of <0.05 was required; this condition was complied with in 399 cases. A VIF value of <3.00 was also required; this condition was complied with by 1695 models. On the contrary, no possible combination appeared if a p-value (JB) of >0.05 was considered or a spatial autocorrelation (SA) of >0.20. In contrast, the second model saw the adjusted R-squared value lowered to figures above 0.2, which was achieved in two models. Meanwhile, a p-value of <0.05 was provided by 49 models, and a VIF value of <3.00 obtained 63 viable results. Nevertheless, although greater permissiveness is possible with a p-value (JB) or a spatial autocorrelation (SA) of >0.1, no viable model is achieved. Finally, the third model presents 511 tests. A total of 17.81% of these comply with the minimum adjusted R-squared value considered. In turn, they also comply with the p-value and VIF value criteria in between 29.75% and 43.64% of cases. However, the robustness of the models is compromised if we should resort to a p-value (JB) or a spatial autocorrelation (SA).
These first results show how difficult it is to explain the distribution of the beds available in rural accommodation establishments in a logical manner. It is also as well to point out that there is a degree of territorial concentration of the same in some places, which undoubtedly affects the fact that there is a similar concentration in the remainders obtained with these regression tests [55].
The significance of the variables considered in each model certifies the consistency of each of them (Table 9). It can thus be observed that while Model 1 attaches much importance to most variables, Model 2 distinguishes them better. In this sense, it is revealed that the relief (mountains and sierras) and the water resources (bathing areas and reservoirs) reach significant values; cultural attractions and the distance from highways are also noteworthy. Model 3 shows similarities with Model 1 in the variables which have been used jointly.
This analysis demonstrates that Model 2 is more accurate as it selects fewer variables, but all of them are highly significant from a statistical point of view. Moreover, they agree with the main attractions which tourists emphasise. This same model reflects the importance of being near mountain and bathing areas, as their negative significance is 100%, which is somewhat similar to the case of historical ensembles. This is in addition to the fact that there is a positive relationship with sierras, reservoirs, and the distance from highways.
This initial analysis reflects the possibility of finding two well differentiated types of areas. On the one hand, we have mountain areas clearly linked to bathing areas, and on the other, areas far from these spaces containing reservoirs and protected areas.
Despite the interest of the analysis of statistical significance, the fact that multicollinearity is one of the most serious problems that may arise in the application of multiple regression cannot be ignored. For this reason, its analysis is essential if we are to determine the goodness of the regression models constructed (Table 10). It is noteworthy that Model 1 has three variables with unacceptable VIF values, which gives rise to their participation in numerous models with multicollinearity (violations). In contrast, Model 2 represents precisely the opposite as all the variables included have very low VIF values, especially if it is taken into account that the maximum limit established is 7.5. Clearly, having low VIF values means that no violation by collinearity occurs. For its part, Model 3 is not robust as there are three variables with a VIF much higher than the value statistically permitted (7.5).
The assessment of the models obtained indicates that Model 2 has the best conditions for explaining the presence of the supply of rural accommodation (Table 11). Indeed, the most viable model constructed consists of the main attractions required by the rural tourist. Nevertheless, it should be taken into account that this model attains an R2 of 0.21, although it is also in keeping with the SA. To this, what must also be added is the fact that the variables selected by the model itself are significant at a level of 0.01. From this, it can be deduced that it is unlikely that the result is due to chance.
The above analyses show that there is a model which can provide an explanation according to the activities carried out by tourists, although it provides a reduced explanation of the dependent variable. For this reason, it is difficult to find a logical explanation for the distribution of the supply of rural accommodation.

3.3. Ordinary Least Squares Regression (OLS)

The results given by OLS in Model 2 include interesting data because the VIF has a lower value in all the regressive variables. This shows that there is no collinearity between the variables. Likewise, it confirms that the coefficients obtained by each of these variables provide an explanation of the model, although three stand out negatively (historical ensemble, bathing area, and mountain) (Table 12). This means that the centres nearer these attractions have a higher number of beds, which is in keeping with the activities carried out by the tourists staying in rural accommodation establishments, at least during the summer. Nevertheless, the model is completed with two other positive coefficients (reservoir and highway), from which it can be deduced that the further a population centre is from highways and reservoirs, the greater the capacity for rural accommodation. All this is faithfully reflected in the current situation, as shown in Figure 3. On the other hand, both probability and robust probability are significant, from which it can be deduced that the possibility of the coefficient being essentially zero is also low.
Despite the fact that the regressive variables correspond to the attractions valued by tourists who stay in rural accommodation establishments, a full diagnosis of the model must be carried out (Table 13). It is from this analysis that the first signs of complexity arise. This is due to the fact that an explanation can only be found for 21% of the distribution of beds in rural accommodation establishments. Moreover, the statistics of Koenker and de Wald are significant. Owing to this, OLS reflects that the lodging capacity is not strongly linked to the presence of the essential attractions for rural demand. This may mean that the location of these establishments does not follow a logical criterion but rather other factors which have little to do with the attraction capacity and the tourist potential of Extremadura.
Both the low level of explanation of the model and the negative assessments of the main tests used are a clear sign that there is no obvious trend to explain the distribution of the beds of accommodation establishments. This peculiarity is due to the fact that setting up an establishment of this type is a personal decision that does not always follow a logical criterion.
The graphic analysis of the histograms showing the distribution of the regressive variables and the linear regressions obtained by means of a scatter plot demonstrate that there are no atypical values conditioning the relationship and the tendency (Figure 4). However, considerable bias can be observed in the number of beds in rural accommodation establishments with a clear predominance of lower values. Indeed, of the 388 municipalities analysed, 141 have no beds and only 57 population centres have over 50 beds.
The analysis of the standardised results reflects a Gaussian distribution of frequencies and the positive bias is minimal (Figure 5a). This is corroborated when resorting to the spatial autocorrelation index obtained by means of Moran’s I (Figure 5b), the value of which attains 0.063515 to certify a random distribution of the remainders. Moreover, the territorial distribution of the same reveals no concentrations that may lead us to suspect the omission of a specific regressor (Figure 5c).

3.4. Geographically Weighted Regression

The application of OLS has shown that it is difficult to obtain a satisfactory global regression model. This is essentially due to the unsatisfactory explanation of the dependent variable and the non-compliance of some indexes which check the suitability of the set of regressive variables. Other parameters however exist which suggest the possible applicability of the same, although they show ambiguities. For this reason, it is a case of discovering whether any changes occur by resorting to the conceptualisation of the territorial aspect beyond its own geographical representation. The need thus arises to contrast this model by GWR with the essential objective of revealing whether any other territorial pattern exists which has escaped analysis.
The conceptualisation of the model applied by GWR is similar with the aim of being able to contrast the results and confirm which technique is best fitted to the situation deriving from the territorial distribution of rural establishments. The decision was taken to use the same regressive variables and naturally maintain the lodging capacity in rural accommodation establishments as a dependent variable. Nevertheless, this type of regression allows the adding of a variable which acts as spatial weighting for each individual entity. For this reason, the number of rural accommodation establishments has been included as a calibration factor of the model.
From then on, different combinations have been applied to select the kernel of the function and the most suitable method used to calculate its extent. In the case of the selection of the kernel, it was decided to use both fixed and adaptive parameters. The former covers a fixed distance (metres) while the latter considers a variable distance depending on a specific number of neighbours. In both cases, it is calculated automatically in the case of the corrected Akaike’s information criterion (AICc) and cross validation (CV) and selected by the authors in the case of the bandwidth parameter.
A synthesis of the results obtained shows that, overall, they are better than those obtained by OLS (Table 14).
An analysis of the different methods used reflects that AICc and CV provide superior performances to bandwidth parameter, both resorting to fixed kernels and to adaptive kernels. This is due to the fact that the condition number does not exceed the critical value of 30. Even so, the most effective result is achieved by means of the application of AICc with fixed kernels. In this case, the automatic distance calculation indicates that the models are optimised by using a bandwidth of 63.627 metres. It can likewise be observed that the difference between the R2 and the adjusted R2 is not very high and that the proportion of variance of the variable explained is reasonable if the complexity of the model is taken into account.
The territorial distribution of the local regression models reveals that the R2 follows certain patterns (Figure 6a). In this sense, the best adjustments are obtained in the district of Alcántara and its vicinity where they are over 40%, although there are other areas in which 30% is exceeded such as the Villuercas-Ibores-Jara Geopark. In contrast, the adjustments of the areas of the north of the province of Cáceres or the south-centre of the province of Badajoz are lower than 20%.
The territorial analysis of the remainders obtained by the model shows a negative imbalance in those areas with the highest number of beds (Figure 6b). In the north of the province of Cáceres therefore, the model estimates a higher lodging capacity than that existing in a large proportion of the centres. This is due to the considerable potential for development which said areas still have as a consequence of the superimposition of the resources preferred by the demand from tourists. In contrast, in these areas, there are also centres in which the model estimates a number of beds lower than the actual number to coincide with the centres with smaller and more aged populations and with a very limited capacity for enterprise. In the same way, other areas which are very well delimited geographically can be observed in which there are centres with a considerable imbalance, both positive and negative, between the values calculated by the model and those actually found (Table 15).
The distribution of the values observed and those predicted by the model (Figure 7a,b) corroborates certain guidelines. On the one hand, it is clear that the lodging capacity of rural accommodation establishments is concentrated in very few areas. In them, there are population centres with many beds in this kind of accommodation also intermingling with others in which the number of beds is more limited or even non-existent. On the other hand, when the distribution of the number of beds estimated by the regression model is analysed, it can be observed that they are concentrated in specific areas such as the north-east and other well-defined areas.
The comparison of the beds available in rural accommodation establishments with those calculated by the model reveals that the areas where the latter best operates maintain a certain balance between both variables (Figure 8). When the situation of other areas in which the operation of the model is less adequate is analysed, the results are similar. Nevertheless, these adjustments are acceptable when R2 exceeds 0.3. In contrast, the opposite is true in those areas of Extremadura in which rural tourism is more developed, such as the northern district, as a consequence of having less satisfactory adjustments. This scenario is completed with the existence of numerous population centres which have no beds available, although they are allocated some by the model.
An analysis of the results obtained by means of GWR shows the complexity of trying to explain the distribution of rural accommodation establishments by tourist preferences, which reveals that the implementation of the same has not followed an objective criterion. Far from it, in certain places, their location follows an almost random criterion which may perhaps be explained by other variables which have nothing to do with the tourism and business logic that should govern any tourist business.

4. Discussion

Regression models aim to explain the relationship between independent variables and the dependent variable, although in cases in which human decisions or behaviour intervene, it is always difficult to find strong causal relationships [75]. This is clear in the case analysed in which we aim to explain whether the location of rural accommodation establishments and consequently the presence of this type of supply follows logical criteria or depends on personal inconsistent decisions, as was found in some studies [64].
In order to obtain probable explanations which reveal the incidence of a set of explanatory variables on an explained variable, one generically resorts to the application of regressions of multiple types. For this reason, taking into account the literature analysed, we decided to use ordinary least squares regression which is one of the techniques most frequently used by scientists [85,86,87,88], although given the latter’s limitation when contextualising spatial relationships, geographically weighed regression is becoming more and more acceptable in analyses of tourism [89,90]. The essential objective of using these two types of analysis is to determine which of the two best adapts to the creation of models explaining the location of the supply.
In both cases, it has been found that it is very difficult to achieve a valid model which complies with the statistical requirements necessary so that predictions can be made according to the same. In this sense, it is clear that GWR obtains better results than OLS (hypothesis 3), although both types of regression show that the distribution of rural accommodation establishments is not ideal; this means that they also refute other techniques such as grouping analysis [57], hot spot analysis, or cluster and outlier analysis [41].
The difficulty in defining a suitable model to explain the distribution of rural accommodation establishments is confirmed when the preferences of the demand for rural tourism in the area studied are taken into account. For this reason, it can be inferred that their location depends on a statistically complex environment in which decisions are made on many occasions, owing to a personal criterion supported by expansive tourist policies which sometimes lack rigour. These policies pursued the profusion of the supply to the detriment of other necessary options. We can thus understand declarations calling for the need to implement a supply of quality tourist products which are capable of making destinations more attractive. Among these products, the following stand out: dehesas as an essential resource to support currently non-existent agrotourism [91], bathing areas and reservoirs [92], protected areas [93], the historical heritage [94], and very specific segments such as hunting [95].
The demonstration that the distribution of the pool of rural accommodation establishments is not ideal, as there are areas of this supply which do not have the necessary attractions in the opinion of the demand, makes it clear that it is necessary to delimit satisfactorily the explanatory variables which can reliably define and delimit the location of this type of supply (hypothesis 1). In order to do so, it is recommendable to use the criterion of the demand of the tourist type analysed. In this sense, the carrying out of surveys in which the tourist expounds the activities he/she prefers assumes that the criteria established are much more realistic and effective at the end of the day than any other method. It should not be forgotten that when the tourist indicates his/her preferences, he converts them into rigorous variables which would explain the presence of accommodation establishments (hypothesis 2). As a result, it is considered that above all the main attractions that must act as explanatory variables are mountain areas, bathing areas, and historical-artistic ensembles, locations which are also in areas far from highways and reservoirs. The simple superimposition of these locations on the supply of rural accommodation establishments shows that some of the locations in which they are available do not have this kind of attraction. This undoubtedly means that this location does not follow logical criteria, which explains that in these areas lacking the tourist attractions required by tourists, the occupation level is appreciably lower than in other areas better adapted to the preferences of tourists.
This fact is revealed by both techniques, although it is obvious that spatially weighted regression is much more efficient as it is capable of determining the areas to which the regression models are better adapted. In this respect, it is as well to remember that this type of regression is capable of generating a different equation for each of the locations selected (hypothesis 4). Owing to this, its analytical superiority is understood. It is interesting that there are areas in which the adjustment of the models can be improved; they coincide with areas in which the attractions required by tourists do not abound. Moreover, the models themselves described define some outliers, imbalances in short, owing to the existence of a specific criterion such as the presence of a low demographic volume (hypothesis 5). In this sense, it is understood that small population centres, normally with less than 500 inhabitants, lack the endogenous capacity for enterprise and also suffer from a scarcity of services and tourist equipment.
By taking into account the aspects listed, it can be affirmed that the distribution of rural accommodation establishments does not correspond as it should to the presence of the lodging preferred by tourists. In its turn, it is also noteworthy that ordinary least squares regression is not the most effective technique for explaining this distribution, not even if its aim is to provide a generic model. In contrast, spatially weighted regression operates better as it explains a higher percentage of the variance and also defines areas in which the models constructed are more effective.
The analyses carried out reveal that GWR is much superior to OLS, even when analysing models based on variable behaviour such as that of human decisions, far from being adapted to physical laws or following coherent guidelines, and predicting the supply of accommodation. This is because although both types of regression obtain general models, only GWR allows local models adapted to each individual case (hypothesis 6). Moreover, it has another consubstantial advantage when tackling spatial statistics, i.e., the consideration of multiple types of spatial relationships which is particularly useful for tourism. In this discipline, it has been shown on numerous occasions that proximity is an essential element as is also in a sense the first law of geography, which indicates that interactions between places are inversely proportional to the cost of the distance between them [79].
The very complexity of the model analysed leads us to think that this type of technique is very useful, even in erratic systems such as that analysed, and facilitates decision-making at least in the places better adapted to the models designed. Nevertheless, both the global and local models defined by GWR can be improved from a statistical perspective by adding another type of explanatory variable, although when this possibility was explored with the inclusion of the most emblematic protected natural spaces of the study area, it was found that statistical consistency fell. Due to this, the results given here were obtained after multiple improvements and discarding the most problematical models, seeking the best relationship between a logical explanation and one adapted to the preferences of tourists and statistical robustness.
The applicability of this research allows both the local and regional administration to define a tourist policy conducive to the search for alternatives in the designing of tourist products in the most problematical areas and to stress the value of other attractions of these territories which are not mentioned by tourists. Indeed, it can be observed that many areas, in which the supply is not adapted to the resources required by visitors, contain other elements which may be of interest such as protected natural spaces. In them, specific products could be created, orientated towards nature tourism, ornithological tourism, or even agrotourism, as many spaces feature the typical landscape of the southwest of the Iberian Peninsula, the “dehesa”.
All this opens a new line of analysis based on finding explanations for the suitable location of investments, in this case, illustrated with the supply of rural accommodation in a specific context, Extremadura. This line would improve the model, perhaps including more attractions so that the tourist can define it better.

5. Conclusions

The analysis of the distribution of the supply of rural accommodation in Extremadura, using two types of regression widely accepted by the scientific community, leads to the following conclusions:
The first shows that the location of the supply of rural accommodation is random over a large part of the territory, as is shown by the fact that there is no correspondence between the resources required by the tourist and those present in said areas. Fortunately, the centres in which there are fewer tourist attractions according to the demand tend not to have many beds. In contrast, where this correspondence does exist, an important pool of lodging is detected. In line with the above, it is shown that the north of the province of Cáceres, recognised by the INE as an outstanding area for rural tourism at a national level, together with other areas with the highest relief in Extremadura, does have the necessary attractions and moreover a considerable number of rural accommodation beds.
The second is that the most important attractions for the demand from rural tourists who stay in this type of accommodation are varied, although those predominating are altitude, bathing areas, and historical-artistic ensembles. It is indeed observed that the distance to these elements has a negative relation with the number of beds available. In other words, the nearer the centres are to them, the greater their capacity for rural accommodation.
The third reveals that the application of OLS does not provide a convincing explanation for the model defined by tourists, despite the fact that it is a global statistic, on defining a single equation to explain the relationship between the independent variables and the dependent variable. This is due to the fact that the variability of circumstances reflected in the territory conceals different realities and cannot therefore be synthesised in a single equation.
The fourth is that GWR gives better results for both the global model and the local models which it is capable of creating. Moreover, it allows the distinguishing of spaces in which the model is more reliable than others in which the level of adjustment is appreciably lower. As has been mentioned, the reason for this is that different situations require a differentiated analysis, which this regression allows as it has not only a global outlet but also a local one, calculating a specific case for each case analysed.
The fifth shows that the configuration of spatial relations is necessary to explain models in which territorial proximity is vital. In this case, tourist activities are a clear exponent as proximity to resources is a clear sign which can explain the presence of the supply of accommodation.
The sixth and last conclusion refers to the need to continue with a line of research into the application of spatial statistics to all those systems in which the territory can be considered as one more variable to analyse. For this reason, when both regression models have been compared, the immediate conclusion has been the goodness of GWR compared with the poor results of OLS.

Author Contributions

Conceptualization: J.-M.S.-M., J.-I.R.-G., and J.-L.G.-G.; methodology: J.-M.S.-M.; formal analysis: J.-M.S.-M. and J.-L.G.-G.; investigation: J.-M.S.-M.; resources: J.-M.S.-M.; data curation: J.-M.S.-M., and J.-I.R.-G.; writing—original draft preparation: J.-M.S.-M, J.-I.R.-G., and J.-L.G.-G.; writing—review and editing: J.-M.S.-M. and J.-L.G.-G.; supervision. J.-M.S.-M. and J.-L.G.-G.; project administration: J.-M.S.-M.; funding acquisition: J.-M.S.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by DISEÑO Y ELABORACIÓN DE PRODUCTOS ESTRATÉGICOS DIFERENCIADOS PARA LA POTENCIACIÓN DEL TURISMO RURAL EN EXTREMADURA. DE LA DETECCIÓN DE PROBLEMAS A LA PROPUESTA DE SOLUCIONES BASADAS EN CRITERIOS GEOESTADÍSTICOS, grant number PRI-IB16040 and the APC was funded by the Consejería de Economía e Infraestructuras de la Junta de Extremadura (the branch of the regional government that covers economy and infrastructure) and by the European Regional Development Fund (ERDF).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Jackson, J. Developing regional tourism in China: The potential for activating business clusters in a socialist market economy. Tour. Manag. 2006, 27, 695–706. [Google Scholar] [CrossRef]
  2. Rogerson, C.M. Regional tourism in South Africa: A case of ‘mass tourism of the South’. GeoJournal 2004, 60, 229–237. [Google Scholar] [CrossRef]
  3. Braun, P. Regional tourism networks: The nexus between ict diffusion and change in Australia. Inf. Technol. Tour. 2003, 6, 231–243. [Google Scholar] [CrossRef]
  4. Ateljevic, J. Tourism entrepreneurship and regional development: Example from New Zealand. Int. J. Entrep. Behav. Res. 2009, 15, 282–308. [Google Scholar] [CrossRef]
  5. Frechtling, D.C. The tourism satellite account: Foundations, progress and issues. Tour. Manag. 1999, 20, 163–170. [Google Scholar] [CrossRef]
  6. Smeral, E. Tourism satellite accounts: A critical assessment. J. Travel Res. 2006, 45, 92–98. [Google Scholar] [CrossRef]
  7. Xu, C.; Jones, C.; Munday, M. Tourism inward investment and regional economic development effects: Perspectives from tourism satellite accounts. Reg. Stud. 2019, 1–12. [Google Scholar] [CrossRef]
  8. Perna, F.; Custódio, M.J.; Oliveira, V. Tourism destination competitiveness: An application model for the south of Portugal versus the Mediterranean region of Spain: Competitivtour. Tour. Manag. Stud. 2018, 14, 19–29. [Google Scholar] [CrossRef] [Green Version]
  9. Sánchez-Rivero, M.; Martín, J.M.S.; Gallego, J.I.R. Methodological approach for assessing the potential of a rural tourism destination: An application in the province of Cáceres (Spain). Curr. Issues Tour. 2014, 19, 1084–1102. [Google Scholar] [CrossRef]
  10. Ebejer, J. Urban heritage and cultural tourism development: A case study of Valletta’s role in Malta’s tourism. J. Tour. Cult. Chang. 2018, 17, 306–320. [Google Scholar] [CrossRef]
  11. Li, J.; Xu, L.; Tang, L.; Wang, S.; Li, L. Big data in tourism research: A literature review. Tour. Manag. 2018, 68, 301–323. [Google Scholar] [CrossRef]
  12. Xiang, Z.; Fesenmaier, D.R. Big data analytics, tourism design and smart tourism. In Co-Creation and Well-Being in Tourism; Springer Science and Business Media LLC: Cambridge, UK, 2016; pp. 299–307. [Google Scholar]
  13. Song, H.; Liu, H. Predicting tourist demand using big data. In Co-Creation and Well-Being in Tourism; Springer Science and Business Media LLC: Cambridge, UK, 2016; pp. 13–29. [Google Scholar]
  14. Yang, Y.; Pan, B.; Song, H. Predicting hotel demand using destination marketing organization’s web traffic data. J. Travel Res. 2013, 53, 433–447. [Google Scholar] [CrossRef] [Green Version]
  15. Miah, S.J.; Vu, H.Q.; Gammack, J.; McGrath, M. A big data analytics method for tourist behaviour analysis. Inf. Manag. 2017, 54, 771–785. [Google Scholar] [CrossRef] [Green Version]
  16. Alaei, A.; Becken, S.; Stantic, B. Sentiment analysis in tourism: Capitalizing on big data. J. Travel Res. 2017, 58, 175–191. [Google Scholar] [CrossRef]
  17. Gu, Z.; Zhang, Y.; Chen, Y.; Chang, X. Analysis of attraction features of tourism destinations in a mega-city based on check-in data mining—A case study of Shenzhen, China. ISPRS Int. J. Geo Inf. 2016, 5, 210. [Google Scholar] [CrossRef] [Green Version]
  18. Xu, F.; Nash, N.; Whitmarsh, L. Big data or small data? A methodological review of sustainable tourism. J. Sustain. Tour. 2019, 28, 144–163. [Google Scholar] [CrossRef]
  19. Yang, Y.; Zhang, H. Spatial-temporal forecasting of tourism demand. Ann. Tour. Res. 2019, 75, 106–119. [Google Scholar] [CrossRef]
  20. Reid, L.J.; Andereck, K.L. Statistical analyses use in tourism research. J. Travel Res. 1989, 28, 21–24. [Google Scholar] [CrossRef]
  21. Nunkoo, R. The routledge handbook of tourism impacts. In The Routledge Handbook of Tourism Impacts; Routledge: London, UK, 2019; p. 14. [Google Scholar]
  22. Chatzigeorgiou, C.; Christou, E. Adoption of social media as distribution channels in tourism marketing: A qualitative analysis of consumers’ experiences. J. Tour. Herit. Serv. Mark. 2020, 6, 25–32. [Google Scholar]
  23. Dai, L.; Wang, S.; Xu, J.; Wan, L.; Wu, B. Qualitative analysis of residents′ perceptions of tourism impacts on historic districts: A case study of Nanluoguxiang in Beijing, China. J. Asian Arch. Build. Eng. 2017, 16, 107–114. [Google Scholar] [CrossRef] [Green Version]
  24. Mano, A.; Costa, R. Imagem projetada de Portugal como destino turístico: Análise qualitativa do portal oficial de promoção turística. Rev. Tur. Desenvolv. 2018, 29, 65–83. [Google Scholar]
  25. Martín, J.M.S.; Delgado, L.M.M.; Gallego, J.I.R. La reputación online de los alojamientos rurales en Extremadura desde una óptica geoestadística. BAGE 2019, 82, 1–36. [Google Scholar] [CrossRef] [Green Version]
  26. Tarí, J.J.; Molina-Azorin, J.F.; Pereira-Moliner, J.; López-Gamero, M.D. The internalization of a sectorial standard for quality management: A qualitative analysis in tourism. In Chinese Acquisitions in Developed Countries; Springer: Cham, Switzerland, 2017; pp. 97–108. [Google Scholar]
  27. Wang, X.; Sun, J.; Wen, H. Tourism seasonality, online user rating and hotel price: A quantitative approach based on the hedonic price model. Int. J. Hosp. Manag. 2019, 79, 140–147. [Google Scholar] [CrossRef]
  28. Beerli, A.; Martín, J.D. Tourists’ characteristics and the perceived image of tourist destinations: A quantitative analysis—A case study of Lanzarote, Spain. Tour. Manag. 2004, 25, 623–636. [Google Scholar] [CrossRef]
  29. Walle, A.H. Quantitative versus qualitative tourism research. Ann. Tour. Res. 1997, 24, 524–536. [Google Scholar] [CrossRef]
  30. Nunkoo, R. Handbook of research methods for tourism and hospitality management. In Handbook of Research Methods for Tourism and Hospitality Management; Edward Elgar Publishing: Cheltenham, UK, 2018; pp. 3–23. [Google Scholar]
  31. Hall, C.M. Quantitative and qualitative content analysis. In Handbook of Research Methods in Tourism and Hospitality; Edward Elgar Publishing: Cheltenham, UK, 2018; pp. 395–406. [Google Scholar]
  32. Dwyer, L.; Gill, A.; Seetaram, N. Handbook of Research Methods in Tourism: Quantitative and Qualitative Approaches; Edward Elgar Publishing: Cheltenham, UK, 2012; p. 528. [Google Scholar]
  33. Nunkoo, R.; Ramkissoon, H. Structural equation modelling and regression analysis in tourism research. Curr. Issues Tour. 2012, 15, 777–802. [Google Scholar] [CrossRef]
  34. Lv, Z.; Xu, T. A panel data quantile regression analysis of the impact of corruption on tourism. Curr. Issues Tour. 2016, 20, 1–14. [Google Scholar] [CrossRef]
  35. Santeramo, F.G.; Morelli, M. Modelling tourism flows through gravity models: A quantile regression approach. Curr. Issues Tour. 2015, 19, 1–7. [Google Scholar] [CrossRef]
  36. Ferreira, F.A.; Castro, C. The impact of terrorism and corruption on tourism in Turkey: A regression analysis. J. Comput. Methods Sci. Eng. 2019, 1–8. [Google Scholar] [CrossRef]
  37. Kim, N.; Song, H.; Pyun, J.H. The relationship among tourism, poverty, and economic development in developing countries. Tour. Econ. 2016, 22, 1174–1190. [Google Scholar] [CrossRef]
  38. Dogru, T.; Sirakaya-Turk, E. Engines of tourism’s growth: An examination of efficacy of shift-share regression analysis in South Carolina. Tour. Manag. 2017, 58, 205–214. [Google Scholar] [CrossRef]
  39. Gutiérrez, J.; García-Palomares, J.C.; Romanillos, G.; Salas-Olmedo, M.H. The eruption of Airbnb in tourist cities: Comparing spatial patterns of hotels and peer-to-peer accommodation in Barcelona. Tour. Manag. 2017, 62, 278–291. [Google Scholar] [CrossRef] [Green Version]
  40. Sarrión-Gavilán, M.D.; Márquez, M.D.B.; Mora-Rangel, E.O. Spatial distribution of tourism supply in Andalusia. Tour. Manag. Perspect. 2015, 15, 29–45. [Google Scholar] [CrossRef]
  41. Martín, J.M.S.; Gallego, J.I.R.; Blas-Morato, R. Hot spot analysis versus cluster and outlier analysis: An enquiry into the grouping of rural accommodation in Extremadura (Spain). ISPRS Int. J. Geo Inf. 2019, 8, 176. [Google Scholar] [CrossRef] [Green Version]
  42. Xu, F.; Hu, M.; La, L.; Wang, J.; Huang, C. The influence of neighbourhood environment on Airbnb: A geographically weighed regression analysis. Tour. Geogr. 2019, 22, 192–209. [Google Scholar] [CrossRef]
  43. Baggio, R.; Klobas, J. Quantitative Methods in Tourism; Channel View Publications, Ltd.: Bristol, UK, 2017. [Google Scholar]
  44. Rezaei, S. Quantitative methods, applications, and trends in Asian tourism research. In Quantitative Tourism Research in Asia; Springer: Singapore, 2018; pp. 1–10. [Google Scholar]
  45. Exteberría, J. Regresión Múltiple; La Muralla: Madrid, Spain, 2007; p. 160. [Google Scholar]
  46. Martín, J.M.S.; Rivero, M.S.; Gallego, J.I.R. Analysis of the balance between tourist potential and supply of rural tourism accommodation using spatial statistical techniques. An application to the province of Cáceres (Spain). Cuad. Tur. 2017, 39, 699–702. [Google Scholar]
  47. Fotheringham, A.S.; Charlton, M.; Brunsdon, C. Geographically weighted regression: A natural evolution of the expansion method for spatial data analysis. Environ. Plan. A 1998, 30, 1905–1927. [Google Scholar] [CrossRef]
  48. Lu, B.; Charlton, M.; Harris, P.; Fotheringham, A.S. Geographically weighted regression with a non-Euclidean distance metric: A case study using hedonic house price data. Int. J. Geogr. Inf. Sci. 2014, 28, 660–681. [Google Scholar] [CrossRef]
  49. Smith, S.L. Room for rooms: A procedure for the estimation of potential expansion of tourist accommodations. J. Travel Res. 1977, 15, 26–29. [Google Scholar] [CrossRef]
  50. Yang, Y.; Luo, H.; Law, R. Theoretical, empirical, and operational models in hotel location research. Int. J. Hosp. Manag. 2014, 36, 209–220. [Google Scholar] [CrossRef]
  51. Sánchez, J.M. El sistema de información geográfica como herramienta de análisis turístico. Una aplicación para la localización idónea de alojamientos rurales en la provincia de Cáceres mediante análisis multicriterio. Rev. Estudios Turíst. 2009, 182, 71–94. [Google Scholar]
  52. Sánchez, M.; Sánchez, J.M. Nuevas técnicas de investigación aplicadas al turismo: La estadística espacial. In M. Figuerola & C. Martín. Modelos De Gestión E Innovación En Turismo; Civitas: Madrid, Spain, 2019; pp. 109–132. [Google Scholar]
  53. Goodrich, J.N. Differences in perceived similarity of tourism regions: A spatial analysis. J. Travel Res. 1977, 16, 10–13. [Google Scholar] [CrossRef]
  54. Wilson, J. The routledge handbook of tourism geographies. In The Routledge Handbook of Tourism Geographies; Routledge: London, UK, 2012; pp. 163–173. [Google Scholar]
  55. Martín, J.M.S.; Rivero, M.S.; Gallego, J.I.R. La evaluación del potencial para el desarrollo del turismo rural. Aplicación metodológica sobre la provincia de Cáceres. Geofocus 2013, 13, 99–130. [Google Scholar]
  56. Sánchez, J.M.; Pérez, M.N.; Jurado, J.C.; Granados, M.M. Detección de áreas óptimas para la implantación de alojamientos rurales en Extremadura. Una aplicación SIG. Lurralde 1999, 22, 367–384. [Google Scholar]
  57. Martín, J.M.S.; Gallego, J.I.R.; Sánchez-Rivero, M. Caracterización espacial del turismo en extremadura mediante análisis de agrupamiento (grouping analysis). Un ensayo técnico. GeoFocus 2017, 19, 207–235. [Google Scholar] [CrossRef] [Green Version]
  58. Getis, A.; Ord, J.K. The analysis of spatial association by use of distance statistics. Geogr. Anal. 2010, 24, 189–206. [Google Scholar] [CrossRef]
  59. Ord, J.K.; Getis, A. Local spatial autocorrelation statistics: Distributional issues and an application. Geogr. Anal. 2010, 27, 286–306. [Google Scholar] [CrossRef]
  60. Griffith, D.A. Teaching spatial autocorrelation by simulation. J. Geogr. High. Educ. 1987, 11, 143–153. [Google Scholar] [CrossRef]
  61. Anselin, L. A local indicator of multivariate spatial association: Extending geary’s c. Geogr. Anal. 2018, 51, 133–150. [Google Scholar] [CrossRef]
  62. Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: New York, NY, USA, 2002; p. 284. [Google Scholar]
  63. Díaz, M.V. Tills y otros depósitos relacionados con la dinámica glaciar en la Sierra do Xistral: Interpretación de litofacies y reconstrucción paleoglaciar. BAGE 2018, 76, 276–302. [Google Scholar] [CrossRef]
  64. Gallego, J.I.R.; Martín, J.M.S. Análisis de la distribución territorial de los alojamientos rurales y convencionales en los núcleos rurales de Extremadura. An. Geogr. Univ. Complut. 2019, 39, 463–484. [Google Scholar] [CrossRef]
  65. Sánchez, J.M.; Rengifo, J.I. Evolución del sector turístico en la Extremadura del siglo XXI: Auge, crisis y recuperación. Lurralde 2019, 42, 19–50. [Google Scholar]
  66. Junta de Extremadura. Extremadura Turismo. Available online: https://turismoextremadura.com/ (accessed on 12 January 2020).
  67. DOE Núm. 74 Decreto 65/2015, de 14 de Abril, Por El Que Se Establece la Ordenación y Sistema de Clasificación de Los Alojamientos de Turismo Rural de la Comunidad Autónoma de Extremadura; DOE: Washington, DC, USA, 2015.
  68. Instituto Nacional de Estadística (INE). Available online: https://www.ine.es/daco/daco42/ocuptr/eotr0019.xlsx (accessed on 15 January 2020).
  69. Instituto Geográfico Nacional (IGN). Centro de Descargas. Available online: http://centrodedescargas.cnig.es/CentroDescargas/ (accessed on 19 January 2020).
  70. Instituto Geográfico Nacional (IGN). Base Topográfica Nacional 1:100.000 (BTN100). Available online: http://www.ign.es/web/resources/docs/IGNCnig/CBG%20-%20BTN100.pdf (accessed on 19 January 2020).
  71. Santos, C.A.S.M.; Vieira, J.A.C. An analysis of visitors’ expenditures in a tourist destination: OLS, quantile regression and instrumental variable estimators. Tour. Econ. 2012, 18, 555–576. [Google Scholar] [CrossRef]
  72. Brida, J.G.; Scuderi, R. Determinants of tourist expenditure: A review of microeconometric models. Tour. Manag. Perspect. 2013, 6, 28–40. [Google Scholar] [CrossRef] [Green Version]
  73. Wilcox, R.R. Robust regression: Testing global hypotheses about the slopes when there is multicollinearity or heteroscedasticity. Br. J. Math. Stat. Psychol. 2018, 72, 355–369. [Google Scholar] [CrossRef]
  74. Wilcox, R.R. Multicolinearity and ridge regression: Results on type I errors, power and heteroscedasticity. J. Appl. Stat. 2018, 46, 946–957. [Google Scholar] [CrossRef]
  75. Sánchez, J.M. La matriz de correlación lineal en climatología. Los riesgos interpretativos: Su reducción o eliminación. Estudios Geogr. 1995, 219, 411–434. [Google Scholar]
  76. Meloun, M.; Militký, J. Detection of single influential points in OLS regression model building. Anal. Chim. Acta 2001, 439, 169–191. [Google Scholar] [CrossRef]
  77. ESRI. Available online: https://doc.arcgis.com/es/insights/analyze/regression-analysis.htm (accessed on 19 February 2020).
  78. Berry, W. Understanding Regression Assumptions; SAGE Publications: Ventura, CA, USA, 1993. [Google Scholar]
  79. Tobler, W.R. A computer movie simulating urban growth in the detroit region. Econ. Geogr. 1970, 46, 234. [Google Scholar] [CrossRef]
  80. Anselin, L. Local indicators of spatial association-lisa. Geogr. Anal. 2010, 27, 93–115. [Google Scholar] [CrossRef]
  81. Tu, J.; Xia, Z.-G. Examining spatially varying relationships between land use and water quality using geographically weighted regression I: Model design and evaluation. Sci. Total Environ. 2008, 407, 358–378. [Google Scholar] [CrossRef] [PubMed]
  82. De Toro, P.; Nocca, F.; Renna, A.; Sepe, L. Real estate market dynamics in the city of Naples: An integration of a multi-criteria decision analysis and geographical information system. Sustainability 2020, 12, 1211. [Google Scholar] [CrossRef] [Green Version]
  83. Sánchez, J.M. Propuesta Metodológica Para La Generación De Información Climática En La Provincia De Cáceres; Fundicot-Extremadura Universidad de Extremadura: Cáceres, Spain, 1995. [Google Scholar]
  84. Sánchez, M. Memoria Turística de Extremadura Por Territorios. Año 2018. Available online: https://www.turismoextremadura.com/.content/observatorio/2019/EstudiosYMemoriasAnuales/Memoria_turistica_por_territorios_2018.pdf (accessed on 12 December 2019).
  85. Thrane, C. Students’ summer tourism: Determinants of length of stay (LOS). Tour. Manag. 2016, 54, 178–184. [Google Scholar] [CrossRef]
  86. Wang, Y.; Davidson, M.C. A review of micro-analyses of tourist expenditure. Curr. Issues Tour. 2010, 13, 507–524. [Google Scholar] [CrossRef] [Green Version]
  87. Wu, L.; Zhang, J.; Fujiwara, A. Tourism participation and expenditure behaviour: Analysis using a scobit based discrete–continuous choice model. Ann. Tour. Res. 2013, 40, 1–17. [Google Scholar] [CrossRef]
  88. Zheng, B.; Zhang, Y. Household expenditures for leisure tourism in the USA, 1996 and 2006. Int. J. Tour. Res. 2011, 15, 197–208. [Google Scholar] [CrossRef]
  89. Jin, C.; Xu, J.; Huang, Z. Spatiotemporal analysis of regional tourism development: A semiparametric geographically weighted regression model approach. Habitat Int. 2019, 87, 1–10. [Google Scholar] [CrossRef]
  90. Ahmadi, M.; Baaghide, M.; Roudbari, A.D.; Asadi, M. Modeling the role of topography on the potential of tourism climate in Iran. Model. Earth Syst. Environ. 2018, 4, 13–25. [Google Scholar] [CrossRef]
  91. Sánchez-Martín, J.M.; Blas-Morato, R.; Rengifo-Gallego, J.I. The dehesas of extremadura, spain: A potential for socio-economic development based on agritourism activities. Forests 2019, 10, 620. [Google Scholar] [CrossRef] [Green Version]
  92. Martín, J.M.S.; Sánchez-Rivero, M.; Gallego, J.I.R. Water as a tourist resource in extremadura: Assessment of its attraction capacity and approximation to the tourist profile. Sustainability 2020, 12, 1659. [Google Scholar] [CrossRef] [Green Version]
  93. Martín, J.M.S.; Gallego, J.I.R.; Sánchez-Rivero, M. Protected areas as a center of attraction for visits from world heritage cities: Extremadura (Spain). Land 2020, 9, 47. [Google Scholar] [CrossRef] [Green Version]
  94. Martín, J.M.S.; Gurría-Gascón, J.-L.; García-Berzosa, M.-J. The cultural heritage and the shaping of tourist itineraries in rural areas: The case of historical ensembles of Extremadura, Spain. ISPRS Int. J. Geo Inf. 2020, 9, 200. [Google Scholar] [CrossRef] [Green Version]
  95. Delgado, L.M.M.; Gallego, J.I.R.; Martín, J.M.S. Hunting tourism as a possible development tool in protected areas of Extremadura, Spain. Land 2020, 9, 86. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Location of the study area.
Figure 1. Location of the study area.
Sustainability 12 04737 g001
Figure 2. The methodological diagram reflects the stages followed in the research.
Figure 2. The methodological diagram reflects the stages followed in the research.
Sustainability 12 04737 g002
Figure 3. Distribution of rural accommodation establishments and main (a) and other (b) regressive variables.
Figure 3. Distribution of rural accommodation establishments and main (a) and other (b) regressive variables.
Sustainability 12 04737 g003
Figure 4. Variable distributions and relationships.
Figure 4. Variable distributions and relationships.
Sustainability 12 04737 g004
Figure 5. Analysis of the remainders: (a) Gaussian distribution of standardised remainders, (b) Moran’s I spatial autocorrelation; (c) territorial distribution of the standardised remainders.
Figure 5. Analysis of the remainders: (a) Gaussian distribution of standardised remainders, (b) Moran’s I spatial autocorrelation; (c) territorial distribution of the standardised remainders.
Sustainability 12 04737 g005
Figure 6. Local R2 (a) and residual (b).
Figure 6. Local R2 (a) and residual (b).
Sustainability 12 04737 g006
Figure 7. Beds in rural accommodation establishments (a) and predicted values (b).
Figure 7. Beds in rural accommodation establishments (a) and predicted values (b).
Sustainability 12 04737 g007
Figure 8. Graphic distribution of the beds observed vs. the beds calculated.
Figure 8. Graphic distribution of the beds observed vs. the beds calculated.
Sustainability 12 04737 g008
Table 1. Main tourist parameters of Extremadura (2019).
Table 1. Main tourist parameters of Extremadura (2019).
Type of EstablishmentEstablishmentsBedsBeds Including Extra Beds
Hotels43318,37319,994
Rural accommodation862893311,055
Tourist apartment39630153015
Hostels4518271827
Campsites3210,06610,066
Total176842,21445,957
Table 2. Main parameters of rural tourism establishments (2019).
Table 2. Main parameters of rural tourism establishments (2019).
Autonomous RegionBeds *TravellersOvernight StaysAverage StayOccupation (%)
Andalusia21,642395,7231,358,9673.4318.06
Aragón9644194,182581,7493.0017.03
Principality of Asturias15,221302,765973,0773.2119.26
Balearic Islands10,470329,4791,249,6953.7946.55
Canary Islands319978,420366,0494.6729.22
Cantabria8784263,220625,3282.3824.56
Castilla y León33,113829,4511,825,0652.2015.88
Castilla-La Mancha14,283313,030688,2442.2013.05
Cataluña18,790496,7481,244,3282.5019.03
Valencia10,414186,649467,7312.5112.49
Extremadura7758224,908490,9492.1817.64
Galicia6496228,449425,9741.8619.81
Madrid3753169,950294,2311.7321.76
Murcia240656,850125,5532.2113.97
Navarra6821136,901449,6193.2819.89
Basque Country4322165,247415,7032.5228.19
La Rioja116838,95887,7772.2521.09
* Data for August 2019.
Table 3. Aspects included in the survey completed by the tourists.
Table 3. Aspects included in the survey completed by the tourists.
QuestionsAnswers
SexMale/Female
AgeYear of birth
StudiesNo studies/Primary studies/Secondary/University
Income Up to EUR 20,000/EUR 20,000 to 30,000/Over EUR 30,000
Place of residenceTown or City
Place of overnight stayTown/Towns
Type of tourism practisedRural/Cultural/Rural and cultural
Accommodation type chosenTourist hostel/Tourist apartment/Campsite
House of one’s own or of friends or family/
Holiday cottage/Hostelry/Hostel or guest house
Spa hotel/1 to 3 star hotel/4 or 5 star hotel
Rural hotel/Not stated/Other accommodation
Duration of the stayNo. of days
Activities carried outCultural visits/rural tourism/birdwatching/enjoying gastronomy/visiting caves or geological formations/using bathing areas
Daily expense per person<EUR 25 /EUR 25 to 50/EUR 50 to 75 > EUR 75
Noteworthy reasons for choosing the destinationBathing area/rivers or reservoirs/natural areas/type of vegetation/accessibility/cultural resources/climate/altitude/other (open reply)
Assessment of the destination1 to 10
ObservationsFree opinion
Table 4. Alphanumerical variables and how they were obtained.
Table 4. Alphanumerical variables and how they were obtained.
VariableHow Obtained
MountainEuclidian distance
SierraEuclidian distance
Bathing areaShortest distance by road
ReservoirShortest distance by road
Monfragüe National ParkShortest distance by road
Garganta de los Infiernos Nature ReserveShortest distance by road
Villuercas-Ibores- Jara GeoparkShortest distance by road
Tajo-Internacional Nature ReserveShortest distance by road
Historical EnsembleShortest distance by road
Interpretation CentreShortest distance by road
Distance to highwayShortest distance by road
Resident population Direct allocation to the centre
Table 5. Activities and motivations.
Table 5. Activities and motivations.
Activities/MotivationsN%
Rural tourism126728.82
Cultural visits110825.20
Tourism in rivers, gorges, or reservoirs63614.46
Gastronomy63514.44
Birdwatching2455.57
Visiting mines, caves, or geological formations1493.39
Practising sport1433.25
Stargazing691.57
Visiting wine cellars611.39
Participating in events (congresses or meetings)390.89
Visiting scenarios of films or TV series180.41
Learning Spanish140.32
Hunting130.30
Table 6. Tourist figures in rural accommodation establishments according to types of landscape.
Table 6. Tourist figures in rural accommodation establishments according to types of landscape.
Tourist TerritoriesBedsTravellersOvernight StaysAverage StayAnnual Occupation (%)Type of Landscape
Valle del Jerte and La Vera259546,546100,3832.1631.36Mountain
Sierra de Gata, Las Hurdes, and Valle del Alagón143830,16371,9812.3921.91Mountain
Valle del Ambroz and Tierras de Granadilla92137,81089,1372.3646.97Mountain
Alqueva, Sierra Suroeste, and Tentudía50510,61820,2881.9425.41Sierra
Trujillo, Miajadas, and Montánchez67119,42740,4842.0825.51Peneplain
Monfragüe Biosphere Reserve491590012,8162.1727.30Protected space
Vegas del Guadiana23910,77520,8191.9325.96River valley
Table 7. Regressive variables applicable.
Table 7. Regressive variables applicable.
Distance to:Model 1 (Not Improved)Model 2 (Improved)Model 3 (Improved)
Historical EnsembleXXX
Interpretation CentreX
HighwayXXX
ReservoirXXX
Bathing AreaXXX
Monfragüe National ParkX X
Garganta de los Infiernos Nature ReserveX X
MountainsXXX
SierrasXX
Villuercas- Ibores-Jara GeoparkX X
Tajo-Internacional Nature ReserveX X
Population in 2019X
Dependent variable: Beds in rural accommodation establishments.
Table 8. Criteria used in exploratory regression.
Table 8. Criteria used in exploratory regression.
Search CriterionCutoffTrials# Passed% Passed
MODEL 1Min Adjusted R-Squared >0.304095 00
Max Coefficient p-value <0.05 4095 3999.74
Max VIF Value<3.004095 169541.39
Min Jarque-Bera p-value >0.05 4095 00
Min Spatial Autocorrelation p-value >0.20 3400
MODEL 2Min Adjusted R-Squared >0.206323.17
Max Coefficient p-value <0.05 634977.78
Max VIF Value <3.006363100
Min Jarque–Bera p-value >0.16300
Min Spatial Autocorrelation p-value >0.11600
MODEL 3Min Adjusted R-Squared >0.205119117.81
Max Coefficient p-value <0.05 51115229.75
Max VIF Value <3.0051122343.64
Min Jarque–Bera p-value >0.16300
Min Spatial Autocorrelation p-value >0.11600
Table 9. Summary of variable significance.
Table 9. Summary of variable significance.
Distance to:Model 1Model 2Model 3Model 1Model 2Model 3Model 1Model 2Model 3
% Significant% Negative% Positive
Historical Ensemble99.46100.0098.83100.00100.00100.000.000.000.00
Interpretation Centre98.54 100.00 0.00
Highway92.1493.7589.840.000.000.00100.00100.00100.00
Reservoir88.6278.1291.800.000.000.00100.00100.00100.00
Bathing Area83.54100.0087.1198.44100.00100.001.560.000.00
Monfragüe National Park70.41 77.7326.46 26.1773.54 73.83
Garganta de los Infiernos Nature Reserve63.62 73.0590.77 92.199.23 7.81
Mountains51.2296.8855.0879.15100.0084.7720.850.0015.23
Sierras34.4781.25 5.860.00 94.14100.00
Villuercas-Ibores-Jara Geopark31.59 39.4543.65 38.6756.35 61.33
Tajo-Internacional Nature Reserve31.15 41.4227.00 22.2773.00 77.73
Population in 20193.17 35.84 64.16
Table 10. Summary of multicollinearity.
Table 10. Summary of multicollinearity.
Distance to:Model 1Model 2Model 3
VIFViolationsVIFViolationsVIFViolations
Historical Ensemble50.4516641.4901.940
Interpretation Centre2.550
Highway2.3501.4602.830
Reservoir1.3701.2201.330
Bathing Area61.5516641.43020
Monfragüe National Park66.051632 57.15208
Garganta de los Infiernos Nature Reserve5.62490 42.97200
Mountains6.126011.90023.07200
Sierras2.0101.860
Villuercas-Ibores-Jara Geopark1.330 4.7950
Tajo-Internacional Nature Reserve3.1864 4.4956
Population in 20191.070
Table 11. Assessment of models.
Table 11. Assessment of models.
MethodMODEL 1MODEL 2MODEL 3
ParametersIndependent VariablesParametersIndependent VariablesParametersIndependent Variables
Highest Adjusted R-Squared AdjR2: 0.24(−) Mountain ***AdjR2: 0.21(−) Mountain ***AdjR2: 0.22(−) Bathing area ***
AICc: 3850(−) Bathing area ***AICc: 3864(−) Bathing area ***AICc: 3860(+) Reservoir ***
JB: 0.0(+) Reservoir ***JB: 0.0(+) Reservoir ***JB: 0.0(−) Garganta de los Infiernos NR ***
K(BP):0.0(−) Historical Ensemble ***K(BP):0.0(−) Historical Ensemble ***K(BP):0.0(−) Historical Ensemble ***
VIF: 1.40(−) Interpretation Centre ***VIF: 1.40(+) Highway ***VIF: 1.45(+) Highway ***
SA: 0.0(+) Highway ***SA: 0.0 SA: 0.0
Residual Normality (JB)JB: 0.0(+) Sierra ***JB: 0.0(+) Sierra ***JB: 0.0(+) Reservoir**
AdjR2: 0.086AdjR2: 0.086AdjR2: 0.008
AICc: 3918AICc: 3918AICc: 3949
K(BP): 0.000004K(BP): 0.000004K(BP): 0.000004
VIF: 1.00VIF: 1.00VIF: 0.13
SA: 0.0SA: 0.0SA: 0.0
Residual Spatial Autocorrelation (SA)SA: 0.0465(+) SierraSA: 0.00006(−) Mountain ***SA: 0.0465(−) Bathing area ***
AdjR2: 0.25(−) Bathing area **AdjR2: 0.21(−) Bathing area ***AdjR2: 0.23(+) Reservoir ***
AICc: 3849(+) Reservoir ***AICc: 3864(+) Reservoir ***AICc: 1391(+) Monfragüe NP
JB: 0.0(+) Monfragüe NPJB: 0.0(−) Historical Ensemble ***JB: 0.0(−) Garganta de los Infiernos NR **
K(BP):0.0(−) Garganta de los Infiernos NR **K(BP):0.000005(+) Highway ***K(BP):0.0(−) Villuercas-Ibores-Jara Geopark
VIF: 60.62(-)Villuercas- Ibores-Jara GeoparkVIF: 1.40 VIF: 53,39(−) Tajo Internacional NR
(−) Tajo Internacional NR (−) Historical Ensemble ***
(−) Historical Ensemble *** (+) Highway ***
(−) Interpretation Centre ***
(+) Highway ***
Note: AdjR2: adjusted R-squared; VIF: max variance inflation factor; AICc: Akaike’s information criterion; SA: global Moran’s I p-value; JB: Jarque–Bera p-value; model variable sign: (+/−); K(BP): Koenker (BP) statistic p-value; model variable significance: (** = 0.05; *** = 0.01).
Table 12. Summary of ordinary least squares (OLS) results—model variables and diagnostics.
Table 12. Summary of ordinary least squares (OLS) results—model variables and diagnostics.
VariableInterceptMountainBathing AreaReservoirHistorical EnsembleHighway
Coefficient (a)38.08944−0.00014−0.0004020.000381−0.0006360.000525
StdError4.8696590.0000290.0001070.0001350.0001180.000089
t-Statistic7.821788−4.792055−3.7676742.818674−5.4093495.886486
Probability (b)0.000000 *0.000003 *0.000201 *0.005075 *0.000000 *0.000000 *
Robust_SE5.8116590.0000270.0000920.0001430.0001310.000098
Robust_t6.55397−5.129564−4.3784852.662947−4.838815.328641
Robust_Pr (b)0.000000 *0.000001 *0.000019 *0.008069 *0.000003 *0.000000 *
VIF (c)--------1.2999131.3552921.1958131.4034491.339888
* An asterisk next to a number indicates a statistically significant p-value (p < 0.01); (a) Coefficient: this represents the strength and type of relationship between each explanatory variable and the dependent variable; (b) Probability and Robust Probability (Robust_Pr); (c) variance inflation factor (VIF).
Table 13. OLS diagnostics.
Table 13. OLS diagnostics.
Input Features:Population CentresDependent Variable:Rural Accommodation
Number of Observations:388Akaike’s Information Criterion (AICc)3864.24795
Multiple R-Squared 0.223397Adjusted R-Squared0.213232
Joint F-Statistic21.977147Prob (>F), -5382 degrees of freedom:0.000000 *
Joint Wald Statistic43.364549Prob (>chi-squared) -5 degrees of freedom:0.000000 *
Koenker (BP) Statistic32.437577Prob (>chi-squared) -5 degrees of freedom:0.000005 *
Jarque-Bera Statistic 3271.23973Prob (>chi-squared),-2 degrees of freedom:0.000000 *
* An asterisk next to a number indicates a statistically significant p-value (p < 0.01).
Table 14. Synthesis of geographically weighted regression (GWR) models obtained.
Table 14. Synthesis of geographically weighted regression (GWR) models obtained.
MethodCorrected Akaike’s Information Criterion (AICc)Cross Validation (CV)Bandwidth Parameter
Kernel Fixed Bandwidth (metres)63,627.42,677,852.715,000
Residual Squares (a)2,815,576.73,759,062.11,248,124.3
Effective Number (b)28.06.031.7
Sigma (c)88.499.2123.9
AICc (d)4598.94676.71443.2
R2 (e)0.4750.2990.665
R2 Adjusted (f)0.4360.2900.538
Condition number (g)<30<30>30
Kernel Adaptive Neighbours15838815
Residual Squares (a)2,572,646.73,531,296.933,308.1
Effective Number (b)35.212.37.7
Sigma (c)85.496.9159.4
AICc (d)4579.14661.0−32,474.5
R2 (e)0.5200.3420.966
R2 Adjusted (f)0.4740.3220.791
Condition number (g)<30<30>30
(a): The sum of the residual squares obtained by the model; (b): this value reflects a balance between the variance of the adjusted values and the influence on the coefficient estimates; (c): the square root of the standardised residual sum of squares in which the latter is divided between the effective degrees of freedom of the residual; (d): this is a measurement of the performance of the model and is useful for comparing different regression models; (e): a measurement of the goodness of the adjustment; (f): the calculations for the adjusted square r-value standardise the numerator and denominator by means of the degrees of freedom; (g): this assesses the local multicollinearity.
Table 15. Extreme results of positive and negative residuals (detail).
Table 15. Extreme results of positive and negative residuals (detail).
CentreBeds ObservedBeds CalculatedResiduals
Navaconcejo298155143
Hervás253139114
Jarandilla de la Vera254140114
Gargantilla8126−117
Talaveruela de la Vera10152−141
Viandar de la Vera0150−150

Share and Cite

MDPI and ACS Style

Sánchez-Martín, J.-M.; Gurría-Gascón, J.-L.; Rengifo-Gallego, J.-I. The Distribution of Rural Accommodation in Extremadura, Spain-between the Randomness and the Suitability Achieved by Means of Regression Models (OLS vs. GWR). Sustainability 2020, 12, 4737. https://doi.org/10.3390/su12114737

AMA Style

Sánchez-Martín J-M, Gurría-Gascón J-L, Rengifo-Gallego J-I. The Distribution of Rural Accommodation in Extremadura, Spain-between the Randomness and the Suitability Achieved by Means of Regression Models (OLS vs. GWR). Sustainability. 2020; 12(11):4737. https://doi.org/10.3390/su12114737

Chicago/Turabian Style

Sánchez-Martín, José-Manuel, José-Luis Gurría-Gascón, and Juan-Ignacio Rengifo-Gallego. 2020. "The Distribution of Rural Accommodation in Extremadura, Spain-between the Randomness and the Suitability Achieved by Means of Regression Models (OLS vs. GWR)" Sustainability 12, no. 11: 4737. https://doi.org/10.3390/su12114737

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop