Statistical Relations of the Qualitative Attributes of Real Properties Subject to Mass Appraisal

Abstract Research background: Every real estate may be described with a multitude of attributes. In the process of real estate appraisal only those properties are taken into account that significantly affect its value. Mass appraisal involves a simultaneous valuation of many similar real properties, carried out in the same manner and at the same time. The algorithm applied to mass appraisal ought to ensure a uniform approach to the valuation of all real estate of the same type in an objective fashion. Purpose: The purpose of the paper is to define the weights of attributes in the process of real estate mass appraisal on the basis of relationships between unit property value and the values of attributes. Research methodology: The weights were defined on the grounds of partial correlation coefficients for the qualitative properties (Spearman rank correlation coefficient and τB Kendall correlation coefficient). Results: The signs of certain correlation coefficients were discordant with the actual direction of the relations between the analysed properties. The problem was avoided by employing partial correlation coefficients. On the basis of the calculated partial correlation coefficients the weights of individual attributes were calculated. Of all analysed coefficients the partial τB Kendall correlation coefficient is methodologically the most suitable one. Novelty: The use of partial correlation coefficients for determining attribute weights is an innovative approach and is applied in the article.


Introduction
The process of real estate valuation is a complex undertaking, which requires a property appraiser 1 to take into account a multitude of factors. It needs to be noted that the impact of individual factors on real property value is unequal -some exert a greater influence, whereas other factors have less impact. Determining the weights of the market properties affecting real estate value is a highly difficult task. In the process of establishing the value of real estate property appraisals are obliged to take into account the provisions set forth in the acts of law Market properties (attributes) that ought to be taken into account in the process of appraisal include location, physical, technical, use-related as well as legal characteristics. These factors significantly affect price differentiation. Owing to the specificity of various types of real properties and of the individual real property markets themselves, a property appraiser each time decides what attributes they will take into account in valuing a real property. For instance, in the appraisal of land real property value the attributes may include as follows: location, land area, plot shape, plot technical infrastructure, access, surroundings and neighbourhood.
In order to standardise the description of a real property for individual attributes that are applied in the valuation process, attribute states are defined -from the worst to the best -at the same time converting the analysed properties to an ordinal scale. For instance, the "neighbourhood" attribute can be described as: troublesome, unfavourable, average, favourable.
Passing over the fact that such an attribute presentation is imprecise, its significance and impact on real property value may vary depending on the intended purpose of the analysed real property.
For example, if there is busy road in the vicinity of a real property, then in the case of a land plot with the designated purpose of residential buildings, such a neighbourhood will be unfavourable or troublesome, whereas if a given plot is designated for industrial purposes, then the proximity of the road is very much favourable. Furthermore, such an attribute as "area", even though it may be presented in a quotient scale, is frequently described by property appraisers as: small, average, and large. Similarly as in the case of the preceding characteristic, its significance will vary for real properties of different designated purposes.
The next stage of appraising a real property entails determining the strength with which individual attributes affect real property value -determining weights. In accordance with the Common General Rules of Valuation (Common General Rules of Valuation 2008), the degree of attributes influence on real property value can be defined depending on the condition of a market, taking into account: a) the results of the data analysis concerning the market prices and characteristics of similar real properties being the subject of real estate trade in a real estate market specified for the needs of an appraisal; b) the analogy to the local markets similar in terms of their type and area; c) the examination and/or observation of the preferences of potential real property buyers; d) another reliable manner.
Real property appraisers typically employ the second or the third method, while the first one raises the greatest doubts, since it is involves conducting a statistical analysis in accordance with the assumption of quantitative studies (including, inter alia, having an adequately extensive data base at one's disposal, which in the case of atypical real properties is difficult to achieve).
In the article the use of correlation coefficients was proposed for the purpose of the objective determination of the attributes impact on real property value.
Such an unbiased approach to the issue of determining the influence of individual attributes on real property value takes on a special significance in mass real property appraisal. Mass appraisal is an appraisal conducted at the same time with the use of the same tool (an algorithm) for a multitude of real properties simultaneously (Hozer, Kokot, Kuźmiński, 2002). In order to ensure comparability and the possibility of calculation generalisations, the employed tool ought to be automated, while the impact of a human factor ought to be limited to a minimum.
When analysing the relationships between attributes in the real property market, the following problems may be encountered: Attributes in the real property market are most frequently measured in an ordinal scale, but also other scales, like Likert and Osgood ones are applied (Foryś, Gaca, 2016); therefore the use of coefficients based on ranks is the most justified, namely: ρ Spearman rank correlation coefficient, τ Kendall coefficient (Doszyń, 2017) or Γ general correlation coefficient. However, in practice in the process of real property appraisal (but also in academic papers concerning the use of statistical methods in real property appraisal) property appraisers (or scientists being property appraisers) apply the Pearson product-moment correlation coefficients 2 most often (inter alia Sawiłow, 2010;Walkowiak, Zydroń, 2012), even though in the case of such properties they should not be used at all. In the developed real property markets statistical methods constitute a well-recognised and frequently applied tool of real property market analysis (Bruce, Sundell, 1977). In Poland statistical methods were employed for the analysis of factors affecting the value of real estate by inter alia: J. Hozer, M. Zwolankowska, S. Kokot, W. Kuźmiński (2000). The purpose of the article is the determination of weights of attributes by using partial correlation coefficients based on the ordinal scale.

Research methodology
The following coefficients were used to analyse the interrelation of the properties affecting real property value: -ρ Spearman rank correlation coefficient, For comparison, the results obtained on the basis of the above-specified coefficients were juxtaposed with the results obtained for the r Pearson product-moment correlation coefficient 3 .
The Spearman rank correlation coefficient is computed as follows (Kendall, 1948, p. 29): The Kendall correlation coefficient occurs in several variants: τ A , τ B and the τ C . τ A coefficient is calculated with the assumption that there are no tied ranks, whereas the τ C coefficient is the most suitable for the data assuming the form of a contingency table. In the case of the data used in this research, where the number of observations is substantially higher than the number of varieties of the examined attributes, tied ranks are certain to occur. Which is why, the τ B Kendall coefficient was applied and computed as follows (Parker et al., 2011, p. 5): -Observations are linked in all possible pairs: (x i , y i ) and (x j , y j ), i ≠ j.
-If both x i > x j and y i > y j or x i < x j and y i < y j , then such a pair is called concordant.
The number of such pairs is equal to n c .
-If both x i > x j and y i < y j or x i < x j and y i > y j , then such a pair is called discordant.
The number of such pairs is equal to n d .
-If x i = x j or y i = y j , then such a pair is neither concordant nor discordant. It is a tied pair.
Having determined the above values, the τ B Kendall coefficient is computed in accordance with the following formula: The Γ general correlation coefficient is yet another coefficient. It is calculated on the basis of the following assumptions: -Set of n objects, described by two variables -x and y.
-Both variables form a set of values {x i } i ≤ n and {y i } i ≤ n .
-For each i-th and j-th pair of observations a result for the x variable is assigned, designated by a ij and for y variable, designated by b ij .
-Values a ij and b ij are asymmetric, i.e. a ij = -a ji and b ij = -b ji .
The Γ general correlation coefficient is computed on the basis of the following formula (Walesiak, 2016, p. 40): The Spearman coefficient and τ A Kendall coefficient are particular cases of the Γ general correlation coefficient. If r i and s i are respectively the ranks of x and y variables and: , then the above coefficient will become the τ A Kendal coefficient.
If a ij = r j -r i , b ij = s j -s i , then the above coefficient will become the ρ Spearman rank correlation coefficient. All the coefficients determined with formulas (1) or -1 -the relation is functional. The closer their value is to 1 or -1, the stronger the relation is.
Examining the correlation between attributes and the value of 1 m 2 , it may occur that there will be a strong correlation between attributes. It may affect the strength and direction of the relation between attributes and the value of 1 m 2 of the real property. In order to eliminate the impact of the remaining attributes during the correlation test between a given attribute and the value of 1 m 2 , partial correlation coefficients were calculated (Han, Zhu, 2008, p. 160): where: y -vector of the explained variable value, x -vector of the explaining variable value, z -vector (or matrix) of the remaining variables, where: i -number of an analysed attribute, n -number of attributes.
The greater the share of the absolute value of the partial correlation coefficient of a given attribute is in the sum of the absolute values of all attributes coefficients, the greater the weight ought to be assigned to a given attribute.
The attribute of "physical properties" was determined on the basis of plot shape. It was assumed that the optimal shape for a land plot is a rectangle with the ratio of the length of its sides measuring 3 : 2. Having established the data on a plot's circumference, its area was then computed with the assumption of its rectangular shape and compared to the actual area (the actual area was divided by the area obtained for the assumed rectangle of 3 : 2). If that ratio was greater than 0.9, then the value of that property was equal to 2. If it was within the range of 0.5-0.9 -the property value was 1. If it was less than 0.5, the property assumed the value of 0 (more on the determination of that property in Dmytrów, Gnat, Kokot, 2018).
The attribute of "area" assumed the value of 0, if the plot surface was greater than 1,200 m 2 (the greater the area, the lower the value of 1 m 2 was relatively). If the area was within the range of 500-1,200 m 2 , then the property assumed the value of 1 and if the attribute was less than 500 m 2 -2. Such a division of real estates was discussed with experts (real estate appraisers).
During the first stage, the correlation coefficients between the value of 1 m 2 and the values of individual attributes were calculated. The results are presented in Table 1. The values of (r) Pearson product-moment correlation coefficients were given for comparison only, as mentioned before, they should not be used for the examined properties. The computed values demonstrate that apart from the location, the influence of the remaining attributes on the value of 1 m 2 of a real property was average at the most (for buildings erected and technical infrastructure). In the case of the Γ general correlation coefficient and τ B Kendall coefficient, all the attributes apart from physical properties and the area, had a significant impact on the value of 1 m 2 of a real property, although in the case of neighbourhood the values of the coefficients were low enough that it was difficult to note any significant influence of that attribute on real property value. In the case of the ρ Spearman coefficient additionally the impact of neighbourhood occurred to be insignificant.
The calculated correlation coefficients show that the relation between the value of 1 m 2 and neighbourhood is negative, which should not occur (a higher value of 1 m 2 of a real property accompanied by "worse" neighbourhood). A negative relation was also established between the value of 1 m 2 and plot area, which indicates that the market assigned a higher value to the greater area of land when real properties were designated for residential development.  Table 2. The calculated τ B Kendall correlation coefficients demonstrate that the relation between the neighbourhood and buildings erected and technical infrastructure was statistically significant.
The attribute of plot area was significantly correlated to physical properties, access and location.
Similar results were obtained for other correlation coefficients.
In order to eliminate the impact of the remaining attributes on the value of 1 m 2 of a real property, the values of partial correlation coefficients on the basis of formula (4) were calculated (the Pearson partial correlation coefficients again were computed for comparison only). Because there were values of general correlation coefficients equal 1 or -1, some values of the obtained partial coefficients exceeded the borders 〈-1; 1〉 and such results did not make sense, therefore a general correlation coefficient was excluded from any further analysis. The calculated partial correlation coefficients are presented in Table 3. When comparing the values of partial coefficients with the regular ones in Table 1, it can be observed that individual coefficients differ significantly. The τ B Kendal and ρ Spearman partial correlation coefficients show that the relation between the value of 1 m 2 and all the attributes, apart from the physical properties and those buildings erected, were statistically significant.
The use of the procedure that led to eliminating the impact of the remaining attributes resulted in the increase of the values of all coefficients, apart from the ones measuring the relation between the value of 1 m 2 and physical properties. The values of all the partial coefficients are positive, i.e. the signs correctly reflect the correlations between the value of 1 m 2 and individual attributes.
When comparing the relevant values of the τ B Kendal and ρ Spearman correlation coefficients, it becomes evident that the latter ones are always higher (just as they were in the case of the regular correlation coefficients, presented in Table 1). This outcome may be explained by the fact that the τ B Kendall coefficient uses only the relation of predominance, whereas the ρ Spearman coefficient also accounts for the difference between property ranks.
That is why, the Spearman coefficient yields values closer to those calculated for the Pearson product-moment correlation coefficient. However, it is difficult to evaluate the actual difference between the attribute values (it is hard to determine whether the difference between the attribute value 0 and 1 is the same as between 1 and 2), which is why it seems that the τ B Kendall coefficient is better for analysing the correlation in a study of interrelations between properties and that it ought to be taken into account when computing the weights for individual attributes, The weights calculated on the basis of formula (5) are presented in Table 4. The order of the significance of the attributes obtained with the use of the τ B Kendall and ρ Spearman partial correlation coefficients was the same. The greatest weight was determined for the following attributes: location and technical infrastructure, and the lowest weight for: physical properties and buildings erected. The weights assigned to individual attributes obtained with the use of individual coefficients differed highly from each another. Although in the case of the ρ Spearman and r Pearson coefficients the weights assigned to individual attributes were rather equally distributed, while the weights determined on the basis of the τ B Kendall coefficient assigned decidedly the greatest weight to location, followed by technical infrastructure. The advantage of the τ B Kendall coefficient is further evident in that the attributes insignificantly affecting the value of 1 m 2 -i.e. physical properties and buildings erected, the weights assigned on its basis were distinctly lower than for the remaining coefficients.

Conclusions
The results of the conducted analyses demonstrate that when studying the relation between the value of 1 m 2 of a real property and the attributes presented in an ordinal scale, partial correlation coefficients ought to be used. The values of regular correlation coefficients are distorted by the fact of a strong correlation between the attributes themselves. The application of the Pearson product-moment correlation coefficient should be avoided, since attributes are variables whose variants are recorded in an ordinal scale. Among the determined partial correlation coefficients, the τ B Kendall coefficients occur to be the most useful. The ρ Spearman rank correlation coefficients should always be used for the variables at least in an interval scale.
The application of the τ B Kendall coefficients enables an objective determination of the influence the individual attributes exert on the value of a real property. In the analysed example, real property value was determined in 44% by its location, in 28% by its technical infrastructure, in 11% by its access, in 8% by its neighbourhood and in 7% by its area. The value of a real property was only very slightly affected by buildings erected on it and the physical properties of a land plot (1%).
The proposed manner of defining the impact of individual properties on a real property value may find its application in the practice of real estate appraisal -for instance, in the method of a statistical analysis of the market or in the automatic algorithms of mass real property appraisal. It is worth noting that this approach is universal -it can be applied for all types of real estates in both urban or rural areas. The main limitation of this approach is the necessity of having a big and reliable database. Only in such a case the relationships between the attributes and values of real estates can be estimated by means of correlation coefficients.