Research on Spatial and Dynamic Planning Methods for Settlement Buildings Based on Data Mining

. Traditional settlements are widely concerned by academic circles for their unique settlement patterns, exquisite residential buildings, and rich historical and cultural connotations, and their protection and development is an important proposition for rural revitalization. Therefore, from the perspective of big data mining (BDM), this paper explores its application in architectural space and settlement protection of traditional settlements in Hainan and provides new ideas for the protection and renewal of traditional settlements in Hainan. The attribute elements of spatial data of settlement groups are analyzed by the decision tree classiﬁcation mining method. In order to avoid the multivalued tendency of ID3 algorithm and improve the eﬃciency of decision tree generation by ID3 algorithm, an improved ID3 algorithm is proposed by introducing user interest and simplifying the calculation process of the algorithm. At the same time, the graph theory recognition method of grid pattern is proposed. Aiming at the intersection graph and direction relation graph of straight line pattern, grid pattern recognition is realized by solving the connectivity, intersection, and subsequent construction of the maximum complete subgraph. Experiments show that the improved ID3 algorithm has better running eﬃciency than the parallel algorithm based on cooccurrence matrix. The analysis of the architectural space of traditional settlements in Hainan will help us better grasp social activities and provide direction for the protection and renewal of traditional settlements from the perspective of tourists and residents.


Introduction
Hainan Island is an ideal research place because of its special status as an island-type natural geographical region. e relative independence and integrity of geography and geomorphology provide a relatively independent and complete geographical space for the survival of its population, so that its cultural evolution, historical changes, and the related evolution of traditional settlements and architectural space forms naturally remain relatively independent. at is to say, Hainan Island has the coincidence of geographical area, administrative area, and cultural area in research space and keeps relative independence. How to select a reasonable research area is the premise for the success of scientific research [1]. As a research area, it is necessary to avoid separating the integrity of regional natural geography and landform, human history development and evolution, regional living environment, and so on due to administrative division.
In China, the protection and renewal process of traditional settlements is mostly a top-down process, ignoring the existence of "people" as the main body of villages. e traditional settlements after protection are out of touch with the times, resulting in the loss of original villagers and hollowing out of villages, which cannot meet people's needs for places of social activities [2,3]. Big data mining (BDM) is dynamic and time-sensitive, closely related to people's social activities, and rich in data. rough the analysis and application of data throughout urban planning, to a certain extent, it can make up for the problems of lack of fair participation, lack of dynamics, and lack of planning foresight in the current planning process [4].
rough the comprehensive analysis of traditional data and network data, the collection can reflect the opinions of most residents and tourists on villages, which makes the protection and renewal of traditional settlements more humane. Under the background of information age, traditional settlements have been faced with the problem of how to balance protection and inheritance, development, and renewal. e data of BDM come from traditional survey data and network data. is paper analyzes the problems existing in the protection and renewal of traditional settlements from two aspects of traditional data and network data and provides a new way of thinking for planners and managers. It has certain guiding significance for how to give consideration to the inheritance of history and culture and meet the needs of people's social activities in the current social background.
From the perspective of big data mining (BDM), this paper discusses its application in the architectural space and settlement protection of traditional settlements in Hainan, so as to provide new ideas for the protection and renewal of traditional settlements in Hainan. e decision tree classification mining method is used to analyze the attribute elements of settlement spatial data. In order to avoid the multivalued tendency of ID3 algorithm and improve the efficiency of ID3 algorithm in generating decision tree, an improved ID3 algorithm is proposed by introducing user interest and simplifying the calculation process of the algorithm. At the same time, a graph theory recognition method of grid pattern is proposed. For the intersection graph and direction relation graph of line pattern, grid pattern recognition is realized by solving the maximum complete subgraph, connected graph, intersection graph, and subsequent construction.

Related Work
Settlement and architectural space form are the main contents of architectural research. e research on traditional settlements and architectural space forms in domestic architecture began with residential buildings, and at first, the in-depth anatomical study was made with single residential buildings. Zhang et al. [5] made detailed investigation and research on traditional residential buildings in Yunnan and Sichuan and accumulated a large amount of research and mapping data.
Xie [6] systematically studied the formation of settlements and analyzed the relationship between topography and various settlements. e early research on settlement geography mainly focused on the influence of natural conditions on settlement layout and the location factors of settlement layout [7]. Junli et al. [7] deeply studied the types, distribution, and evolution of rural settlements. Gong and Xin [8] conducted special investigation and research on rural settlements and bazaars. Alnaim [9] divided the architectural form of traditional dwellings from three aspects: climate, terrain, and materials. Bi et al. [10] used the "comprehensive analysis method of historical people's region" to study the residential settlements in southern China from various angles and comprehensively, taking the regional life circle as the basic research unit, and analyzed and discussed the architectural patterns and evolution of different regions, enriching the research content of residential settlements. e study of traditional settlements from the perspective of human geography has gradually become universal.
Tokarev et al. [11] from the ecological perspective, combined with settlement geography, dialect theory and ethnology, culturology, sociology, and other humanities, constructed the research framework of the ecological system theory of rural settlements in Southeast China. Stober et al. [12] discussed the adaptation mechanism of settlement spatial form to ecosystem, the ecological connotation of settlement spatial form, and the ecological aesthetic characteristics of settlement. Dewa Nyoman Angga et al. [13] took Huizhou traditional settlements as an example, focusing on the ecological factors and simple natural view of human settlements in four aspects: site selection planning, single building, relationship with natural resources, and multieffect utilization of space. Wijaya and Dwejendra [14] discussed the overall relationship between the living environment and the surrounding natural environment.
Rimisho et al. [15] used the method of architectural typology to analyze the prototype of traditional ancient settlements and systematically summarized and described the basic layout and spatial structure characteristics of traditional ancient settlements. Fayaz et al. [16] probed into the boundary cultural characteristics of Xiamei ancient dwellings from three aspects: the natural and humanistic environment, the relationship between Zou's historical development and settlement space formation and residential form, and the decorative features of buildings and their cultural values. Ibraev and Keneshov [17] explored the methods of inheriting and developing the traditional urban style of historical blocks in modern urban design and analyzed the thinking brought by the lack of urban style, especially in historical blocks with rich traditional characteristics. Shuvalov [18] explained the fractal architectural aesthetics in an easy-to-understand way and used the method of fractal quantitative analysis and comparison to deeply discuss the theoretical methods and manifestations of fractal architectural aesthetics. Baranov [19] designed 38 specific index evaluation systems from the five environmental angles of humanity, safety, life, economy, and ecology.
rough factor analysis, 38 original indexes are reduced in dimension to obtain two main factors: quality of life factor and natural ecological environment factor, and their influences on urban development are analyzed.

Basic Concept
3.1. Settlement. Settlements often refer to villages in ancient books in China. "In the history of the development of traditional settlements, there are many types of settlements in the literature" [20], which show the context of the development of traditional settlements. In modern times, settlements generally refer to all settlements, including both rural settlements and urban settlements.
Settlement refers to the place where human beings live together. It needs to be formed over a period of time, occupying a relatively fixed space in a period of time, with a certain scale. e settlement contains not only residential forms such as "room houses" but also other architectural forms and facilities related to living and living and covers the surrounding environment dependent on living "room houses." Settlements are generally presented in the form of a dynamic organic whole, which involves the influence of various factors related to settlement activities, such as entity, space, time, politics, economy, culture, religion, society, folklore, nature, and so on, and changes with time.

Traditional Settlement.
Traditional settlement refers to the settlement formed by the influence of "tradition." Obviously, traditional settlements need to be inherited and extended for a certain period of time, and their construction methods, spatial forms, artistic styles, decorative techniques, and living habits all follow a certain "traditional" pattern.
e settlement form has relatively stable continuity and pays attention to the continuation of the past and historical architectural form, production, and lifestyle. e settlement style can reflect certain historical context and the inheritance of local culture.
Compared with modern rural areas and cities, traditional settlements have different performances: consanguinity and geographical relationship are clear, clan and patriarchal etiquette system still continue, religious beliefs, moral standards, life patterns, and ideology still retain more traditional components, and more characteristics of Chinese traditional residential buildings are still retained in the appearance of settlements.

Architectural Space.
Settlement generally solves the relationship between settlement and settlement group from a macroperspective. It covers the nature, scale, and mutual location of settlements in different periods. In the same period, the relationship between different settlements is different, and the living mode has different changes.
Architectural space form refers to the internal relations among various spaces of human activities in architecture and the rational organization of space forms. e spatial form of architecture studies the spatial organization relationship of architecture from a microperspective, covering the relationship between architecture and various functional spaces within architecture.

Big Data
Mining. Data mining refers to mining valuable knowledge or rules from a large amount of incomplete and irregular information [21]. e object of data mining is mainly for business data in large databases. Business data are cleaned and integrated, converted, analyzed, evaluated, and represented by knowledge [22], which ultimately provide users with valuable information and help users make corresponding decisions on business data. e storage of data source can be any type, including structured, unstructured, and semistructured data content.
In different application scenarios and different mining technologies, the process of data mining will be different. However, according to previous summary, the basic processes of data mining generally include data preparation, data mining, evaluation, and representation, as shown in Figure 1.

Data Preparation.
In the data preparation stage, it is the beginning stage of the whole data mining process, which is very important and takes up more than 60% of the time.
e quality of the data preparation stage directly affects the feasibility of the subsequent steps or the quality and effectiveness of the whole mining [23]. Usually, the data preparation stage can be subdivided into three steps: data integration, data cleaning, and data preprocessing.

Data
Mining. Data mining is a real process of mining and analysis. On the basis of data preparation, select algorithms related to mining topics to mine and analyze data.
is stage is the most critical part of the whole mining process, and the quality of the algorithm directly affects the validity and accuracy of the results. At this stage, the emphasis is on the design and optimization of algorithms, which is extremely technical and difficult and is often a key and hot research field.

Evaluation and Representation.
e evaluation and display stage is to evaluate the excavated models and eliminate the models that do not meet the evaluation criteria. At the same time, the results are expressed through various visual tools in this stage, which makes it easier for people to understand and accept, and show the relevant information to users.

Decision Tree Classification
Mining Algorithm for Settlement Groups. Spatial data, including attribute, space, and time, are one of the purposes of spatial data mining. e purpose of this chapter is to explore hidden information and extract practical and useful rules or models describing data from a large number of noisy attribute elements.
Iterative dichotomy 3 or ID3 is an algorithm used to generate decision tree. For the details of the ID3 algorithm, it has many uses, especially in the field of machine learning. In this article, we will see the feature selection process used in ID3 algorithm. e feature selection section is divided into basic information related to the dataset. ID3 (Iterative Dichotomiser 3) algorithm is based on greedy algorithm, which firstly finds out the most discriminating attribute in the training sample set, that is, the attribute with the largest information gain.
en, the training sample set is divided into several subsets, and each subset selects the most discriminating attribute for division Discrete Dynamics in Nature and Society and executes recursively from top to bottom until all subsets only contain the same type of data. Finally, a decision tree is obtained [24].
ID3 algorithm has clear basic theory, simple algorithm, and strong learning ability and is widely used in the fields of data mining and machine learning. At the same time, ID3 algorithm has the following shortcomings: (1) Preference is given to attributes with a large number of attributes, and it is considered that attributes with more values play a more important role in classification, but in fact, attributes with more values are not always the optimal attributes, that is, there is multivalued tendency. (2) Due to the limitation of calculation principle, the efficiency of decision tree generation by the algorithm is obviously affected when dealing with large training sets.
In order to overcome the shortcomings of the above ID3 algorithm, this paper has improved the ID3 algorithm accordingly, and the basic idea of the improved ID3 algorithm comes from this, as follows.
(1) User interest is introduced to solve the multivalue tendency of ID3 algorithm.
In the process of improving ID3 algorithm, the user interest degree α is introduced to indicate the user's interest in uncertain knowledge, in which 0 ≤ α ≤ 1 is determined by the decision maker according to prior knowledge or domain knowledge. User interest α is a vague concept, which usually refers to prior knowledge about a certain transaction, including domain knowledge and expert advice. After improvement, the calculation formula of information entropy is (2) Simplify the calculation process of ID3 algorithm by using McLaughlin formula.
According to ID3 algorithm implementation principle: in which I(p, n) is a quantity, so the complexity of the algorithm depends on en, every time the nonleaf node is selected, the algorithm has to perform logarithmic operations several times. When dealing with large training sets, the efficiency of decision tree generation will be seriously affected. erefore, ID3 algorithm is improved according to McLaughlin formula [25].
Let f(x)be continuous in x 0 ; then, Let f(x) be derivable in x 0 ; then, in which a and o(x − x o ) tend to 0 at lim △x⟶0, i.e., lim△x⟶x 0 . Equation (4) is a first-order polynomial near x 0 , and in order to improve the accuracy, a higher-order polynomial near x 0 is Among them, According to Taylor formula, Among them, Convert (2) In2 p i In p i p i + n i , n i In n i p i + n i , (10) in which 1/(p + n)In2 is a constant and is expressed by M. At the same time, Finally, the improved ID3 algorithm is obtained: It contains only addition, multiplication, and division instead of logarithmic operation, which can obviously improve the generation efficiency of decision tree when dealing with large training sets [26].
After the above improvement, combined with formulas (1) and (10), the information entropy calculation formula of the improved ID3 algorithm is obtained: In the improved ID3 algorithm, when testing attributes at each node, E(A) is used as the calculation method of information, thus constructing decision tree. When applying the improved ID3 algorithm, if the attribute with less values is more important than the attribute with more values in the generation of decision tree, the value of user's interest α should be given appropriately, so that the original attribute far from the root node can appear on the node closer to the root node. en, the improved ID3 algorithm is used to reconstruct the decision tree and extract the classification rules [27].

Grid Pattern Recognition Based on Graph
eory. Grid pattern is one of the typical patterns of architectural complex distribution, which is common in well-planned urban communities, schools, factories, etc. It is a geographical distribution feature that must be considered in architectural complex synthesis (such as typification) and multiscale expression.
Graph theory is an abstract algebra used to study the relationship between a group of concrete things, which has a solid mathematical foundation. Any system that can be described by bilateral relations can use graph theory to provide its mathematical model [28]. Graph in graph theory is composed of a set of points representing concrete things and a set of line segments representing the connections between things. Its graphic form is more intuitive and clear, and its data structure is simple and unified, which can reveal the inherent connections and movement changes between things in essence.
Taking the building complex in Figure 2 as an example, the flow of the grid pattern extraction algorithm is as follows.
Establish the connection graph G � <V,E> of straight line mode. Node V of the graph represents a straight line pattern, and connection E between nodes indicates that two straight line patterns intersect or meet. As shown in Figure 2, straight line modes {1,2,3} intersect with {4,5,6} and {8,9, 10, 11} intersect with {12, 13}. In this step, it can be ensured that the straight line pattern groups that make up the grid all intersect or connect. e connected component ORG_CC of direction relation graph ORG is extracted. If ORG has only one connected component, it means that either there is no grid mode or the direction threshold is set too large, so it should be considered to reset it. e network InitG of the initial grid pattern is obtained, as shown in Figure 3.
Buildings filled with diagonal lines in Figure 4 are excluded from GA and GB. us, the final result of grid pattern is generated.
Based on the idea of "decomposition-combination" and the hierarchy of its cognitive characteristics, the complex nonlinear cognitive concept of grid pattern is decomposed into the orthogonal combination of linear cognitive concept of linear pattern, which makes the cognitive meaning more concise. e extraction process of grid pattern only involves the direction parameter, which has clear geographical meaning and is easy to adjust.

BDM Process of Traditional Settlement Architectural
Space. Firstly, the spatial data mining model of problem building is established, that is, the spatial data mining model of contradictory problem building: in which g, g1, and g2 represent the target and l is the condition, which can be formally expressed by primitive.
In the contradictory problem, if the design condition l cannot achieve the goal g, then the problem p � g * l is called an incompatible problem and recorded as g↑l. If the design condition l cannot realize g 1 , g 2 at the same time, the problem p � (g 1 ∧g 2 ) * l is called the opposite problem and is recorded as (g 1 ∧g 2 ) ↑ l; most architectural designs are multiobjective and multiconditional, and the problem building spatial data mining model is p � g 1 ∧g 2 ∧...g n * l 1 ∧l 2 ∧...∧l m , g 1 ∧g 2 ∧...g n ↑ l 1 ∧l 2 ∧...l m . (15) e method of calculating support degree and credibility is to explain and evaluate each kind of knowledge by calculating the support degree and credibility of positive and negative knowledge and positive and quantitative knowledge in the dataset. e concept of support degree can be understood as the percentage of selected tuples in its total number, while the reliability is the percentage of selected cases in eligible cases.
If the whole universe is U, the sample that satisfies the rule is E 1 , and the sample that satisfies expressed as: the precondition is E 2 , the support and credibility can be In the knowledge formula, support and reliability are usually expressed by the following formula.
ℓ � (sup port, confidence). (17) at is, knowledge expression is Test the knowledge discovered in practice. e knowledge mined from architectural design data, combined with the design conditions, solves the problems in mining design and is applied to the design scheme formulation process. After several stages such as scheme implementation and application effect evaluation, the excavated architectural space knowledge is tested and evaluated in design practice, as shown in Figure 5.

Performance Analysis of the Algorithm.
In this paper, ID3 algorithm and C4.5 algorithm are also used to classify the attribute values of five traditional settlement buildings, and the classification rules obtained are the same as ID3, an improved algorithm of ID3 algorithm, but after many tests, the running time of the three algorithms is different in five traditional settlement buildings. Compared with other algorithms, it can get more accurate algorithm value and wider application range. According to the characteristics of the spatial form and structure of traditional settlements in Hainan, the spatial layout is carried out to build a new settlement in Hainan with the characteristics of the times. Figure 6 shows the running time of the classification algorithm.
It can be seen from Figure 6 that in terms of running time, the running time of the three algorithms increases gradually with the increase of the number of training sets to be classified. To sum up, the improved ID3 algorithm is not only effective but also achieves the improved effect, which improves the efficiency of the algorithm after simplifying the calculation method of the algorithm. As for the C4.5 algorithm, the efficiency is relatively low in this case, and it is guessed that when C4.5 algorithm processes continuous data, there is a fast classification process.
Acceleration ratio usually refers to the ratio of the time spent running the same job in a single processor system to the time spent in parallel processing by multiple processors, which is used to measure the parallel system processing or program parallelization. e higher the speed-up ratio, the better the parallel performance. In order to test the parallel processing performance of the improved ID3 algorithm, this experiment uses different datasets D1 (100 W score records), D2 (400 W), D3 (700 W), and D4 (1000 W) to run on a single machine and a cluster with different node numbers, calculates the required time, and averages the final results after multiple runs. en, we calculate the acceleration ratio under various conditions, as shown in Figure 7.
It can be seen from Figure 7 that the improved ID3 algorithm has obvious speedup effect. With the same number of nodes, the larger the dataset, the greater the speedup ratio, which reflects Spark's ability to handle BDM. With the increase of the number of nodes, the acceleration ratio will increase correspondingly and finally tend to slow down. With the increase of the number of nodes, the  network overhead also increases in the calculation process. Moreover, the experimental data are not large enough, and the cluster is still small (limited by experimental conditions), which can not fully reflect the actual ability of BDM parallel processing. erefore, the acceleration is slower than the final trend.
In addition, this experiment also compares the improved ID3 algorithm with the existing ID3 algorithm MCF based on cooccurrence matrix and runs in the same dataset and the same experimental environment and obtains the results as shown in Figures 8-11.

Analysis of Architectural Space of Traditional Settlements in Hainan.
Traditional settlement spatial form is to analyze the expression of settlement in the overall environment from a macroperspective. e spatial form analysis of traditional settlement architecture is to go deep into the internal space of the settlement from the microscopic point of view and analyze the specific internal spaces of the settlement architecture and the structural relationships among various buildings [29].
Due to the dense population pattern and strong family culture in Qiongbei area, it is natural to choose the way of increasing the number of main residential buildings vertically and horizontally, extending auxiliary long houses on one side or both sides and maintaining a unified, regular, and orderly courtyard where large families live in compact communities. is situation is mainly in Haikou, Dingan, Wenchang, Chengmai, and other places, and further to the southern region, there are fewer courtyards in this pattern. Figure 12 shows a scatter chart of evaluation content scores of well-developed villages.
ere are two clues to the inheritance and evolution of the traditional courtyard space form in each region. First, based on the basic building units that represent the clear characteristics of the region, there are relatively "regular" variations to produce different types, and the "shadow" of the basic building units always exists in these variation types. e differences of climate and environment in different regions of Hainan, the imbalance of economic development, the complex origin of ethnic groups, and the characteristics of living together make the spatial form of houses in different regions show different regional differentiation. Figure 13 shows a score graph of evaluation content of slow-developing villages.

Street Space Protection.
Traditional streets and lanes are complex spaces interwoven with various cultures and spatial implications. e functions of streets and alleys are not limited to meeting traffic needs but also include places for villagers to live and communicate. Streets and alleys are not only an important means of organizing landscape but also the main carrier for people to "read" the information of traditional settlements in Hainan. ey are the most important material elements of the characteristics of village spatial pattern.
Remediation Measures. Traditional streets and lanes in good condition must keep their original composition and cannot be arbitrarily transformed. Good traditional streets and alleys shall strictly comply with the direction, width, and shape of streets and alleys. e occupation area and elevation form of residential buildings on both sides of streets and businesses along streets should remain unchanged, and the status quo should be protected. Demolition, material replacement, and illegal selling are not allowed. In case that infrastructure construction really needs to be carried out under the pipeline, the corresponding qualified units must be invited to carry out on-site investigation, mapping, shooting, and numbering and draw detailed pavement drawings. After the pipeline goes down, it must be restored according to the drawings drawn by the original materials, regulations, and processes, and all relevant elements of street space should be uniformly deployed under the same style.

Optimization of Village Ecological Environment.
Combined with the renovation of the road system in overlapping village, the spare open space in the village is sorted out and transformed into green space, while the vegetable fields and shrub lands outside the village are preserved and optimized, the recreation and leisure facilities in Hainan traditional settlement style are built, and the ecological environment is optimized in cooperation with the construction of civilized ecological villages. e reconstruction and renewal of coincidence village belong to the type of ecological civilization village. is kind of conservation and renewal village is mainly based on maintaining the basic characteristics of the village's natural environment, so as to improve the sustainable development ability of the village. Village renovation and renewal involve relatively little protection. erefore, compared with famous historical and cultural towns (villages) and towns and villages with special customs, the renovation and renewal efforts should be greater. However, we should also pay attention to consciously leaving room for the creation of village characteristics and customs in the future. According to the characteristics of traditional Hainan settlement spatial form and structure, the spatial layout is carried out to build a new Hainan settlement with the characteristics of "times."

Conclusion
e protection and renewal of traditional settlements should take protection as the first premise. is paper analyzes the architectural space of Hainan traditional settlements  through BDM, which can help us better grasp social activities and provide a direction for the protection and renewal of traditional settlements from the perspective of tourists and residents. Traditional settlements in different regions of Hainan Island show obvious regional differentiation in living space. But as far as the whole traditional settlement space form is concerned, it shows similar consistency, namely, it is composed of residential buildings, public buildings, and ecological environment, which highlights the regional characteristics of Hainan Island. According to different types of traditional settlements in Hainan Island, the protection and renewal system of traditional settlements covering the whole island is constructed, which is composed of famous historical and cultural towns (villages), characteristic towns (villages), and civilized ecological villages. Famous historical and cultural towns (villages) focus on protecting the historical authenticity of traditional settlements. It not only ensures the true preservation of historical authenticity but also strengthens the regional distinctive features of traditional settlements and architectural space forms while retaining the most common basic features of natural and simple traditional settlements in Hainan Island.
Because the research of BDM in traditional settlements is still a developing field, there is no systematic research method and paradigm for the application of BDM in traditional settlements. Because of the inconvenience and difficulty in obtaining data, only the network data of Internet and mobile phone platform in BDM are used as the data of village renewal research. erefore, this paper puts forward more ideas, but the content is not profound and comprehensive enough. In the future research, we need to solve the inconvenience and difficulty of data acquisition and collect more data that can be used for village renewal research.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.