Built environment and early infection of COVID-19 in urban districts: A case study of Huangzhou

Since COVID-19 spread rapidly worldwide, many countries have experienced significant growth in the number of confirmed cases and deaths. Earlier studies have examined various factors that may contribute to the contagion rate of COVID-19, such as air pollution, smoking, humidity, and temperature. As there is a lack of studies at the neighborhood-level detailing the spatial settings of built environment attributes, this study explored the variations in the size of the COVID-19 confirmed case clusters across the urban district Huangzhou in the city of Huanggang. Clusters of infectious cases in the initial outbreak of COVID-19 were identified geographically through GIS methods. The hypothetic relationships between built environment attributes and clusters of COVID-19 cases have been investigated with the structural equation model. The results show the statistically significant direct and indirect influences of commercial vitality and transportation infrastructure on the number of confirmed cases in an infectious cluster. The clues ch inducing a high risk of contagions have been evidenced and provided for the decision-making practice responding to the initial stage of possible severe epidemics, indicating that the local public health authorities should implement sufficient measures and adopt effective interventions in the areas and places with a high probability of crowded residents.


Introduction
Since the initial cases of the 'unknown pneumonia' reported in Wuhan, China, in December 2019 Lu, Stratton, & Tang, 2020), such an infectious disease has an outbreak and rapidly spread all over the world (Wei, Wang, & Kraak, 2020). The novel coronavirus was soon identified as the causative agent by the Chinese Centre for Disease Control and Prevention . Subsequently, the pneumonia was named COVID-19 by the World Health Organization. So far, the COVID-19 becomes a severe pandemic of international concern. The regional haphazard has transformed into an international pandemic as an unprecedented global emergency in the context of a highly globalized era (Sohrabi et al., 2020). Alongside these dark sentiments, images of solidarity have emerged (The Lancet, 2020). However, the impact will drive a change in the way we think about cities and health (Acuto, 2020).
Evidenced in many confirmed cases, the clinical symptoms of COVID-19 are dry cough, fatigue, fever, dyspnea, and bilateral lung infiltrates Sohrabi et al., 2020). Some younger infected persons have relatively mild symptoms without fever, most of which recovered after about one week. The severe cases developed rapidly into acute respiratory distress syndrome, septic shock, hard-to-correct metabolic acidosis, and coagulation dysfunction. Notably, aged patients and patients with chronic disease have worse conditions (Deng & Peng, 2020). Thus, it is essential to fend off the challenge of COVID-19 spread with the strategies and knowledge from the interdisciplinary perspective (Franch-Pardo, Napoletano, Rosete-Verges, & Billa, 2020; Megahed & Ghoneim, 2020;Rahman et al., 2020). However, few studies define the pathophysiological characteristics of COVID-19, and its mechanism of spread is of considerable uncertainty.
Against the backdrop, this study investigated the spatial spread of the epidemic in the Huangzhou district of Huanggang, Hubei province, China. The anonymized data about the residential location of infected cases at the initial stage were used to mapping the COVID-19 contagion distribution through geographic information systems (GIS). Various aspects of the urban built environment are correlated with not only organizational level outcomes (Wang & Vermeulen, 2020) but also individual level outcomes regarding population physical and mental health (Saarloos, Kim, & Timmermans, 2009). Using data of the built environment around confirmed cases, we investigated the relationships between the quantity of assembled COVID-19 confirmed cases and the built environment attributes of corresponding neighborhood-level urban space. The measurements of urban components and attributes on a human-scale that affect the sense of potential risks regarding contact infection can be used to identify features and hot spots of infected COVID-19 clustering in certain areas (Shach-Pinsly, 2019). The objective of this investigation is to provide evidence regarding the potential risk of COVID-19 spreading and clustering related to the conceptualized constructs regarding urban commercial development, medical service capacity, and transportation infrastructure. In light of the density-based clustering algorithm, the confirmed cases were spatially categorized into clusters of different sizes. The number of cases therein defines the size of a cluster. Subsequently, the point-of-interests (POIs) of various types of commercial facilities and medical services, road network lengths, building density, and average housing price at the community level, which is surrounding the clusters (within 1000 m buffer), are incorporated into the distributions map of COVID-19 infected cases with ArcGIS. Finally, we explored the relationships between infected cases clustering and fine-scale built environment attributes.
The remainder of this paper is organized as follows. The short review of the spatial transmission of COVID-19 is presented in Section 2, which is followed by Section 3 regarding the proposed hypothetical conceptual framework. To examine the hypotheses, the methodology in this study is addressed in Section 4, which includes the descriptions of COVID-19 confirmed cases and related neighborhood-level urban attributes in Huangzhou as the data sources, and the methods of geo-analysis on COVID-19 clustering and structural equation model for estimating the hypotheses. Then, the results of the estimation are shown in Section 5. Discussions based on the results are elaborated in Section 6. In the last section of the conclusion, the study is summarized.

Literature review on the spatial spread of COVID-19
Most of the initial COVID-19 cases were geographically linked to the Huanan seafood wholesale market in Wuhan, which suggests the virus was transmitted from animal to human (Li, Guan et al., 2020), whereas the intermediate animal that caused the transmission of the virus to human is still unknown . The sequent presence of human-to-human transmission has been confirmed based on the reports of an infected cluster of family members and healthcare works in different geographical regions (Jasper Fuk Woo Chan et al., 2020). Nevertheless, genomic evidence has been provided that the virus was induced from another unknown location, and the human-to-human transmission may have occurred earlier (Shereen, Khan, Kazmi, Bashir, & Siddique, 2020;Yu, Bin, Da, Zhang, & Corlett, 2020). Further investigation revealed that more than 70 % of the patients at the beginning of the outbreak had no exposure to the market (Li, Guan et al., 2020). For sure, the COVID-19 has tremendous transmissibility . So far, its reproductive number of COVID-19 has been much higher than SARS, so does the number of people infected or died (Liu, Gayle, Wilder-Smith, & Rocklöv, 2020;Mahase, 2020). As the viral transmission was tending more and more intensively, Wuhan's area was quarantined on 23 January 2020. Subsequent viral spread led to movement across Hubei province in 16 cities, affecting 50 million people (Sohrabi et al., 2020). Since 26 January 2020, 30 provinces in China have initiated a Level-1 serious major public health emergency response to control the outbreak.
Typically, respiratory viruses are most contagious when a patient is symptomatic (Li, Guan et al., 2020;Phan et al., 2020). According to the clinical evidence of previous respiratory syndromes, respiratory droplets and personal contact are primary transmission methods (Zumla, Chan, Azhar, Hui, & Yuen, 2016). Close contacts with infected persons, especially when they cough or sneeze, are thought to be the main risk of respiratory virus transmission (Harapan et al., 2020). In the case of COVID-19, the source of infection is mainly the infected patients, even the asymptomatic ones, as some patients do not have any symptoms and abnormalities on the initial presentation, which has complicated the diagnosis and preventions Guan et al., 2020). As the research moves along, a rich body of evidence has indicated the human-to-human transmission during the asymptomatic incubation period of COVID-19 (Chan et al., 2020;Guan et al., 2020;Li, Guan et al., 2020;Rothe et al., 2020;Zhou et al., 2020). Thus, at the initial stage, the spread rate of epidemics in the city depends on the human mobility in urban areas. The compulsory measures should be considered to restrict residents' congregations and ensure the supply of living resources.
As one of the most important characteristics of an epidemic, the spatial spread mainly depends on the transmission mechanism, human mobility, and control strategy (Gross et al., 2020;Rahmani & Mirmahaleh, 2021). Responding to the virus spread, GIS and spatial statistics are applicable to track the epidemic through scientific information, such as finding the spatial correlation with related variables and identifying transmission dynamics (Ellis et al., 2016;Xiong, Wang, Chen, & Zhu, 2020). The spatiotemporal characteristics of confirmed cases were correlated with the profile of symptoms and demographic variables, making the information of contagions more precise . Estimating spatial variations through geographical and geospatial analyses has been conducted, which provides insights into the distribution patterns of COVID-19 and effective measures (Franch-Pardo et al., 2020). The data of identified cases may offer valuable information on our understanding of the spread and making targeted policies. Since early identification of the high potential risk areas of contagion and understanding the spatial spread of the outbreak are significantly crucial for the decision making of implementing preventions and control measures to prevent the COVID-19 infections at the initial stage (Kang, Choi, Kim, & Choi, 2020). Currently, either vaccine to prevent infections or specific therapeutic medicines curing the patients have yet been produced (Deng & Peng, 2020;Wang, Horby, Hayden, & Gao, 2020). The critical and effective control strategies are to block human-to-human transmission and to prevent people being exposed to the virus sources. The disruption to the transmission chain is considered a key to stopping the contagious diffusions (Chan et al., 2015). Public health intervention and transportation restrictions have been evaluated regarding their effects on COVID-19 transmission (Su et al., 2020;Tang et al., 2020;Zhao et al., 2020).
The identification of significant clusters of infected cases is essential to practical and scientific endeavors (Grubesic, Wei, & Murray, 2014;Sarwar, Waheed, Sarwar, & Khan, 2020). Multiple dimensions of the spatial segmentation of identity and count cases are required for revealing the epidemic risk and prevention level. Although the spread of COVID-19 is a serious international concern, the responses depend on local governance, socioeconomic, and cultural contexts (Shaw, Kim, & Hua, 2020). National-level spatial distribution and dependency have confirmed the positive relationship between the frequency of flights, trains, and buses and the number (daily and cumulative) of COVID-19 infected cases. The neighborhood-level spatial track of infected patients provides essential information to detect the risky areas and the infection rate, which is also necessary for impeding the transmission with quarantine and disinfection methods. The spatial clustering facilitates the early detection of the risk of contagion outbreaks at the neighborhood-level associated with the human-to-human transmission of viruses or a localized infection source.
As important as the knowledge of spatial attributes of risk and spread rate, the understanding of urban settings related to effective preparedness is the demanded innovative solutions. Residents' daily behavior depends on the environment in which it takes place, while residents also influence the environment by their presence and activities (Saarloos et al., 2009). Different configurations of urban spaces define user's perceptions, sense of control, social distance, and behavioral patterns (Morello & Ratti, 2009), which may provide information about the potential risk of contact air-borne contagions. The heterogeneity in the distribution and clustering of confirmed COVID-19 cases at the initial stage could be correlated with appropriate urban settings. However, there is a lack of the neighborhood-scale research related to the risk and rate of the spread of COVID-19. The certainty in existing evidence of community-level effects of spatial attributes on COVID-19 infection and transmission is yet to be estimated, posing a barrier to decision-making. The challenge to systematic analysis of the relationship between built environment and epidemic spread is still confronted in this study for the lack of detailed and quality data of infected persons and the comprehensive built environment data.

Hypothetical conceptual framework
Epidemical evidence has indicated that the human-to-human transmission caused by confirmed and suspected cases exists even at the community level in the incubation period (Chan et al., 2020). At the initial stage of epidemic spread, the individual suspected virus carriers behave as normal persons in response to urban physical and social settings as a part of the built environment (Moudon & Lee, 2003). Their activities and movements are influenced by businesses and organizations located in the environment of daily life (Saarloos et al., 2009). The urban setting features, such as the location of commercial zones and public transportation and socioeconomic gradient, contribute and shape human exposure (Andrianou et al., 2020;Wild, 2005). The area of community commercial activities, where a high possibility of close contact between people exists, potentially has a high risk of an uncontained outbreak of COVID-19 infection (Lee et al., 2020). From the geographical perspective, the influences may have a spatial dimension, which may allow for mapping the potentially predicted distribution and clustering of infectious cases and areas with high contagious risk.
In this study, the conceptual framework presumes that the commercial prosperity, medical service, and transportation infrastructure surrounding the cluster of COVID-19 confirmed cases are treated as unmeasurable influential factors. These latent theoretic constructs regarding commercial prosperity, medical service, and transportation infrastructure at the community level are conceptualized by measured variables with spatial dimension regarding POIs of the daily life commercial, medical and transportation facilities, and road network lengths in the surrounding area of clusters as we set. Also, average housing price and average building density are regarded as the exogenous variables related to socioeconomic factors and spatial attributes surrounding the cluster of COVID-19 confirmed cases, which influence the endogenous latent variables and cluster size. The conceptual framework of the hypothetical structure is shown in Fig. 1.
The hypotheses of theoretic constructs are described in Table 1, and relationships between variables are listed in Table 2. Three latent variables of perceived urban facilities are presumed to be measured by the corresponding indicators referring to h1-h10. The number of POIs is used to reflect urban vitality and commercial prosperity , which often includes food services, markets, hair salons, convenience stores, public toilets, and ATM in 1000 m radius around each COVID-19 cluster (Mitchell & Bendixson, 2015). Each commercial POI is the possible venue where people are gathering with close exposures and contacts in daily life. The medical service indicators are the number of POIs regarding clinics and drugstores in 1000 m of each COVID-19 cluster. Concerning the measurement of transportation infrastructure, the indicators comprise the number of bus stop POIs and the road network length in a 1000 m radius surrounding each COVID-19 cluster. All indicators are taken logarithms before running the structural model. As for the structural model part, the hypotheses H1-H6 are specifying the relationships among the latent endogenous variables and COVID-19 cluster size. H7-H9 represents the hypothetical effects of building density on commercial prosperity, transportation infrastructure, and COVID-19 cluster size. Similarly, H10-H12 represents the hypothetical effects of housing prices on commercial prosperity, medical service, and COVID-19 cluster size. All hypotheses have been estimated through covariance-based SEM, and the results are described in Section 5.
Accordingly, this study aims to bridge the gap by first constructing a structure equation model to provide a view to show the relationships among the confirmed COVID-19 cluster size (the number of confirmed persons in a cluster) and latent variables representing commercial prosperity, medical service capacity, and transportation infrastructure within 1000 m radius around each COVID-19 confirmed cluster, which will explicitly indicate the correlations between neighborhood-level urban spatial attributes and the risk of confirmed cases clustering. It is meant to give the decision-makers a sense of the potential procedures for prevention and control on the quantification of urban spatial influence. Based on the data gathered from the Centre for Disease Control and Prevention of Huangzhou district in Huanggang, we employed DBSCAN to cluster confirmed cases based on their residential locations spatially, and SEM to analyze the relationships between the cluster size and the exogenous and endogenous factors related to urban settings and spatial attributes. The findings arising from this study can potentially inform decision-makers on detecting the potential risk correlated with urban settings and spatial attributes and how to implement and repeal the prevention and protection measurements spatially.

Data sources
Huanggang is a major municipality (prefecture) adjacent to Wuhan in the easternmost Hubei province. There are two directly administrative divisions, namely Huangzhou and Longganghu. The geographic relation of Huanggang city and Huangzhou district is shown in Fig. 2. Since the COVID-19 outbreak, the world is concerned about the prospect of the pandemic in Wuhan. The epidemic treatment, prevention, and control in Wuhan received widespread support throughout the whole country. Besides the high growth rate of infection in Wuhan, the virus transmission rate in adjacent cities in Hubei province is remarkably higher than the other in other provinces. Wuhan was quarantined from 23 January 2020, which comprises an urban area spanning 1528 km 2 and exceeds 11 million residents. Subsequent imposition of restriction regarding movements across 16 cities of Hubei Province affecting 50 million people.
The data of COVID-19 confirmed cases in this study were obtained from the Huangzhou district. The economic development of Huangzhou received the benefits of transportation infrastructure, such as 90 km to Tianhe airport of Wuhan, an extensive road network, several new expressways, and two national main north-south railway lines. However, during the initial stage of the COVID-19 outbreak, the inevitable frequent flow of population from Wuhan leads to a high possibility of rapid infections spreading in Huanggang, especially the Huangzhou district. The spatial distribution data of 639 COVID-19 confirmed cases in the study area was extracted from "the Report of Epidemic Diseases" of Huangzhou District Center for Disease Control and Prevention.
At the preparation stage of this study, spatial analysis using GIS played a crucial role in rapidly collecting and screening related data of urban settings and spatial attributes from multiple sources, further tracking and clustering the COVID-19 confirmed cases. The amounts of POIs regarding commercial and medical facilities and transportation, including foods and drinks, markets, ATMs, hair salons, public lavatories, bus stops, clinics, and drugstores at the community level in the areas of 1000 m radius surrounding the clusters of COVID-19 confirmed cases were accessed from the online open-source database of Gaode Map Lab (https://maplab.amap.com). Then, the spatial pattern of various POIs in the Huangzhou district was incorporated into the layer of the map via ArcGIS tools. All data regarding urban settings and spatial attributes were linked with the corresponding clusters in the ArcGIS environment. In addition, we calculated the length of the road network, mean housing price, and average building density in the 1000 m radius based on each location of clusters. Logarithm fetching on the amounts of POIs has been exerted to normalize the data for further modeling. (1) Building density The k-mean values of (1) building density and (2) housing price within 1000 m radius centered in a COVID-19 infected cluster (2) Housing price Table 2 . Hypothetic links between components in SEM.
Measurement model h1-h6: Commercial prosperity is conceptualized by 6 manifest items h7-h8: Transporation infrastructure is conceptualized by 2 manifest items h9-h10: Medical service conceptualized by 2 manifest items Structural model H1: Transporation infrastructure's effect on Commercial prosperity H2: Transportation infrastructure's effect on Medical service H3: Transportation infrastructure's effect on COVID-19 cluster size H4: Medical service's effect on Commercial prosperity H5: Commercial prosperity's effect on COVID-19 cluster size H6: Medical service's effect on COVID-19 cluster size H7: Effect of Building density on Commercial prosperity H8: Effect of Building density on Transportation infrastructure H9: Effect of Building density on COVID-19 cluster size H10: Effect of Housing price on Medical service H11: Effect of Housing price on Commercial prosperity H12: Effect of Housing price on COVID-19 cluster size

Density-based clustering algorithm
The density-based clustering algorithm aims at finding the underlying structure of data, which is an unsupervised learning method to group similar data points in a certain way. Based on the idea that a cluster in data space is a contiguous region of high point density, separated from other such clusters by contiguous regions of low point density, the density-based clustering algorithms are efficient to deal with the arbitrary shaped clusters and finding the high-density and outliers (Dane, Borgers, & Feng, 2019;Daszykowski & Walczak, 2009). DBSCAN algorithm is a well-known data clustering algorithm, which stands for Density-Based Spatial Clustering of Applications with Noise. The key idea is that for each point of a cluster, the neighborhood of a given radius must contain at least a minimum number of points. Unlike k -means, DBSCAN is a non-parametric approach that does not require the number of clusters as an input parameter. Instead, it infers the number of clusters based on the data, and it can discover clusters of arbitrary shape comparing the spherical clusters in k -means. Based on DBSCAN, a point belongs to a cluster if close to many points from that cluster (see Fig. 3). Noise is defined relative to a given set of clusters, which is simply the set of points in data space not belonging to any clusters. The cutoff distance of the neighborhood is fundamental to DBSCAN to approximate local density. Therefore, the algorithm has two parameters determining the clustering outcome: (i) Eps (ε): The cutoff distance specifies the neighborhoods. Two points are considered to be neighbors if the distance between two points is not more than ε, and (ii) MinPts: Minimal number of data points for defining a cluster.
DBSCAN stands out as a suitable algorithm for clustering the data of COVID-19 confirmed cases due to its ability to discover clusters of arbitrary shape without the prior knowledge of the existing number of clusters. Therefore, by using DBSCAN to discover clusters of COVID-19 confirmed cases through GIS, we categorize the points of infected cases into three classes on the map, (I) core points: a data point of COVID-19 confirmed case is a core point if its εneighborhood contains at least MinPts, (II) border points: a data point of COVID-19 confirmed case is a border point if its εneighborhood contains less than MinPts, (III) outlier: a data point of COVID-19 confirmed case is an outlier if it is neither a core point nor a border point. A cluster is defined as a maximal set of density-connected points of COVID-19 confirmed cases. The ε is set to 30 m, which is the largest distance between two dwelling buildings in the same residential community, while MinPts is set to 2. The classification starts with an arbitrary point from the data points, satisfying the core point condition. Then, it retrieves all points that are density reachable from the core points obtaining the cluster containing the points until all points have been processed.

Structural equation modeling
The structural equation modeling (SEM) refers to a broad and flexible framework and a family of modeling techniques for data analysis regarding relationships between measured variables and latent constructs. The intricate causal networks enabled by SEM characterize realworld processes better than simple correlation-based models. Therefore, SEM has been widely used in economics, sociology, and behavioral science, which is particularly appropriate for analyzing complex relationships among various variables. SEM can be generally viewed as a combination of factor analysis and regression or path analysis (Hox & Bechger, 1999;. Contrary to regression, SEM enables researchers to set up and reliably examine hypothetical relationships among theoretical constructs and those between the constructs and their observed indicators (Muthén, 2002).
SEM comprises two or more equations in the model, which differs from the single equation regression model with a single dependent variable and multiple independent variables. The two essential components of SEM are the measurement model and the structural model. The measurement model quantifies the loadings of measured variables (measurements) on their expected latent variables (constructs). In contrast, the structural model quantifies the assumed causation among a set of dependent and independent constructs simultaneously with the measurement model's estimation. SEM is often used to test theories or hypotheses that can be represented by a path diagram. The network of  directional paths is postulated with direct and indirect effects among variables. A direct effect indicates the direct regulation of a response variable based on a causal variable, while an indirect effect implies the regulation is mediated through other variables.
Various estimation procedures are available for SEM, but the most commonly used estimator is the maximum likelihood (ML). ML leads to estimates for the parameters vector θ which maximize the likelihood L that the empirical covariance matrix S is drawn from a population for which the model-implied covariance matrix ∑ (θ) is valid (Reinartz, Haenlein, & Henseler, 2009;Schermelleh-Engel, Moosbrugger, & Müller, 2003). The log-likelihood (logL) function is express as: where N is the sample size (the number of clusters in this study), tr is the trace of a matrix, and c is a constant containing terms of the Wishart distribution that do not change once the sample is given (Schermelleh-Engel et al., 2003). The maximization of logL is equivalent to minimization of the function, which is expressed as where F ML is the fitting function evaluated at the final estimates, and p is the number of observed variables, and q is the sample size. The ML estimator assumes that the joint distribution of variables in the model is a multivariate normal distribution.

Results
The data of COVID-19 confirmed cases are extracted from the central area of Huangzhou, the main district of Huanggang prefecture, with 396.3 thousand population and 362.37 km 2 . The anonymized COVID-19 confirmed cases at the initial stage of the outbreak from 21 January 2020 to 18 February 2020 were transformed into the spatial point on the map of Huangzhou district through ArcGIS. The spatial distribution of 639 COVID-19 confirmed cases in Huangzhou is heterogeneous in light of Fig. 4. Most of the confirmed cases concentrate in the central area, where residents engage in many daily activities. Commercial activities are one of the most active factors that play a vital role in people's congregating flow. According to the clustering result based on DBSCAN, a total of 190 clusters of confirmed COVID-19 cases are identified in the Huangzhou district. The distribution of clusters is depicted in Fig. 5. The numbers of different POIs are listed in Table 3. The limitation of data collection results in the number of housing price points is limited. Therefore, based on housing prices, the spatial interpolation was performed to obtain the average housing price in the 1000 m radius of each cluster. The statistical surface of housing prices in the Huangzhou district is shown in Fig. 6. The average housing price decreases based on the change of color from red to yellow until green.
Three indexes are used to examine the model's goodness-of-fit, including Comparative Fit Index (CFI) (Bentler, 1990), Tucker Lewis Index (TLI) (Tucker & Lewis, 1973), and Standardized Root Mean Square Residual (SRMR) (Jöreskog & Sörbom, 1981;Bender, 1995). The rule of thumb minimum standard of the CFI and TLI is 0.90 (Schermelleh-Engel et al., 2003). The CFI and TLI values estimated in this study are 0.95 and 0.90, respectively, which means the reasonable cutoff for these fit indexes, and the model is acceptable. In addition, the estimated value of SRMR is 0.05, which is less than 0.10 as acceptable or 0.05 as a good fit.
The SEM diagram depicts the relationships among variables regarding the built environment and the COVID-19 cluster size (see Fig. 7). The detailed estimation results of SEM are shown in Table 4. With regard to the measurement model of commercial prosperity, the hypotheses h2 and h5 have been verified. However, the rest of the hypotheses are invalid. Therefore, commercial prosperity is conceptualized by POIs regarding food services and markets. The medical service is measured by indicators of the number of POIs regarding clinics and drugstores, which indicates that h7 and h8 are verified based on the results of estimation. Accordingly, the transportation infrastructure can be measured by indicators of the number of POI of bus stops and the road network length, proving the hypotheses h9 and h10. The loadings of the measurement model are listed in Table 3. Concerning the hypotheses for the relationships among COVID-19 cluster size and latent variables, H1-H4 are confirmed. However, H5 is rejected. The COVID-19 cluster size is significantly influenced by the surrounding commercial prosperity directly. Meanwhile, the COVID-19 cluster size is indirectly impacted by medical service and transportation infrastructure by mediating commercial development. The covariates regarding housing price and building density, H6, H7, H9, and H10 are proved.

Discussion
The heterogeneous spread of COVID-19 in urban built environments was detected through the residential locations of COVID-19 confirmed cases in the Huangzhou district of Huanggang. GIS techniques and methods were used in analyzing localized infection distributions for more effective information on different dimensions of vulnerable locations. The valid results of the SEM estimate identify the influences of built environment attributes on assembling COVID-19 confirmed cases in the defined clusters, which extracts the neighborhood level spatial information on community contagions' potential risk. More specifically,   urban disparities in commercial prosperity, medical service, and transportation infrastructure at the community level play an essential role in quantities of COVID-19 confirmed cases in clusters. The relationships between latent variables and measured factors are expected to the benefit of implementing public health detection, interventions, and responses to disease threats early (Saarloos et al., 2009;Shaw et al., 2020;Rehmani & Mirmahaleh, 2021). The hypotheses of the measurement models of three latent variables in SEM were examined. Three latent variables have been constructed by the manifest items. Commercial prosperity is conceptualized by two POI indicators, including the number of market and foodservice, with loadings of 0.888 and 0.915, respectively. The increase in number of market and food service will indirectly increase the COVID-19 cluster size in the surrounding neighborhoods through enhanced commercial prosperity. The number of bus stops and the length of the road network conceptualize the transportation infrastructure around each COVID-19 cluster location. The well-developed transportation infrastructure may result in higher accessibility (Wang & Mu, 2018;Wang, Chen, Mu, & Zhang, 2020) and more prosperous commercial conditions, which may, in turn, lead to a high possibility of population movement in the surrounding community of clusters of COVID-19 confirmed cases. Although the direct relationship between medical service and COVID-19 cluster size is not significant, the increments of the POI of clinics and drugstores will cause an increased medical service level, in turn, an increase of COVID-19 cluster size through the intermediate of commercial prosperity.
The statistically significant associations between the theoretic constructs and COVID-19 cluster size underscore the importance of residents' behavior in controlling human-to-human transmission risk and highlights the need to better understand the high-risk behaviors in specific urban spaces. We speculate that the booming commercial atmosphere and high transportation infrastructure portend more urban mobilities in a specific urban area, which may cause more social contacts beyond the other types of places. Due to the disparities in commercial prosperity and transportation infrastructure, some urban areas may be confronted with more intense stresses from the COVID-19 spreading and referring to a higher potential risk of infections (Shach-Pinsly, 2019; Wang, 2021).
In areas of high building density in Huangzhou, residents' mobility may be enhanced with better supporting facilities. However, building density inversely correlates with commercial property. Housing price positively influences the level of medical service and commercial prosperity at the community level, which further influences the COVID-19 cluster size. The effects of housing prices, to some extent, reflect the impacts of socioeconomic characteristics of residents on the possibility of infection. The activities and movements of residents living in a residential community with a relatively high housing price in Huangzhou could be affected by the convenient facilities, commercial prosperity, and sophisticated medical resources. In Huangzhou, the residential districts with relatively high prices are located mainly in the central city, where the population density and visitor flow are higher than other places. From this perspective, people who live in these areas may potentially suffer from the relatively higher risk of infection during the COVID-19 epidemic.

Conclusion
The present COVID-19 pandemic is an international public health crisis. This study reveals the relationships between built environment attributes and confirmed COVID-19 cluster size in the central district. Huangzhou, as the studied district, is a local hub of travel and trade, which is closely bound up with the provincial capital city (Wuhan) and the rest districts and counties in Huanggang prefecture. It has a remarkable high potential to become the sub-epicenter influencing the regions around. However, compared with Wuhan, less attention was paid to Huangzhou at the early stage of the COVID-19 outbreak.
The results of this study, for the first time, identify the cluster of aggregated infected cases at the community level and reveal the direct positive effect of commercial prosperity on the risk of contagion clustering regarding the COVID-19 confirmed cases in cluster. The wellsupplied medical service is associated with high commercial prosperity at the community level, which indirectly and positively affects the COVID-19 cluster size. Community transportation infrastructure is the key essence of developing convenient commercial and medical facilities that in general ensure citizen's quality of life (Peng, Feng, & Timmermans, 2019;Peng, Feng, & Timmermans, 2019;Gan, Yang, Feng, & Timmermans, 2020). However, when it comes to the prevention and control of contagion, the well-developed built environment with high population density is prone to a high transmission rate of the epidemic.
The importance of what we know about the influence of urban settings and spatial attributes and how it correlates with the clustering of infection has been interpreted in this paper. Due to the significant effects of urban commercial prosperity and transportation infrastructure on the aggregation of COVID-19 confirmed cases, the ad hoc policies are expected for prevention and control. To some extent, during the pandemic outbreak period, urban settings and the functions that rely on physical urban spaces should be taken into account in the responses for effective measurements of prevention and containment.
To date, more than 55 % of the population resides in urban areas all over the world, and this proportion is still increasing, which is expected to be 68 % by 2050 (United Nations, 2018). Infectious diseases may emerge in the urban area again in the future. Since the emergence of epidemic originates in urban settings and rapidly propagates in the context of urbanization, built environment attributes are crucially essential to influence the transmission of emerging infectious diseases (Lee et al., 2020;Wang, 2021). Containment measures in some urban regions with specific facilities are essential to prevent the transmission at the community level. Furthermore, apart from the vaccine that are still under development and/or validity testing at the time of this research, understanding the spatial characteristics of COVID-19 spread is critical to its mitigation, espcailly when considering the development of future resilient cities.

Declaration of Competing Interest
The authors claim there is NO conflict of interest.