Height–diameter allometry for the management of city trees in the tropics

Trees are important components of urban greenery because of their large stature and longevity, and their ability to enhance the environmental quality of city landscapes. However, benefits and hazards associated with trees depend on their size, which changes over time and varies among species. While urban trees are often measured during routine management, the full value of these data is rarely realised. Our study uses nation-wide inspection records from Singapore to develop allometric models for 54 species grown in the urban tropics (n = 345 794), a region that is poorly represented in allometric literature. We use the height–diameter relationship to demonstrate how analyses of existing datasets can be used to support decisions on tree inspection and maintenance. The accuracy of models developed separately for each species (single-species models) and using the pooled data for all species (mixed-effects model) was compared. Model outputs and derived metrics were used to detect height outliers and priority regions that may require greater inspection effort, which we summarise using spatial visualisations and an online web application. Model parameters also varied according to each species’ pruning intensity and maximum height, and can thus provide a useful heuristic when selecting species to plant. Such data-driven approaches have the potential to support both management and research, though changes to workflows may be needed to take advantage of new sources of data. Integrating multiple datasets into decision-making will require expertise across multiple disciplines, and coordinated action among stakeholders. By sharing the code used to develop the allometric models in a new open-source R package ‘allometree’, we hope to promote reproducibility and facilitate the application of allometric equations to other tree parameters, study regions and management objectives.


Introduction
Urban trees provide many benefits, both direct and indirect, which are increasingly recognised by both researchers and practitioners (Van den Berg 2017). These benefits, known as 'ecosystem services' , include environmental regulation, resource provision, increased biodiversity and aesthetic enhancement (Willis and Petrokofsky 2017). The benefits provided by individual trees accumulate across an urban forest, leading to substantial effects that help make cities more liveable and adaptable to climate change (Watkins et al 2020).
In the course of their work, foresters and landscape managers generate detailed information about trees in the form of tree inventories. This information is potentially useful in research and management (Jim and Tan 2017), and many studies have compared such data across greening projects and cities worldwide (Mcpherson 2014, Berland andLange 2017). One common use of inventory data is to establish allometric relations between easily measurable parameters such as trunk diameter and other parameters such as tree biomass or total leaf area that can only be measured by destructive sampling. These allometric relations can then be used to study, not only how the biomass and structure of trees change as they grow, but also how the performance and benefits of trees change over time (Mcpherson andPeper 2012, Mcpherson et al 2016). Applications of this approach include carbon accounting (Ngo and Lum 2018) and estimating the impact of tree shade upon the energy balance of buildings (Ko et al 2015).
While such performance-based approaches to city management are being increasingly adopted (Cortinovis and Geneletti 2020), there remains untapped potential in many urban tree inventory datasets. Several tools and databases have been developed to quantify the benefits of urban trees (e.g. EcoLayers 2012, Citree 2015, Urban Forest Ecosystems Institute 2017, U.S. Department of Agriculture Forest Service 2019), but these have mainly been used to raise awareness of the value of urban trees rather than to improve their management (Pincetl 2010, Pincetl et al 2013, Laurans et al 2013. This is partly because these tools do not provide the detailed information on maintenance requirements and potential hazards that is necessary to ensure that tree planting programmes are cost-effective and safe (Song et al 2018).
Allometric scaling relationships between tree height and trunk diameter can be used to characterise species-specific patterns of growth, structural support and failure risk (King et al 2006, Hulshof et al 2015, all of which are important considerations when planting trees in urban areas. From a theoretical perspective, the height-diameter relationship is thought to reflect an evolutionary trade-off between growth and survival for trees growing in natural forests; for example, rapid height growth might allow a young tree to reach the forest canopy sooner, but its thinner trunk could make it more vulnerable to strong winds (King 1996). Since height-diameter allometry is highly influenced by each species' growth strategy and the available environmental resources (Pacala and Tilman 1994), this allometric relationship may provide a means to identify trees at risk of structural failure (Petty andSwain 1985, Mattheck et al 2002). The ability to detect individual trees that exhibit growth atypical of the species could be useful, for example, in guiding inspection, pruning and removal priorities.
Allometric relationships between height, diameter and biomass are used for a range of ecological purposes, including to estimate carbon storage in forests (Hulshof et al 2015). Such relationships can vary widely, even within a species, according to environmental conditions (Wang et al 2006, Feldpausch et al 2011, Hulshof et al 2015 and management (Mchale et al 2009). In a city, where trees must be regularly pruned for safety reasons (Harris 1975), management procedures need to take account of factors such as wood density and tree growth form (King et al 2006). Such factors, along with associated pruning measures, may also influence height-diameter allometry in cities. There have been several studies of height-diameter relationships in urban trees (Mcpherson 2014, Berland andLange 2017), but most of these have been of broadleaved, deciduous species in temperate regions. In contrast, our knowledge of tree allometry in the tropics comes largely from studies in natural environments (Osunkoya et al 2007, Feldpausch et al 2011, which may not be applicable to trees growing in urban areas (Mchale et al 2009, Ngo andLum 2018). Given that many tropical regions are urbanising rapidly (Song et al 2017), it is important to fill this knowledge gap on tree allometry in tropical cities (Roy et al 2012, Song et al 2018. In this study, we use nation-wide inspection records from Singapore to develop height-diameter allometric models for 54 tropical tree species, and show how these models may be applied to support decisions on tree inspection and maintenance. We examine the performance and application of two modelling approaches: developing separate models for individual species (single-species models); and pooling data to develop models with species as random effects (mixed-effects model). We test the allometric variability between species, and between different types of plant attributes, for the purpose of species selection when planting trees. Finally, we show how parameters and metrics derived from the models can be visualised spatially to detect priority regions for subsequent manual inspection. The framework and code developed here can be applied to quantify allometric equations for other tree parameters, or for other study regions and management objectives, and we are therefore sharing it as an open-source R package.

Urban tree data
Data was obtained from the Singapore National Parks Board (NParks)-a government statutory board responsible for managing greenery, including urban street trees. The inventory dataset represented a 'snapshot' of trees in 2018. It contained 389 025 trees belonging to 69 species, with information on height (estimated as height ranges), girth size (measured with diameter tape at breast-height) and geo-location. Tree heights were converted to a middle value for the given height range, and girth measurements were converted to diameter sizes. All analyses were performed in the software R 3.6.0 (R Core Team 2019). The dataset was filtered to 345 794 trees (54 species, each n ≥ 100; details in supplementary information).

Allometric model development
A wide variety of models have been used to estimate the growth and allometry of forest trees (Pretzsch et al 2015, Hulshof et al 2015. Our study focuses on intuitive empirical models that may be easily adopted by practitioners working in an urban context, while bearing in mind that tree heights can be reduced by pruning. For the purpose of reproducibility and use beyond our study context, we developed an opensource R package 'allometree' , which can be found at https://xp-song.github.io/allometree (Song and Lai 2020). The relationship between height and diameter was examined using six allometric linear equations used for urban trees (Mcpherson et al 2016): Quadratic Quartic where y i = height of individual tree i, i = 1,2,3… n, n = number of observations x i = diameter a, b, c, d, e = parameters to be estimated ϵ i = normally distributed error term w i = known weight that takes one of the four following forms: Equations (1)-(6) were used to fit linear regressions for each species (54 single-species models), as well as to fit a linear mixed-model with species specified as a random effect on all parameters (R package 'lme4 ′ v1.1-21). The mixed-effects model was used to estimate errors at both the species-and population-levels, and to test allometric variability among different species. The best-fit model across equations (1)-(6) was selected based on the lowest bias-corrected Akaike's information criterion (AIC c ) (Burnham and Anderson 2004). To make the AIC c values of models with a transformed response variable (equations (5) and (6)) comparable to untransformed models, log (y i ) was multiplied by the geometric mean height (ý) of each species when using the single-species models, and of all species when using the mixed-effects model (Draper and Smith 1998, Xiao et al 2011, Mcpherson et al 2016. Outliers that were more than four times the mean Cook's distance for the respective best-fit models were excluded (Cook 1979). The filtered data was subsequently refit to the best-fit single-species (n = 336 796) and mixed-effects (n = 329 531) models. These models were then used to predict the height of each species, across their respective diameter ranges (details in supplementary information, available online at https://stacks.iop.org/ERL/15/114017/mmedia).

Model validation
We performed ten-fold cross-validations across the full dataset to compute averaged goodness-of-fit statistics R 2 and root mean square error (RMSE) for predicted heights (Barnston 1992). The single-species and mixed-effects models produced a median R 2 of 0.42 and 0.43, and median RMSE of 2.05 m and 2.00 m, respectively. Further, we validated our model against another independently collected dataset (see supplementary table S4). Prediction errors were similar to those produced by cross-validation (details in supplementary figures S5, S6 and tables S2-S4).

Allometric scaling of mixed-effect model parameters
The log-log (equation (5)) mixed-effects model provided the best-fit using the pooled data from all species. We thus used this model to examine the variability in the allometric scaling of model parameters among species, and among factors such as pruning intensity, maximum height and wood density. The effect of each of these factors on parameters a and b in the model was quantified; the Kruskal-Wallis test was used to compare if differences were significant (Mckight and Najab 2010).
An expert survey of three senior arborists was used to classify pruning intensity: (1) 'non-structural' pruning removes dead branches and reduces crown weight; (2) 'structural-I' pruning is aimed at height reduction by making pruning cuts at unions between branches; (3) 'structural-II' pruning reduces height by cutting off large structural branches at locations without unions. The intraclass correlation coefficient (ICC) across the surveys was 0.56 (P < 0.001); the most frequently selected category for each species was used. Maximum height for each species was defined as either 'medium' or 'tall' based on whether there was at least one individual in the tallest height class (>24 m), and results were cross-checked with existing databases (National Parks Board 2019). Finally, the mean wood density for each species was determined using datasets from the publicly available TRY database (Kattge et al 2011) and local surveys (Ngo and Lum 2018). Species were categorised as having either 'high' or 'low' wood density based on the median quantile (further details on plant attributes in supplementary table S1 and figure S2).

Spatial visualisation of trees
Spatial visualisations of results from the allometric models were used to detect height outliers and priority regions for inspection. To illustrate this, tree height was predicted from a measured diameter for each tree in the full dataset (n = 345 794), using the bestfit mixed-effects model (equation (5)). The RMSE of predicted heights and the standardised effect sizes (between predicted and measured heights) were calculated. Across the city, trees were mapped based on the median value for all trees within each cell in a hexagonal spatial grid for (1) parameter a, (2) parameter b, and the (3) RMSE value. Finally, each tree was mapped according to its (4) standardised effect size, which represents the extent that it is taller or shorter than its predicted diameter.

Comparison between model types
In the single-species analyses, the R 2 value for models ranged from 0.20 to 0.70, and were higher than the mixed-effects model for 30 of the 54 species (figure 2; details in supplementary table S2). However, RMSE was larger for 39 species, with prediction errors being particularly high for the single-species models of Syzygium myrtifolium and S. zeylanicum (figures 1 and 2). Species with a larger sample size did not necessarily have a smaller prediction error (figure 1).
For the mixed-effects model, the log-log model (equation (5); w i = 1) provided the best-fit (details in supplementary table S3 and figure S3). Fixed effects alone explained 67% of the variance in height (marginal R 2 = 0.67), and this was increased to 81% by including the random effects (conditional R 2 = 0.81) (Nakagawa and Schielzeth 2013). The adjusted ICC showed that the 'species' random effect explained about 41% of the variance in height across species.

Factors that explain variability in allometric scaling
The best-fit mixed-effects model (equation (5)) was used to assess allometric scaling among species (figure 3), as well as the effects of pruning intensity and related plant attributes (figure 4). In this model, parameter a provided an indication of overall height, while parameter b provided an indication of the species' growth strategy ( figure 3(a)). Trees that were initially slender and rapidly reached their maximum height had a low value of b, while trees that showed a more balanced increase in both height and diameter (i.e. less skewed) had a higher value of b ( figure 3(a)).
The parameters a and b were positively correlated (R 2 = 0.83), with very few species exhibiting extremes of either 'high-a, low-b' , or 'low-a, high-b' ( figure  3(b)). In general, 'high-a' species were likely to be taller at maturity and required higher pruning intensity (figure 4). A higher value for parameter b was also associated with taller trees at maturity, but its association with other attributes were less clear ( figure  4). There were no significant associations across species between wood density and parameters a and b (figure 4).

Spatial mapping of tree structure
Using information from the allometric models, we mapped spatial variation in tree populations that tended to be taller, as defined by parameter a ( figure 5(a)), and trees that scaled in either a skewed or balanced manner between height and diameter, as defined by parameter b ( figure 5(b)). Mapping the RMSE between the predicted and observed values highlights areas where individual tree heights deviated greatly from the allometric expectation for the species (figure 5(c)). Finally, we identified individual trees that were both shorter and taller than predicted, while taking into account differences between species (figure 5(d)).

Considerations for the choice of allometric model
In natural environments, allometric relationships between height and diameter are often described using the power function y = ax b (Brym and Ernest 2018). This produces a linear relationship when the logarithms are taken for both y and x, which closely resembles equation (5) in our study adopted from Mcpherson et al (2016). Because they are more easily interpreted, practitioners tend to prefer linear models over alternatives such as generalised non-linear models. Other functions have been shown to provide a slightly better fit in temperate (Wang et al 2006) and moist tropical forests (Banin et al 2012), although the differences are generally minor (Hulshof et al 2015); the power function has been shown to perform best for tropical trees worldwide, especially in the wet tropics (Purves et al 2007, Feldpausch et al 2011. Growing conditions in urban areas vary widely with soil conditions, surrounding infrastructure and level of management (Brym and Ernest 2018). Unlike in closed forest, growth and plant resource allocation are usually not limited by light (Erwin 1988). Because of pruning, however, a wider range of polynomial relationships may be needed to describe the growth of urban trees, making it difficult to generalise about the influence of life history strategy upon heightdiameter allometry. For instance, species that are pruned more intensely, such as Khaya grandifoliola and Casuarina equisetifolia, tend to show a dip or plateau in the single-species allometric curves (figure 2). Conversely, non-vertical pruning and compact planting arrangements of S. myrtifolium and S. zeylanicum along roads can produce very tall, narrow hedges (exponential relationship; equation (6)). Such effects of management upon tree form could explain the low accuracy of some single-species models, despite relatively large sample sizes (figures 1 and 2). In these cases, using the mixed-effects model avoids extreme model parameterisation, and can be expected to produce more robust predictions for 'average' cases.
Comparisons between the single-species and mixed-effects models clearly reveal the trade-off between precision and generality. Whereas models fitted separately for each species provide greater precision, they may be sensitive to outliers. Such models, especially if they include exponential or polynomial terms, usually do not extrapolate well; for example, predictions below minimum diameter cannot be extrapolated for species with polynomial models (equations (1)-(4)) as they would result in a non-zero height-intercept. On the other hand, the mixed-effects model assumes that the best-fit equation represents a shared biology across all trees. Since extreme values are constrained by the overall mean, parameter estimates for each species tend to be more robust against both outliers and incomplete sampling along the diameter range. If growth form, environmental and management conditions are relatively similar, the 'average' model may be used to estimate height-diameter relationships for new species that are not represented in the dataset. We thus recommend the use of the single-species models if higher sensitivity to unusual growth or management conditions is required, and the mixed-effects model for extrapolation beyond the diameter range, use with new species, and general-purpose predictions. Allometric curves for the single-species and mixed-effects models predicting the relationship between tree height and diameter. The numbers at the left and right corners within each plot are the respective R 2 of the single-species and mixed-effects models (see supplementary figure S5 for plots showing RMSE). Shaded regions along the curves denote a 95% prediction interval, and raw data are shown as points. Dotted lines denote extrapolations beyond the maximum diameter and height ranges for each species. Further details are in supplementary tables S1 to S4. The prototype web application to explore the height-diameter relationship of each species can be found at https://xpsong.shinyapps.io/allometree-sg/.

Using allometric parameters for urban tree species selection
An examination of variability in the log-log model (equation (5)) parameters showed differences in each species' overall height (parameter a) and allometric scaling (parameter b), based on growth and management conditions in tropical Singapore ( figure 3). Parameter values also varied significantly based on species' maximum height and pruning intensity (figure 4), and may thus provide a useful heuristic for selecting species. While parameter variability based on wood density was unclear (figure 4), wood density is known to directly affect trunk diameter, biomass and the height-diameter relationship (Ducey 2012, Iida et al 2012. However, despite its role in structural support, wood density is not a useful indicator of failure risk on a large scale, owing to its relatively complex and destructive method of measurement (Ngo and Lum 2018). Rather, information on each species' wood density may be used alongside outputs from height-diameter models to provide a more holistic assessment of hazard risk. Future work  could investigate the usefulness of wood density as an indicator for species selection in relation to urban planting criteria (Watkins et al 2020).
Choosing species with larger values for parameter a can help maximise vertical coverage (e.g. for aesthetic appeal, higher canopy for shading), while species with low values of a may be planted beneath taller species to produce 'multi-tiered' layering of tree canopies (figure 3). Species with high values of both a and b are taller and show a more balanced increase in both height and diameter, while 'high-a, low-b' species may be tall but also slender when young (figure 3); species within these categories (e.g. K. grandifoliola, K. senegalsis, C. equisetifolia, S. grande) may have to be pruned more intensely (figures 3 and 4; details in supplementary table S1). In general, if structural stability and a low maintenance of tree heights are the primary goals for a planting site,  (5)). The RMSE in (c) is shown using a percentile-based colour scale. The SES in (d) represents the extent that trees are abnormally taller (positive) or shorter (negative) than predicted by their measured diameter; their median values within hexagonal regions are in supplementary figure S7. A web application to explore the spatial distribution of trees can be found at https://xpsong.shinyapps.io/allometree-sg/. species with 'moderate-to-high-b' (i.e. balanced allometric scaling) and 'low-to-moderate-a' (i.e. moderate height) are recommended (figure 3(b); e.g. upper left quadrant).

Spatial mapping to inform maintenance efforts
Tree failure is an important concern in urban areas, and it is important to manage trees to ensure that their diameter is sufficient for structural support (Ball and Watt 2013). Assessment metrics such as the observed height-to-diameter ratio have often been used to identify height outliers that may pose potential hazards, and to assess the risk of tree failure (Wonn and O'Hara 2001, Mattheck et al 2002, Klein et al 2019 (supplementary figure S7(a)). While such metrics offer practitioners a simple means to assess the risk of storm damage, variation in tree morphology and growth form between different species may result in unnecessary pruning that can be harmful to tree health and longevity. Rather than using the absolute size or age of trees, allometric models provide a better gauge of maturity by showing whether each tree corresponds to expected patterns of growth.
Our diameter-height models show outliers that are taller or shorter than expected (figures 5(c) and (d)), which may indicate individuals in need of special attention. Furthermore, urban areas that inherently require more or less management effort can be highlighted by summarising the parameter a and b values of tree populations across the landscape (figures 5(a) and (b)). For example, height inspections may be prioritised in regions with many 'high-a' (taller) species ( figure 5(a)). Areas that contain many 'low-b' species may contain trees that grow to disproportionately taller heights at small diameter sizes ( figure 5(b)), and thus require monitoring of structural support when trees are relatively younger. Model outputs may subsequently be used to identify young trees that are growing abnormally and older trees at risk of failure.

Integrating allometric research with management practices
Many factors need to be evaluated in managing urban trees, including the differing priorities of stakeholders (figure 6(a)). In reality, no single stakeholder has a preeminent role, which means that no single data set is likely to be the ideal tool for all urban greening issues. To provide a more complete picture for practice, multiple objectives often need to be assessed alongside inputs from stakeholders and subject-matter experts (figure 6(a)). For example, at locations where shade is needed (e.g. near buildings, roads, and places with high human traffic; figure 6), species with spreading crowns may be most suitable. The R package 'allometree' can be used to model allometric parameters important for shade provision if such data are available (e.g. crown and leaf area; Mcpherson et al 2016). Other important criteria in urban tree management decisions may include plant health and vitality, which can be assessed using vegetation indices derived from multispectral satellite imagery or LiDAR data (Alonzo et al 2016) ( figure 6(b)). If attracting biodiversity is a priority, data on the fruiting or flowering habit of trees may be useful, and analyses should consider the proximity of planting sites to potential sources of biodiversity (e.g. habitat connectivity between planting sites and forest cover in figure 6(b)). Allometric relationships can also be extended to estimate the potential benefits of trees, alongside important considerations such as expected demand and costs (Song et al 2018).
At present, most tree inventory data worldwide are collected manually, but rapid advances in remote sensing may soon reduce manual effort and shift research and development toward integrating analyses into workflows for tree management and real-time monitoring. For instance, alternative sources of remotely sensed data can help to generate and supplement urban tree inventories (figure 6(b)). LiDAR scanning (Suárez et al 2005) and online images from Google Street View (Wang et al 2018) have been used to estimate the height and diameter of trees, and even to identify trees to the genus-level (Berland and Lange 2017). While rapid technological advancements and computing resources offer many opportunities for these new techniques to be used, systems and workflows will need to evolve to take advantage of such innovations. This includes streamlining data pipelines and integrating analyses into operational workflows. The dissemination of modular code and analyses can contribute to flexibility in combining multiple datasets and objectives for human input, to provide a more holistic assessment of performance.

Conclusions
Our study used two approaches to develop heightdiameter allometric models for 54 urban tropical species, based on an inventory of 345 794 trees planted across the city of Singapore. Goodness-of-fit statistics were relatively similar between the two approaches, but clear trade-offs were observed between the precision and generality of predictions. Models fitted separately for each species were more sensitive to unusual growth or management conditions, while the model with data pooled across all species was more suited for general-purpose and extrapolated predictions. The log-log equation provided the bestfit using the pooled data from all species; variation in model parameters corresponded to the types of pruning intensity associated with each species, as well as plant attributes such as maximum height. Value combinations of model parameters can thus provide a useful heuristic for species selection in urban areas. Model outputs and derived metrics were also used to detect individual trees showing unusual growth and parts of the city that may require more maintenance effort. Such data-driven approaches offer great potential, not only to practitioners responsible for managing urban trees, but also to researchers interested in understanding of tree growth in the urban environment.