Non-destructive estimation of leaf area and leaf weight of Cinchona officinalis L. (Rubiaceae) based on linear models

Abstract Non-destructive methods that accurately estimate leaf area (LA) and leaf weight (LW) are simple and inexpensive, and represent powerful tools in the development of physiological and agronomic research. The objective of this research is to generate mathematical models for estimating the LA and LW of Cinchona officinalis leaves. A total of 220 leaves were collected from C. officinalis plants 10 months after transplantation. Each leaf was measured for length, width, weight, and leaf area. Data for 80% of leaves were used to form the training set, and data for the remaining 20% were used as the validation set. The training set was used for model fit and choice, whereas the validation set al.lowed assessment of the of the model’s predictive ability. The LA and LW were modeled using seven linear regression models based on the length (L) and width (Wi) of leaves. In addition, the models were assessed based on calculation of the following statistics: goodness of fit (R 2), root mean squared error (RMSE), Akaike’s information criterion (AIC), and the deviation between the regression line of the observed versus expected values and the reference line, determined by the area between these lines (ABL). For LA estimation, the model LA = 11.521(Wi) − 21.422 (R 2 = 0.96, RMSE = 28.16, AIC = 3.48, and ABL = 140.34) was chosen, while for LW determination, LW = 0.2419(Wi) − 0.4936 (R 2 = 0.93, RMSE = 0.56, AIC = 37.36, and ABL = 0.03) was selected. Finally, the LA and LW of C. officinalis could be estimated through linear regression involving leaf width, proving to be a simple and accurate tool.


Introduction
Cinchona officinalis is a plant species of high medicinal value because its bark contains alkaloids such as quinine, which has been used as an effective treatment for malaria for more than 300 years (C ondor et al. 2009;Canales et al. 2020). In Peru, there are few forest relicts with C. officinalis located mainly in the regions of Cajamarca and Piura (Huam an 2020; Fernandez-Zarate et al. 2022). This species is being decimated by agriculture, livestock and logging (Zevallos 1989;Castañeda et al. 2019;Carbajal 2022;, conditions that have led to the implementation of conservation and recovery programs in Peru (Alb an-Castillo et al. 2020). The success of these programs will depend largely on the silvicultural management of the species at the nursery level, in that sense, the study of leaf area (LA) allows the analysis of ecophysiological factors such as transpiration, nutrient assimilation percentage, photosynthetic efficiency, specific leaf area and leaf area index, which is reflected in crop growth (Torri et al. 2009;Keramatlou et al. 2015;Montelatto et al. 2020), while leaf weight (LW) will allow measuring the relative growth rate (Su arez et al. 2018) and biomass accumulation (Tieszen 1982).
Mathematical models describing the relationship between plant growth, dry matter generation, and leaf area (LA) are often used to study crop growth. Analysis of these variables over time allows assessing the effect of environmental and management conditions on crop development (Su arez et al. 2018). Destructive sampling is often required for studies that evaluate plant growth; however, this is impossible in trials whose purpose it is to monitor leaf growth over a given period (Fallovo et al. 2008;Kishor Kumar et al. 2017;Su arez et al. 2018).
There are direct and indirect methods to determine the LA of plants (Kishor Kumar et al. 2017). Of the direct methods, the use of conventional planimeters, scanners, or fixed cameras and software for image processing stand out (Granier et al. 2002;Liu et al. 2017;Su arez et al. 2018). However, these methods require leaf extraction (Cristofori et al. 2007;Fallovo et al. 2008). On the other hand, most non-destructive methods allow estimation of LA and leaf weight through regression models considering leaf length and width (Pompelli et al. 2012;Tondjo et al. 2015;Weraduwage et al. 2015;Kishor Kumar et al. 2017;Liu et al. 2017). Regression models that consider leaf structural parameters such as length (L), L 2 , width (Wi), Wi 2 , or a combination of length and width (LÂWi) have been found to have high accuracy in estimating LA (Cristofori et al. 2007;Meng et al. 2019). Leaf weight has commonly been determined through oven drying, a method that is destructive and timeconsuming (Dwyer et al. 2014;Freschet et al. 2015).
Models for estimating LA as a function of leaf length and width have been generated, for example, in coffee (Antunes et al. 2008), kaffir lime (Budiarto et al. 2022), cocoa (Su arez et al. 2018), grape (Tsialtas et al. 2008), strawberry (Demirsoy et al. 2005), hazelnut (Cristofori et al. 2007), walnut (Zellers et al. 2012), cherry (Demirsoy et al. 2005), and other species of small fruits (Fallovo et al. 2008) . However, during the literature review no reports on the subject under study were found for species of the genus Cinchona. Under this context, the objective of this research is to generate a mathematical model that non-destructively estimates the LA and LW of C. officinalis L. (Rubiaceae), taking into account leaf parameters (leaf length and width) as independent variables.

Study area
Data on the L, Wi and weight of leaves of C. officinalis were collected and processed in June 2022 from the Fernandez Zarate family forest nursery, located in the community La Cascarilla (5 40 0 21.12 00 S and 78 53 0 55.65 00 W) province of Jaen at 1810 m, which is characterized by annual precipitation of 1730 mm, minimum temperature of 13 C, and maximum temperature of 20.5 C .

Characteristics of the leaves of C. officinalis
They are elliptic-ovate, simple, opposite and decussate, with petiole, acute or acuminate apex, with entire margin (Zevallos 1989; Huam an 2020).

Data collection
Fifty-five C. officinalis plants were randomly selected at nursery level and were 10 months old. Four leaves per plant were extracted (Figure 1), in total 220 leaves were collected without visible damage that could alter their shape (Su arez et al. 2018).
The fresh weight of each selected leaf was then determined using a Kmt Style electronic balance (200 ± 0.01 g). In addition, each leaf was photographed according to the methodology described in (Fernandez-Zarate et al. 2022). Leaf length (L) from base to apex and maximum leaf width (Wi) were measured using ImageJ software ( Figure 2) (Baker et al. 1996). These measurements were independently performed by three people in order to decrease bias in image processing.

Statistical analysis
Boxx and whisker plots were used for each morphometric variable evaluated (L, Wi, LA, and LW). In  addition, since the data were not normally distributed, pairwise Spearman correlation coefficients were calculated between the sets of independent variables (L, Wi, L 2 , Wi 2 , (L þ Wi) 2 , and L Â Wi) and dependent variables (LA and LW).
Linear regression was performed between the dependent variables (LA) and LW with different independent variables, including L, Wi, L 2 , Wi 2 , (L þ Wi) 2 , and the product L Â Wi (Keramatlou et al. 2015). Finally, the root mean squared error (RMSE) and Akaike's information criterion (AIC) were calculated.

Model validation
Of the 220 leaves sampled, data from 80% of the leaves were randomly selected to form a training set that was used to determine the models, and data of the remaining 20% were used as a validation set to estimate the predictive capacity of the fitted model (Su arez et al. 2022). The mathematical model was chosen taking into account the highest coefficient of determination (R 2 ) and the lowest RMSE and AIC. In addition, in order to analyze trends in the deviation of observed to expected values, scatter plots of observed versus expected values were made, and a reference line y ¼ x and the regression line of the observed versus expected values were superimposed. The deviation between the regression line of the observed versus the expected reference line indicated that there was bias. To determine this bias, the area between these lines (ABL) was calculated; the lower the ABL, the lower the bias, thereby indicating the higher performance of the model in terms of more accurate prediction (Su arez et al. 2018). Figure 3 shows the descriptive analyses demonstrated by box and whisker plots of the variables that were used to determine the linear model for estimating leaf area and leaf weight of C. officinalis in a non-destructive way. The box plot graphs show the distribution of L and Wi (Figure 3(A)), LA (Figure 3(B)), and LW (Figure 3(C)) as a function of the range of quartiles and outliers. The circles correspond to the extreme values; that is, those far away from the median in the Wi, LW, and LA data.

Results
In addition, Spearman correlation coefficients (since the data are not normally distributed) were determined pairwise between the set of morphometric variables (Figure 4). The variable L Â Wi showed a higher correlation for estimating LA and LW (rs ¼ 1 and rs ¼ 0.98, respectively).
Descriptive analysis was carried out of the independent variables of length (L) and width (Wi) and dependent variables of leaf area (LA) and weight (LW) of C. officinalis leaves.
The linear correlations between the independent variables (L, Wi, L 2 , Wi 2 , (L þ Wi)2, L Â Wi) and the dependent variable LA are shown in Figure 5(A-F).   It is evident that the parameters are highly correlated with LA (R 2 ¼ 0.925 À 0.998). Table 1 shows the regression coefficients, RMSE, and AIC of the six models used to estimate LA. With respect to the statistics calculated to select a particular model, none of the models comply for all the considered statistics. Model 6 achieved the highest R 2 , whereas Model 2 had the lowest RMSE and AIC.
The linear correlations between the independent variables (L, Wi, L 2 , Wi 2 , (L þ Wi) 2 , L Â Wi) and the dependent variable LW are shown in Figure 6(A-F). It is evident that the parameters are highly correlated with LW (R 2 ¼ 0.89 À 0.97). Table 2 shows the regression coefficients, RMSE, and AIC of the six models used to estimate the LW of C. officinalis leaves. With respect to the statistics calculated to select a particular model, it can be seen that none of the models showed compliance for all the considered statistics. Model 6 had the highest R 2 , Model 1 achieved the lowest RMSE, and Model 2 had the lowest AIC.
Deviation of the regression line from the expected versus observed values from the baseline is evidence of bias, as shown in Figures 7 and 8. In this case, the baseline and regression line were very close in the  visual analysis of the differences between a set of competing models. When modeling the LA of C. officinalis and considering the R 2 , RMSE (smaller, better), and AIC (smaller, better), two models (Model 2 and Model 6) showed some outstanding features. However, Model 2 had only one input parameter, and the regression line of the observed versus predicted was closer to the baseline (lower ABL, Figure 7). Therefore, Model 2 is proposed as the best of the six evaluated models for estimating the LA of C. officinalis. When modeling the LW of C. officinalis and considering the R 2 , RMSE (smaller, better), and AIC (smaller, better), three models (Models 1, 2, and 6) showed some outstanding features. However, Models 1 and 2 had only one input parameter, and of these, the regression line of the observed to predicted for Model 2 was closer to the baseline (lower ABL, Figure 8). Therefore, we propose that Model 2 is the best of the six evaluated models for the estimation of C. officinalis LW.

Discussion
LA is used to infer plant biomass accumulation (Weraduwage et al. 2015) in addition to estimating leaf  growth (Nihayati et al. 2018;Budiarto et al. 2022). The calculation of LA and LW through non-destructive methods has been used in various physiological studies (photosynthetic capacity) and to assess the agronomic behavior of plants (fertilization intensity, water availability) (Lizaso et al. 2003;Swart et al. 2004;Blanco and Folegatti 2005;Rouphael et al. 2007;Su arez et al. 2018). For this reason, the L and Wi of a leaf have been used in regression as predictors of leaf variables that are more complex to measure non-destructively (LA and LW) (Ma et al. 1992;Serdar and Demirsoy 2006;Rouphael et al. 2007;Bakhshandeh et al. 2011;Erdo gan 2012;Pompelli et al. 2012;Fascella et al. 2013;Nehbandani et al. 2013;Keramatlou et al. 2015;Oliveira et al. 2015;Pezzini et al. 2018;Su arez et al. 2018;Wang et al. 2019;Montelatto et al. 2020;Sabouri et al. 2022).
The research results showed a high correlation between the independent variables and dependent variables, with Spearman correlation coefficients ranging from 0.98 to 1 for the estimation of LA and from 0.96 to 0.98 for the estimation of LW of C. officinalis. Similarly, Spearman correlation coefficients of 0.93 have been reported between the weight and length of a leaf and of 0.89 between the weight and width (Su arez et al. 2018), whereas Serdar and Demirsoy (2006) determined a high correlation between the width, length, and area of chestnut leaves, with correlation coefficients ranging from 0.95 to 0.98.
The highest R 2 values (0.998 and 0.966) were obtained for the linear model using the product between leaf length and leaf width (L Â Wi) to estimate the LA and LW of C. officinalis, respectively, however, this behavior is different when calculating the RMSE and AIC, this is evidence that the models either underestimate or overestimate the LA or LW of the leaves (Basak et al. 2019;Mela et al. 2022). In this study, in addition to R 2 and RMSE, we calculated the AIC and the area between the fitted versus predicted line and the baseline (ABL) as additional criteria for selecting a model, taking into account bias in the same way as reported by (Su arez et al. 2018).
Considering models that were superior in more than one statistic, either obtaining a higher R 2 or a lower RMSE, AIC, or ABL, we found that the best model to predict LA and LW of C. officinalis leaves is Model 2, which used as the independent variable leaf width, since it had a lower RMSE, AIC, and ABL (for estimating LA) and lower AIC and ABL (for estimating We). Similar results were reported by (Sabouri et al. 2022), who found that models using the L or Wi of the leaves provided more accurate estimates of LA and LW, Cristofori et al. (2007) found R 2 values between 0.70 and 0.81 when estimating the LA as a function of the L or Wi of apple leaves, resulting in higher R 2 when Wi of leaves was used as independent variable, for the case of rose leaves, Rouphael et al. (2007) showed several models that allowed estimating LA and LW as a function of leaf L and Wi, Rouphael et al. (2007) developed three mathematical models that estimated LA, fresh and dry weight of maize leaves from measurements of leaf L and Wi, resulting in strong relationships between L and Wi with LA and LW (R 2 > 0.85).

Conclusion
The results obtained in this research showed that LA and LW of leaves of C. officinalis can be estimated through linear regression based on leaf width. Based on R 2 , RMSE, AIC and ABL, Model 2 is recommended for estimating both the LA (LA ¼ 11.521(Wi) À 21.422) and LW (LW ¼ 0.2419(Wi) À 0.4936). The use of these models in estimation and their validation was based on a data set obtained for this purpose. The equations were shown to be a simple, accurate, and time-saving tool for evaluating the growth of C. officinalis plants. There are some limitations regarding accuracy; however, the use of a larger amount of data would decrease extent of deviation. In addition, to increase the model accuracy, the incorporation of environmental factors, nursery management practices, and other growth factors is suggested. Finally, future work should test the applicability of the suggested models to other species of the genus Cinchona.