Machine learning-based prediction of surface checks and bending properties in weathered thermally modified timber

Machine learning (ML)-based models, decision tree and ANFIS, were used to predict the degree of surface checking and bending properties of 30-month weathered thermally modified timber. The results showed that the investigated initial board properties did not allow accurate predictions of surface checks. ML regression and clustering analysis confirmed important variables for accurate predictions of bending properties were dynamic stiffness, acoustic velocity, density and lowest local bending modulus. ML models performed better than conventional regression models used for timber grading, and a prediction accuracy of 80 – 90% for bending stiffness and 50 – 70% for bending strength could be achieved.


Introduction
Thermally modified timber (TMT) is recommended for use in outdoor above-ground situations, where it is directly exposed to weather [1][2][3][4].In general, thermal modification prolongs service life in these conditions because dimensional stability and decay resistance are improved by the treatment.The degree of the effects of thermal modification on wood properties depends on properties of the raw material and process conditions [5,6].Previous research focussed mainly on effects of thermal modification on chemistry, anatomy, mechanical properties and decay resistance of small clear wood specimens [e.g.1,[7][8][9][10].Much less effort has been put on effects of thermal modification on the resistance of timber to weathering, in particular with regard to checking [6,11].Checks are longitudinal separations in wood tissue, which occur when moisture-induced stress exceeds tensile strength of wood [12,13].Checks typically first occur upon seasoning and develop during weathering.Checks start growing and are visible at the surface, i. e. surface checks, or in the core, i.e. internal checks.In unmodified wood, the degree of surface checking is known to be higher in juvenile wood, when pith is present, in and around knots, and when growth rings are oriented parallel to the exposed surface instead of perpendicular [14][15][16][17][18].The degree of checking is typically higher for thermally modified (TM) than for unmodified wood of the same species [19][20][21][22][23][24].Deep surface checks (i.e.equal or greater than 50% of the timber's thickness) were found in Norway spruce TMT after a 30-month weathering [24].These checks are longer and more frequent in TMT than unmodified timber, especially on the pith side of timbers, where checks develop predominantly along growth rings.Checks may affect negatively the aesthetics and mechanical properties of TMT.In 30-month weathered TMT, the presence of deep checks had no significant effect on static bending properties [24], although they could have a considerable impact on shear strength or capacity of timber connections [25].Deep checks also provide more favourable conditions for wood-decay fungi because of increased levels of moisture content (MC) in the surrounding wood tissue [26].Previous studies on unmodified Norway spruce clear wood showed a positive correlation between size of surface checks after weathering and density that was explained by differences in wood's moisture uptake and sorption [27,28].In contrast, Sandberg [17] found no such relationship for unmodified Norway spruce and Scots pine clear wood specimens.Relationships between the degree of surface checking and other wood properties have not been studied for TMT.
Previous studies show that principles of strength grading apply to Abbreviations: ANFIS, Adaptive neuro-fuzzy inference system; ANN, Artificial neural network; CI, Confidence interval; FIS, Fuzzy inference system; MC, Moisture content; ML, Machine learning; MLP, Multilayer perceptron; NN, Neural network; PLS, Partial least squares; RI, Relative importance; SD, Standard deviation; TM, Thermally modified; TMT, Thermally modified timber.TMT, while the loss in bending strength due to thermal modification can be as much as 40% on average [29,30].In detail, linear regression analyses with response variables bending strength and stiffness gave coefficients of determination (R 2 ) of 0.4-0.5 and 0.8-0.9,respectively, depending on predictor variable or combination of predictor variables.Similar to unmodified timber, important variables for accurate predictions are density, acoustic velocity, dynamic stiffness, and size and position of knots.A 30-month weathering from outdoor above-ground exposure (EN 335-1, use class 3.2) had no considerable effect on relationships between predictor and response variables important for strength grading timber, and hence, on the accuracy in prediction [24].Van Blokland et al. [24,29,30] provided proof of concept but did not assess model performance.Validation of models indicates reliability of the obtained accuracy when the model is dealing with unseen data.Principles of strength grading were used to predict internal check occurrence in TMT after 30-month weathering based on initial board properties [31].A machine learning-based model using decision tree regression proved useful and enabled predictions with 67% accuracy.Important variables for accurate predictions included annual ring width, density, acoustic velocity, dynamic stiffness and initial moisture content.It is possible to obtain predictor variables after kiln drying or following thermal modification.This has no noticeable effect on the accuracy in prediction of either TMT's bending properties or internal check occurrence [24,[29][30][31].
In contrast to statistical modelling traditionally used in timber grading, machine learning (ML) modelling provides a powerful tool for predictive regression [32].ML models can deal with complex datasets encompassing nonlinearity or missing data.Common techniques are partial least squares (PLS) regressions and artificial neural networks (ANNs) that use multilayer perceptron (MLP) neural network (NN) [33,34].MLP NN applied to small data sets can be challenging.The risk is overfitting and lack of generalization, which can cause the model's goodness-of-fit parameters (especially for training data) to be misleading and can make hyperparameter tuning and selection of training algorithm to be a critical task [35].Small data sets with sample size (N) in the range of a few to a couple of hundreds are typical in the field of wood science [31,33,36].Other, more robust, ML-based models have been used to predict wood properties with success.Examples are adaptive neuro-fuzzy inference system (ANFIS) [36], group method of data handling NN [37] and decision tree [38].ANFIS is a fuzzy-based model that benefits from a neural structure.It is a strong tool when analyzing dataset with small to medium size [32].ANFIS performed better compared to MLP ANN when monitoring wood machining in sawmilling applications (90 observations) and better than PLS regression when predicting physical and mechanical properties of wood based on infrared spectroscopy (240 observations) [33,39].ANFIS comprises the main steps in a fuzzy inference system (FIS) that are the input fuzzification, rule inference, fire strength computation, aggregation, and deffuzification [39].Decision tree can be used to study relationships between variables through exploratory data analysis such as variable clustering analysis and to study the relative importance (RI) of input variables [38,40].Decision tree was used to predict internal check occurrence in 30-month weathered TMT and to monitor mechanical degradation of wood by ultraviolet radiation [31,38].Performance of ML-models is also determined by choosing the right type of model for the job and quality of training data (e.g.size, missing data) [41].
This study will further analyse the data set from van Blokland et al. [24] and focus on predicting (1) the degree of surface checking and (2) bending properties (response variables) in 30-month weathered thermally modified Norway spruce timber using initial board properties, typically determined at the sawmill (predictor variables).The intention regarding (1) was to investigate if possibilities for prediction exists, while (2) aims to evaluate and maximise model's performance.To achieve this, predictive regression-based ML models, ANFIS and decision tree were developed and validated, and models' performance was evaluated.While variable clustering analysis simultaneously gave insight in relationships between the parameters of investigation.The main hypothesis is that a well-trained ML model should be capable of predicting the degree of surface checking and bending properties in weathered TMT using initial board properties.This will enable timber manufacturers and engineers alike to perform fast and reliable grading at early stage and select boards with better resistance to weathering.

Experimental work
Eighty-four (84) Norway spruce (Picea abies [L.] Karst.)logs were block sawn and two 'side matched' boards per log were obtained from the main yield.Boards were then kiln dried and planed to cross-sectional dimensions of 45 × 145 mm 2 at the sawmill, and cut to lengths of 3.6-4.8m.Subsequently, 84 boards, one from each log, were thermally modified (TM) at an industrial treatment plant using the ThermoWood® Thermo-D process, while the other 84 side matched boards remained unmodified and served as control sample.All 168 boards were exposed to the weather in South Sweden from spring 2017 until autumn 2019.Boards were placed horizontally on racks about 1 m above ground; one third of the TM and control boards were oriented flat pith side up (pith up), one third flat pith side down (pith down), and the last third on their edge (edge).After weathering, all boards were conditioned at room temperature conditions (approx.20 • C/ 60% RH), which should result in a MC of ~ 5% for TM and ~ 12% for control boards [30].Initial board properties were obtained after kiln drying and after thermal modification.Bending properties and the degree of surface checking were obtained after conditioning (Fig. 1).Bending tests were performed according to EN 408 [42] standard but no corrections to 12% MC were made.At the end of bending tests, a representative 20 mm thick knotfree section was cut per board from the unbroken part at least 300 mm from board's ends.This section was used to visually assess the degree of surface checking after 30-month weathering.Surface checks with a minimum depth of 1 mm on the board's flat sides, i.e. pith or bark side, were included.The depth of surface checks was taken as distance d [mm] over which the check had propagated into the board (Fig. 2).The degree of surface checking of each board was defined by the total number and the maximum depth of surface checks on the pith and bark side.An overview of the initial board properties and details of the preparation of boards, weathering setup and determination of board properties are found in van Blokland et al. [31] and therein included references.

Data analysis 2.2.1. Predictor and response variables
In total, 13 predictor variables and 4 response variables were included in the data analysis (Table 1).The following four variables were obtained on TM boards also before thermal modification, i.e. after kiln drying: MC i , ρ, v a,res and E a,res , and shown in Table 1 in parenthesis.
The influence of when these four variables were obtained was included in the analysis.Spearman's correlation coefficient

Statistical analysis
Parameters describing checks in wood, such as length, area or depth, typically do not follow normal distribution [16,24].Therefore, probability plot was used to evaluate the fit of normal distribution to the data.It shows an estimated cumulative distribution function by plotting the value of each observation versus its estimated cumulative probability [43].Normal distributed data aligns with a straight fit line and is reflected by a large p-value.The significance level in this study is 0.05.The probability plot also shows 95% confidence intervals (CIs) and mean and standard deviation (SD) values, and was used to estimate percentiles and to compare distributions of response variables between sample sets.Relationships between variables were studied for both sample sets through correlation analysis.For this, Spearman's rank correlation was used, which can evaluate monotonic relationships between two variables and deal with both continuous and ordinal variables [44].In addition, Spearman's rank correlation does not require linearity and absence of outliers, like Pearson's correlation.The p-value corresponding to each correlation coefficient (r s ) was calculated, in which the p ≤ 0.05 refers to a statistically significant correlation between two variables.r s was calculated separately for control and TM boards.

ML models
A ML model based on decision tree was used to study whether the degree of surface checking (C no and C max ) and bending properties (E m,g and f m ) can be predicted using the initial board properties listed in section 2.2.1.ML models were developed using the whole data set of 168 boards.The combined data set offers better possibilities to train the model and likely contains more variation as it includes both TM and   control boards.This should result in better performing ML models that are more robust to variation in input parameters [41].Another advantage is that the ML models are applicable to both unmodified and TM Norway spruce timber.The classification and regression trees (CART) algorithm was used for decision tree classification [40].Decision tree is developed using a subset of the data (training data) and node splitting at each branch point was based on the decision to minimise the least square error of the relationship between the model's estimated values and experimental values.After reaching maximum tree size, the R 2 of this relationship was calculated again, but this time using the remaining data (test data) to assess model performance.To find the optimal size of tree, tree pruning was performed by removing nodes from leaf to root based on the criteria that the R 2 of a pruned tree should be within one standard error of the maximum tree's R 2 .The RI of predictor variables was calculated and ranked through identifying their contribution in enhancing model performance.RI of the most important variable(s) is always set to 100 and other variables are ranked accordingly.Hyperparameters of the tree were set to those of defined in Nasir et al. [38].
The performance of the model was validated using two approaches: a 70/30% ratio train/test data (70/30) and a 5-fold cross validation method (5-fold).The R 2 for train (R 2 tr ) and test (R 2 ts ) data were reported.Details of decision tree modelling and cross validation were discussed in Leo Breiman et al. [45] and an example of decision tree classification in the context of the current work was given by van Blokland et al. [31].
The performance of the decision tree model was compared with an ANFIS model.The FIS structure of ANFIS was generated by Fuzzy Cmean (FCM) clustering method using four clusters similar to Nasir et al. [36].The input parameters were first fed into a principal component analysis (PCA) model for data reduction [33].The optimal number of predictor variables was found to be eight, which preserved 99% of the total variance of the data.The output of PCA was then linked to ANFIS for predicting response variables C no , C max , E m,g and f m .More details on ANFIS structures and its hyperparameters can be found in Nasir and Cool [39].Variable clustering was then performed to study the similarity between boards' initial properties (predictor variables) and to find common characteristics between predictor and response variables.An agglomerative hierarchical clustering method was applied, in which variables with higher similarity level have a smaller inter-observation distance [37].Details of this method and its implementation in a similar context can be found in Fathi et al. [37].The RI of and similarity between variables was also studied separately for control and TM boards to better study the effect of thermal modification on the interaction between variables.

Surface checking and bending properties
After 30 months of weathering, surface checks were present in both TM and control boards.Fig. 3 shows neither surface checking indices follows a normal distribution for TM boards.For control boards, C no did not follow the shown normal distribution, while C max did.In contrast, C max of TM boards shows significant deviation from the normal distribution.In detail, Fig. 3b shows C max of TM boards follows a similar trend as C max of control boards up to a maximum surface check depth of ~ mm, but changes after that point.Parameters used to characterise checks in unmodified wood of Scots pine and Norway spruce, i.e. check length and area, followed a non-normal distribution [16].In contrast, C max of unmodified Norway spruce boards was normally distributed (Fig. 3b).Both the total number and maximum depth of surface checks were significantly higher for TM boards compared to the control, as discussed earlier in detail by van Blokland et al. [24].
Unlike the surface checking parameters, the probability plot in Fig.
shows that bending properties follow the shown normal distribution.Fig. 4a also shows that there is no significant difference in E m,g between TM and control boards, whereas f m was significantly reduced after thermal modification (Fig. 4b).The considerable decrease in f m while E m,g remains largely unaffected is in line with previous studies, which together include various species, treatment types and conditions, outdoor above-ground exposure or not, and both small clear wood specimens and timber [46][47][48][49].The main reason for this loss in strength is the change in chemical composition due to heat treatment, primarily the loss in hemicelluloses [7,50].While other mechanisms such as the  decrease in polymerization of cellulose and cross-linking of the cell wall matrix were listed as contributing factors [51].It was shown earlier the increased formation of surface and internal checks due to 30 month weathering had no significant effect on bending properties of Thermo-Wood® Thermo-D and unmodified Norway spruce timber, but did influence bending failure modes [24,31].An extensive overview and indepth discussion of the effects of thermal modification on bending properties of Norway spruce timber from the same timber batch as used in the present study can be found in van Blokland et al. [30].The level and variation of boards' initial properties, used in this study for predictions, were presented and discussed earlier in van Blokland et al. [24].

Relationships between variables
Table 2 shows r s of relationships between boards' initial properties, and surface checking parameters and bending properties.The r s of relationships between surface checking parameters, between bending properties, and between surface checking parameters and bending properties are included.For the predictor variables obtained on TM boards after kiln drying, only MC i showed different results and therefore r s in this scenario was given in parenthesis (Table 2).Relationships of predictor and response variables with correlation coefficients around 0.6 and up are typically employed in timber grading [52,53].It is important to note that N influences correlation coefficients.For  example, the correlation coefficient between the number of internal checks and density in pine TMT was only ~ 0.4 (N = 38), but this relationship was later used successfully for predictions of internal check occurrence in Norway spruce TMT with reasonable accuracy in prediction (N = 84) [31,54].
C no was positively correlated with density, velocity and stiffness, while the relationship with W yr was negatively correlated.This change of sign is consistent with the correlation between density and W yr that is typically negative for Norway spruce timber [55,56].The r s of these relationships was in general stronger for TM than for control boards (Table 2).The presence of pith, peak amplitude from ultrasonic stress wave and orientation of boards during exposure seemed to have no influence on C no , since r s -values were low and not significant.MC i correlated negatively with C no for TM boards.That is, the number of surface checks after 30-month weathering was larger for TM boards with a low MC i .Van Blokland et al. [31] argued that the reduction in equilibrium moisture content due to thermal modification is larger for Norway spruce timber with a smaller W yr and a higher density.Hence, TM boards with a low MC i are associated with higher density, velocity and stiffness, and smaller W yr .This relationship was reversed when MC i was obtained after kiln drying (Table 2), because during kiln drying higher density timber dries slower compared to low density timber [57].Indeed, MC i and density were correlated positively after kiln drying (r s ~ 0.35, p≪0.05).For control boards, no significant correlations were found between C max and boards' initial properties with exception of orientation (Table 2).In fact, surface checking in unmodified Norway spruce timber reaches a greater depth when the surface is exposed to the weather [24].For TM boards, C max correlated positively with density, velocity and stiffness, and negatively with W yr .The change of sign is consistent with the above discussion.Orientation did not correlate with C max for TM boards, as well as pith.In contrast, check length in Norway spruce and Scots pine was larger in boards with the pith enclosed [15,17], but in those studies only unmodified wood was investigated and the unexposed surface and check depth were not evaluated.
In general, r s of relationships between boards' initial properties, and bending stiffness and strength were comparable among samples in terms of goodness-of-fit and direction (Table 2).This with exception of MC i , which was higher for stronger and stiffer control boards (positive correlation), but lower for stronger and stiffer TM boards (negative correlation).Again, this change in sign is consistent with the above discussion on the effect of thermal modification on the MC i .Correlations were significant for W yr , MC i , density, velocity and stiffness, but not for pith, peak and orientation.Also, correlations between boards' initial properties and bending strength were typically stronger for control than for TM boards.This may lead to lower accuracy in ML-based prediction of the bending strength of TM boards, as is the case for linear regression-based predictions [29,30,46,58].
Lastly, Table 2 shows r s of relationships between surface checking parameters and bending properties.A higher number of surface checks was coupled with deeper surface checks for TM boards, but not for control boards.Bending strength and stiffness were correlated as expected for unmodified and TM Norway spruce timber [46,52,53,59], and stiffer boards were typically stronger.The r s -values between bending stiffness and surface checking parameters were similar to when stiffness was obtained from static instead of dynamic tests (Table 2).The r s -values between bending strength and surface checking parameters were consistent with those for bending stiffness, but slightly weaker.

ML-based prediction 3.3.1. Degree of surface checking
A summary of the decision tree model shows that prediction accuracy was poor for both C no and C max (Table 3).Specifically, the model showed severe overfitting for C no .The type of validation had some influence on prediction accuracy.Validation by splitting train/test data in a 70/30 ratio was somewhat better than 5-fold cross validation.Compared to decision tree, a higher accuracy in prediction of C no and C max was achieved by ANFIS both for train and for test data (Table 3), although still poor.In contrast, a R 2 of 0.67 could be achieved when predicting internal check occurrence in TMT using the same data set as the present study [31].This implies that the investigated initial board properties cannot be used for predicting the degree of surface checking in weathered TMT.Since both decision tree and ANFIS were used, it seems that poor performance in prediction accuracy is unlikely due to the type of ML model.More likely, the poor performance can be explained in part by the relationships between the investigated initial board properties and surface checking parameters, which were poor as well (section 3.2).Other reasons could be quality and size of the data, which may be supported by the higher R 2 tr of ~ 0.8 (Table 3) obtained when omitting the validation step.When splitting the data set into a TM and control sample, R 2 tr -values of ~ 0.7-0.85(not shown) were obtained, independent of treatment level.This suggests it may be worth further investigating surface check prediction in unmodified and TM Norway spruce timber based on initial board properties.
Fig. 5 shows RI of the models' input parameters for predicting C no and C max .Overall, the most important decision for prediction of C no and C max is based on whether a board is TM or not.The model can make this decision by using MC i or treatment, because after thermal modification MC i depends greatly on treatment level [6,24].The high RI of treatment emphasizes the negative effect of thermal modification on the resistance of timber to surface checking during weathering, as discussed above (Fig. 3) and previously by van Blokland et al. [24].However, this finding has no practical relevance for sawmill grading operations.For that purpose, other variables are needed to identify TM boards with better resistance to surface checking.This variable could be acoustic velocity (v a,res ), which, in general, was an important variable for accurate predictions of C no and C max .When the data was split into control and TM boards (not shown), predictor variables with high RI were dynamic stiffness (E a,res then E a,tof ) often followed either by density (ρ or ρ OD ) or acoustic velocity (v a,res or v a,tof ).MC i had low RI, and orientation the highest RI but only for predictions of C max .Compared to the results in Fig. 5, these results were more in line with Table 2.When initial board properties were obtained after kiln drying instead of after thermal modification, E b,90,nom became more important for prediction of C no and dynamic stiffness and density for C max , while MC i lost importance (not shown).This change in RI of variables was unexpected, because, except for MC i , r s did not depend on which process step the initial board properties were obtained (Table 2).This change could be associated with the decision tree's poor performance in predicting C no and C max .Especially, results shown in Fig. 5a are questionable, since the decision tree failed to predict C no and R 2 was ~ 0 (Table 3).Having very poor performance in decision tree modelling, clustering analysis provides a more meaningful tool to assess relationships between study variables [37].
The dendrogram in Fig. 6 shows the results of variable clustering.Seven clusters were identified.Overall, the results obtained from clustering analysis confirmed the ML findings from decision tree.C no is clustered with acoustic velocity and E b,90,nom consistent with results shown in Table 2. C max forms a cluster with treatment and pith.Probably  [24].That is, no relationship was found between pith and C max (Table 2), while r s was 0.51 for the C max -treatment relationship (p≪0.05) because TM board check more than control boards during weathering and 0.29 for the pith-treatment relationship (p≪0.05) because TM boards contained pith more often (27% vs. 6%) [24].
Clustering analysis also confirms that MC i has little similarity with surface checking indices.Moreover, the dendogram shows orientation, peak and W yr have little similarity with surface checking indices.W yr proved earlier to be the most important parameter to identify internal checking in weathered TMT [31].Water uptake and sorption depend on density and annual ring width [27], and hence, it can be expected that moisture-induced stress during weathering also depends on these variables.However, other parameters that influence physical and mechanical properties involved in check formation due to moisture-induced stress, such as shrinkage properties and resistance to fracture initiation and propagation, also depend, or at least in part, on density [60,61].It is likely the dependency of moisture-induced stress and resistance against check formation on material parameters, such as ring width or density, counterbalanced each other and/or are not properly represented by the investigated initial board properties.Again, a better match with the correlation analysis was found when the data was split in control and TM boards (not shown).For control boards, C max was most similar to orientation, while C no formed a cluster with dynamic stiffness and density.For TM boards, C no and C max were most similar to density.

Bending properties
The performance of ML models for prediction of bending properties is shown in Table 4.In general, the ML models performed well and R 2 ts -values of ~ 0.8-0.9 and ~ 0.5-0.7 could be achieved for predictions of E m,g and f m , respectively.The accuracy was somewhat higher for the 5-fold validation, especially for predictions of f m .For predictions of E m,g , ANFIS performed better than decision tree.A higher accuracy was achieved for E m,g than f m , which is typical when predicting wood's bending properties, both for regression and ML models, for clear wood and timber, and for unmodified and TM wood [37,46,52,62].Compared to a previous study on Norway spruce TMT that used multiple linear   regression model [30], decision tree and ANFIS performed better.Similar R 2 -values were achieved, whereas the models in the former study were not validated.Indeed, without validation, R 2 tr -values from the decision tree model were as much as 0.98 for E m,g and 0.94 for f m (Table 4).Splitting the data into control and TM boards or using initial board properties obtained at different process steps (after kiln drying or after thermal modification) did not affect model performance (not shown).
Overall, the property with the highest RI was dynamic stiffness whether E m,g or f m was predicted (Fig. 7).This is consistent with Spearman's correlation coefficients (Table 2) and with current knowledge on strength grading of timber and TMT [30,52].However, there are differences in RI between E m,g and f m predictions (Fig. 7a vs. Fig.7b).The most noticeable difference is the critical role of treatment on the decision tree model when predicting f m , which could also be measured through MC i as discussed in section 3.2, whereas this variable had a low RI when predicting E m,g .This is confirmed by Fig. 4 and the literature, which show that E m,g is not significantly affected by thermal treatment, at least not for commercial modification processes, while f m is typically affected to a high degree [46,47].Acoustic velocity and density, which can also be measured through W yr (section 3.2), were typically ranked after dynamic stiffness in terms of RI for E m,g predictions, and after variables treatment and dynamic stiffness for f m predictions.The RI of acoustic velocity from stress waves (v a,tof ) typically had a lower RI than acoustic velocity from resonance (v a,res ), as was previously shown in linear regression analyses on Norway spruce timber and TMT [29].
Density (ρ and ρ OD ), in general, had higher RI for prediction of f m than for E m,g .In addition, E b,90,nom had a rather low RI, particularly for f m predictions.The observations regarding density and E b,90,nom contradict with the r s results in Table 2 and the literature on strength grading of timber and TMT [30,52], but agree with the clustering analysis presented below.It should be mentioned that E b,90,nom was more important while MC i lost importance for accurate predictions of TMT's f m after splitting the data in control and TM boards (not shown).When initial board properties obtained after kiln drying were used for predictions, MC i showed some importance for predictions of E m,g and f m (not shown).A positive correlation can be expected between MC i and E m,g , and MC i and f m , because ρ and MC i correlate positively after kiln drying (section  3.2) and the relationships between density and these bending properties are also positive, both for unmodified and TMT [30,52].The dendrogram in Fig. 8 shows five clusters could be identified.Similar to the variable clustering analysis for surface checking indices, treatment and pith, W yr , peak and orientation each formed a separate cluster, whereas the other predictor variables and response variables were similar and formed one large cluster.In this large cluster, f m was most similar to MC i followed by density.Together these variables formed one sub cluster.The remaining variables formed a second sub cluster and had a high similarity level with E m,g -over 80%.This was in agreement with the RI in Fig. 7 and Spearman's correlation coefficients.Although E b,90,nom represents only 90 mm board length, a rather high similarity was found between this parameter and the other variables related to board stiffness that were determined over much larger board lengths of 2.6-4.8 m.Since E b,90,nom can be interpreted as a measure of a board's severest knot or cluster of knots, this finding underlines the impact of knots on measures of axial time-of-flight, axial resonance frequency and modulus of elasticity [63,64].A higher similarity between E b,90,nom and f m was expected.Especially, since E b,90,nom was developed to increase the accuracy of f m predictions as an improvement of more traditional measures of knot sizes, such as knot area ratio [65].The low similarity between W yr and bending properties in clustering analyses (Figs. 8 and 9) was in contrast with results from Spearman's correlation coefficients and RI from ML models, which showed W yr had about similar r s with bending properties as density.
Fig. 9 shows results from clustering analysis when the data was split into a control and a TM board sample.Some differences could be observed after data splitting (Fig. 8 vs. Fig.9).For control boards (Fig. 9a), E b,90,nom formed a cluster with acoustic velocity, while f m was most similar to stiffness and density.For TM boards (Fig. 9b), f m was most similar to E b,90,nom , while MC i and W yr formed one cluster.The similarity between MC i and W yr for TMT was observed and discussed previously [31].The lower dependency of bending properties on MC i than on velocity and stiffness is in line with r s in Table 2 and also became clear from RI analysis after data splitting.Nevertheless, after data splitting, both correlation and RI analyses show MC i has some importance for accurate predictions of bending properties.Thus, MC i is not only a measure of treatment level as discussed earlier.MC i was obtained directly after treatment, and thus the variation in MC i was determined by treatment level and board's anatomical and physical characteristics, and not by climate conditions.Since MC i is commonly and easily determined at sawmills, it may provide a valuable predictor in addition to the common predictors suitable for f m predictions of Norway spruce timber and TMT [29,30].
Overall, performance of the ML models discussed in sections 3.3.1 and 3.3.2did not depend on when initial board properties were determined except for MC i .This is in line with previous work on prediction of internal check occurrence and bending properties in weathered TMT [24,31].It implies that the variation of ρ and v a,res remains largely unaffected by thermal treatment.Future studies should focus on including features that have stronger correlations with surface checking parameters.For this, more in-depth and fundamental research into moisture-induced check formation in TMT is required, where, for example, tools like finite element modelling can provide a way to study the complicated interaction between material parameters, wood material orientation, and external loads [66,67].The cost of data acquisition encourages looking for those features that can be measured quickly.Various non-destructive evaluation tools such as near-infrared spectroscopy and ultrasonic Lamb wave propagation have been proven useful for this purpose [36,68].At the same time, further efforts are needed to provide accurate and objective measure of the degree of surface checking in TMT after weathering.For this, various advanced scanning technologies may be suitable [69][70][71].The performance of decision tree may be improved and overfitting issues resolved by implementing random forest modelling [72] alongside with expanding the size of data.Clustering in addition to statistical analysis should be used on various types of wood science data to better understand the relationship between these methods.Finally, further studies are needed to evaluate the potential of initial moisture content to improve bending strength predictions of TMT.

Conclusions
Conclusions regarding possibilities to predict the degree of surface checking in weathered thermally modified timber: • Machine learning-based models failed to predict the total number and maximum depth of checks, because relationships between the investigated initial board properties and surface checking parameters were weak and/or quality and size of the data was insufficient; • The high relative importance of treatment confirmed the negative effect thermal modification has on the resistance of timber to check formation; • Variables important for accurate predictions were acoustic velocity followed by dynamic stiffness and density, but since models failed this conclusion should be taken with caution.
Conclusions regarding performance of models when predicting bending properties in weathered thermally modified timber: • Machine learning-based models performed better than conventional regression models used for timber grading.The achieved prediction accuracy was 80-90% for bending stiffness and 50-70% for bending strength; • Results from machine learning and variable analyses was in accordance with the literature on strength grading of timber and thermally modified timber, i.e.: dynamic stiffness was the most important property for accurate predictions; other important variables were (a) (b) acoustic velocity, density and lowest local bending modulus, and; the high relative importance of treatment for bending strength predictions emphasized the negative effect thermal modification has on timber's bending strength; • Initial moisture content was very important for bending strength predictions, because it provided a measure of treatment level.When predictions were made only on unmodified or thermally modified timber, this variable had some importance for predictions of both bending strength and stiffness.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 2 .
Fig. 2. Measurement of depth of surface checks as distance d [mm] in board's cross-section.

Fig. 3 .
Fig. 3. Cumulative frequency diagram of (a) C no and (b) C max .Note the lines show estimated normal distributions and 95% CIs.

Fig. 4 .
Fig. 4. Cumulative frequency diagram of (a) E m,g and (b) f m .Note the lines show estimated normal distributions and 95% CIs.

Fig. 5 .
Fig. 5. RI of variables in the decision tree model for prediction of (a) C no and (b) C max.

Fig. 7 .
Fig. 7. RI of parameters in the decision tree model for the prediction of (a) E m,g and (b) f m.

Fig. 8 .
Fig. 8. Clustering analyses between initial board properties and bending properties.Cluster 1 (blue): treatment, pith; Cluster 2 (purple): W yr ; Cluster 3 (grey): peak; Cluster 4 (red): orientation; Cluster 5 (green): MC i , f m , ρ OD , ρ, v a,res , v a,tof , E a,res , E a,tof , E m,g , E b,90,nom.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table 1 shows also an overview of variables' mean values presented and discussed previously by van Blokland et al. [24,30,31].

Table 1
List of variables including mean values of 84 control and 84 TM boards for continues variables.
(continued on next page) J. van Blokland et al.

Table 1
(continued ) a Values in parenthesis: Board properties obtained after kiln drying

Table 2
Spearman's rank correlation coefficients of relationships between boards' initial properties, and surface checking and bending properties of 84 control and 84 TM boards.

Table 3
Performance of ML models for prediction of the degree of surface checking in control and TM boards using boards' initial properties * Model failed to predict the degree of surface checking because both C max and pith are correlated with treatment and not because pith affects C max

Table 4
Performance of ML models for prediction of bending properties in control and TM boards using boards' initial properties