Body weight prediction of Belgian Blue crossbred using random forest

Objective: The aim of this study was to predict the body weight (BW) of a Belgian Blue X Friesian Holstein (BB X FH) crossbred in Indonesia based on morphometrics using random forest. Materials and Methods: A total of 26 BB X FH crossbreds were observed for BW, chest weight (CW), body length (BL), hip height (HH), wither height (WH), and chest girth (CG) from 0, 30, 60, 90, 120, 150, 180, 210, 240, 270, and 300 days of age. Stepwise regression and random forest were performed using R 3.6.1. Results: The random forest results show that CG is an important variable in estimating BW, with an important variable value of 24.49%. Likewise, the results obtained by stepwise regression show that CG can be an indicator of selection for the BB X FH crossbred. The R squared value obtained from the regression is 0.83, while the R squared value obtained from the random forest (0.86) is greater than the regression. Conclusion: In conclusion, random forest produces a better model than stepwise regression. However, a good simple equation to use to estimate BW is CG.


Introduction
Belgian Blue cattle is a double-muscled cattle originating from Belgium.This cattle has an 11-nucleotide base deletion in the myostatin gene.In Indonesia, these cattle were bred by crossbreeding with other Bos taurus or local Indonesian cattle [1,2].The Belgian Blue cannot adapt well to warm regions; therefore, the BB is crossed with other local cattle, one of which is the Friesian Holstein (FH), resulting in better performance than other crossbreds [3].Preliminary research on Belgian Blue X Friesian Holstein (BB X FH) crossbreds for body weight (BW), morphology, and survivability has been conducted in Indonesia [4,5].However, no research has been conducted to predict the growth rate of BB X FH.
Machine learning (ML) is an algorithmic development that is widely used to solve various problems.ML is widely used in animal science for predicting many live BWs [6], such as in chickens [7] and cattle [8], predicting milk yield [9], and precision nutrition [10].ML algorithms are classified into three types of learning: reinforcement learning, unsupervised learning, and supervised learning [11].Random forests in animal science are used to classify breeds, predict BW, and identify important single nucleotide polymorphisms (SNPs) [12][13][14].
Random forest is a popular ML algorithm that is widely used in classification and regression and has various advantages and high accuracy [15].Therefore, this study aimed to predict the BW of a BB X FH crossbred in Indonesia based on morphometrics using random forest.

Ethical approval
The experimental procedures were carried out following the guidelines established by the Ministry of Agriculture of Indonesia.The Indonesian Agency for Agricultural Research and Development approved the procedures (Balitbangtan/Balitnak/Rm/06/2021).

Animals
A total of 26 Belgian Blue X FH crossbreds were observed for BW, chest weight (CW), body length (BL), hip height (HH), wither height (WH), and chest girth (CH) from 0, 30, 60, 90, 120, 150, 180, 210, 240, 270, and 300 days of age.BW was measured in kg using the animal scale for cattle.Morphometric variables were measured using animal tape (Rondo).The experiment was carried out at the Indonesian Research Institute for Animal Production in Bogor, Indonesia.

Feeding and management
The animals were kept in individual pens equipped with feeders and automatic drinking water.The feeder and drinking water tanks were cleaned every day.The cage was built from iron pipes on each side with concrete material for the floor.The roof of the cage was made from asbestos.From the age of 1 to 21, the animals were fed milk for about 2 l per head per day, then continuously increased up to 5 l per head per day.Starting at the age of 90 days, the animals were offered a mix of fresh chopped Napier grass and concentrate up to 200 g per head per day, which were offered after the calves received milk.Then, at the weaning age (about 5 months of age), the animals were fed a mixture of freshly chopped Napier grass and concentrate for about 2 kg.The legume leaves (Gliricidia sepium) were also offered to each animal, up to 1 kg per head per day.After that, all the animals received daily feed, which consisted of mixed fresh chopped Napier grass and legume leaves (20-25 kg per head per day) and concentrates (3 kg per head per day).

Analysis data
Descriptive statistics, stepwise regression, and random forest were performed using R 3.6.1 [16].The goodnessof-fit of the regression models was assessed using the coefficient of determination (R2) and Akaike's information criterion (AIC).Stepwise regression was used to develop equations to predict BW from morphometric traits and age.In this study, three machine-learning methods have been used for estimating the weight of crossbreds during the first 300 days of life using body measurements.

Results and Discussion
Based on the results of descriptive statistics, the BW and morphometric size of males are greater than those of females.In most animals, males are larger than females and are affected by testicular secretions (testosterone and its metabolites) [17].The BW and morphometric size are presented in Tables 1 and 2. BW at 300 days of BB X FH was greater than that of Boran beef cattle from Kenya at 365 days of age [18].This shows that B. taurus has better BW performance than Bos indicus.Based on the results of stepwise regression, BL and CG are the best models for estimating the BW of crossbreds (Table 3).However, CG is a simple and good variable for estimating BW.On the other hand, rump height has a high direct effect value in predicting BW in Nguni cattle [19].In Hereford cows, metacarpus girth and backside half-girth are important variables in estimating live weight [20].Therefore, CG and BL can be selection criteria for crossbreeding between Belgian Blue and FH.Head length contributed to 88% of the variation in male BW in South African Kalahari Red goats [21].
Based on the results of random forest, the variable importance values in this study were 21.88%, 20.88%, 24.49%, 23.28%, and 20.49% for WH, BL, CG, HH, and CW, respectively.These results support the previous stepwise regression analysis, where CG was an important variable for BB X FH cattle.An increase in node purity (IncNodePurity) denotes a change in the homogeneity of the groups formed by the trees.Based on the random forest results, it is known that two variables have optimal results in predicting BW in crossbreds; this can be seen from the high R 2 value, while the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) values are the smallest (Table 4).In this study, we found random forests produced better prediction performance.However, the animal samples are very limited, and more animals need to be used for future studies.

Conclusion
In conclusion, random forest produces a better model than stepwise regression.The random forest analysis produced the best predictive performance, indicating that this modeling approach is optimal for calculating BW in this investigation.Stepwise regression analysis revealed that the models involving BL and CG are the most effective for estimating the BW of crossbred cattle.The CG is a simple equation that can be used to estimate BW.

Table 1 .
Means, standard error of BW, and morphometrics of male (n = 12 heads).

Table 2 .
Means, standard error of BW, and morphometrics of female (n = 14 heads).

Table 3 .
Stepwise regression of BW and morphometrics in BB X FH.

Table 4 .
Random forest summary for predicting BW using morphometrics data.