Predicting Maize Theoretical Methane Yield in Combination with Ground and UAV Remote Data Using Machine Learning

The accurate, timely, and non-destructive estimation of maize total-above ground biomass (TAB) and theoretical biochemical methane potential (TBMP) under different phenological stages is a substantial part of agricultural remote sensing. The assimilation of UAV and machine learning (ML) data may be successfully applied in predicting maize TAB and TBMP; however, in the Nordic-Baltic region, these technologies are not fully exploited. Therefore, in this study, during the maize growing period, we tracked unmanned aerial vehicle (UAV) based multispectral bands (blue, red, green, red edge, and infrared) at the main phenological stages. In the next step, we calculated UAV-based vegetation indices, which were combined with field measurements and different ML models, including generalized linear, random forest, as well as support vector machines. The results showed that the best ML predictions were obtained during the maize blister (R2)–Dough (R4) growth period when the prediction models managed to explain 88–95% of TAB and 88–97% TBMP variation. However, for the practical usage of farmers, the earliest suitable timing for adequate TAB and TBMP prediction in the Nordic-Baltic area is stage V7–V10. We conclude that UAV techniques in combination with ML models were successfully applied for maize TAB and TBMP estimation, but similar research should be continued for further improvements.


Introduction
The growth of the global human population, being one of the major challenges for humanity, encourages and increases energy demand at the household, smallholder, and industrial scales. Anaerobic digestion of organic materials has been developed for more than 100 years and currently has great potential in contributing to sustainable energy production [1]. During the anaerobic process, it is possible to convert fermentable biomass, including the organic fraction of municipal solid waste, food waste, green wastes, aquatic biomass, and crops [2], into a multipurpose fuel (CH 4 ) and fertilizers that can be used in crop management systems while also being an effective way to reduce greenhouse gas emissions, as compared to traditional fossil fuels [3,4].
A study of digestion [5] highlighted that among the variety of substrates suitable for anaerobic digestion, agricultural crops are rather widely investigated. The most suitable crops are those that offer a high, dry biomass yield at a low cost, require low nutrient and energy inputs, and do not decrease biodiversity. Among them, maize can be identified as one of the most popular plants among energy crops. What is more, maize is a highly productive crop that can be cultivated in a wide range of climatic conditions, including in northern latitude regions [6]. Also, due to the high maize biomass yield potential, methane production varies from 7500 to 10,200 m 3 ha −1 , i.e., an amount that no other crop has achieved so far [7]. output is the crop's dry matter yield. The dynamic variations in maize's chemical composition throughout different maturity classes and their meaning for methane output, as well as the impact of harvest time for methane output, have been well studied. For instance, Amon et al. researcher's study [21] determined that the highest methane yields of 7500-10,200 m 3 /ha −1 were achieved via maize cultivars with a high FAO number, which starts from 300 (late variety) at harvest at the R3-Milk stage. A study in Denmark [21] highlighted remarkable differences in the harvest time and the impact of cultivars on methane yield. They also established that the highest methane yield was obtained at late harvest. Another study in Germany [22] found that when growing maize for methane production, late maize cultivars had a lower concentration of fat and protein, but a higher concentration of ash and lignin, as compared with medium and early maize hybrids. It was established that despite substantially different nutrient concentrations between the maize cultivars, no clear-cut association existed between chemical composition and specific methane yield.
Therefore, the objectives of our study were: (i) to track UAV-based multispectral bands at the main phenological stages of maize, and based on these remote data, establish the correlation between VIs, TAB, and TBMP; (ii) to estimate maize total above-ground biomass and theoretical biochemical methane potential yield, combining UAV techniques with ML methods; and (iii) identify the maize phenological stage during which total-above ground biomass (TAB) and theoretical biochemical methane potential (TBMP) predictions are the most specific.

Environmental Conditions and Maize Growth
The weather conditions during the maize growing season in 2021 were warm with optimal rainfall conditions, but there was a distinct contrast between different months, both in temperature and rainfall regime (Table 1). Table 1. Ten-day period weather conditions per maize vegetative (V) and reproductive (R) development stages in 2021. Maize growth and development stages: V5-Fifth leaf, V7-Seventh leaf, V10-Tenth leaf, VT-tasseling, R1-Silking, R2-Blister, R3-Milk, R4-Dough, R5-Dent, R6-Physiological maturity. During the 2021 maize sowing harvest, the mean air temperature was 17.3 • C, which was 1.5 • C above the 1991-2020 historical average. During the maize vegetative period (VE-VT), the average air temperature was 20.2 • C, which was 3.0 • C higher than the 1991-2020 historical average; however, during the reproductive period (R1-R6), the average temperature was lower compared to the vegetative period and reached 14.5 • C, which was equal to the historical average during the 1991-2020 range. The year 2021, in terms of rainfall regime, was near optimal during the entire maize growing season, and the sum of precipitation was 234.5 mm (86.2% of precipitation compared to the climate normal between 1991 and 2020). During the entire maize growing season, the number of days with heavy precipitation (above 10 mm) was 6 days. Despite the fact that the cumulative rainfall (234.5 mm) during the entire maize growing season was near optimal compared to the long-term average, contrasting differences were observed between the vegetative and reproductive development stages. For instance, during the maize vegetative development stage, the rainfall was 45.6 mm, which was only 41.1% of the precipitation norm between 1990 and 2020. According to the calculated SPI index, the maize vegetative development stage may be attributed to a moderately dry period. In contrast, a significantly higher amount of precipitation occured during the maize reproductive development stage-188.9 mm, which was 117.3% of the precipitation norm between 1990 and 2020. According to the calculated standardized precipitation index (SPI), the maize reproductive development stage may be attributed to relatively normal conditions. Due to different maize hybrids, total above-ground biomass (TAB) varied from 9.70 t/ha −1 to 15.00 t/ha −1 (dry weight). ANOVA analysis indicated the significant (p < 0.05) effect of maize cultivars on TAB variation in 2021. According to the received TAB yield, maize hybrids may be grouped as follow: DKC3201 ≈ DKC2684 ≈ DKC2891 (TAB: 13.3-15.0 t ha −1 ) > DKC2978 ≈ DKC3079 ≈ DKC3204 (TAB: 12.4-12.6 t ha −1 ) > DKC3218 (TAB: 9.7 t ha −1 ).

Chemical Composition of Maize Samples
At maize physiological maturity, protein concentration in the TAB yield ranged from 6.6% to 8.0% [ Table 2]. Performed statistical analysis indicated significant (p < 0.05) differences between the cultivar treatments. It has been observed that a statistically significant protein concentration was higher in the earlier maturity cultivars, i.e., DKC3218 (FAO190)-8.0% and DKC2978 (FAO190)-7.64%. Slightly lower protein concentration values were determined in the intermediate-maturity maize cultivars: DKC2684-7.56%, DKC3079-7.05%. In the remaining treatments of longer vegetation cultivars, protein concentration was below 7.0%. In all experimental treatments, lipid concentration varied within the range of 2.53-3.04%. Lipid variation was similar to protein concentration, i.e., higher lipids values were determined in earlier maturity cultivars (FAO190); however, these differences were not significant. Structural carbohydrates in the maize TAB yield varied within a rather narrow range, for instance, cellulose varied from 10.82 to 13.30%, hemicellulose from 15.78 to 16.78%, and lignin from 6.41 to 7.88%; these differences between treatments were not significant. Nonstructural carbohydrates at the harvest ranged from 35.12 to 43.58%, and these differences were not significant either. However, when analyzing the total amount of carbohydrates in maize TAB, there were significant differences (p < 0.05) between different cultivars, and the order of treatments was as follows: DKC3079 ≈ DKC2891 ≈ DKC3204 > DKC2684 ≈ DKC3201 ≈ DKC2978 > DKC3218. It was also determined that cultivars of different maturities did not have a significant effect on the ash concentration in maize TAB.

Maize Yield Prediction Accuracy Using UAV-Based Multispectral Data
The generalized linear model (GLM), a support vector machine (SVM), and random forest (RF) models were tested regarding maize TAB estimation. Figure 1 refers to the highest performing model in terms of RMSE for each growth stage. between different cultivars, and the order of treatments was as follows: DKC3079 ≈ DKC2891 ≈ DKC3204 > DKC2684 ≈ DKC3201 ≈ DKC2978 > DKC3218. It was also determined that cultivars of different maturities did not have a significant effect on the ash concentration in maize TAB.

Maize Yield Prediction Accuracy Using UAV-Based Multispectral Data
The generalized linear model (GLM), a support vector machine (SVM), and random forest (RF) models were tested regarding maize TAB estimation. Figure 1 refers to the highest performing model in terms of RMSE for each growth stage. Figure 1. Statistical values between predicted vs. estimated total above-ground biomass (TAB) data for rainfed maize at Akademija.
When comparing the main maize growth and development periods, i.e., vegetative (from VE-Emergence to VT-tasseling) and reproductive (from R1-Silking to R6-physiological maturity), it can be stated that slightly better TAB prediction results were obtained during the reproductive period. At the beginning of the maize vegetative period, i.e., during the leaf formation from stage V5 to V7, statistical parameters indicated a generally good level of agreement between the observed and predicted TAB with R 2 , which varied in the range of 0.60-0.65 (p < 0.01); RMSE, which varied in the range of 0.86-0.92 t ha −1 ; and BIAS, which varied from −0.07 to −0.25. During later vegetative periods, i.e., at the end of the leaf development (V10) and tasseling (VT), TAB prediction results have shown a better level of agreement with R 2 0.72-0.77 (p < 0.01), RMSE 0.86-1.03 t ha −1 , and BIAS −0.16 to −0.18. During the first maize growing season, the best TAB prediction results were demonstrated by GLM; however, the results of SVM and RF models were very similar while the differences in TAB prediction results between the different models did not exceed 9.1%, which indicates that the maize growth and development stage have a significantly greater impact on the prediction results than the selected model. During the maize Figure 1. Statistical values between predicted vs. estimated total above-ground biomass (TAB) data for rainfed maize at Akademija.
When comparing the main maize growth and development periods, i.e., vegetative (from VE-Emergence to VT-tasseling) and reproductive (from R1-Silking to R6-physiological maturity), it can be stated that slightly better TAB prediction results were obtained during the reproductive period. At the beginning of the maize vegetative period, i.e., during the leaf formation from stage V5 to V7, statistical parameters indicated a generally good level of agreement between the observed and predicted TAB with R 2 , which varied in the range of 0.60-0.65 (p < 0.01); RMSE, which varied in the range of 0.86-0.92 t ha −1 ; and BIAS, which varied from −0.07 to −0.25. During later vegetative periods, i.e., at the end of the leaf development (V10) and tasseling (VT), TAB prediction results have shown a better level of agreement with R 2 0.72-0.77 (p < 0.01), RMSE 0.86-1.03 t ha −1 , and BIAS −0.16 to −0.18. During the first maize growing season, the best TAB prediction results were demonstrated by GLM; however, the results of SVM and RF models were very similar while the differences in TAB prediction results between the different models did not exceed 9.1%, which indicates that the maize growth and development stage have a significantly greater impact on the prediction results than the selected model. During the maize reproductive stages, the model's prediction capability consistently increased from stage R1 to R3, when the best TAB prediction results were achieved over the entire maize growing season. Then, at the R3 stage, high R 2 -0.95 (p < 0.01), small RMSE-0.35 t ha −1 , and BIAS close to zero, i.e., −0.06, indicated perfect prediction results. During maize growth stage R4, TAB prediction results were slightly worse compared to R3; meanwhile, at stage R5, TAB prediction results were significantly worse than R3 or R4. During the second maize growing season, the best TAB prediction results had been determined using GLM and SVM models; however, the results of the RF model were also rather similar, and the difference between the best and the worst model in a specific maize growing period was no higher than 5.4%.

Maize Theoretical Biochemical Methane Potential Prediction Accuracy Using UAV-Based Multispectral Data
The performance results of the highest performing model in predicting maize TBMP values among GLM, SVM, and RF are shown in Figure 2 at nine growth and development stages.
reproductive stages, the model's prediction capability consistently increased from stage R1 to R3, when the best TAB prediction results were achieved over the entire maize growing season. Then, at the R3 stage, high R 2 -0.95 (p < 0.01), small RMSE-0.35 t ha −1 , and BIAS close to zero, i.e., −0.06, indicated perfect prediction results. During maize growth stage R4, TAB prediction results were slightly worse compared to R3; meanwhile, at stage R5, TAB prediction results were significantly worse than R3 or R4. During the second maize growing season, the best TAB prediction results had been determined using GLM and SVM models; however, the results of the RF model were also rather similar, and the difference between the best and the worst model in a specific maize growing period was no higher than 5.4%.

Maize Theoretical Biochemical Methane Potential Prediction Accuracy Using UAV-Based Multispectral Data
The performance results of the highest performing model in predicting maize TBMP values among GLM, SVM, and RF are shown in Figure 2 at nine growth and development stages. The same trend discovered regarding TAB was found regarding the TBMP prediction, i.e., better TBMP prediction results were obtained during the maize reproductive period, compared to the vegetative period. Statistical analysis obviously demonstrated that it is extremely important to choose a suitable prediction period for both maize TAB and TBMP predictions. Among four maize growth and development stages in the vegetative period, the best TBMP prediction performance was determined to be during stages V10 and VT when R 2   The same trend discovered regarding TAB was found regarding the TBMP prediction, i.e., better TBMP prediction results were obtained during the maize reproductive period, compared to the vegetative period. Statistical analysis obviously demonstrated that it is extremely important to choose a suitable prediction period for both maize TAB and TBMP predictions. Among four maize growth and development stages in the vegetative period, the best TBMP prediction performance was determined to be during stages V10 and VT when R 2 , varied in the range of 0.73-0.76 (p < 0.01); RMSE, varied in the range of 372.7-459.4 m 3 t ha −1 ; and BIAS values, varied in the range of −72.3 to −83.6, have shown a small TBMP underestimation. Meanwhile, at maize stages V5 and V7, statistical results were only reasonably good, with R 2 0.47-0.61 (p < 0.01), RMSE 371.3-644.7 m 3 t ha −1 , and BIAS −24.9 to −157. During the maize reproductive period, the statistical accuracy between predicted and observed TBMP values was the highest during growth stage R3; while R 2 was high-0.97, RMSE was rather low-104.3 m 3 t ha −1 , and BIAS was close to zero, i.e., 5.3. Slightly worse results when predicting TBMP values were received at maize stages R2 and R4, when R2 slightly varied from 0.88 to 0.90, while RMSE varied from 185.6 to 192.7 m 3 t ha −1 , and BIAS varied from −11.3 to −19.6. Having compared the models used in this study, it can be stated that GLM was the most accurate model for maize TBMP prediction; however, SVM and RF models have shown slightly poorer results, meaning they can also be used for maize TBMP prediction.

Maize Yield
The average weights of an attribute of each calculated VI for the maize TAB, at different growth and development stages, are shown in Figure 3, according to the models shown in Figure 1.
were only reasonably good, with R 2 0.47-0.61 (p < 0.01), RMSE 371.3-644.7 m 3 t ha −1 , and BIAS −24.9 to −157. During the maize reproductive period, the statistical accuracy between predicted and observed TBMP values was the highest during growth stage R3; while R 2 was high-0.97, RMSE was rather low-104.3 m 3 t ha −1 , and BIAS was close to zero, i.e., 5.3. Slightly worse results when predicting TBMP values were received at maize stages R2 and R4, when R2 slightly varied from 0.88 to 0.90, while RMSE varied from 185.6 to 192.7 m 3 t ha −1 , and BIAS varied from −11.3 to −19.6. Having compared the models used in this study, it can be stated that GLM was the most accurate model for maize TBMP prediction; however, SVM and RF models have shown slightly poorer results, meaning they can also be used for maize TBMP prediction.

Maize Yield
The average weights of an attribute of each calculated VI for the maize TAB, at different growth and development stages, are shown in Figure 3, according to the models shown in Figure 1. The analysis has demonstrated the importance of maize growth and development stages as covariates in the use of models for TAB prediction. No clear trends have been found when analyzing the distribution of different VI weights among different maize growth and development stages. For instance, GDVI VI showed very high weight values at stages V5 and VT, but at stages V10, R1, and R3, the weight values were 0. Similar results were obtained using other VIs, for example, NDVI weight values were very high during the reproductive period, i.e., stages R2, R4, and R5; however, during the vegetative period, the results were the opposite. In general, all of the weight values exhibited high variability across the nine growth stages. For maize TAB prediction, VIs may be grouped according to their importance as follow: NDVI ≈ TGI ≈ GDVI ≈ RGBVI > NDI ≈ VARI ≈ GLI ≈ RGRI > EVI ≈ RDVI ≈ SAVI ≈ GNDVI ≈ NDRE ≈ VEG. It should be noted that the VI of the last group (VEG) had almost no effect on the maize TAB prediction. To summarise The analysis has demonstrated the importance of maize growth and development stages as covariates in the use of models for TAB prediction. No clear trends have been found when analyzing the distribution of different VI weights among different maize growth and development stages. For instance, GDVI VI showed very high weight values at stages V5 and VT, but at stages V10, R1, and R3, the weight values were 0. Similar results were obtained using other VIs, for example, NDVI weight values were very high during the reproductive period, i.e., stages R2, R4, and R5; however, during the vegetative period, the results were the opposite. In general, all of the weight values exhibited high variability across the nine growth stages. For maize TAB prediction, VIs may be grouped according to their importance as follow: NDVI ≈ TGI ≈ GDVI ≈ RGBVI > NDI ≈ VARI ≈ GLI ≈ RGRI > EVI ≈ RDVI ≈ SAVI ≈ GNDVI ≈ NDRE ≈ VEG. It should be noted that the VI of the last group (VEG) had almost no effect on the maize TAB prediction. To summarise the results, the estimated and used UAV-based VIs as input data for ML models were different for each maize growth stage, indicating that the suitable UAV covariates for the TAB prediction were greatly affected by the maize growth stage.

Maize Theoretical Biochemical Methane Potential
The weights of the attributes which reflect the relative importance of the 14 VI used in this study for maize TBMP prediction are shown in Figure 4, according to the models shown in Figure 2. the results, the estimated and used UAV-based VIs as input data for ML models were different for each maize growth stage, indicating that the suitable UAV covariates for the TAB prediction were greatly affected by the maize growth stage.

Maize Theoretical Biochemical Methane Potential
The weights of the attributes which reflect the relative importance of the 14 VI used in this study for maize TBMP prediction are shown in Figure 4, according to the models shown in Figure 2. For maize TBMP estimation models, the greatest covariates were NDVI (especially R2 and R4-R6 stages) and TGI (especially V7, R2, and R4 stages); however, it should be noted that in some maize growth stages, the aforementioned VIs did not have a significant impact on maize TBMP prediction. On the contrary, the lowest weight values were obtained using NDRE and VEG indices. To summarise the results, the VIs can be arranged in the following order of importance: NDVI ≈ TGI ≈ VARI > RGBVI ≈ GDVI ≈ RGRI ≈ GLI ≈ NDI > EVI ≈ GNDVI ≈ RDVI ≈ SAVI ≈ VEG ≈ NDRE. When predicting maize TBMP as well as TAB values, a similar trend remained, i.e., the last group VI (VEG) had almost no effect. To summarise the results, it can be stated that the tendencies remained the same as they were during TAB prediction, i.e., UAV covariates for TBMP prediction were considerably affected by maize growth stages.

Maize Yield and Theoretical Biochemical Methane Potential Mapping
As shown in Sections 2.3. and 2.4, the models with the highest prediction accuracy were selected, and based on these results, the TAB and TBMP maps of the predicted maize conditions at nine growth and development stages were developed ( Figure 5). As shown For maize TBMP estimation models, the greatest covariates were NDVI (especially R2 and R4-R6 stages) and TGI (especially V7, R2, and R4 stages); however, it should be noted that in some maize growth stages, the aforementioned VIs did not have a significant impact on maize TBMP prediction. On the contrary, the lowest weight values were obtained using NDRE and VEG indices. To summarise the results, the VIs can be arranged in the following order of importance: NDVI ≈ TGI ≈ VARI > RGBVI ≈ GDVI ≈ RGRI ≈ GLI ≈ NDI > EVI ≈ GNDVI ≈ RDVI ≈ SAVI ≈ VEG ≈ NDRE. When predicting maize TBMP as well as TAB values, a similar trend remained, i.e., the last group VI (VEG) had almost no effect. To summarise the results, it can be stated that the tendencies remained the same as they were during TAB prediction, i.e., UAV covariates for TBMP prediction were considerably affected by maize growth stages.

Maize Yield and Theoretical Biochemical Methane Potential Mapping
As shown in Sections 2.3 and 2.4, the models with the highest prediction accuracy were selected, and based on these results, the TAB and TBMP maps of the predicted maize conditions at nine growth and development stages were developed ( Figure 5). As shown in Figure 5, clear differences in the spatial distribution of TAB and TBMP values are seen among maize hybrids across growth stages. Figure 5. The heat map of estimated maize total above-ground biomass as dry matter and theoretical biochemical methane potential (TBMP).

Discussion
Our study provides experimental evidence regarding different maize hybrids and their total above-ground biomass levels, which varied significantly from 9.70 t/ha −1 to 15.00 t/ha −1 (in dry weight), in a region in the Baltic area. Due to the simplicity, easy interpretability, and high acceptance of the RS approach, we obtain high-resolution UAVbased multispectral band data, which have been used for maize TAB and TBMP prediction. Although the VI-based TAB estimation method is considered to be very efficient, the remotely obtained maize TAB and TBMP data cannot effectively describe the mechanistic response of changes in maize growth and development in different environments or under different agronomic management practices. For this reason, 14 VIs derived from UAVbased multispectral images were employed. In addition, some studies demonstrated that the growth stage plays a key role in the sensitivity and performance of VIs when predicting crop biophysical parameters [23]; there is also a lack of comprehensive high-frequency

Discussion
Our study provides experimental evidence regarding different maize hybrids and their total above-ground biomass levels, which varied significantly from 9.70 t/ha −1 to 15.00 t/ha −1 (in dry weight), in a region in the Baltic area. Due to the simplicity, easy interpretability, and high acceptance of the RS approach, we obtain high-resolution UAVbased multispectral band data, which have been used for maize TAB and TBMP prediction. Although the VI-based TAB estimation method is considered to be very efficient, the remotely obtained maize TAB and TBMP data cannot effectively describe the mechanistic response of changes in maize growth and development in different environments or under different agronomic management practices. For this reason, 14 VIs derived from UAV-based multispectral images were employed. In addition, some studies demonstrated that the growth stage plays a key role in the sensitivity and performance of VIs when predicting crop biophysical parameters [23]; there is also a lack of comprehensive high-frequency UAV observation for maize. Therefore, in our study, the UAV campaign was carried out nine times during all main growth stages of maize. A detailed study of maize yield prediction at different physiological stages was carried out by Barzin et al. [24]; they discovered that VI-based prediction models at maize stages V10 and VT had the greatest accuracy with R 2 values of 0.90 and 0.93, respectively, while in our study, prediction accuracy during the same period was slightly lower, with the R 2 values being 0.73 at V10 and 0.76 at VT. However, the results of both studies confirmed that during the initial maize growth stages (V3-V7), TAB prediction accuracy was lower than it was during later growth periods (V10-VT). In contrast to that, a study by Zhang et al. [11] has shown that, through using the hyperspectral imageries of maize canopy and measuring maize height data, the stepwise regression model managed to explain 85-87% of TAB variation at the V6, R1, and R3 growth stages; this showed that the maize growth stages did not have such a significant impact on prediction accuracy as what is implied in the above-mentioned studies. Another study by Ji et al. [25] presented detailed research on maize yield prediction that used phenological information from RS; the researchers divided the maize growing season into four growth phases, i.e., 1-from V1 to V6, 2-from V6 to VT, 3-from VT to R4, and 4-from R4 to R6. It was found that the phenological phases have a significant effect on the prediction of maize yield, and the period from maize flowering (VT) to dough (R4) is the most optimal period for yield prediction. These results are in line with the results we obtained, i.e., the best TAB prediction accuracy was found during the R2-R4 growth stage period when R 2 varied from 0.88 to 0.95. The silking-dough (R1-R4) stages represent the nutrient (nitrogen, phosphorus, potassium) concentration growth peak of the maize parts of the leaf, stem, and storage organs (shank, cob), while nutrient concentration in the grain only begins to increase during this period [26]. During the R1-R4 period, maize has the highest LAI [27] while the leaves are still green, which is probably the reason why the best prediction results were obtained during the maize silking-dough period. However, these comparisons between the different maize yield prediction studies should be treated carefully, as, in many cases, different genotypes have been grown and different management practices have been used in different soil types and climatic conditions, which can possibly have a great impact on the accuracy of the prediction. According to one study [26], a rapid increase in maize kernel weight starts during period R2-R4, and the results of our study suggest that this period is the most suitable for maize yield prediction within the Nordic-Baltic region. Although stage R3 is a perfect period for yield prediction, for farmers, it may be a bit too late as it is close to the end of the maize life cycle, and it is therefore practically impossible to make any agronomic management changes to increase yield. Thus, we determined that the earliest suitable observing period for maize yield prediction in the region is V7-V10 because, at that time, the accuracy of the prediction is sufficiently high and the N demand is still low; therefore, the last N fertilization is still possible. At very early maize growth stages (V5), TAB prediction accuracy was the lowest (R 2 = 0.65) among all of the growth stages analyzed. The results were largely to be expected, as during the initial maize growth stages, the demand for water and nutrients was rather low; thus, the difference in maize hybrids was not so notable. As a result, ML models during those growth stages could not represent well enough the TAB differences caused by the various maize hybrids. On the contrary, maize nutrient demand starts to increase rapidly from around stage V7, and at the end of the vegetative period (V10-V14), the differences are clearly notable [28]. During a similar time, maize also reached its water demand peak, which happens during flowering and at the early grain-filling period [26]. In addition to that, it has been determined that relevant condition when developing data-driven prediction models for yield is when RS data is obtained during the crop growth stage, which is crucial in kernel growth [29]. It is important to note that even though different maize cultivars (FAO190-230) have been used in our study, they all had a "stay-green" effect, which resulted in the leaf senescence being almost the same in all of the cultivars. It is likely that mainly because of this, different cultivars did not have a significant effect on the prediction results.
Maize is a well-known crop used in methane production and can produce between 7500 and 10,200 m 3 ha −1 under suitable climate conditions [7]. However, the climatic conditions for maize production in the Nordic-Baltic region are still marginal as the crop is exposed to cold weather and water stresses, which mainly result in a large year-to-year variation [30]. As has been already determined, methane production output is primarily influenced by the dry matter yield of the crop, while the chemical composition of the crop has a significantly lesser influence on the amount of methane production [22]. As a result, during maize harvesting, the estimated TBMP levels in our experiment ranged from 4092 to 6626 m 3 ha −1 and the correlation between TAB and TBMP was R 2 -0.98, indicating that TAB is the main factor that mainly determines TBMP levels. This finding, which revealed a strong relationship between TAB and TBMP, was also responsible for the fact that the accuracy of TBMP prediction at different maize growth stages was very similar to that of TAB values. The most accurate TBMP values could be predicted at the R3 and R4 growth stages where R 2 was 0.90 and 0.97, respectively. These results nicely coincided with the findings of Amon et al. [7] who suggested that maize should be harvested at growth stages R3-R4 to achieve the highest methane yield. However, we could not find similar studies where UAV-based multispectral data were used for TBMP prediction.
To acquire the most suitable model for predicting maize TAB and TBMP, we used three ML methods, i.e., GLM, RF, and SVM models. When comparing the performances of these models during each maize growth stage, the accuracy of all of the used models' predictions was very similar; although, slightly better results regarding TAB and TBMP prediction were obtained with the GLM and SVM models. However, the differences in the different models' predictions were not so evident, and thus, all three models used in our study may be successfully employed when predicting maize TAB and TBMP values under Nordic-Baltic climate conditions. When predicting the TAB and TBMP values, it has been observed that when using only spectral bands (e.g., red, green, blue), the accuracy of the predictions was not sufficient. Thus, in the further steps of this study, only calculated VIs were used and their possible link with dependent TAB and TBMP variables was sought. However, even when using different VIs for maize prediction, no clear insights have been found. For example, during the maize vegetative period for TAB prediction, GDVI performed relatively better than the other VIs at the V5 and VT stages, but later, at the V10, R1, and R3 stages, there was almost no weight effect on the prediction. Similar trends remained when using the other VIs, for example, NDVI was very important at R2, R4, and R5 stages, but during the vegetative period, the results were the opposite. It is only important to note that at that time, i.e., from the R2 to R5 stages, the NDVI values were the highest during the whole maize growing season and varied within the range of 0.68 to 0.73, while the canopy was closed and the NDVI saturated. Another interesting observation is that at the R3 stage, i.e., the time when the accuracy of the TAB predictions was the best, the GLI, NDI, and VARI indices had the greatest impact on these predictions and all of them have a green band in their formulas, as well as not having an NIR band. It has also been noted that most of the VIs used were saturated at the R3 stage of maize growth. When analyzing the TBMP predictions, the trends were very similar to the TAB, i.e., different VIs at different maize growth stages had different effects on the accuracy of the predictions. Certainly, it was greatly affected by the fact that we used different ML models for the prediction; therefore, VI weights at different maize growth stages should be compared carefully.
RS technologies, more precisely, UAVs at low altitudes, have important application options for acquiring crop spectral information at the plot scale. One of the greatest advantages of using UAVs compared to satellites is that it is possible to compensate for the lack of spatial and temporal resolution of satellite data in precision agriculture. In this study, we tried an exploration in predicting maize TAB and TBMP, and this is an initial attempt, at least in regards to the Nordic-Baltic area. Our prediction results were rather good, but they could be further improved and more studies that cover different soil types and environmental conditions are required. Our research results are based on a dataset of one-year field experiments under typical weather conditions that prevailed in the area. Thus, this deficiency should be corrected by future multi-year field experiments, testing how TAB and TBMP change under contrasting weather conditions. What is more, the use of more data from field seasonal measurements (e.g., chlorophyll content, LAI, etc.), as well as climatic and management data, could provide a more detailed description of crop growth and allow for more favorable predictions of TAB and TBMP.

Study Site and Maize Experiment
According to the Köppen climate classification [31], the climate of Lithuania is humid continental (Dfb), with warm summers and rather severe winters. The territory of Lithuania is not homogeneous regarding air and soil temperature, precipitation distribution, and soils. The average annual air temperature ranges between 5.8 and 7.6 • C, and annual precipitation between 550 and 910 mm, of which 60 to 66% falls in the April-October period. The main soils are Luvisols, Cambisols, Gleysols, Arenosols, Retisols, and Histosols and they cover 28.5, 15.9, 14.6, 13.2, 9.4, and 8.5% of the area, respectively. Field experiments with maize were carried out in 2021 at the Lithuanian Research Centre for Agriculture and Forestry, located in Akademija (55 • 23 09 N 23 • 52 41 E). The field experiment locations fell into the agro-climatic zone IID (see Figure 6) of central Lithuania [32], which is warm but also the driest compared to other zones. one-year field experiments under typical weather conditions that prevailed in the area. Thus, this deficiency should be corrected by future multi-year field experiments, testing how TAB and TBMP change under contrasting weather conditions. What is more, the use of more data from field seasonal measurements (e.g., chlorophyll content, LAI, etc.), as well as climatic and management data, could provide a more detailed description of crop growth and allow for more favorable predictions of TAB and TBMP.

Study Site and Maize Experiment
According to the Köppen climate classification [31], the climate of Lithuania is humid continental (Dfb), with warm summers and rather severe winters. The territory of Lithuania is not homogeneous regarding air and soil temperature, precipitation distribution, and soils. The average annual air temperature ranges between 5.8 and 7.6 °C, and annual precipitation between 550 and 910 mm, of which 60 to 66% falls in the April-October period. The main soils are Luvisols, Cambisols, Gleysols, Arenosols, Retisols, and Histosols and they cover 28.5, 15.9, 14.6, 13.2, 9.4, and 8.5% of the area, respectively. Field experiments with maize were carried out in 2021 at the Lithuanian Research Centre for Agriculture and Forestry, located in Akademija (55°23′09″ N 23°52′41″ E). The field experiment locations fell into the agro-climatic zone IID (see Figure 6) of central Lithuania [32], which is warm but also the driest compared to other zones. The maize (Zea mays L.) field experiment under rainfed conditions was conducted during the 2021 season at the Lithuanian Research Centre for Agriculture and Forestry (55°23′50′′ N and 23°51′40′′ E) in the Central part of Lithuania. The main soil was Endocalcari-Epihypogleyic Cambisol (WRB, 2014) with loam texture class (49% Sand, 35.4% Silt, 15.6% Clay); it had a neutral pH of 6.8 and soil organic matter of 3.2%. In this study, seven maize hybrids with different maturity classes and different suitability levels (grain, silage, biogas) were investigated (Table 3). These maize varieties were selected based on the high amount of dry matter and their suitability to be grown in this region. ; it had a neutral pH of 6.8 and soil organic matter of 3.2%. In this study, seven maize hybrids with different maturity classes and different suitability levels (grain, silage, biogas) were investigated (Table 3). These maize varieties were selected based on the high amount of dry matter and their suitability to be grown in this region. The maize was grown after conventional tillage and was sown on 30 May 2021, when soil temperature had reached 10-12 • C, with a density of 90,000 plants ha −1 (0.75 m row and 0.15 m plant spacing). Weeds were controlled by the herbicide Arrat (tritosulphurone 250 g/kg −1 + dicamba 500 g/kg −1 ) and commercial formulation wettable granule (WG) at a rate of 0.2 kg/ha −1 was used at the maize V3 growth stage. During the maize growing season, no diseases were observed; thus, other pesticides were not used. All experimental treatments were fertilized according to local practices and followed the same rates as those of mineral fertilizers. Before maize sowing; nitrogen (N), at a rate of 120 kg N ha −1 ; superphosphate (P), at a rate of 90 kg P ha −1 ; and potassium chloride (K), at a rate of 170 kg K ha −1 were applied manually and incorporated into the soil. Additionally, at the growth stage of V5, maize was fertilized with N 80 kg N ha −1 to avoid N deficiencies. The mineral fertilizers were in the form of ammonium nitrate (AN) (34.4-0-0), while complex NPK fertilizers were in the form (6-18-34). Maize harvest was performed manually on 6 October 2021. The field experiments included seven treatments that were performed in randomized block design with four replicates. The area of each experimental plot was 27 m 2 .

Plant and Soil Measurements
During the maize vegetation period, plant development stages were recorded frequently. The maize vegetative and reproductive development stages were identified on the basis of the entire treatment when 50% or more of the plants were at a particular development stage. The leaf-collar method [26] was used for the development of vegetation stages, whereas reproductive stages were based on established visual indicators of kernel development. At physiological maturity, when more than 50% of the plants showed a visible black layer at the base of the kernel, four rows of each plot from an area of 8 × 3.0 = 24 m 2 were cut to identify the final total aboveground biomass and grain yield. The individual maize components were weighed (fresh mass) and dried at 65 ± 5 • C to constant weight (dry weight).
A modern, environmentally friendly analysis method, near-infrared (NIR) spectroscopy, which does not require chemical reagents, was used to examine the quality of the maize samples. Maize TAB samples were scanned with a NIRS-6500 device, with a spinning module using wavelengths between 400 and 2500 nm in reflectance, and the obtained spectra were processed with equations installed into the device for maize analysis (VDLUFA Laboratory, Speyer, Germany). For analysis, the samples were dried and ground with an ultra-centrifugal mill ZM 200 (Retsch, Haan, Germany) to pass a 1 mm screen. The precision of the used equations was sufficient and ranged within the following limits: standard error calibration (SEC)-0.16-0.64, coefficient of determination (RSQ)-0.77-0.94, and standard error cross-validation (SECV)-0.18-0.68. When using this method, maize quality parameters, including protein, lipid, and ash contents, have been determined. The number of water-soluble carbohydrates was determined by Anthron's method.
Soil samples were collected before the implementation of the experiment and they were later air-dried and passed through 2 mm and 0.25 mm size sieves in the laboratory. SOC content was determined by a photometric procedure at a wavelength of 590 nm using a UV-VIS spectrophotometer Cary (Varian) and using glucose as a standard [33]. The soil texture fraction was determined using the pipette method. All chemical analyses were conducted in the Chemical Research Laboratory, while texture analyses were carried out in the Department of Soil and Crop Management of the Lithuanian Research Centre for Agriculture and Forestry.

Theoretical Methane Yield
In this study, we calculated theoretical biochemical methane potential (TBMP) based on the maize sample's organic composition (expressed as TBMP org ). TBMP org was calculated by Equation (1)  where volatile fatty acid (VFA), lipids, protein carbohydrates, and lignin were expressed in terms of %. In the next step, TBMP values were recalculated into m 3 /ha −1 units.

Remote Data Acquisition
The UAV system for the image collection of the field experiment was a consumergrade quadrotor Phantom 4 Professional (SZ DJI Technology Co., Shenzhen, China) with an installed real-time kinematic (RTK) module that enhances the precision of position data, derived from satellite-based positioning. Additionally, this used UAV had a sunlight sensor that automatically adjusts radiation reflectance and obtains reflectance data directly. This UAV had a combined multispectral imaging system, including a visible light (RGB) sensor responsible for visible light imaging. There are five additional multispectral bands with 5.74 focal lenses, including blue light (B), green light (G), red light ® , red edge, and near-infrared (NIR), with center wavelengths of 450, 560, 650, 730, and 840 nm, respectively. The UAV campaign was carried out 9 times during the main maize growth stages (Table 4) at a height of 25 m above ground level. The UAV flights were conducted in clear sky and low wind speed conditions between 11:00 am and 13:00 pm local time. The UAV flights were controlled using the flight planner app (Pix4D SA, Lausanne, Switzerland), adapted for android OS and using the Huawei P20 smartphone. The average speed of each UAV flight was ≈2.2 m/s, with a camera looking downwards; the flight duration for the full experimental plot cover lasted approximately 20 min; and about 2100 images, with 85% vertical and 85% horizontal overlapping to obtain images with a 1.1 pixel size, were taken during each flight.

UAV Image Processing and Calculation of VIs
After each flight UAV, multispectral image data was preprocessed with Pix4D software, which uses the structure-from-motion technique. This technique was used to relate features between overlapping images and calculate the 3D position of the matched points, which are densified and textured with the corresponding images. The ortho-mosaic image was generated by projecting each texture point onto the 2D plane and was exported in a TIFF image format for further analysis. After the image stitching process, the generated orthomosaic ground sampling distance was 1.09 cm/pixel. Then, the densified point clouds were generated with 5,049,015 points. The RMSE values in the X, Y, and Z coordinates were 0.020 m., 0.017 m., and 0.052 m., respectively. In order to calculate the vegetation indices (VI) consisting of different combinations of wavelength-specific spectral reflectance, the prepared ortho-mosaic imagery was imported to freely available QGIS software (version 3.2.2.). In the next step, experimental plots (replicates) were marked out with polygons on the ortho-mosaic image, and the vegetation indices were calculated using the raster calculator tool. In this study, we computed 14 widely used VIs for predicting maize TAB and TBMP. The majority of the selected VIs had been used in previous studies for monitoring the growth and predicting the yields of agricultural crops. Based on this, we selected the most commonly used VIs and they were tested in our study (Table 5). Table 5. Vegetation indices (VI) calculated in the study in order to test the usefulness of multispectral aerial images to predict maize total above-ground biomass and theoretical biochemical methane potential.

Machine-Learning Methods for Predicting Biomass and Theoretical Methane Yield
In this study, we used three widely used machine-learning (ML) methods, i.e., generalized linear model (GLM), random forest (RF), and Support Vector Machine (SVM). In order to predict the maize total above-ground biomass (TAB) and theoretical biochemical methane potential (TBMP), a dataset regarding each of the vegetative (V5, V7, V10, VT) and reproductive (R1, R2, R3, R4, R5) period stages, where TAB and TBMP were assigned as dependent variables and VIs as independent variables (Table 5), was divided into two parts: 60% of the dataset was used for training while the remaining 40% of the dataset was used for testing. For dataset splitting, we used the split validation method, which randomly splits up the data into a training set and a testing set. The entire prediction analysis was done using a computer that has an Intel i9 processor and 64 GB memory with a 64-bit operating system. ML prediction algorithms were created and developed with open-source data software RapidMiner Studio 9.9 (RapidMiner Inc., Boston, MA, USA).
Generalized linear models (GLM) are an extension of traditional linear models [47]. This algorithm fits generalized linear models to the data by maximizing the likelihood of the training parameters. The model fitting computation is parallel, fast, and scales well for models with a limited number of predictors.
A random forest (RF) is an ensemble of a certain number of random trees and is more robust with respect to noise [48]. These trees are created/trained on bootstrapped sub-sets provided by the input data. Each node of a tree represents a splitting rule for one specific attribute. Only a sub-set of attributes, specified with the subset ratio criterion, is considered for the splitting rule selection. This rule separates values in an optimal way for the selected parameter criterion. For classification, the rule is separating values belonging to different classes, while for regression, it separates them in order to reduce the error made by the estimation. The building of new nodes is repeated until the stopping criteria are met.
The Support Vector Machine (SVM) is a form of supervised nonparametric modeling, which is defined by using kernels and operating on the margins [49]. The algorithm takes a set of input data and predicts, for each given input, which of the two possible classes comprises the input, making the SVM a non-probabilistic binary linear classifier. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.
The agreement between the observed maize TAB, estimated theoretical methane yield, and predicted parameters was tested by the coefficient of determination (R 2 ), BIAS, and RMSE, computed as follows [50]: where n is the number of observed values, y i and x i are the simulated and observed values, and y and x are the average observed and simulated values for the ith data pair. R 2 describes the proportion of the variance in the observed data as explained by the prediction model, and it ranges from 0 to 1, with higher values indicating less error variance. The BIAS measures the average difference between the observed/estimated and predicted values. A positive BIAS value indicates an under-prediction and a negative BIAS indicates an over-prediction. The RMSE is the square root of the mean square error. The smaller the BIAS and RMSE values are, the better the performance of the prediction. Significant differences between the treatments were determined using Tukey's test at a 0.05 probability level. Statistical analyses were performed using proc GLM, SAS v9.4 (SAS Institute Inc., Cary, NC, USA).

Weather Data
The daily meteorological data, including mean temperature • C, minimum and maximum temperatures • C, mean humidity (%), and sunshine hours per day, were obtained from a meteorological station of the Lithuanian Hydrometeorological Service (Ministry of Environment), located near the maize experimental field.

Conclusions
In this study, we investigated the feasibility of using UAV-multispectral images in combination with field measurements and ML algorithms to estimate maize TAB and TBMP values at main phenological stages. Our study suggests that the accuracy of both TAB and TBMP prediction was greatly affected by different maize growth stages; however, different maize hybrids did not have a significant effect on prediction accuracy. The best ML prediction results were obtained during the R2-R4 maize growth period; the prediction models managed to explain 88-95% of TAB and 88-97% TBMP variation. We also found that, for the practical usage of farmers, the earliest suitable timing for TAB and TBMP prediction in the Nordic-Baltic area is within stages V7-V10. In this study, three ML algorithms were used when predicting TAB and TBMP values. The best maize TAB and TBMP prediction accuracy results were achieved using the GLM model.