Physics-driven digital twins to quantify the impact of pre- and postharvest variability on the end quality evolution of orange fruit Resources, Conservation & Recycling

Currently, there are differences in the quality loss between individual fruit upon arrival at retail. These differ- ences in fruit quality stem from pre-harvest biological variability between individual fruit at harvest and postharvest variations in hygrothermal conditions between refrigerated shipments. The impact of these pre-harvest biological and postharvest variability on the final quality of each fruit that reaches the consumers remains largely uncharted. In this study, we addressed this knowledge gap by developing physics-based digital twins of orange fruit to unveil how pre-harvest and postharvest variability affect the final fruit quality upon arrival at retail. Markov chain Monte Carlo method was used to generate a realistic ’ virtual ’ population of 1000 individual orange fruits at harvest. Afterwards, the impact of pre-harvest biological variability and variations in hygrothermal conditions between shipments on several orange quality metrics, including mass loss, fruit quality index (FQI), remaining shelf life (RSL), chilling injury severity (CI), total soluble solids (TSS), color, and Mediterranean fruit fly (MFF) mortality was quantified. Results showed that pre-harvest biological variability causes variations in mass loss of oranges at retail by up to 1.2%, FQI by up to 5% and RSL by more than 2 days. The postharvest variability between shipments causes high variations in mass loss of oranges at retail by up to 4%, FQI by more than 20%, RSL up to 3 days, and CI up to 5%. The study also revealed that compared to pre-harvest biological variability, postharvest variability between shipments could increase the variations in RSL of oranges at retail by 75%, FQI by 50%, and mass loss by ~10%. This work helps improve our understanding of the variability in the end fruit quality upon arrival at retail.


Introduction
Every year, about one-third of the world's fresh produce is lost within the food supply chain, from farm to consumer (Bellù, 2017;Gustavsson et al., 2011). Such a remarkable level of food loss amounts to an enormous loss in resources, including water, labor, and investment, and also contributes to 5 -10% of global greenhouse gas emissions (Cassou et al., 2020;Garnett, 2006;Lake et al., 2015). Reducing food loss, therefore, implies improving resource conversation, enhancing food security, increasing access to food, reducing the environmental impact of food systems, and a shift towards a sustainable food system (Thyberg and Tonjes, 2016;Shoji et al., 2022;Onwude et al., 2020). Still, it is not fully understood when or why food loss occurs within hundreds of fresh produce shipments in a supply chain, let alone the best way to reduce such loss. One reason is that each fruit has unique pre-harvest biological properties with which it starts its postharvest journey, depending on its growing conditions and harvest time. These fruit properties result from biochemical and physiological processes, such as pigment synthesis and carbohydrate accumulation influenced by growing conditions (Cronje et al., 2016;Lado et al., 2019). Due to varying growing conditions, the differences in the biological properties of individual fruits at harvest affect the quality of fresh produce in the subsequent stages of the postharvest supply chain, especially at the consumer (Sibomana et al., 2016;Di Vittori et al., 2018;Lufu et al., 2020).
The postharvest supply chain is often characterized by refrigerated storage and transport logistics during the entire journey of fresh produce (Onwude et al., 2020). These refrigerated shipments contain environmental air temperature and humidity sensors. Data from these sensors are often used as the first indicator to map the quality evolution in such shipments. For these, the low temperatures maintained decrease the rate of temperature-driven biochemical degradation reactions, thereby increasing the quality and remaining shelf life of fresh produce. However, every refrigerated shipment encounters a distinct hygrothermal journey. The reasons for this include variability in environmental air temperature and humidity, delays at ports, routing changes, or possibly cooling breakdowns (Mercier et al., 2017). Since the time-varying environmental air temperature and humidity profile is different for every shipment, each shipment has a peculiar food quality evolution. This variability is another reason why there are large variations between individual fruits upon arrival at the retailer; that is to say, some fruits will degrade sooner than others. Excessive decay could lead to the complete discard of the full shipment of fruits or require laborious sorting out of the spoiled products.
Research has been conducted on how biological variability of fresh produce at harvest (Hertog et al., 2009;Joshi et al., 2018;Joshi et al., 2019;Hertog et al., 2004;Duret et al., 2015;Gwanpua et al., 2014;Hertog et al., 2007) and postharvest variability between shipments (Shoji et al., 2022;Joshi et al., 2019) affect the quality of fruits. The authors mostly showed how pre-harvest variability affects cultivar, color, and maturity age of fruits and vegetables. Rarely, the impact of several pre-harvest biological properties of fruits at harvest and hygrothermal differences between shipments are accounted for. Additionally, in most cases, the targeted cold chain scenarios do not reflect the fluctuations in air temperature or the duration of actual transcontinental cold chains and their impact on storage life variability. To the best of our knowledge, information on why and when food loss occurs at the end of the fresh produce supply chain does not exist. To this end, the relevant question is what has the highest impact on the quality of fruits that a consumer receives. It is important to ascertain if (i) the variability in the initial quality at the onset of the cold chain or (ii) the variability in environmental conditions the fruit experiences between different shipments has the most significant impact on the quality of fruits the consumer receives.
To answer these questions, this study aims to quantify the impact of pre-harvest biological variability of fresh fruits at harvest and postharvest variability due to hygrothermal differences between shipments on the end quality evolution of orange fruit. A Markov Chain Monte Carlo (MCMC) sampling method and physics-based mechanistic digital twin were used to unveil the impact of the pre-harvest biological variability of Valencia oranges on fruit quality evolution in a single shipment. A mechanistic digital twin of fruit is a 'virtual' model linked to real-world processes via sensor data, containing all essential product characteristics and simulating relevant hygrothermal and metabolic processes of the fruit (Fig. 1) (Defraeye et al., 2021;Verboven et al., 2020;Ivanov and Dolgui, 2020). MCMC was used to generate a realistic 'virtual' population of orange fruits with different physical and geometrical properties. With this approach, digital twins of 1000 'Valencia' oranges (16 cartons-1/5 pallets of oranges) were developed to simulate ongoing quality evolution in a single shipment. These simulations indicate how much the biological variability in the shipment affects the end quality. The impact of postharvest variability due to hygrothermal differences between shipments on the quality evolution of oranges was then quantified. These simulations indicate how much the specific shipment of fruits affects the end quality. Finally, the impact of pre-and postharvest variability on the quality of oranges at the retail were compared to identify the value chain with the most impact.

Biological properties of fruit at harvest
The study was carried out for 'Valencia' orange over the 2019 season in the Citrusdal production area, Western Cape, South Africa. Except where stated otherwise, all fruit were harvested at commercial harvesting maturity determined by producers. Five fruits per tree were harvested from two trees in five different orchards. A total of fifty oranges were sampled and data were collected for different physicochemical properties.
The external fruit properties, such as fruit weight (g) and rind weight (g), were determined at harvest using an electronic scale (ADW, UWE Scales and Calibrations, Cape Town, South Africa). The fruit size (mm) and rind thickness (mm) were measured using a caliper (CD-6 ′ ' C, Mitutoyo Corp, Tokyo, Japan) after the seven-day shelf-life period. The rind color for each fruit replicate was obtained using the standard CRI color plate (CRI, 2004) for orange, where a visual color score is assigned to each fruit.
The internal properties of the fruit were assessed by cutting along the longitudinal plane of the fruit for juice extraction using a citrus juicer (8-SA10, Sunkist®, Chicago, USA). The total sugar content of the fruit pulp (measured as • Brix and expressed as% TSS in the pulp) was determined using a digital refractometer (PR-32 Palette, ATAGO CO, Tokyo, Japan).

Measurement of air temperature in actual cold chain
The air temperature on the cold chain of oranges was monitored and data from TempTale®4 (TT4) GEO Eagle Extended (SENSITECH, Beverly, MA, USA) sensors with an accuracy of ±0.5 • C were acquired. Data acquired are for 43 shipments from packhouse in Durban, South Africa, to a distribution center (DC) in Western Europe spanning an entire season (August 2018 to September 2019). This data provides realistic temperature data for overseas cold chain. The air temperature sensors were positioned in the second last row of pallets from the door on the left-hand side of the container at half the height of the pallet (see supplementary material for details). The position of the sensors (USDA3) in the pallet (say x) close to the fruits ensured that the air temperature collected represents that sensed by the fruit in pallet x (see supplementary material #1). As such, the digital fruit twin is representative of fruit near the location where the air temperature is sensed. For hygrothermal variability between shipments, each shipment has a unique temperature profile and length. The delivery air temperature, therefore, represents only the fruits in the pallets from the door on the left-hand side of the container for different shipments (pallet x). The targeted Fig. 1. A digital twin of a typical citrus supply chain from the farm to the consumer at the retail level. delivery air temperature for these shipments was between − 1 • C to 0 • C. Note that a sample size of 43 shipments of 'Valencia' oranges was found sufficient for this study using a modified Cochran's formula with 95% confidence level and ±5% precision (Shoji et al., 2022). This was also demonstrated by examining the minimum required sample size that stabilizes the average air temperature value. Specifically, for this study, the minimum amount of dataset with 95% confidence interval for the average air temperature values ( • C) is 5 and above. Details of the estimation procedure are provided in the supplementary material.
To further extend the monitored supply chain from the DC to retail stores, three days of retail sales conditions were assumed and added to the sensor data, with a daily time interval of 3600 s for each shipment. This was simulated stochastically based on the average air temperature ( • C) at several retail locations in Rotterdam, Netherlands, collected via NASA POWER (August 2018 to September 2019). The average air temperature for the locations was for specific retail storage days and times of the month based on each shipment time stamp at DC. In addition, the standard deviation of the measured cold chain air temperature ( • C) was also used for the stochastic simulation.

Markov chain Monte Carlo sampling
A population of 1000 realistic 'virtual' Valencia orange fruits produced in South Africa was generated, which was fed into a physics-based digital twin as input data for the stochastic simulation (see Sections 2.3 and 2.4). To do this, a prior probability distribution of 'Valencia' orange (i.e., the mean and standard deviation of different pre-harvest biological properties from literature) was considered (Goedhals-Gerber and Khumalo, 2020; Goulas and Manganaris, 2012;Bai et al., 2016;Khalid et al., 2012;Rehman et al., 2018). A population of 1000 oranges was generated in this study to reduce the computational time and enable fast post-processing when simulating the digital twins of oranges throughout the transcontinental citrus supply chain. Furthermore, the generated population, representing 1/5 orange pallets (16 cartoons), is already a very large sample size for quality measurement compared with the current standard in the industry. The representative visual and destructive sampling of citrus per consignment throughout the transcontinental supply chain is a minimum of 30 oranges and a maximum of 100 (USDA 2005;Adans, 2020). A field sample dataset of 50 'Valencia' oranges of different pre-harvest biological properties with 10 oranges each from five different orange orchards in Citrusdal, South Africa, was considered. The pre-harvest biological properties of fruit at harvest include fruit size (mm), fruit weight (g), TSS ( • Brix), fruit color (color chart 1-5 scale), rind thickness (mm), rind weight (g), initial quality (%), fruit density (kg m − 3 ), and rind density (kg m − 3 ). The MCMC algorithm was developed using Gibbs sampling with package NMixMCMC in Rstudio software (version 1.4.1106) (Team, 2020). More details about the MCMC method and steps are given in the supplementary material. Fig. 2 illustrates a flowchart of the methodology from data collection to MCMC sampling.

Digital twin configuration
A physics-based mechanistic model based on the finite element method was developed to simulate the quality evolution of 1000 'Valencia' oranges (Citrus sinensis (L.) Osbeck) in a refrigerated container. A single fruit was modeled as a two-dimensional axisymmetric geometry of a sphere (Fig. 3). The domain was divided into two sections of the fruitthe rind (base case thickness (rind thick ) = 5.9 mm) and the fruit pulp (base case radius (r pulp ) = 30.7 mm). The configuration was simplified, and the fruit-fruit interaction was ignored. This is because of the limited thermal interaction with other fruit, due to the few contact points between fruits, and the fact that the surrounding fruit is at a similar temperature. Based on experimental and literature data, the model was calibrated with the same geometrical and material properties as the real fruit (see Section 4, Table 1, and supplementary material). The model was then linked to a sensor and virtual orange data, thus forming digital twins of oranges in a supply chain.

Fig. 2.
A flowchart of the overall methodology implemented from data collection to Markov chain Monte Carlo sampling for a citrus supply chain (NB: SA = South Africa; EU = Europe).

Mechanistic multiphysics model
A mechanistic model was developed to calculate heat transfer inside the orange fruit and its convective exchange with the environment throughout the supply chain from farm to retail. In addition, physicsbased models for predicting mass loss, thermally-driven fruit quality index, pest mortality, and chilling injury were also coupled with the thermal model for heat transfer.

Thermal model
The energy conservation equation was solved in the fruit, with temperature (T, K) as the time-dependent variable (Eq.. (1)).
where ρ i is the density (kg•m − 3 ), C p,i is the specific heat capacity , with the subscript i corresponding to the rind and fruit pulp. Q resp (W•m − 3 ) is the volumetric heat of respiration, which is the product of heat of respiration (W kg − 1 ), multiplied by the pulp density (kg m − 3 ). The heat of respiration was estimated from a correlation between the carbon dioxide production rate of orange and the temperature (Becker et al., 1996) (see supplementary material for details). Thermal equilibrium between all components and phases was assumed in this model. The material properties are given in Table 1.
The convective boundary condition for heat transfer based on flux continuity is presented in Eq.. (2). The conductive flux within the fruit is balanced by the heat flux due to convection and evaporation at the surface.

n.(λ∇T)
where n is the unit vector normal to the surface (-), h c is the convective heat transfer coefficient (W•m − 2 •K − 1 ), T air is the delivery air temperature (K). Here j m is the moisture flux at the surface (kg•m − 2 •s − 1 ) derived from the moisture transport model (Section 2.4.2) and H vap is the latent heat of evaporation (H vap = − 2364.2T+3,147,175.2 J•kg − 1 ) (Ferrua and Singh, 2009). The radiation exchange between different fruit inside the ventilated boxes was considered limited compared with convective heat transfer. This is because of the small temperature difference between adjacent fruit during cooling in actual cold chains. Thus, radiation exchange was not modeled.
Since the airflow field around the fruit was not explicitly modeled, its influence on the hygrothermal behavior of the fruit was accounted for using the convective heat and mass transfer coefficients (h c , k air ). A representative airspeed in the porous stack of products, namely a pallet of orange fruit, was estimated based on the airflow rate inside a refrigerated container (Wu and Defraeye, 2018). Eq. (3) was employed to estimate the physical airspeed in the porous medium (Dehghannya et al., 2010)(see supplementary material for details). This physical airspeed is the actual speed around the orange fruit, which was 0.11 m•s − 1 in the present study.
where u physical , (m•s − 1 ) is the actual airspeed around the orange fruit, u superficial , (m•s − 1 ) is the superficial airspeed, ϕ (%) is the porosity in a pallet of orange, A cross , (m 2 ) is the cross-sectional area of the cargo bottom, and Q air (m 3 •h − 1 ) is the delivery air flow rate. In this study, spatially-constant heat transfer coefficient (h c , W•m − 2 •K − 1 ) around the fruit in a container was assumed over the entire fruit surface, as a simplified representation even though h c could spatially be distributed over the fruit surface (Tagliavini et al., 2019).

Table 1
Base case input and thermal parameters of orange fruit. Note: airspeed is accounted for using the Nusselt number (Nu) correlation for flow around a single sphere presented in Eq. (4) (Whitaker, 1972).
is the dimensionless Reynolds number as a function of the air speed (-), Pr is the Prandtl number for air (-), and ν air is the kinematic viscosity of air (m 2 s − 1 ). µ air and µ air, wall correspond respectively to the absolute viscosity of air (kg m − 1 s − 1 ) and the viscosity of air at the wall (kg m − 1 s − 1 ), which were considered to be equal in this study. The base case h c value for this study is 4.6 (W•m − 2 •K − 1 ).

Mass loss model
Mass loss, also called moisture loss, is a crucial metric in the cold citrus chains because it directly influences market value. As citrus is a product sold by weight, a loss in saleable weight implies a direct loss in profits. The mass flux (j m ), or moisture flux at the surface of the fruit (kg•m − 2 •s − 1 ), was calculated as the product of the convective mass transfer coefficient (k t ), and the difference between surface vapor pressure (P v,rind , Pa) and ambient vapor pressure (P v,air , Pa) Eq. (5)). These vapor pressures were estimated based on the surface temperature, ambient temperature, and ambient relative humidity (Eqs. (8)-(10). The mass transfer coefficient, k t was determined from the contribution of the resistance due to moisture migration through the rind (k rind , s•m − 1 ) and the resistance to mass transfer due to the air boundary layer (k air s•m − 1 ) (Eq. (6)) (Becker et al., 1996;Cogné et al., 2013).
The air film mass transfer coefficient (k air ) was estimated based on the airspeed (u air , m•s − 1 ) using the Sherwood correlation for a sphere, as presented in Eq. (7) (Becker et al., 1996;ASHRAE 2018).
where D fruit is the diameter of the citrus fruit (m), Re (=D fruit •u air •ν air − 1 ) is the dimensionless Reynolds number as a function of the airspeed. Here, ν air corresponds to the kinematic viscosity of air [m 2 s − 1 ], Sc (=ν air • δ wv, air − 1 ) is the Schmidt number (-), δ wv,air is the diffusion coefficient of water vapor in the air [m 2 s − 1 ]. The base case Sherwood number (Sh) for this study is 15.56 (see supplementary material for more explanation).
The vapor pressure was dynamically linked with temperature using the Antoine equation (Becker et al., 1996), expressed in Equation 8: Specifically, the vapor pressure just below the rind (P v,rind ) was computed using Eq. (9).
where a w is the water activity below the fruit surface [-] (Table 1). While the vapor pressure of the air around the fruit (P v,air , Pa) was estimated using the relative humidity of the air (RH air ,%), as shown in Eq. (10).
The relative humidity of the air within the refrigerated container (RH air ,%) was estimated using the principles of psychrometry by assuming a constant sensible heat ratio (SHR) within the refrigerated container. Details of the calculation steps are included in the supplementary material.
where A s is the surface area of the fruit (m 2 ) computed from the geometry and m ini is the initial mass of the orange (kg) which was measured as 0.21 kg. Note that the threshold for citrus moisture loss during shipment is between 7 -10% (Ladaniya, 2008).

Thermally-driven model for fruit quality attributes
The quality of the fruit, which often determines consumer acceptability, is affected by temperature conditions during the cold chain from farm to retail. Most of the temperature-induced underlying biochemical reactions responsible for quality changes of oranges can be adequately modeled. The evolution of multiple quality attributes such as total soluble solids (TSS [ • Brix]), color (Col [scale 1-5]), and fruit quality index [%] of the orange fruit can thereby be predicted.
For this purpose, kinetic rate law models were implemented in order to quantify the change with shipment time for each of the abovementioned specific quality attributes A i (Robertson, 2016;Van Boekel, 2008), as in Equation 12: where the subscript i indicates the specific attribute, k i is the corresponding rate constant (s − 1 ), and n i is the order of the reaction (-), which depends on the attribute's decay kinetics. The order of the reaction was chosen based on the best fit of the model with the data or the inherent order of the decay reaction (e.g., TSS is a first-order reaction, Table 2). However, little differences were often present between first-and zeroorder approximations. In Eq. (12), for a constant value of k, i.e. at a constant temperature, the quality attribute decreases linearly over time (for zero-order reactions, where indeed the magnitude of the slope equals to k, Eq. (13)), or shows an exponential decrease (for first-order reactions, Equation14): where A 0,i is the quality attribute at the start of the cooling process (t = 0 s) for a specific attribute i and C i is an integration constant. However, in reality, the rate constant k i is not constant, and so Eq. (12) needs to be explicitly solved over time. The temperature dependency of the quality attribute was therefore incorporated into the rate constant through an Arrhenius relationship (Robertson, 2016)

as in Equation 15
: where k 0,i is a constant (s − 1 ), E a,i is the activation energy (J mol − 1 ), R is the ideal gas constant (8.314 J mol − 1 K − 1 ), and T is the absolute temperature (K). For fruit quality index, color and total soluble solids of degree fruits, the constants k 0,i and E a,i , were calibrated from quality attribute data as a function of time at (at least) two different temperatures. More details are available in the supplementary material. The fruit quality index (I,%), which is linked to the remaining shelf life at the retail, serves as a general indicator of the marketability of the fruit. The quality threshold value of 10% was assumed as a point where the product has not lost its quality completely, but first visual damage occurs, below which the product is not acceptable anymore to the consumer. This quality metric was modeled using a first-order kinetic model, which quantifies the respiration-driven, temperature-dependent change in overall quality from the point of harvest until the point where the fruit is considered to be lost (Valentas et al., 1997). Details of the model calibration are presented in the supplementary material.
The remaining days of shelf life (RSL) for a shipment were predicted based on the same kinetic rate model, by storing the fruit in-silico at retail air condition of 20 • C. Here, typical dynamic conditions encountered during retail were also considered. RSL was computed until the remaining quality of the respiration-driven quality indicator (I) attained the threshold of acceptable quality (≥10%). The base case quality parameter values used for the physics-based simulation of orange fruit are presented in Table 2.

Lethality model for pest mortality
The efficacy of the cold disinfestation treatment against Mediterranean fruit fly (MFF) was modeled based on a lethality model (Zaragozá, 2019). This was done based on knowledge of the time-temperature history of the fruit at its most critical location, with the highest temperature during cooling. For citrus fruit, the most critical location corresponds to the core of the fruit. This model, described in the supplementary material, was calibrated based on the death kinetics of Mediterranean fruit fly (Animal and Plant Health Inspection Service 2014).

Thermal damage model for chilling injury
Chilling injury is a physiological disorder caused by suboptimal low storage temperature beyond a threshold duration that alters the tissues in the rind, leading to symptoms such as peel pitting or discolorations that render the fruit unmarketable (Biswas et al., 2016;Chalutz et al., 1985). The incidence of chilling injury on the surface of the fruit was computed similarly to the thermal damage model for the human skin during skin burn (Moritz and Henriques, Sep. 1947). The model, described in the supplementary material, quantifies thermal damage as a dimensionless damage integral (Ω) based on the combined effect of rind temperature (T rind, ) (K) and exposure time (t).

Numerical simulation
The physics-based digital twin was implemented in COMSOL Multiphysics (version 5.6), which is a finite element-based commercial software. The transient conductive heat transfer and thermal damage model in the fruit during convective air cooling was solved using the 'Bioheat Transfer' physics. Ordinary Differential Equations' and 'Differential Algebraic Equations' interfaces were used to solve for moisture transport, total soluble solids, color, fruit quality index, and mortality of fruit fly. Quadratic Lagrange elements were used together with a fullycoupled direct solver, relying on the MUMPS (MUltifrontal Massively Parallel sparse direct Solver) solver scheme. The solver tolerance was set to 10 − 5 based on sensitivity analysis. Adaptive time-stepping based on the Backward Differentiation Formula (BDF) was used for the simulation, with the maximum step set to automatic to maintain the desired relative tolerance. A grid sensitivity analysis was conducted to ensure that the results were grid-independent (see supplementary material for details). The grid consisted of triangular and quadrilateral finite elements, with a total element size of 6504. To stochastically simulate the quality evolution of 1000 'Valencia' oranges via the digital twins, a parametric sweep was performed over the wide range of pre-harvest biological properties of fruit generated from MCMC. The parametric sweep feature in COMSOL Multiphysics® runs calculations for several parameter cases in a single instance. Note that the various physical models of the digital twin (i.e., from Section 2.4.1 -2.4.5) are individually validated with experimental data or were empirically calibrated with experimental data on real fruit (as detailed in supplementary material). As there is no interaction between the sub-models and they have also been validated, the full digital twin can be considered validated . However, due to seasonal differences between the oranges in this study and those used for the calibration, there could be some differences between our validated digital twin and actual fruit.

Statistical data and sensitivity analysis
The actionable quality metrics from the digital twins were analyzed and presented as median (center line), 75 th upper and 25th lower quartiles (box limits) and 1.5 × the interquartile range (IQR, whiskers) with a 0.95 confidence level. Levene's test at p ≤ 0.05 significant level and 95% confidence interval was also used to assess the equality of variances at farm level, port in South Africa (SA), port in Europe (EU), and retail storage within a single shipment.
The combined pre-harvest biological and postharvest variability assessment was presented based on mean values of the different quality metrics at retail. A fitted probability distribution function and rug plot were applied to visually determine the statistical differences in quality evolution due to pre and postharvest variability at the end of the supply chain. Additionally, a two-sample t-test, assuming equal variances at p ≤ 0.05 significant level, was used to compare the mean significant difference of the quality evolutions.
Sensitivity analysis was carried out to assess the impact of each preharvest biological property on the fruit quality variability at the end of the cold chain (see supplementary material for details). This is important to identify the cultural practices that have the highest impact on different quality metrics as a basis for future study. The descriptive test, probability distribution function and sensitivity analysis were all conducted using ORIGIN 2020b (Government) (OriginLab, Northampton, Massachusetts, USA) and Microsoft Excel (2016).

Markov chain Monte Carlo (MCMC) analysis
To assess the variability of the oranges due to pre-harvest conditions at harvest, MCMC sampling method was first employed. MCMC used measured data to generate a statistically-representative set of fruit properties for individual fruits. MCMC was used to generate a realistic virtual population of orange fruit used for stochastic simulations via digital twins. Fig. 4 shows the correlation matrix between the different parameters of the sample data and also the generated virtual population. The point estimates, range, mean and standard deviation of the posterior distribution of the variance components were calculated from the MCMC samples. By comparing the correlation coefficient between parameters  [-], n = Order of reaction [-], E a = Activation energy [J mol − 1 ], k 0 = Pre-exponential factor [s − 1 ], Q 10 = Ratio of the rate constants at temperatures T and T + 10 K (=kT+10/kT) [-], k 0,quality = Pre-exponential factor for fruit quality index [s − 1 ], and E a,quality = Activation energy for fruit quality index [J.mol − 1 ].
of sample data and the generated virtual oranges, MCMC successfully captures the realistic relationship between the different sample data parameters and the corresponding parameters of the generated virtual oranges. These parameters are fruit size (mm), fruit weight (g), TSS ( • Brix), fruit color (color chart 1-5 scale), rind thickness (mm) and rind weight (g). MCMC has previously been used to generate clusters for genetic variability and trait correlation of fruits and vegetables (Kim et al., 2021;Mora et al., 2019, Kato)

Impact of pre-harvest biological variability on fruit quality evolution
The impact of pre-harvest biological variability on fruit quality evolution was evaluated. Therefore, 1000 orange fruits that are transported in a single shipment (i.e., − 1 • C targeted delivery cold air temperature and shipment length of 30 days) were simulated. For a real shipment of citrus fruit in a refrigerated container, typically about 100 ′ 000 orange fruit are shipped in 20 pallets. Although only 1% of this shipment was covered, this sample size is already much larger than what is experimentally monitored during quality control and a statistically relevant sample size of the shipment. Mass loss (%), fruit quality index (%), remaining shelf life (RSL) (days), chilling injury severity (%), MFF mortality (%), total soluble solids (TSS) ( o Brix) and color (1-5 scale) were quantified. Fig. 5 shows box plots of the different quality evolution at different supply chain stages from South Africa (SA) to retail shops in Europe (EU) using the digital twin. Fig. 5A shows the variability in fruit mass loss from farm to retail due to pre-harvest biological variability. The mass loss increased across the postharvest supply chain and the variability between individual fruit increased slightly (standard deviation (STDev) up to 1% and relative STDev from 21% to 19%) as the shipment progressed further throughout the chain. Indeed, the pre-harvest biological variability of oranges results in a variation in fruit transpiration, which is directly correlated with mass loss (Holcroft, 2015; J.F. Thompson et al., 2008). The largest mass loss occurs at retail (2% of initial fruit weight). The high amount of mass loss at retail is due to higher storage temperature, decreased humidity, and increased respiration heat production, as this is temperature dependent. Similar findings on increased respiration rates of fruits at higher temperatures have been reported (Joshi et al., 2018;Defraeye et al., 2019). The impact of pre-harvest biological variability of orange fruits on mass loss is largest at retail. Due to inherent biological variability in fruit properties after harvest, the mass loss of a shipment shows a variability up to 1.2% between individual fruit. This is substantial given that the average mass loss of the entire shipment is 3.4% upon arrival at the retailer. Fig. 5B shows the impact of pre-harvest biological variability on the temperature-driven fruit quality index evolution of orange fruit. The fruit quality index and variability of the fruit decrease across the postharvest supply chain, with STDev ranging from 9% (farm level) to 4% (at retail). This means that the impact of pre-harvest biological variability on the fruit quality index of oranges decreases across the postharvest supply chain. The impact of pre-harvest variability of oranges is lowest at retail storage compared to the other supply chain stages. However, this impact is still high, with a fruit quality variability between individual fruits of the same shipment at retail of up to ~5%. This means that ~20% of the entire shipment upon arrival at retail contains fruits of different fruit quality, as the average fruit quality index is about 30%. This implication can be further seen in the remaining shelf life days between individual fruits at retail.
Using the quality upon arrival at the retailer, the remaining shelf life of each of the 1000 fruits in the shipment after arrival at the retailer was quantified. The average remaining shelf life of the shipment was 6 days, and the standard deviation was 1.4 days. The min and max values were 1.2 and 9.4 days, respectively. As such, variability in the remaining shelf life of several days was found between individual fruit.   1-5 scale). The fruit quality of 80% was assumed when leaving the packhouse calibrated based on measured quality data. The boxplots within represent the median (center line), 75 th upper and 25 th lower quartiles (box limits) and 1.5 × the interquartile range (whiskers) (IQR). Letter N at the top of the plot indicates the number of samples. Significant differences between different stages along the supply chain, namely, farm, port-SA, port-EU, arrival at retail and end of retail storage, were determined using the Levene's test of equal variances and are indicated with letters from a to d for statistical significantly different groups at p ≤ 0.05. The red-colored horizontal line in the plots signifies the threshold value for the different quality metrics. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Next, the influence of pre-harvest biological variability on the chilling injury severity of oranges within a shipment was assessed (Fig. 5C). The chilling injury of oranges increased from 0% to 11.5% across the postharvest supply chain. Very high chilling injury occurred at the end of refrigerated shipment and during storage at retail, with values above the severe chilling injury threshold. A very low variation (<0.5%) in chilling injury was observed for the entire supply chain. This low variability is due to the minimal impact of fruit density and size (see sensitivity analysis in supplementary material). This implies that the pre-harvest biological variability does not have much impact on the chilling injury of oranges during shipment. Rather, the chilling injury was mainly driven by cold chain practices (e.g., low air temperature) (see Section 3.3). Nonetheless, there is a significant difference between the variability at the beginning of the postharvest journey (0%) and the end of the retail storage (0.3%) (p ≤ 0.05). Indeed, fruit sensitivity to low temperatures is also influenced by a variety of other pre-harvest biological factors such as biological age, harvesting time, production area, production season, pre-harvest temperature and humidity, and other cultural practices (Di Vittori et al., 2018;Siboza et al., 2017;Musacchi and Serra, 2018;Kashash et al., 2016). All these pre-harvest factors were not considered in this study due to insufficient field data. Therefore, it was not possible to assess their actual impact on chilling injury.
The impact of pre-harvest biological variability on MFF mortality of oranges for a single shipment is presented in Fig. 5D. In accordance with protocols dictated by citrus import regulations, maintaining a low delivery air temperature is essential to keep the fruit core temperature at or below 2 • C for 16.7 days or at 3 • C for 18 days. Thereby, at least 99.9968% Mediterranean fruit fly (MFF) mortality can be achieved (National Department of Agriculture 2018; De Lima et al., 2007;Grout et al., 2011). In this study, the average MFF mortality increased from 0% to 100% across the postharvest supply chain, whereas the variability is less than 0.1%. This implies that the pre-harvest biological variability does not impact MFF mortality at the end of the supply chain. That is, the entire shipment upon arrival at the retailer is devoid of pest infestation. This also means that there are not many differences in core temperature between fruits of different sizes, once they are properly cooled down.
The impact of pre-harvest variability on total soluble solids (TSS) (Fig. 5E) and color (Fig. 5F) evolution of oranges during shipment was evaluated. There was no significant change in the TSS (Fig. 5E) and color (Fig. 5F) across the postharvest supply chain. This shows that the preharvest biological variability has an equal impact on TSS and color of degreened oranges across the entire supply chain. This is expected because 'Valencia' orange is a non-climacteric fruit and, as such, does not increase in TSS, or show a change in color at low (>4 • C) shipping temperatures. Since TSS and color do not change for degreened citrus fruit after harvest, the variability between the different fruit after harvest will be the one that quality control experts upon arrival at the retailer will observe.

Impact of postharvest variability on fruit quality evolution
The impact of postharvest variability due to hygrothermal differences between shipments was quantified. Therefore, a single fruit going through 43 different shipments was simulated. At the end of the supply chain, the mass loss, fruit quality index (FQI), remaining shelf life (RSL) days, chilling injury and Mediterranean fruit fly (MFF) mortality of oranges were quantified via digital twin. The total soluble solids (TSS) and fruit color were not considered as they remain constant through the supply chain.
Before quantifying these metrics, the shipments were first analyzed. Fig. 6 shows the time-varying air temperature profile as input for the physics-based simulations. The shipments showed a considera ble variation in delivery air temperature and length of time (Fig. 6A), which spanned between 20 and 55 days (Fig. 6B), with more than 35% of the shipments above the median of 31 days (Fig. 6A, B). Similarly, the analysis of the delivery air temperature in a single shipment shows a variation above 2 • C for more than 30% of the shipments (Fig. 6C). The cumulative consequence of these factors resulted in a unique cooling history for each shipment. Similar variability in the air temperature and shipment time have been reported for citrus supply chain .
The impact of hygrothermal variability between shipments on the mass loss evolution of 'Valencia' oranges during their postharvest journey was quantified (Fig. 7A). The mass loss increased from 0% to 5.6% over shipment time, depending on the shipment air temperature (Fig. 7A(i)). This implies that delivery air temperature fluctuations have a major effect on mass loss, as also reported by Joshi et al. (Joshi et al., 2019). The further increase in mass loss after overseas shipment (from distribution center to retail stores) is due to the relatively high temperature and low prevailing ambient relative humidity at retail. Similar findings have been reported for different fruits and vegetables (Joshi et al., 2019;Habibi et al., 2021;Amwoka et al., 2021;Kelly et al., 2019). Fig. 7A(ii) shows the large variability of mass loss (2 -6%) between different shipments at retail, with more than 60% of the shipments having mass loss above 3%. This is significant as the average mass loss of all shipments is 3.3% upon arrival at retail. The impact of hygrothermal variability between shipments on the fruit quality index of 'Valencia' oranges during their postharvest journey was also analyzed (Fig. 7B). The fruit quality index decreased with shipment time for all shipments (Fig. 7B(i)). The fruit quality index between different shipments at retail varies between 20 and 43% ( Fig. 7B(ii)). The very high variability of over 20% signifies that more than 30% of the shipments contain fruits of significantly different quality upon arrival at the retail. This remarkable insight is echoed in the remaining shelf life days between fruits of different shipments at retail. The average remaining shelf life of the shipment was 6 days, and the standard deviation was 2.4 days. The min and max values were 0.0 and 8.9 days, respectively. This means that some shipments arrived at the retail with oranges that must be consumed immediately or within 9 days to avoid losses. Shrivastava et al. (Shrivastava et al., 2022) reported similar findings on the effect of the variability between shipments on fruit quality and remaining shelf life.
The impact of postharvest variability on the chilling injury of oranges during shipment is shown in Fig. 8A. Chilling injury severity increased from 0 to ~17% with increasing shipment time for different shipments (Fig. 8A (i)). The difference in chilling injury severity between shipments during transit is up to 10%. The temperature fluctuation or deviation from the target air temperature (− 1 • C or 0 • C) is responsible for the large variation in the chilling injury between different shipments. Increasing the temperatures above a threshold chilling inducing temperature (>4 • C) could increase the fruit tolerance due to gradual conditioning. Fig. 8A (ii) shows that the mean chilling injury at retail is 5%, and the variability between shipments is ~5%. This means that all shipments contain fruits with different levels of chilling injury severity upon arrival at retail. In fact, ~20% of the shipments contain fruits with chilling injury above the severe commercial threshold at retail.
The impact of postharvest variability on the evolution of MFF mortality during shipment was also evaluated (Fig. 8B). The mean effective lethality at 2 • C for all the shipments was found to be 14 days, which is less than the targeted 16.7 days (Fig. 8B (i)). This means that all shipments attained MFF mortality of more than 99.9968% during transit, which is above the required Probit-9 treatment recommended by The initial fruit quality of 80% was assumed when leaving the packhouse calibrated based on measured quality data. The boxplots represent the median (center line), 75 th upper and 25 th lower quartiles (box limits) and 1.5 × the interquartile range (IQR, whiskers). The mean and median connecting lines are also included. SC = supply chain. The dotted horizontal line in plots (i) signifies the threshold value for the different quality metrics. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) phytosanitary protocols (FAO and IPPC 2017). Thus, variability between shipments does not have an impact on MFF mortality. This can be further observed in Fig. 8B(ii), with 41 shipments having 100% MFF mortality at the end of the supply chain.

Comparison between the impact of pre-and postharvest variability on the quality evolution of orange fruit
The variability in end quality upon arrival due to (1) pre-harvest biological variability in fruit properties and (2) postharvest variability in hygrothermal conditions in the supply chain of 'Valencia' orange fruits were compared. A probability distribution function and rug plot were used, as shown in Fig. 9. Only actionable metrics of which both preharvest and postharvest variability have a significant impact during shipment were considered. In Fig. 9A, it can be seen that the variability in the mass loss at the end of the supply chain is caused, to a similar extent, by the inherent variability in fruit properties after harvest and by the variability in hygrothermal storage conditions during transit. Both pre-harvest and postharvest variability significantly impact the fruit quality index and remaining shelf life days at retail (Figs. 8B and 8C). Nevertheless, postharvest variability induces a slightly larger spread in the final quality and remaining shelf life days.

Conclusions
This study used digital twins to unveil how pre-harvest biological variability and postharvest variability due to hygrothermal differences between shipments affect the quality of citrus that gets to the consumer. For the pre-harvest biological variability, 1000 virtual oranges of different physiochemical properties at harvest were generated using the Markov chain Monte Carlo method based on 50 oranges harvested from 10 trees. The 'virtual' oranges were then fed into a physics-based digital twin of the citrus supply chain from farms in Citrusdal, South Africa, to retail shops in Europe. Digital twins of 43 different orange shipments from farms in Citrusdal, South Africa, to retail shops in Europe were also developed. The output of these twins was used to quantify the impact of postharvest variability on the quality of oranges that gets to the consumer. These digital twins were coupled with the real-world environmental conditions via measured air temperature sensor data. The key conclusions derived from this study are as follows: • For a single shipment, variability in the mass loss between individual fruit at retail of up to 1.2% was observed. Results also showed that about 20% of the fruits in a shipment upon arrival at retail are significantly different in fruit quality. The variability in the remaining shelf life of several days exists between individual fruits. This means that the fruits the consumers buy could last for different days, which is a challenge for the retailers to further ensure consumer satisfaction. Our findings also show that a single shipment upon arrival at the retail is without pest infestation. • Concerning multiple shipments, more than 90% of shipments have high varying fruit mass loss upon arrival at retail. More than 30% of these shipments contain fruits of significantly different fruit quality at retail. The remaining shelf life of the fruits the consumer buys from the retailer differs by up to 3 days. This complicates supply and demand in the citrus supply chain. Our findings also showed that about 20% of shipments contain fruits with chilling injury above the severe commercial threshold at retail. This means that the retailer could throw away 20% of the fruits they receive, which also translates into a significant loss in income. • Both pre-harvest (STDev = 0.65) and postharvest variability in hygrothermal conditions (STDev = 0.74) causes high varying mass loss in oranges upon arrival at retail. Compared to pre-harvest biological variability in fruit properties (STDev = 3.94), the postharvest variability (STDEV = 6.05) resulted in more oranges with significantly different quality at retail. The postharvest variability leads to slightly more variations in the remaining shelf life (3 days) of oranges at retail compared to pre-harvest variability (2 days).
This simulation-based research has addressed a key issue in postharvest supply chains: Where does the variability in end quality upon arrival, which many stakeholders regularly observe, come from? This study unveiled the extent to which biological variability after harvest between different fruit and the variability in hygrothermal storage conditions between different shipments causes non-uniform end quality. With such valuable insight, different stakeholders in the citrus industry could make planning decisions that will reduce the impact of variability in pre-harvest cultural practices and hygrothermal differences between remaining shelf life (days) at the end of supply chain using a probability distribution function and rug plot. The rug plot shows the actual data set used for the STDev = standard deviation between groups. Two sample t-test with equal variance assumed at p ≤ 0.05 significant level was used to compare the mean significant difference of quality evolution due to pre-harvest biological variability and cold chain variability at the end of the supply chain. They are indicated with letters a and b for statistically significantly different groups and ns for not statistically significant groups.
shipments that drives food loss. Trade-offs between pre-harvest practices and supply chain logistics could also be established. Here, necessary control measures at pre-harvest and harvest stages could be targeted in order to minimize the emphasis on postharvest handling techniques, thereby improving profitability in the citrus industry. In this way, digital twins can help improve the current transcontinental citrus cold chain to reduce food losses, thereby improving resource conservation and food security, and increasing supply chain sustainability. A limitation of this study is the calibration of some models based on literature data for specific cultivars. As already observed, the unique physiological history of fruits means that their physicochemical properties change every harvest year, or location, so the model needs repeated recalibration. Additionally, the impact of pre-harvest biological fruit variability and hygrothermal variability were accessed individually, not altogether. This means the simultaneous impact of the variability between fruits of different pallets for different shipments was not evaluated. Looking ahead, the digital twin model should be enhanced to include a physics-based plant growth model to pinpoint exactly when and what cultural pre-harvest practices could drive food loss in a future production season. Increased monitoring of the entire postharvest life of citrus in a refrigerated container with more environmental parameters, such as relative humidity, should be a future critical step. A real-time digital twin of a full shipment of cirtus fruit in a refrigerated container, driven by real-time-measured air temperature and humidity data, should also be a future goal.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.