A new standard model for milk yield in dairy cows based on udder physiology at the milking-session level

Milk production in dairy cow udders is a complex and dynamic physiological process that has resisted explanatory modelling thus far. The current standard model, Wood’s model, is empirical in nature, represents yield in daily terms, and was published in 1967. Here, we have developed a dynamic and integrated explanatory model that describes milk yield at the scale of the milking session. Our approach allowed us to formally represent and mathematically relate biological features of known relevance while accounting for stochasticity and conditional elements in the form of explicit hypotheses, which could then be tested and validated using real-life data. Using an explanatory mathematical and biological model to explore a physiological process and pinpoint potential problems (i.e., “problem finding”), it is possible to filter out unimportant variables that can be ignored, retaining only those essential to generating the most realistic model possible. Such modelling efforts are multidisciplinary by necessity. It is also helpful downstream because model results can be compared with observed data, via parameter estimation using maximum likelihood and statistical testing using model residuals. The process in its entirety yields a coherent, robust, and thus repeatable, model.


Supplementary
. Session-specific milk yield from the simulation and the model. a,b, simulated yield (n=1,000; black line), observed yield (green line), and model-estimated yield±95% CI (solid and dashed red lines, respectively) for the morning (a) and evening (b) milking sessions. c, plot of the total session-specific variance in milk yield found using the model versus the simulation (dashed red best-fit line). Figure 3. Daily milk yield from the simulation and the model. a, simulated yield [n=1000] (black line), observed yield (green line), and model-estimated yield±95 CI (solid and dashed red lines, respectively). b, plot of mean daily milk yield found using the model versus the simulation (dashed red best-fit line). c, plot of the total daily variance in milk yield found using the model versus the simulation (dashed red best-fit line).

Supplementary Figure 5. Observed versus estimated milk yield full days when precise interval duration was unknown.
Observed yield is represented by a solid black line. Modelestimated yield and the 95% CI are represented by green lines (solid and dashed, respectively). The estimated yield from Wood's model and the 95% CI are represented by red lines (solid and dashed, respectively). Figure 6. Image illustrating the simulation used to estimate parameter influence in the milk-yield model.

Note: "Parameter estimation using the nlm function in R."
The R function nlm carries out the non-linear minimisation of target functions, and we used it to estimate all the model parameters at once. Based on the output, we can see that this process seemed to run smoothly.
1 -The output code was 1, which, according to the nlm documentation, is the best value you can obtain.
2 -The number of iterations was 78, which is not very high, given that we set the maximum number of iterations to 400 (via interlim); the default value is 100.
It is perhaps worth noting that nlm uses a Gauss-Newton-type algorithm, which is not necessarily optimal in complex non-linear situations. On the one hand, we would have preferred a Gauss-Marquart-type algorithm, which is more optimised for cases such as ours and more rapidly converges towards a solution (i.e., requires far fewer iterations). However, no such option exists for nlm. On the other hand, we did want to use a general non-linear mimisation function such as nlm to demonstrate that parameter estimation can be efficacious and robust when commonly available procedures are used.
3 -In nlm, numerical derivatives are used for the gradient vector and the Hessian matrix. Algorithm speed increases when the vector and the matrix are explicitly calculated. The process yields a Hessian matrix whose inverse is easy to obtain using the solve function. This shows that all the parameters could be estimated. From the Hessian matrix, we obtained a variance-covariance matrix for all of the parameters; the main diagonal was the linear vector of the variances.
The fact that we obtained reasonable and coherent values for these variance estimators is another sign that parameter estimation ran smoothly. We can thus feel confident that all of the parameters could be estimated.