Skip to main content
Log in

Using a Bayesian Hierarchical Linear Mixing Model to Estimate Botanical Mixtures

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

In grazing systems, estimating the dietary choices of animals is challenging but can be achieved using plant-wax markers, natural compounds that provide a signature of individual plants. If sufficiently distinct, these signatures can be used to characterize the makeup of a botanical mixture or diet. Bayesian hierarchical models for linear unmixing (BHLU) have been widely used for hyperspectral image analysis and geochemistry, but not diet mixtures. The aim of this study was to assess the efficiency of BHLU to estimate botanical mixtures. Plant-wax marker concentrations from eight forages found in Nebraska were used for simulating combinations of two, three, five and eight species. Also, actual forage mixtures were constructed in laboratory and evaluated. Analyses were performed using BHLU with 2 prior choices for forage proportions (uniform and Gaussian), 2 covariance structures (independent and correlated markers), stable isotope mixing models (SIMM), and nonnegative least squares (NNLS). Accounting for correlations between markers increased efficiency. Estimation error increased when Gaussian priors were used to model forage proportions. Performance of BHLU, SIMM, and NNLS was reduced with the more complex botanical mixtures and the limited number of markers. For simple mixtures, BHLU is a reliable alternative to NNLS for estimation of forage proportions. Supplementary materials accompanying this paper appear online.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aitchison, J. (2003), The Statistical Analysis of Compositional Data, Chapman & Hall, London, UK.

    MATH  Google Scholar 

  • Arthur, P. F., Archer, J. A., and Herd, R. M. (2004), “Feed intake and efficiency in beef cattle: overview of recent Australian research and challenges for the future,” Australian Journal of Experimental Agriculture, 44, 361–369.

    Article  Google Scholar 

  • Bouriga, M., and Fern, O. (2013), “Estimation of covariance matrices based on hierarchical inverse-Wishart priors,” Journal of Statistical Planning and Inference, 143, 795–808.

    Article  MathSciNet  MATH  Google Scholar 

  • Brewer, M. J., Filipe, J. A. N., Elston, D. A., Dawson, L. A., Mayes, R. W., Soulsby, C., and Dunn, S. M. (2005), “A hierarchical model for compositional data analysis,” Journal of Agricultural, Biological, and Environmental Statistics, 10, 19–34.

    Article  Google Scholar 

  • Cottle, D. L. (2013), “The trials and tribulations of estimating the pasture intake of grazing animals,” Animal Production Science, 53, 1209–1220.

    Article  Google Scholar 

  • Dobigeon, N., and Tourneret, J.-Y. (2007), “Truncated multivariate Gaussian on a Simplex,” Technical Report, IRIT/ENSEEIHT/TeSA.

  • Dobigeon, N., Tourneret, J.-Y., and Hero, A. O. III. (2008), “Bayesian linear unmixing of hyperspectral image analysis corrupted by colored Gaussian noise with unknown covariance matrix.” In IEEE International Conference on Acoustics, Speech and Signal Processing, 3433-3436 Las Vegas, NV, USA.

  • Dove, H., and Mayes, R. W. (2006), “Protocol for the analysis of n- alkanes and other plant-wax compounds and for their use as markers for quantifying the nutrient supply of large mammalian herbivores,” Nature Protocols, 1, 1680–1697.

    Article  Google Scholar 

  • Dove, H., and Moore, A. D. (1995), “Using a least squares optimization procedure to estimate botanical composition based on the alkanes of plant cuticular wax,” (1995). Australian Journal of Agricultural Research, 46, 1535–1544.

    Article  Google Scholar 

  • Eches, O., Dobigeon, N., Mailhes C., and Tourneret, J.-Y. (2010), “Bayesian estimation of linear mixtures using the normal compositional model. Application to hyperspectral imagery,” IEEE Transactions on Image Processing, 19, 1403–1413.

    Article  MathSciNet  MATH  Google Scholar 

  • Eddelbuettel, D., and Sanderson C. (2014), “RcppArmadillo: Accelerating R with high- performance C++ linear algebra,” Computational Statistics and Data Analysis, 71, 1054–1063.

    Article  MathSciNet  Google Scholar 

  • Egozcue J., Pawlowsky-Glahn V., Mateu-Figueras G., and Barcel-Vidal C. (2003), “Isometric log-ratio transformations for compositional data analysis,” Mathematical Geology, 35, 279300.

    Article  MATH  Google Scholar 

  • Karsten, H. D., and Carlassare, M. (2002), “Describing the botanical composition of a mixed species northeastern U.S. pasture rotationally grazed by cattle,” Crop Science, 42, 882–889.

    Article  Google Scholar 

  • Kazianka, H., Mulyk, M., and Pilz, J. (2011), “A Bayesian approach to estimating linear mixtures with unknown covariance structure,” Journal of Applied Statistics, 38, 1801–1817.

    Article  MathSciNet  Google Scholar 

  • Lawson, C. L., and Hanson, R. J. (1974), Solving least squares problems, Society for Industrial and Applied Mathematics, New Jersey, USA.

    MATH  Google Scholar 

  • Lewis, R. M., Vargas Jurado, N., Hamilton, H. C., and Volesky J. D. (2016), “Are plant waxes reliable dietary markers for cattle grazing western rangelands?” Journal of Animal Science, 94(S6), 93–102.

    Article  Google Scholar 

  • Montao-Bermudez, M., Nielsen, M. K., and Deutscher, G. H. (1990), “Energy requirements for maintenance of crossbred beef cattle with different genetic potential for milk,” Journal of Animal Science, 68, 2279–2288.

    Article  Google Scholar 

  • Moore J. W., and Semmens B. X. (2008), “Incorporating uncertainty and prior information into stable isotope mixing models,” Ecology Letters 11, 470480.

    Google Scholar 

  • Mullen, K. M., and va Stokum, I. H. M. (2012), nnls: The Lawson-Hanson algorithm for non-negative least squares (NNLS). R package version 1.4. Available at https://CRAN.R-project.org/package=nnls.

  • Newman, J. A., Thompson, W. A., Penning P. D., and Mayes, R. W. (1995), “Least-squares estimation of diet composition from n-alkanes in herbage and faeces using matrix mathematics,” Australian Journal of Agricultural Research, 46, 793–805.

    Article  Google Scholar 

  • Palarea-Albaladejo, J., and Martn-Fernndez, J. A. (2008), “A modified EM alr-algorithm for replacing rounded zeros in compositional data sets,” Computer and Geosciences, 34, 902–917.

    Article  Google Scholar 

  • Parnell A. C., Phillips D. L., Bearhop S., Semmens B. X., Ward E. J., Moore J. W., Jackson A. L., Grey J., Kelly D. J., and Inger R. (2013), “Bayesian stable isotope mixing models,” Environmetrics, 24, 387–399.

    MathSciNet  Google Scholar 

  • R Core Team (2016), R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at https://www.R-project.org/.

  • Scealy, J. L., and Welsh, A. H. (2011), “Regression for compositional data by using distributions defined on the hypersphere,” Journal of the Royal Statistical Society, 73, 351–375.

    Article  MathSciNet  Google Scholar 

  • Stewart, C., and Field, C. (2011), “Managing the essential zeros in quantitative fatty acid signature analysis,” Journal of Agricultural, Biological, and Environmental Statistics, 16, 45–69.

    Article  MathSciNet  MATH  Google Scholar 

  • Stock, B. C., and Semmens, B. X. (2013), MixSIAR GUI User Manual. Version 3.1. https://github.com/brianstock/MixSIAR. https://doi.org/10.5281/zenodo.56159.

  • Vargas Jurado, N., Tanner, A. E., Blevins, S. R., McNair, H. M., Mayes, R. W., and Lewis, R. M. (2015), “Long-chain alcohols did not improve predictions of the composition of tall fescue and red clover mixtures over n-alkanes alone,” Grass and Forage Science, 70, 499–506.

    Article  Google Scholar 

  • Whalley, R. D. B., and Hardy, M. B. (2002), “Measuring botanical composition of grasslands.” In Field and laboratory methods for grassland and animal production research, ’t Mannetje, L., and R. M. Jones (eds), 67–102. Wallingford: CABI.

  • Yu, S., -Y., Colman, S. M., and Li, L. (2016), “BEMMA: A hierarchical bayesian end-member modeling analysis of sediment grain-size distributions,” Mathematical Geosciences, 48, 723–741.

Download references

Acknowledgements

The authors wish to thank Dr. Jerry Volesky and the technical staff at the West Central Research and Extension Center, North Platte, NE, for their assistance in collecting plant samples. We appreciate the laboratory assistance of Amy Tanner (Virginia Tech, Blacksburg, VA) and Hannah Hamilton (University of Nebraska, Lincoln, NE). We are thankful to Dr. Nicolas Dobigeon for sharing the MATLAB code for the generation of random samples from the multivariate Gaussian on a simplex. This material is based upon work that was supported by the U.S. Department of Agriculture, Agricultural Research Service (Award Number 1932-21630-003-06) and by the Nebraska Agricultural Experiment Station with funding from the Hatch Multistate Research capacity funding program (Accession Number 1006754) from the U.S. Department of Agriculture, National Institute of Food and Agriculture. Also, funding from Senescyt in Ecuador is greatly appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Napoleón Vargas Jurado.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 40 KB)

Supplementary material 2 (gz 26 KB)

Appendix

Appendix

1.1 Forage Proportions

Although the Dirichlet distribution is not conjugate to the Gaussian likelihood, due to the choice of \(\varvec{\upalpha }\) the posterior distributions resulting from combining (4) with (5), (6), or (7) have closed forms (Dobigeon et al. 2008) and can be efficiently sampled. Under (5) or (6) the full conditional distribution of the forage proportions is a truncated Gaussian on the simplex

$$\begin{aligned} \begin{aligned} f\big (\varvec{\uptheta }_{i,-p}|\varvec{\Sigma }, \mathbf {y}_{i}\big )&\propto |\varvec{\Psi }| \mathrm {exp}\bigg [-\dfrac{1}{2}\big (\varvec{\uptheta }_{i,-p} -\varvec{\upmu }_i \big )^{\prime } \varvec{\Psi }^{-1}\big (\varvec{\uptheta }_{i,-p} -\varvec{\upmu }_i \big )\bigg ]\\&\quad \times I\big (\varvec{\uptheta }_{i,-p}\in \mathbb {S}\big ),\\&\sim \phi _{\mathbb {S}}\big (\varvec{\upmu }_i,\; \varvec{\Psi }\big ), \end{aligned} \end{aligned}$$
(12)

where the mean vector is \(\varvec{\upmu }_i=\varvec{\Psi }\Big [\big (\mathbf {M}_{-p}-\mathbf {m}_{p} \mathbf {1}^{\prime }\big )^{\prime }\varvec{\Sigma }^{-1}\big (\mathbf {y}_i- \mathbf {m}_{p}\big )\Big ]\) and the variance matrix is \(\varvec{\Psi }=\Big [\big (\mathbf {M}_{-p}-\mathbf {m}_{p} \mathbf {1}^{\prime }\big )^{\prime }\varvec{\Sigma }^{-1}\big (\mathbf {M}_{-p}-\mathbf {m}_{p} \mathbf {1}^{\prime }\big )\Big ]^{-1}\). In addition, \(\mathbf {M}_{-p}=\big [\mathbf {m}_1,\ldots , \mathbf {m}_{p-1}\big ]\), \(\mathbf {m}_p\) is the \(p\mathrm{th}\) column of M, and \(\mathbf {1}=\big [1,\ldots ,1\big ]\) is a unit vector with \(p-1\) elements. Under (7) the posterior distribution of forage proportions is also multivariate Gaussian truncated on a simplex,

$$\begin{aligned} \begin{aligned} f\big (\varvec{\uptheta }_{i,-p}|\varvec{\Sigma }, \mathbf {y}_{i}\big )&\propto |\varvec{\Psi }| \mathrm {exp}\bigg [-\dfrac{1}{2}\big (\varvec{\uptheta }_{i,-p} -\varvec{\widetilde{\upmu }}_i \big )^{\prime } \varvec{\widetilde{\Psi }}\big (\varvec{\uptheta }_{i,-p} -\varvec{\widetilde{\upmu }}_i \big )\bigg ]\\&\quad \times I\big (\varvec{\uptheta }_{i,-p}\in \mathbb {S}\big ),\\&\sim \phi _{\mathbb {S}}\big (\varvec{\widetilde{\upmu }}_{i},\; \varvec{\widetilde{\Psi }}\big ), \end{aligned} \end{aligned}$$
(13)

with updated mean vector \(\overset{\sim }{\varvec{\upmu }}_i=\overset{\sim }{\varvec{\Psi }}\Big (\varvec{\Psi }^{-1} \varvec{\upmu }_i+\mathbf {V}^{-1}\mathbf {q}\Big )\), and updated variance matrix \(\overset{\sim }{\varvec{\Psi }} =\big (\varvec{\Psi }^{-1}+\mathbf {V}^{-1}\big )^{-1}\).

1.2 Variance Matrix \(\Sigma \)

Given that the likelihood is Gaussian, and the choice of priors, in the case of \(\varvec{\Sigma }=\mathrm {diag}\big (\sigma _{1}^{2},\ldots ,\sigma _{r}^{2}\big )\), the posterior distribution for the marker variances \(\sigma _{k}^{2}\) is a scaled inverse Chi-squared as shown by Kazianka et al. (2011)

$$\begin{aligned} \begin{aligned} f\big (\sigma _{k}^{2}|\varvec{\uptheta }_{i},\lambda ,\nu ,\mathbf {y}_i\big )&\propto \big (\sigma _{k}^{2}\big )^{\frac{n+\nu }{2}-1}\\&\quad \times \mathrm {exp}\Bigg \lbrace -\dfrac{1}{2\sigma _{k}^{2}}\Bigg (\nu \lambda + \sum _{i=1}^{n} \big [y_{ik} - \mathbf {m}_k\varvec{\uptheta }_i\big ]^{2} \Bigg )\Bigg \rbrace ,\\&\sim \chi ^{-2}\Bigg (n+\nu ,\;\dfrac{\nu \lambda }{n+\nu }+\sum _{i=1}^{n} \dfrac{\big [y_{ik} - \mathbf {m}_k\varvec{\uptheta }_i\big ]^{2}}{n+\nu }\Bigg ), \end{aligned} \end{aligned}$$
(14)

where \(\mathbf {m}_k\) is the \(k\mathrm{th}\) row of the M matrix. In the case of priors as in (9), the posterior for \(\varvec{\Sigma }\) is

$$\begin{aligned} \begin{aligned} f\big (\varvec{\Sigma }|\varvec{\uptheta }_i,\nu ,\varvec{\Xi },\mathbf {y}_i\big ) \propto \,&|\varvec{\Sigma }|^{-\frac{n+\nu +r+1}{2}}\mathrm {exp}\Bigg \lbrace -\dfrac{1}{2}\mathrm {tr} \Bigg [\bigg (\sum _{i=1}^{n}\big (\mathbf {y}_i-\mathbf {M}\varvec{\uptheta }_i\big ) \big (\mathbf {y}_i-\mathbf {M}\varvec{\uptheta }_i\big )^{\prime } {+}\, \varvec{\Xi } \bigg )\varvec{\Sigma }^{-1}\Bigg ]\Bigg \rbrace ,\\ \quad \sim&\; \mathrm {Inv.Wishart} \bigg (n+\nu ,\; \sum _{i=1}^{n}\big (\mathbf {y}_i-\mathbf {M}\varvec{\uptheta }_i\big ) \big (\mathbf {y}_i-\mathbf {M}\varvec{\uptheta }_i\big )^{\prime } + \varvec{\Xi } \bigg ). \end{aligned} \end{aligned}$$
(15)

1.3 Hyperparameters

The posterior distribution for the scale parameter \(\lambda \) is

$$\begin{aligned} \begin{aligned} f\big (\lambda |\sigma _{k}^{2}, \mathbf {y}_i\big )&\propto \lambda ^{\frac{r\nu }{2}-1}\mathrm {exp}\bigg (-\dfrac{\lambda \nu }{2}\sum _{k=1}^{r} \dfrac{1}{\sigma _{k}^{2}}\bigg ),\\&\sim \mathrm {Gamma}\bigg (\dfrac{r\nu }{2},\;\dfrac{\nu }{2}\sum _{k=1}^{r} \dfrac{1}{\sigma _{k}^{2}} \bigg ). \end{aligned} \end{aligned}$$
(16)

Similarly, under the parameterization \(\varvec{\Omega } = (\nu -r-1)\varvec{\Xi }\) and (11), the posterior distribution of the hyperparameter \(\omega _{k}\) (Bouriga and Fern 2013) is

$$\begin{aligned} \begin{aligned} f\big (\omega _{k}|\nu ,\varvec{\Sigma },\mathbf {y}_i\big )&\propto \omega ^{\frac{r\nu }{2}}\mathrm {exp}\bigg [-\dfrac{r\nu }{2}\mathrm {tr}\big ( \varvec{\Sigma }_{k,k}^{-1}\big )\bigg ],\\&\sim \mathrm {Gamma}\bigg (\dfrac{r\nu }{2},\dfrac{\varvec{\Sigma }_{k,k}^{-1}}{2}\bigg ). \end{aligned} \end{aligned}$$
(17)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vargas Jurado, N., Eskridge, K.M., Kachman, S.D. et al. Using a Bayesian Hierarchical Linear Mixing Model to Estimate Botanical Mixtures. JABES 23, 190–207 (2018). https://doi.org/10.1007/s13253-018-0318-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-018-0318-9

Keywords

Navigation