Simultaneous estimation of diet composition and calibration coefficients with fatty acid signature data

Abstract Knowledge of animal diets provides essential insights into their life history and ecology, although diet estimation is challenging and remains an active area of research. Quantitative fatty acid signature analysis (QFASA) has become a popular method of estimating diet composition, especially for marine species. A primary assumption of QFASA is that constants called calibration coefficients, which account for the differential metabolism of individual fatty acids, are known. In practice, however, calibration coefficients are not known, but rather have been estimated in feeding trials with captive animals of a limited number of model species. The impossibility of verifying the accuracy of feeding trial derived calibration coefficients to estimate the diets of wild animals is a foundational problem with QFASA that has generated considerable criticism. We present a new model that allows simultaneous estimation of diet composition and calibration coefficients based only on fatty acid signature samples from wild predators and potential prey. Our model performed almost flawlessly in four tests with constructed examples, estimating both diet proportions and calibration coefficients with essentially no error. We also applied the model to data from Chukchi Sea polar bears, obtaining diet estimates that were more diverse than estimates conditioned on feeding trial calibration coefficients. Our model avoids bias in diet estimates caused by conditioning on inaccurate calibration coefficients, invalidates the primary criticism of QFASA, eliminates the need to conduct feeding trials solely for diet estimation, and consequently expands the utility of fatty acid data to investigate aspects of ecology linked to animal diets.

The unverifiable assumption that calibration coefficients are known has received considerable attention in the literature. Several investigators have found diet estimates to be sensitive to the choice of calibration coefficients (e.g., Budge, Penney, & Lall, 2012;Haynes et al., 2015;Meynier, Morel, Chilvers, Mackenzie, & Duignan, 2010;Nordstrom, Wilson, Iverson, & Tollit, 2008;Wang, Hollmén, & Iverson, 2010), and simulation studies have demonstrated that errors in their values can bias diet estimation (Bromaghin, Budge, Thiemann, & Rode, 2016). In feeding trials, calibration coefficient estimates have been found to vary by the species of the consumer and various aspects of feeding trial design (e.g., Budge, Penny, & Lall, 2011;Rosen & Tollit, 2012;Thiemann, Iverson, & Stirling, 2008;Wang et al., 2010). Further, fatty acids may be deposited or turnover at different rates in different tissues (Mohan et al., 2016;Nordstrom et al., 2008), and diet composition, physiological state, and the age of an animal can affect fatty acid metabolism (Williams & Buck, 2010). Designing a feeding trial to develop calibration coefficients that are robust to so many factors is complex and perhaps not even feasible. Such difficulties with calibration coefficients have caused some investigators to question the utility of QFASA (e.g., Happel et al., 2016;Rosen & Tollit, 2012;Williams & Buck, 2010).
We present a new model that allows simultaneous estimation of both diets and calibration coefficients based only on fatty acid signature samples from predators and potential prey. The primary requirements are the availability of a suitable prey library and a predator sample that exceeds a minimum number of animals, which depends upon the number of prey types and fatty acids used. The performance of our model was explored using constructed test cases, in which the true values of diet proportions and calibration coefficients were known, based on two prey libraries previously used to investigate the performance properties of QFASA diet estimators (e.g., Bromaghin, Budge, Thiemann, & Rode, 2016). Diet composition and calibration coefficients were also estimated for a sample of Chukchi Sea polar bears ( Figure. 1; Ursus maritimus) whose diets were previously estimated  using calibration coefficients derived from a mink (Neovison vison) feeding trial (Thiemann et al., 2008).

| The model
Our notation is a minor extension of that of Iverson et al. (2004). Let x ik = the proportion for fatty acid k in the mean signature of prey type i; i = 1, 2, …, I; k = 1, 2, …, K; y jk = the proportion for fatty acid k in the F I G U R E 1 A polar bear (Ursus maritimus) family feeding on a ringed seal (Phoca hispida). Photograph credit: U.S. Geological Survey, Alaska Science Center. Previously published by Ecology and Evolution 5:1249-1262 signature of predator j; j = 1, 2, …, J; c k = the calibration coefficient for fatty acid k, common to all predators; and π ji = the proportion of prey type i in the diet of predator j.
Calibration coefficients are used to adjust signature proportions for the effects of fatty acid metabolism, providing a one-to-one mapping between the prey and predator spaces ( Figure. 2). Diet estimation can occur in either space because metabolic effects have been accounted for and the signatures made comparable, although estimates obtained in the two spaces may differ (Bromaghin, Rode, Budge, & Thiemann, 2015). For example, Iverson et al. (2004) divided predator signatures by the calibration coefficients to transform the signatures to the prey space, while Bromaghin et al. (2013) multiplied prey signatures by the calibration coefficients to transform the signatures to the predator space.
We performed estimation in the predator space, so calibration coefficients (c k ) were used to transform mean prey signatures (x ik ) to the predator space, Predator signatures were modeled as a mixture of the transformed prey signatures ( ), with diet proportions (π ji ) as the weights, and diet proportions and calibration coefficients were estimated by minimizing the Aitchison distance (Aitchison, 1986) between the observed and modeled signatures summed over all predators.
where gm(s) is the geometric mean of the K fatty acid proportions in signature s. The key differences between this model and prior QFASA models are that the calibration coefficients are unknown parameters to be estimated, rather than known constants, and the distance between observed and modeled signatures is summed over all predators in the sample.
There are J*(K−1) degrees of freedom in the predator signature data, losing one degree of freedom for each predator because the proportions in each signature must sum to 1. The diet proportions for each predator must also sum to 1, so there are J*(I−1) unknown diet proportions. Only the relative magnitudes of the calibration coefficients are informative for diet estimation, that is, multiplying their values by any constant produces an identical mapping between the prey and predator spaces, so one identifiability constraint must be placed on them and there are therefore K−1 free calibration coefficients. The which can be expressed as J ≥ (K−1)/(K−I), for K > I. The number of prey types will always exceed 1, so this minimum threshold exceeds 1 and will increase as the difference between the number of fatty acids and the number of prey types decreases.
All data processing was performed using MATLAB (version 2016b, www.mathworks.com/), and the objective function Q was minimized using TOMLAB SNOPT software (version 8.0, www.tomopt.com/ tomlab/). Initial values for all diet proportions were set to the inverse of the number of prey types, 1/I, and initial values for all calibration coefficients were set to 1. For the identifiability constraint on the calibration coefficients, we arbitrarily chose to constrain their sum to equal the number of fatty acids, K. Calibration coefficients were additionally constrained to be at least 0.02 to bound them from zero and F I G U R E 2 An example with six fatty acids (FA) illustrating how calibration coefficients are used to transform signatures between the predator and prey spaces avoid potential computational problems during parameter estimation.
Finally, the diet proportions of each predator were constrained to be non-negative and sum to 1.

| Prey libraries
Our analyses were based on two prey libraries with quite different characteristics that have previously been used to investigate the performance of QFASA diet estimators (e.g., Bromaghin, Budge, Thiemann, & Rode, 2016). The marine mammal (hereafter mammal) library (Bromaghin et al., 2015a,b) was comprised of 357 signatures from seven species that have been used to estimate the diets of Chukchi Sea polar bears . Several prey types in this library have reasonably distinct signatures, although there is some confounding between the ice seal species, especially ribbon seal Histriophoca fasciata and spotted seal Phoca vitulina (Bromaghin, 2017). For the mammal library, we used the 31 fatty acids previously used by Thiemann et al. (2008) to estimate polar bear diets. The second library was the Scotian Shelf fish and shellfish (hereafter fish) library, comprised of 954 signatures from 28 species (Bromaghin et al., 2015b;Budge, Iverson, Bowen, & Ackman, 2002). The fish library is considerably more complex because of the larger number of prey types and the confounding that exists among the signatures of several prey types (Bromaghin, Rode, et al., 2015). With the fish library, we used the extended dietary suite of 41 fatty acids (Iverson et al., 2004), which is nearly identical to the suite of 39 fatty acids that have been used with expanded versions of this library (e.g., Beck, Iverson, Bowen, & Blanchard, 2007).
With both libraries, fatty acid proportions that were missing or equal to zero were replaced by a small constant (0.005), a common strategy in QFASA because distance measures for compositional data often involve logarithms and so are not defined if any proportions equal zero. The sum of the proportions for the fatty acids used in diet estimation was computed for each signature, and each signature was then augmented with an additional proportion equal to one minus that sums so that the proportions in each augmented signature summed to one, which has been found to reduce bias in some circumstances . Consequently, signatures were comprised of 32 and 42 proportions with the mammal and fish libraries, respectively. We therefore needed a minimum of J = 2 predators for the mammal library and J = 3 predators for the fish library for all parameters to be estimable.

| Example diets
We expected the model to perform optimally when the number of predator signatures was well above the minimum sample size threshold and predator diets were highly diverse. Consequently, we established a large set of diverse predator diets for each library by selecting a grid of diets regularly spaced throughout the range of all diets possible with each library . As an example, a diet grid for three prey types having diet proportions equally spaced by an increment of 0.10 is illustrated in Figure. 3. The diet grids used in our analyses were generated using the make_diet_grid function of the R package qfasar (version 1.1.0, Bromaghin, 2017) with a diet increment of 0.25, which resulted in grids of 210 and 31,465 diets with the mammal and fish libraries, respectively. We randomly selected a subset of 210 diets from the fish library grid to reduce the number of diets to a manageable number. For each library and its suite of fatty acids, calibration coefficients were established by drawing a random sample from a chi-square density with one degree of freedom and scaling them to sum to the number of fatty acids used with each library. Each of the example diets was then used to compute a predator signature in the prey space as a mixture of the mean prey signatures weighted by the diet proportions (Iverson et al., 2004), and the calibration coefficients were used to transform each signature from the prey space to the predator space. The resulting predator signatures and the mean prey signatures were then used as data inputs to the new model described above, and diet proportions and calibration coefficients were simultaneously estimated. There were a total of 1,291 and 5,711 diet proportion and calibration coefficient parameters in the models based on the mammal and fish libraries, respectively.
Test cases that we expected to be more challenging for the model were based on the realistic diets of adult female and male polar bears (mammal library) and spring-and fall-sampled, female and male gray seals (Halichoerus grypus; fish library) used as test cases by Bromaghin, Rode et al. (2015). Estimated diets for subadult female and male polar bears  were added so that we had four realistic diets for each library. These test cases were expected to be more difficult because the number of diets (four) was much closer to the minimum sample size threshold of each library (either two or three), and the diets were considerably less diverse than the gridded diet test cases. Using F I G U R E 3 A ternary plot illustrating a grid of diet proportions regularly spaced throughout the range of all possible diets comprised of up to three prey types, with an increment of 0.1 between proportions. Similar diet grids with the larger mammal and fish prey libraries were used to establish example diets to test the performance of the model the four realistic diets for each library, a process identical to that previously described for the gridded diets was used to generate predator signatures in the prey space, map them to the predator space using the same calibration coefficients, and estimate both diets and calibration coefficients. There were 55 and 149 parameters in the models based on the mammal and fish libraries, respectively.
For both gridded and realistic diet test cases, the true values of the diet proportions and calibration coefficients were known. Given the large number of diet proportions in the diet grid analyses, we computed differences between the estimated and true proportions (error or statistical bias) and graphically summarized their distribution.
Because of the smaller number of diet proportions in the analyses based on realistic diets, we graphically compared the true and estimated proportions for each prey component of the four diets. In both cases, we graphically compared estimated calibration coefficients with the true values.

| Chukchi Sea polar bears
We estimated the diets and calibration coefficients for a sample of 154 polar bears from the Chukchi Sea  using the mammal library. The diets of these bears were previously estimated  using the original QFASA model, the mammal library, and calibration coefficients derived from a mink feeding trial (Thiemann et al., 2008). The bear signatures were prepared for analysis using the same methods of zero replacement and signature augmentation previously described for the prey libraries.
Individual bear diets and calibration coefficients were estimated using the new model, and the mean diet was computed from the individual estimates for each of four age-sex classes: adult females, adult males, subadult females, and subadult males. In this case, the true diets and calibration coefficients were unknown. Consequently, we compared our estimated calibration coefficients with the values derived from the mink feeding trial (Thiemann et al., 2008), scaled to a common sum to make the comparison meaningful. Our estimates of diet composition were compared to a second set of estimates also obtained using our new model, but with the calibration coefficients constrained to equal the marine-fed values derived from the mink feeding trial.

| RESULTS
In the diet grid analysis with the mammal library, the minimized value of the objective function was effectively zero (<1.0e −18 ), so the model came very close to fitting the predator signature data perfectly. The estimated diet proportions and calibration coefficients ( Figure.  With the fish library, the minimized objective function was slightly greater than zero (2.3 × 10 −12 ) and the optimization routine returned a warning that an improved solution could not be found, which jointly implied that a good solution was found but the termination criteria were not fully satisfied. Errors in the estimated diet proportions were slightly larger than in the other cases, but still inconsequential from a practical perspective, ranging from −3.5e −5 to 4.1e −5 ( Figure. 7), and the calibration coefficient estimates had errors of a similar magnitude ( Figure. 5b).
With the Chukchi Sea polar bear data and the marine mammal library, the estimated calibration coefficients for many fatty acids were somewhat similar to the values derived from the mink feeding trial (Thiemann et al., 2008;Figure. 8). The most notable exception was fatty acid 20:1n-11, for which our estimated calibration coefficient was substantially smaller than either of the feeding trial estimates. In addition, our estimates for the 22-carbon polyunsaturated fatty acids tended to be somewhat larger than the corresponding feeding trial estimates. Our unconditional diet estimates were more diverse than estimates conditioned on the marine-fed mink calibration coefficients ( Figure. 9) and tended to have larger contributions from beluga whale (Delphinapterus leucas), ribbon seal, spotted seal, and walrus (Odobenus

rosmarus), smaller contributions from bearded seal (Erignathus barbatus)
and ringed seal (Pusa hispida), and similar contributions from bowhead whale (Balaena mysticetus). Our second set of estimates conditioned on the marine-fed mink calibration coefficients were close to previous estimates obtained using similar methods (Bromaghin, Rode et al., 2015).

| DISCUSSION
Our analyses of example diets, constructed under known conditions with two prey libraries having substantially different characteristics, demonstrate that simultaneous estimation of diet composition and calibration coefficients based only on signature samples from wild predator and prey is not only feasible, but also highly accurate. In all four cases in which the true values of the diets and calibration coefficients were known, both sets of parameters were estimated with essentially no error. Single predator models must therefore condition on specific values for the calibration coefficients in order for diet proportions to be estimable.
As an aside, we note that analysis of feeding trial data essentially inverts that process, conditioning on a known diet to estimate calibration coefficients.
In our analyses based on realistic diets, all parameters were successfully estimated with the signatures of only four predators.
However, in the analysis with the fish library, the warning received from the optimization routine, a minimized objective function somewhat greater than zero, and slightly greater, though inconsequential, bias likely indicated that the model was challenged in that case. The sample of four predators was only marginally greater than the theoretical minimum of three predators for the fish library, which may have caused some difficulty. In practice, having a larger number of predators is recommended and doing so can be expected to increase estimation accuracy, because each predator contributes some information about the calibration coefficients, as described above. It is also important to realize that some diversity in the predator data is necessary for both diet proportions and calibration coefficients to be estimable. The ability to estimate diets without calibration coefficients derived from a feeding trial is a major breakthrough with profound benefits. Conducting a feeding trial requires a facility with the capability to properly house and care for an adequate number of animals. Feeding trials must be conducted over lengthy periods of time, as the relationship between consumer diet and lipid reserves takes time to develop and stabilize (e.g., Budge et al., 2012). Consequently, feeding trials are often time-consuming and expensive. In addition, a number of animal welfare concerns could arise from holding animals, feeding a controlled diet, or sample acquisition. Such issues can preclude working with rare species held in zoos and similar facilities that prioritize animal welfare, even though data from such species might be extremely valuable to support field investigations.
Although our model avoids the need to conduct a feeding trial to estimate calibration coefficients, the prey library remains a critically important data input. For accurate estimation of diets, the prey library must contain representatives of all prey potentially consumed by predators. To the degree possible, prey types should be defined to minimize differences among signatures within prey types and maximize differences between prey types. In addition, our model assumes that the predators share a common set of calibration coefficients. If there is reason to suspect that calibration coefficients differ by sex, age, or similar factors, predator data can be partitioned into subsamples, and estimation can be performed separately for each.
The similarity between our estimates of calibration coefficients for the Chukchi Sea polar bears and the estimates derived from the mink feeding trial (Thiemann et al., 2008) may provide some assurance that both sets of estimates reflect related metabolic processes. The greatest difference between the calibration coefficients occurred with fatty acid 20:1n-11, for which our estimate was substantially smaller than either of the feeding trial estimates, although other less striking differences were also found. The cause of these differences is unknown, but likely originates from some aspect of the feeding trial design, such as characteristics of the diets fed, or differences in the physiology of mink and polar bears. In a prior analysis of the polar bear data, Bromaghin, Rode et al. (2015) reported that many polar bear fatty acid proportions were outside the range of the proportions in the transformed prey library.
This finding is conceptually impossible for the QFASA mixing model if assumptions are met, and strongly suggests that the mink-derived calibration coefficients are not wholly suitable for Chukchi Sea polar bears.
F I G U R E 9 Mean estimated diet of Chukchi Sea polar bears obtained with unconditional estimation and by conditioning on the marine-fed mink calibration coefficients for (a) adult females, (b) adult males, (c) subadult females, and (d) subadult males. Error bars represent one standard error of the mean