Hormone comparison between right and left baleen whale earplugs

Abstract Marine animals experience additional stressors as humans continue to industrialize the oceans and as the climate continues to rapidly change. To examine how the environment or humans impact animal stress, many researchers analyse hormones from biological matrices. Scientists have begun to examine hormones in continuously growing biological matrices, such as baleen whale earwax plugs, baleen and pinniped vibrissae. Few of these studies have determined if the hormones in these tissues across the body of the organism are interchangeable. Here, hormone values in the right and left earplugs from the same individual were compared for two reasons: (i) to determine whether right and left earplug hormone values can be used interchangeably and (ii) to assess methods of standardizing hormones in right and left earplugs to control for individuals’ naturally varying hormone expressions. We analysed how absolute, baseline-corrected and Z-score normalized hormones performed in reaching these goals. Absolute hormones in the right and left earplugs displayed a positive relationship, while using Z-score normalization was necessary to standardize the variance in hormone expression. After Z-score normalization, it was possible to show that the 95% confidence intervals of the differences in corresponding lamina of the right and left earplugs include zero for both cortisol and progesterone. This indicates that the hormones in corresponding lamina of right and left earplugs are no different from zero. The results of this study reveal that both right and left earplugs from the same baleen whale can be used in hormone analyses after Z-score normalization. This study also shows the importance of Z-score normalization to interpretation of results and methodologies associated with analysing long-term trends using whale earplugs.

Recently, researchers have established that several biological matrices have the capacity to archive sustained longitudinal hormone data at the level of the individual spanning months to lifetimes, including baleen whale earwax, baleen plates and pinniped vibrissae (Trumble et al., 2013Hunt et al., 2014;Karpovich et al., 2018). In identifying potential sources of variability, the association between hormone concentrations within or between matrices should be examined to increase our understanding of how to interpret these data among individuals and through time.
Previous studies used baseline correction or Z-score normalization to correct for natural variation in hormones expression as a means to compare across multiple individuals (Fanson et al., 2017;Trumble et al., 2018). The steroid hormones targeted in this study are glucocorticoids (i.e. cortisol) and gonadocorticoids (i.e. progesterone) because of their ubiquitous use in studies of stress and reproduction. Cortisol is widely used as an indicator of stress in animal studies (Kellar et al., 2015;Hunt et al., 2017a;Trumble et al., 2018), partially as a result of the rapid nature of the hypothalamic-pituitary-adrenocortical system's response during stressful events. Progesterone, a key pregnancy hormone, is often used as an indicator of reproductive events such as age at sexual maturity, oestrous and pregnancy in a variety of mammalian species (Rolland et al., 2005;Kellar et al., 2006;Clark et al., 2016;Hunt et al., 2016;Robeck et al., 2017;Pallin et al., 2018).
In this study, extremely rare paired right and left earplugs from four individual baleen whales were examined to determine if individual lamina exhibit similar cortisol and progesterone concentrations in corresponding lamina and therefore interchangeable for endocrinology research. Determining if differences exist between corresponding baleen whale earplug lamina allows for improved analysis and interpretation of years to lifetime duration longitudinal studies, including establishing life events and providing insights into causal mechanisms and processes. Further, individual-level longitudinal data on biological and behavioural data are becoming increasingly available from a variety of biological matrices. Thus, two data transformation methods were evaluated in their utility to merge data by age or calendar year from multiple individuals: baseline correction and Z-score normalization. To date, no studies have investigated the potential hormone concentration differences within these corresponding sets of right and left earplugs.

Methods
Corresponding sets of right and left earplugs from four baleen whales were used in this study: two individual baleen whales (#1120 and 1121) of unknown species and archived at the Smithsonian Museum of Natural History and two earplugs from recent strandings, one fin whale (Balaenoptera physalus, ID #1019) and one humpback whale (Megaptera novaeangliae, ID #1025, Table 1). Because of the inability to distinguish between 'right' and 'left' earplugs, designations were assigned at random for each baleen whale.

Aging and delamination
Aging and delamination methods for baleen whale earplugs have been described previously . Briefly, whales were aged to the nearest year by counting individual earplug layers (lamina), assuming a dark and light lamina equates to 1 year (growth layer groups, Gabriele et al., 2010). After aging, laminae were separated, weighed (± 0.001 g), homogenized and stored at 4 • C in nitrogen filled amber vials until lipid extraction. Lipids were extracted from individual earplug lamina using a Soxtec 2043 extraction system (FOSS) with 2:1 chloroform to methanol for 60 min at 160 • C. After extraction, extracts were dried under nitrogen and stored at −80 • C until analysis. Using enzyme-linked immunoassays (ELISA; Enzo Life Sciences, cortisol: ADI-901-071, progesterone: ADI-901011), extracted lipid aliquots were analysed for cortisol and progesterone in duplicate. Optical density values (Beckman Coulter DTX 880 Multimode Detector) were converted to hormone concentration in pg/g lipid. Cortisol in baleen whale earplugs has been assessed for linearity and accuracy (Trumble et al., 2013). Pooled samples from three male and three female fin whales were serially diluted, and the resultant values were compared to the ELISA kit progesterone standard to ascertain the binding affinity of progesterone to the assay antibodies (Hunt et al., 2017b). Pooled samples from three male and three female fin whales were then combined and spiked with known serial dilution of the standards to compare observed concentration to known standard concentration to determine accuracy of the assay for progesterone (Grotjan and Keel, 1996;Hunt et al., 2017b).

Absolute hormone values in right and left earplugs
Absolute hormone concentrations (pg/g lipid) for corresponding right and left earplug laminae were compared using a mixed-model framework as a regression analysis to account for non-independence of data, with individual whale ID as the random effect (Bates et al., 2015;Bartón, 2019): lmer (left earplug absolute hormone ∼ right earplug absolute hormone + 1|WhaleID, data = data), where cortisol concentrations from right and left earplugs were compared and the progesterone concentrations from right and left earplugs were compared. Conditional r 2 is reported using MuMIn R pack-  Means all refer to the mean lamina hormone concentration over the entirety of the earplug. For both cortisol and progesterone, mean concentrations and standard deviation used for Z-score normalization and the baseline used for baseline correction for each whale for the right and left earplugs are provided. Baselines are calculated by averaging the three lowest hormone concentrations, except in the case of juvenile whales ID# 1019 and 1025, where only the lowest hormone concentration was used age for the variance explained by fixed and random effects (Nakagawa and Schielzeth, 2012;Bartón, 2019).
Furthermore, the difference between corresponding lamina absolute hormone concentration in right and left earplugs was calculated, from which a 95% confidence interval (CI) was derived (Table 2). For example, the hormone concentration of lamina A in the right earplug is subtracted from lamina A in the left earplug and recorded as the difference between the two laminae. This is repeated for all laminae in the right and left earplugs of an individual; these differences are then used to calculate the 95% CI around the mean of the differences. If the 95% CI of the differences between the right and left earplug includes zero, this indicates that the hormones in corresponding lamina of the right and left earplug are no different from zero.

Controlling for differing lifetime mean hormone concentration in individual subjects
Mean lifetime hormone concentration was compared between corresponding earplugs, in absolute (pg/g lipid), baselinecorrected (%) and Z-score-normalized values (Table 1). Mean lifetime hormone values were calculated as the mean hormone value per lamina for each right and left earplug (sum of hormone values divided by total number of laminae for each earplug). For each earplug, hormone concentrations were baseline-corrected and Z-score-normalized (i.e. calculated independently for each right and left earplug from each individual). To calculate baseline-corrected hormone values, 'baseline' levels were defined as the mean of the three lowest hormone values for each earplug (Table 1, Trumble et al., 2018). Due to the age of the juvenile whales (<2 years), baselines for ID# 1019 and 1025 were assigned as the single lowest hormone concentration for each individual earplug (Table 1). Baseline-corrected values, as percent change from the baseline for each lamina, were calculated as: (lamina hormone concentration−baseline)/baseline) * 100%, which is derived from Trumble et al., (2018). Z-scores for each lamina in a whale's earplug were calculated as ((lamina absolute hormone concentration−mean of earplug lamina)/standard deviation of earplug lamina). These resulting data have a mean of zero and are proportional to the standard deviation. A Z-score of one for cortisol in a single baleen whale earplug lamina would indicate that over 6 months this individual produced elevated cortisol one standard deviation above its mean cortisol production over its life.
To assess differences in mean lifetime hormone concentrations, a Wilcoxon signed-rank test was used due to the non-normal distribution of the differences between the pairs of corresponding laminae. Baseline-corrected and Z-scorenormalized hormones were used to control for differing lifetime mean hormone concentration in individual earplugs. Specifically, the difference between each layer of corresponding sets of individual's earplugs was calculated for baselinecorrected and Z-score-normalized hormones, from which a 95% CI from the mean was calculated for each individual (Table 2, Fig. 1).

Assays
Baleen whale earplug cortisol and progesterone intra-assay coefficient of variation (CV) for all ELISAs were 7.7 ± 6.6 and

Controlling for differing lifetime mean hormone concentration in individual subjects
Cortisol and progesterone were both baseline-corrected and Z-score-normalized to determine mean lifetime hormones. The 95% CI for baseline-corrected cortisol always included zero while baseline-corrected progesterone did not include zero for whale ID# 1120 and 1121. Z-score normalization for both cortisol and progesterone always included zero (Table 2, Fig. 1).
The lifetime mean lamina hormone value for each individual's right and left earplug was compared (Table 1). Mean absolute cortisol concentrations for the right and left earplugs for two individuals were significantly different (ID# 1120 and 1121, Wilcoxon signed-rank test, P = 0.0001 and P = 0.001, respectively), whereas mean baseline-corrected and Z-score normalized cortisol did not differ (Fig. S3, Wilcoxon signedrank test, P > 0.2). Mean absolute progesterone was significantly different for whale ID# 1120 (Wilcoxon signedrank test, P = 0.008), and baseline-corrected progesterone was significantly different for both whale ID# 1120 and 1121 (Wilcoxon signed-rank test, P = 0.004, P = 0.01 respectively), whereas mean Z-score-normalized progesterone values were no different (Table 1, Fig. S4, Wilcoxon signed-rank test, P > 0.5).

Discussion
Results from this study demonstrate significant positive relationships in absolute cortisol and progesterone concentrations in corresponding lamina between earplugs within individual whales. This supports the use of either the right or left earplug for endocrinology research, which is valuable since museums may only have one earplug from an indi-vidual, or deceased, stranded individuals may be positioned in such a way that only one earplug is accessible. However, of the two data transformation techniques analysed, only Z-score-normalized hormones always included zero when examining the 95% confidence interval around the mean between the right and left earplugs. Furthermore, Z-score normalization provided an identical lifetime mean hormone value from which to compare across individuals (Fanson et al., 2017). This allows researchers to use Z-score-normalized hormones to merge individual hormone trends. Therefore, we conclude that Z-score transformation adequately corrects for variance between earplugs, appropriate for use in studying stress and reproduction by age and calendar year . Z-score normalization does remove individual variability from hormone data, where individuals with significantly higher lifetime hormone means would be combined with individuals with significantly lower lifetime hormone means. This should be taken into consideration when making the decision to Z-score normalize hormone data. These findings indicate right and left earplugs can be used interchangeably after Z-score normalization to reconstruct hormone profiles to examine long-term trends in baleen whales.

Assays
Parallelism and accuracy require a great deal of sample mass from extremely rare samples and therefore could not be assessed for all species (Hunt et al., 2017b). Due to insufficient sample mass per earplug, aliquots for parallelism and accuracy tests were unavailable for all species; however, given the validations performed for progesterone and cortisol during this and a previous study, we assume these validations for earplug hormones extend to all Mysticetes (Trumble et al., 2013;Hunt et al., 2017b).

Aging and delamination
Assuming formation of lamina within earplugs is biannual (Gabriele et al., 2010), difficulties arise when aging earplugs from older individuals. (Purves, 1955;Lockyer, 1972). As the earwax accrues, the lamina becomes increasingly compacted as age and size of the earplug advances (Purves, 1955;Lockyer, 1972). Furthermore, many species of cetaceans, including baleen whales, show behavioural and physiological laterality and asymmetry, which may indicate that external ear canals are slightly different shapes and sizes (Galatius and Jespersen, 2005;MacLeod et al., 2007;Canning et al., 2011;Pyenson et al., 2012). Therefore, compaction could influence earplug and lamina shape and distinction, leading to variability in delamination, aging and ultimately longitudinal comparisons. However, while errors in aging may lead to an offset in the time between right and left earplugs (Fig. S3, Fig. S4), overall lifetime trends remain consistent. Errors associated with aging and delamination described in this study did not introduce data bias, though techniques for earplug delamina-  Figure 2: Here, we demonstrate the differences that can arise when assessing hormone trends across multiple individuals when using absolute cortisol concentration as compared to Z-score-normalized cortisol. A. All whales combined by age for absolute cortisol concentration and B. all whales combined by age for cortisol Z-score. Vertical bars represent standard error. Any points with no visible error bars are present, but very small tion continue to be improved to reduce any possible differences.

Absolute hormone values in right and left earplugs
Previous studies have shown similarities in hormone concentrations between different tissues from within individuals (Kellar et al., 2013;Charapata et al., 2018;Mingramm et al., 2019). In addition, a few studies have investigated hormone similarity from the same tissue in different locations within the body (Kellar et al., 2006;Cattet et al., 2017;Mello et al., 2017;Charapata et al., 2018). Kellar et al. (2006) showed that blubber depth had no effect on progesterone concentration in a northern right whale dolphin (Lissodelphis borealis) while blubber sampled at different locations in the body were significantly different, though the differences were relatively small. Mello et al. (2017) reported different blubber sampling locations in humpback whales (Megaptera novaeangliae) which resulted in similar hormone concentrations, except for testosterone from the dorsal fin. Charapata et al. (2018) showed extensive variability in individual walrus (Odobenus rosmarus) bone concentrations but showed that hormones extracted from cortical bone across the walrus skeleton were similar. Hair from brown bears (Ursus arctos) also revealed similar hormone concentrations in hair taken from different locations on the individual's pelt (Cattet et al., 2017). The results of the mixed model from the present study demonstrate that as cortisol or progesterone concentration increases in the left earplug, they increase in the right earplug. These similar trends in hormone data (i.e. between right and left earplugs) indicate a consistent excretion pathway for hormone deposition into the developing earplug. Therefore, we surmise that baleen whale earplugs are suitable in assessing longitudinal endocrinological trends in Mysticetes.

Utility of Z-score-normalizing longitudinal hormone values
The concept of hormone 'baselines' is not straightforward across and within disciplines (Bortolotti et al., 2008;Houser et al., 2011;Fair et al., 2014;Rolland et al., 2017). Colloquially, a baseline in animal studies is a minimum or fixed reference point which can be used in comparing between or among morphological, behavioural or physiological data. Many studies involving avian species define a baseline value as the measured concentration of a particular analyte during an unstressed period, usually taken immediately after capture (Bortolotti et al., 2008). Similarly, studies in free-ranging bottlenose dolphins, Tursiops truncatus, establish baseline analyte values after capture (Fair et al., 2014), whereas studies in managed dolphins use baseline values from voluntary blood draws (Houser et al., 2011). Marine mammal research involving free-ranging animals must rely on innovative techniques or opportune sampling to determine baselines. For example, Rolland et al. (2017) defined baseline glucocorticoid concentration for North Atlantic right whales as the mean faecal glucocorticoid concentration of healthy individuals. These techniques for assessing hormone baselines are designed for the time scales over which change in hormone concentration is being examined-minutes to hours in most cases.
Here, we suggest calculating a lifetime hormone baseline by averaging the hormone values of each lamina per earplug for one consistent value across the timeline, which, when transformed, would be a Z-score of zero (Table 1). When comparing the corresponding lamina from the right and left earplugs within the same individual, only baseline-corrected cortisol and Z-score-normalized cortisol and progesterone values resulted in similar lifetime hormone means, where the 95% CI of the mean of the differences included zero. If zero is included in the 95% CI for the mean of the differences, lamina are similar. However, if zero is not included in the 95% CI for the mean of the differences, then right and left earplug hormone values per lamina are different. Hormone values from juvenile whales (ID# 1019, 1025) do create wider 95% CIs than for the older animals (ID# 1120, 1121), though these confidence intervals still include zero. The right and left earplugs for ID# 1120 and 1121, aged and delaminated differently, could explain why absolute hormone concentration 95% CI did not include zero. Therefore, baseline correction and Z-score normalization were used to control for both natural hormone variability (Fanson et al., 2017;Charapata et al., 2018;Trumble et al., 2018) and for variation in aging and delamination of lamina to merge and examine trends of multiple individuals by age or calendar year.
Baleen whale earplugs for which individual lamina have been Z-score-normalized (Table 1): (i) provide a means with which to compare changes in hormone trends across individuals, populations (Clark et al., 2017;Trumble et al., 2018) or species, (ii) creates an equal lifetime hormone mean from which to compare across individuals with natural variability in hormone expression over their lifetimes (Jenkins et al., 2014;Charapata et al., 2018;Trumble et al., 2018) and (iii) standardizes the variance, so that one animal does not overly influence the trends (Fanson et al., 2017). We acknowledge that using Z-scores reduces individual variability in hormone expression as well as difference between individuals. Therefore, the researcher must be mindful about its limitations, particularly when interpreting transformed data.

Combining hormone profiles from multiple individuals
To illustrate the utility of using a method of calculating a lifetime hormone baseline that allows comparison among individuals, mean cortisol concentrations (pg/g lipid) as well as Z-scores were compared across the four individuals in this study as a function of age ( Fig. 2A and B). These graphs demonstrate that interpretation of results may differ with regards to using absolute hormone concentrations or Zscore-normalized hormone values. As described here, absolute cortisol values plotted over 15 years reveal 6-month-old calves have higher cortisol Z-scores than later in life and a peak at 1.5 years of age ( Fig. 2A). However, while Z-score normalization reveals a similar cortisol increase at 1.5 years of age, corresponding to weaning age (Chittleborough, 1958;Lockyer, 1984), 6-month old calves have lower cortisol Zscores with a peak in cortisol occurring at 6-9 years of age (Fig. 2B). Previous studies have suggested that this peak between 6 and 9 may be associated with onset of sexual maturity (Trumble et al., 2013). There is a notable difference of interpretation when comparing Z-score-normalized data across lifetimes of multiple whales as compared to using absolute cortisol concentrations in the same manner. Individual lifetime hormone baseline variability between different individuals must be considered when assessing trends across multiple individuals (Fig. 2, Trumble et al., 2018).

Conclusion
In this study, we have shown that hormones in the left and right earplugs of baleen whales exhibit similar trends and are therefore interchangeable. In addition, we have demonstrated that hormones must be Z-score-normalized to merge data and analyse decades-long trends. Analysing the effectiveness of baseline-correcting and Z-score-normalizing hormones across an individual's lifetime is vital for researchers using continuously growing biological matrices with the intention of comparison across individuals.