Summary cortisol reactivity indicators: Interrelations and meaning

Research on the hypothalamic pituitary adrenal (HPA) axis has involved a proliferation of cortisol indices. We surveyed recently published HPA-related articles and identified 15 such indices. We sought to clarify their biometric properties, specifically, how they interrelate and what they mean, because such information is rarely offered in the articles themselves. In the present article, the primary samples consist of community mothers and their infants (N = 297), who participated in two challenges, the Toy Frustration Paradigm and the Strange Situation Procedure. We sought to cross-validate findings from each of these samples against the other, and also against a clinically depressed sample (N = 48) and a sample of healthy older adults (N = 51) who participated in the Trier Social Stress Test. Cortisol was collected from all participants once before and twice after the challenges. These heterogenous samples were chosen to obtain the greatest possible range in cortisol levels and stress response regulation. Using these data, we computed the 15 summary cortisol indices identified in our literature survey. We assessed inter-relations amongst indices and determined their underlying dimensions via principal component analysis (PCA). The PCAs consistently extracted two components, accounting for 79%–93% of the variance. These components represent “total cortisol production” and “change in cortisol levels.” The components were highly congruent across challenge, time, and sample. High variable loadings and explained factor variance suggest that all indices represent their underlying dimensions very well. Thus the abundance of summary cortisol indices currently represented in the literature appears superfluous.

measures in conjunction. Despite this notion, AUC I is rarely used in the literature (Fekedulegn et al., 2007), a fact attributed to widespread misunderstanding of what AUC I captures d cortisol change (i.e., increase or decrease), rather than increase alone d and a failure to appreciate the index's biometric properties, including how it relates to other cortisol indices (Fekedulegn et al., 2007). Fekudelegn et al. used principal component analyses to better delineate the biometric properties of AUC I in the context of diurnal cortisol change and response to awakening (Fekedulegn et al., 2007).
Researchers have also noted that the baseline cortisol level is often incorrectly used as a nonspecific proxy for the anticipatory stress response (Engert et al., 2013). Other researchers continue to emphasize the need for consistency in cortisol research, in order to draw comparisons across studies (Dickerson and Kemeny, 2004). For instance, some authors question the appropriateness of point estimates in assessing mother-infant cortisol attunement, suggesting that trajectories may be better suited for capturing dyadic fluctuations in cortisol (Laurent et al., 2011). Furthermore, others point to difficulties in drawing unequivocal conclusions across a proliferation of cortisol indices (Atkinson et al., 2013), particularly in light of the many sources of variation in acute stress responses (e.g., individual differences, features of the stressor, aspects of the collection procedure, etc.) (Dickerson and Kemeny, 2004;Foley and Kirschaum, 2010). Rather than focusing on the association among various cortisol indices with behavioral or health variables, in the current study, we focus on the interrelations amongst cortisol indices and their underlying structure.
For that purpose, we identified several of the cortisol indices that are commonly used in the literature, in order to derive a reasonable sampling of indices for use in the current study. We restricted ourselves to articles published between January 2011 and June 2013 in six journals (Developmental Psychobiology, Early Human Development, Journal of Biological Psychiatry, Psychoneuroendocrinology, Psychosomatic Medicine, and Stress). We chose those journals as they regularly feature research on relations between HPA activity and behavioral/psychiatric function. Rather than being a systematic review, the aim was to generate a comprehensive list of commonly used point indices of cortisol. We reviewed 219 articles that measured cortisol in the context of some form of challenge, with a total of 15 indices of cortisol.
In total, the articles reviewed thus incorporated 15 unique indices of cortisol (see Table 1 for list of indices with definitions). From the original literature search, the majority of the studies did not provide a rationale for use of a particular index, and only 16 provided a citation for the chosen index. Moreover, among these studies, there were instances when the cortisol index selected was not entirely supported by the citation provided. For example, one study used AUC G but inaccurately defined it as assessing cortisol reactivity (Pruessner et al., 2003). Overall, only three studies provided justification for why a specific index was chosen over other indices.
Several studies used different formulas to calculate identically labeled indices. For example, thirteen studies measured "cortisol reactivity," either by calculating the difference between the baseline and middle sample, last sample, or peak sample. We consider this to be problematic as the use of multiple, varying formulas to then calculate apparently identical indices of 'cortisol reactivity' confounds the meaning of results, undermines the possibility of comparison across studies, and makes it difficult to draw valid conclusions for the field a whole. Finally, some studies use different nomenclature but equivalent formulas to assess identical constructs (e.g., AUC I and AUC AB , see Table 1).
The purpose of the present study is thus to clarify the meaning of the point indices identified in the aforementioned survey (Table 1) by (a) formally assessing the intercorrelations amongst them and (b) characterizing their underlying dimensions via principal components analysis. We assessed congruence across two primary samples, consisting of mothers and their infants, as well as two validation samples, consisting of clinically depressed adults and healthy older adults. These samples were selected to generalize findings across development (i.e., infants and older adults), as well as across healthy (i.e., mothers, older adults) and clinical (i.e., depressed) samples. In addition, we assessed congruence of findings for these samples across time using three challenges known to have differential impact on the adrenocortical function of participants.

Ethics statement
This study was approved by the Ryerson University Research Ethics Board, the Centre for Addiction and Mental Health Research Ethics Board, and the Institutional Review Board of the Faculty of Medicine of McGill University. The samples of mothers, older adults, and depressed participants provided written, informed consent to participate in the study. We obtained written, informed consent from mothers with respect to their infants' participation. These consent procedures were approved by the respective Research Ethics Boards.

Samples of mothers and infants
A community sample of 297 mother-infant dyads (52.2% female) participated in home and laboratory visits (Atkinson et al., 2013). During the home visit, mothers were a mean of 33.43 years (SD ¼ 4.54) and infants were a mean of 15.98 months (SD ¼ 1.37). During the laboratory visit, approximately one month later, mothers were a mean age of 33.75 years (SD ¼ 4.42) and infants were a mean of 17.26 months (SD ¼ 1.92). Participants were primarily Caucasian (76.0%) and in spousal relationships (82.0%). Almost half (46.2%) were university educated and median family income was between $114,000e150,000 Canadian. This is a demographically low-risk sample.

Sample of depressed adults
A sample of 48 depressed adults participated in a laboratory visit during which they were exposed to the Trier Social Stress Test (Chopra et al., 2009). All patients met criteria for chronic major depressive disorder (defined as experiencing a non-remitting major depressive episode for at least a 2 year period) based on the Structured Clinical Interview for DSM-IV TR criteria) and had a score of at least 18 on the Hamilton Depression Rating Scale (Williams et al., 1988). This sample was also screened for cognitive impairment. The depressed sample ranged from 23 to 55 years old (M ¼ 41.62, SD ¼ 1.12), with 41.5% males.

Sample of older adults
A sample of 51 older adults was also exposed to the Trier Social Stress Test. Older adults did not have a history of Axis I disorders (according to the criteria established by the DSM-IV TR), presence of cognitive impairment (Mini Mental State Exam scores ! 29), or dementia (assessed using the Blessed Information- Memory-Concentration Test, Blessed, 1996). These individuals ranged in age from 60 to 75 years of age (M ¼ 66.92, SD ¼ 5.02). Approximately half (51%) of this sample was male.

Sample of mothers and infants
Both the home and laboratory visits occurred between 900 and 1000 h. Infant afternoon cortisol levels are confounded by variations in daytime routine (Goldberg et al., 2003;Gunnar and White, 2001). Thus, the design used here is more sensitive to increases in infant than in maternal cortisol levels. It is important to note that it is unlikely that these samples are confounded by the cortisol awakening response (CAR), as the study occurred well after participants awoke. During the home visit, mothers and infants were observed in a Toy Frustration Task (TFT; Braungart-Rieker and Stifter, 1996). Approximately one month later, the dyads participated in the laboratory Strange Situation Procedure (SSP; Ainsworth et al., 1978). Saliva samples were collected from mothers and infants once before and twice after each task. Both procedures were discontinued if the infant cried continuously for 20 s.

Sample of depressed adults
Individuals participated in two study sessions, the first consisting of a Structured Clinical Interview for DSM-IV, the second consisting of the Trier Social Stress Test (TSST; Kirschbaum et al., 1993). All TSST visits occurred between 1400 and 1840 h. Saliva samples were collected three times before and four times after the TSST. A complete description of study procedures are provided elsewhere (Chopra et al., 2009).

Sample of older adults
The older adult sample was exposed to the TSST, which occurred between 1300 and 1800 h. Saliva samples were collected twice before and six times after the TSST. A complete description of similar study procedures can be found here (Sindi et al., 2013).

Challenges for mothers and infants
The TFT (Braungart-Rieker and Stifter, 1996) consisted of four 90-s episodes: 1) Mother engages the infant with an attractive toy; 2) mother places the toy in a clear container with lid on but not sealed, while disengaging from the infant; 3) mother returns the toy to the infant; 4) mother places the toy in the clear container with the lid sealed and disengages again.
The SSP (Ainsworth et al., 1978) consisted of seven 3-min episodes, designed to induce increasing distress in the infant. During these episodes, 1) the infant and mother interact in an unfamiliar but child-friendly room, 2) a female stranger enters and engages the infant, 3) the mother leaves, 4) the mother returns and the female stranger leaves, 5) the mother leaves the infant alone, 6) the stranger returns, and 7) the mother returns and the stranger leaves. In this study, the SSP was used only as a stressor and was not coded for attachment classification (Ainsworth et al., 1978).
Due to ethical constraints, infant challenge paradigms are not highly stressful. Even so, among laboratory stressors, frustration paradigms (e.g., the TFT) are less potent than separation paradigms (e.g., the SSP). Based on meta-analytic data (Jansen et al., 2010), infant cortisol response to frustration paradigms corresponds to d (standardized difference between pre-stressor and post-stressor cortisol concentrations) ¼ .19, whereas separation paradigms have an average effect of .34. Similar findings are reported in subsequent primary studies (Atkinson et al., 2013;Laurent et al., 2012). In addition, the TFT and SSP are differentially challenging for mothers and infants (Atkinson et al., 2013). The variable strength of these challenges is an important consideration, given the present focus on the meaning of cortisol indices, independent of stressor.

Challenge for the depressed sample
Individuals completed the Trier Social Stress Task (Kirschbaum Table 1 Definitions of cortisol indices as identified in the literature search and calculated in the current samples. et al., 1993) in a standard interview room, which consisted of a video camera, microphone and stand, as well as a three-person "expert committee" who sat behind a table. Participants were given 10 min to prepare a 5-min speech (as if they were in a job interview); this preparation occurred in a separate room. After completing the speech, participants completed a serial subtraction mathematical problem. Salivary cortisol samples were obtained at times seven times throughout the study session.

Challenge for the older adult sample
Similar to the sample of depressed participants, older adults completed the standard TSST in an interview room with a twoperson (mixed gender) expert committee. Participants were given 10 min to prepare their speech. Participants were then introduced to the panel of experts. The experimenter sat behind a one-way mirror to observe participants complete the 5-min speech and 5min mental arithmetic task. In total, eight saliva samples were collected before and after the TSST.

Sample of mothers and infants
Salivary cortisol was obtained from infants and mothers 5 min before each challenge and 20 and 40 min post-challenge. Two Sorbettes (Salimetrics, State College PA) were collected for each participant at each time point. Saliva samples were centrifuged for 10-min at 3000 rpm to extract the saliva and then stored in a freezer at À70 C. Salivettes were thawed and centrifuged for 10 min at 3000 rpm at 4 C. All samples were assayed twice using a salivary cortisol enzyme immunoassay kit (Salimetrics, State College, PA), and average values were used in analyses. The interassay variability was 10.6%; the intraassay variation was 8.3%, for samples with low values, and 6.9% for samples with high values.

Sample of depressed adults
Salivary cortisol was obtained from participants two times before the TSST began (À35 min, À20 min), just before beginning the task (0 min), and four times after the task (þ20, þ40, þ60, and þ80 min). Saliva samples were collected using cotton wool salivettes (Sarstedt, Montreal, Quebec). All samples were assayed twice using radioimmunoassay kits (ICN Biomedical Inc, Costa Mesa, CA). Both intra-and inter-assay variability was less than 10%. For the purposes of this study, we used the cortisol samples collected at 0, þ20 and þ40.

Sample of older adults
Salivary cortisol was obtained from participants 15 min before the beginning of the TSST (-15 min), immediately before beginning the task (À1 min), and six times after the task (þ1, þ10, þ20, þ30, þ45, and þ60 min). Saliva samples were collected using salivettes (Sarstedt, Quebec City, Quebec, Canada). All samples were analyzed with a fluorescence immunoassay. Both intra-and inter-assay variability was less than 10%. For the purposes of this study, we used the cortisol samples collected at 0, þ20 and þ45, given that these time points closely resemble those of the primary samples.

Cortisol indices
We used the baseline, þ20 and þ40 1 times points in all samples to compute the fifteen cortisol indices identified in our literature survey. These time points were selected because they are consistent across the samples used here and are typically used in the literature. The fifteen cortisol indices include cortisol concentrations at each time point (i.e., baseline, 20, and 40 min), mean cortisol concentration, AUC G , AUC I , peak cortisol, minimum cortisol, cortisol reactivity, cortisol slope, percent change in cortisol, intercept and slope based on a regression of raw cortisol values. A description of all indices as well as the formulas used to derive them is included in Table 1.

Statistical analyses
Data analysis involved two phases. 1) We constructed correlation matrices for each sample, incorporating all cortisol indices and eliminating multicollinear indices. We then assessed the matrices for biometric adequacy. 2) We subjected the reduced matrices to principal components analysis.

Correlation analyses
We conducted Pearson productemoment correlations, incorporating all 15 indices into the matricesdone for each sample (mother, infant, and depressed) and challenge (TFT, SSP, TSST). We examined correlation matrices for adequacy. Variables were excluded from further analyses if they correlated above .90 with at least two other variables across both participant (mothers and infants) and visit (home and lab). A .90 correlation or above indicates multicollinearity (Tabachnick and Fidell, 2007). The primary sample of mothers and infants were used to determine issues of multicollinearity, though similar multicollinear associations were also found in the depressed sample. Once all such variables were removed, we assessed it using Bartlett's test of sphericity (Bartlett, 1950) to ensure that variables were sufficiently intercorrelated, and the Kaiser-Meyer-Olkin measure of sampling adequacy (KMO), which assesses the degree to which each variable is explained by the others (Kaiser, 1974). These indices ensure that the matrices are sufficiently integrated to permit valid factor analysis.

Principal component analyses (PCA)
Once matrix adequacy was established, six separate PCAs were conducted across participant and challenge. We used an oblique rotation (oblimin) to permit component dependence within solutions. Component loadings > .40 were interpreted (Field, 2009). Component retention was based on eigenvalue >1 (Guadagnoli and Velicer, 1988) and scree plot (Cattell, 1966) criteria. Tucker's coefficients of congruence (Tucker, 1951) were calculated to assess component congruence across participant (mothers, infants, depressed, and older adults) and time/challenge. Coefficients above .85 indicate high congruence, whereas coefficients above .95 indicate equivalent components (Lorenzo-Seva and Ten Berge, 2006).
An important strategy in labeling components involves a priori identification of marker variables whose meaning is known (Nunnally and Bernstein, 1994). We used AUC G and AUC I for this purpose, as they represent the two most-often used indices that capture cortisol levels across repeated measures. AUC G measures total cortisol output, capturing both intensity (overall distance of cortisol samples from the ground) and sensitivity (difference between individual cortisol samples), whereas AUC I measures change in cortisol over repeated samples, regardless of pre-challenge cortisol concentrations (Fekedulegn et al., 2007;Pruessner et al., 2003). Given the well-defined nature of these indices, as well as their popular use in the literature, we used AUC G and AUC I as our anchor points for defining the latent variables captured by the principal component analysis. This a priori expectation does not bias analyses but assists in the valid interpretation of extracted components. 1 The þ 45 min sample was used for the older adult sample, as saliva was not collected at þ40 min. Throughout the remainder of the paper, reference to the 40 min samples includes the 45 min sample for older adults.

Descriptive statistics
Raw baseline, 20-min, and 40-min cortisol values were positively skewed ( Table 2). Given that all cortisol indices were based on these values, all indices, except for the raw intercept and slope, were calculated and then log transformed to minimize skew. Table 2 displays the means and standard deviations for each index prior to transformation. Several indices have particularly large standard deviations, likely a product of intra-and interindividual variability in cortisol responses to challenge, a recognized phenomenon in the HPA literature (Atkinson et al., 2013;Kudielka et al., 2009). Several indices (i.e., maximum increase, slope, reactivity, peak reactivity, AUC I ) remained non-normally distributed after transformation; however, log transformation minimized this skew and factor analysis is robust to violations of normality (Atkinson, 1988).

Correlation analyses
Correlation matrices (Tables 3e5) revealed several variables with intercorrelations ! .90. This is not surprising, given that all variables are based on the same three values (baseline, 20-, and 40min). This multicollinearity indicates that several variables do not explain unique variance beyond error. To render the matrix amenable to PCA, the following variables were excluded according to aforementioned multicollinearity criteria: mean, maximum increase, and peak cortisol values. The reactivity and raw slope variables correlated perfectly (r ¼ 1.00) across all matrices, indicating that they explain identical variance. The reactivity variable was calculated by subtracting baseline from the 40-min value, whereas the slope variable was calculated using all three samples (baseline, 20 and 40 min). Given that the 20-min value correlates very highly with both the baseline and 40-min values (median r ¼ .54) across all samples, it appears that the inclusion of 20-min values does not provide additional variance. Given that reactivity is more frequently used in the literature, the raw slope variable was excluded from subsequent analyses. Although all 15 variables are displayed in our correlation matrices (Tables 3e5), these issues of multicollinearity resulted in the inclusion of 11 variables (baseline, 20-min, 40-min, minimum value, AUC G , AUC I , reactivity, peak reactivity, slope, intercept, and percent change) in subsequent analyses.

Mother and infant samples
Four PCAs were conducted (for mothers and infants separately, show low intercorrelations (maternal Components 1 and 2 correlate À.14 and À.082, for the TFT and SSP respectively; infant Components 1 and 2 correlate À.003 and .091 for the TFT and SSP respectively), indicating that components 1 and 2 are largely independent. Table 9 shows that component structure is stable across time for mothers and infants and congruent across mothers and infants during both the TFT and SSP. The factor analytic findings are robust.

Clinically depressed sample
The component matrix for participants with depression is shown in Table 8. Similar to the mother and infant results, component 1 reflects total cortisol production, whereas component 2 reflects change in cortisol over time. Again, components 1 and 2 account for a high percent of the variance (93%). Components 1 and 2 are also largely independent, with an intercorrelation of .13. Three indices (i.e., 20 min, 40 min, and AUC G ) have cross loadings Table 3 Correlation matrix of maternal cortisol variables during the toy frustration and strange situation procedures.  Note: Correlations above the diagonal (shaded grey) refer to the Toy Frustration task, correlations below the diagonal refer to the Strange Situation procedure. Diagonal values (italicized, underlined) are testeretest correlations (across stressors). Correlations above .18 and .13 are significant at p < .01 and p < .05, respectively. Baseline ¼ baseline cortisol value; 20-min ¼ cortisol value at 20 min post-challenge; 40-min ¼ cortisol value at 40 min post-challenge; mean ¼ average cortisol value across samples; Pk ¼ peak cortisol value; Min ¼ minimum cortisol value; Max ¼ maximum increase (peak minus minimum value); AUC G ¼ area under the curve with respect to ground; AUC I ¼ area under the curve with respect to increase; RT ¼ reactivity; PkRT ¼ peak reactivity; Slp ¼ slope from baseline to 40 min value; Int ¼ intercept of the regression line fitted through the raw cortisol data; Slp Raw ¼ slope of the regression line fitted through the raw cortisol data; % change ¼ percent increase/decrease from 0 to 40 min. in this validation sample, but not in the original sample. It is not entirely clear whether this is a substantive issue based on sample or challenge differences, or a statistical issue based on small sample size in this validation sample. As discussed below, however, these minor divergences have little influence on component congruence.

Older adult sample
A PCA was also conducted for the sample of older adults (Table 8). Similar to the other PCA results, component 1 reflects total cortisol output and component 2 reflects change in cortisol. These two components account for 92% of the variance. Similar to the other samples, the intercorrelation between the two components was not significant at r ¼ .13. Table 9 shows that component structure is stable across participants and challenge. That is, coefficients of congruence ranged from .83 to .99 across participants during different challenges (e.g., infants and clinically depressed adults during the TFT and TSST, respectively). This is particularly notable given the small sample size of depressed and older adult samples.

Discussion
Cortisol is a commonly used marker of stress responsivity in both the developmental and adult literature. These literature incorporate an array of cortisol indices, often without clear definition or justification for use. Several investigators have suggested that this state of affairs is counterproductive (Atkinson et al., 2013;Dickerson and Kemeny, 2004). The present study examined associations amongst 15 cortisol indices (selected from a survey of recent HPA research), using primary samples of mother and infant cortisol collected during two challenges at two time points, as well as two validation samples of clinically depressed and healthy older adults. We conducted correlation and principal component analyses with these indices to explore their interrelationships and underlying dimensions and assist in bringing explicit biometric   consensus to the literature. To integrate and successfully interpret existing literature, a necessary first step is to understand how various cortisol indices are associated. Several variables were multicollinear, and there was wide variability in the correlation matrices of the remaining variables. The lower correlations were the first indication that these variables capture more than one dimension. At the same time, the higher correlations suggest that some variables share substantial variance; these results are a first indication that the plethora of indices utilized in the cortisol challenge literature is likely unnecessary, and superfluous. Our next step was to examine what underlying dimensions are measured by the indices identified as nonredundant. All PCAs extracted two components, representing total cortisol production and change in cortisol over time (see Tables 6e8). The components explained a very high percent of the variance (between 79 and 93 percent). In addition (and corollary to the high variance explained by the components themselves), the relevant component loadings are consistently very high. This indicates that each index is an effective marker of its underlying dimension. It also indicates, however, that many of these indices are redundant. Congruence coefficients showed that components are consistent across time, challenge and participant sample. Overall, the robust congruence coefficients attest to the reliability of the current findings.
Of the indices included in the PCA (keeping in mind that some indices were excluded prior to conducting the PCA), the AUC G and AUC I consistently loaded highly on component 1 and component 2, respectively, and each index loaded only weakly on the other component, across all samples. In addition, the reactivity and slope variables loaded very highly on component 2, and weakly on component 1, suggesting that they too could serve as potential markers for cortisol change. Overall, there is a need for consistent use of indices in this literature, with justification for index choice; the use of numerous indices measuring the same underlying construct (total cortisol or cortisol change) may thus be unnecessary and potentially confusing.
The demonstration of two largely independent components also underscores the need to determine whether these statistically derived constructs are related to physiologically distinct processes. To our knowledge, researchers have yet to directly explore whether there is a biological basis for the differentiation between total cortisol output and change in cortisol. Variability in cortisol reactivity is influenced by both genetic and environmental factors (Laurent et al., 2011;Young and Nolen-Hoeksema, 2001). However, research has not yet distinguished aspects of this cortisol responsivity. Similar questions have been explored in the diurnal cortisol literature. For example, the cortisol awakening response (CAR) is a discrete component of the cortisol diurnal cycle, with notable differences between the awakening response (measured as AUC) and mean cortisol levels throughout the day (Clow et al., 2004;Edwards et al., 2001). Further research demonstrates unique biological associations, such as genetics (Wust et al., 2000), neurobiology (Sage et al., 2001) and physiology (Pruessner et al., 1997) with CAR, but not daily circadian cortisol levels. These associations have yet to be explored in regards to differentiating cortisol responsivity to challenge. Our findings of two independent responsivity dimensions present a beginning point for investigation of differential physiological underpinnings.
In the present study we used diverse samples across different challenges. Interestingly, although the chronically depressed sample and the older adult sample both participated in the TSST, the chronic depressed sample had higher cortisol production compared to the healthy controls. In the literature, there is great variability in cortisol responses to different laboratory stressors (including the TSST), for depressed and older adult samples. The cortisol values reported in the literature for depressed and remitted depressed samples are sometimes lower than those presented here (e.g., Ahrens et al., 2008;Morris and Rao, 2013;Stewart et al., 2013), however, there are some studies that report values comparable to ours. For instance, one study showed similar cortisol responses to the TSST in individuals with remitted depression (Bagley et al., 2011). Other studies reported similar baseline cortisol values in depressed inpatients (Croes et al., 1993) and undergraduate samples with high depression (Scarpa and Luscher, 2002), in the context of non-TSST stressors. Other studies examining older adults with and without depression, showed variable cortisol responses to the TSST, with some groups exhibiting comparable cortisol levels to those found here (Armbruster et al., 2011;Taylor et al., 2006). Despite the cortisol production differences between the depressed and older adult samples, the factor structures were replicated between all samples.
The presented analyses, and the studies they are based on, are not without limitations. First, although the cortisol collection time points (i.e., baseline, 20-and 40/45 min) likely allowed for the capture of baseline and peak Schwartz et al., 1998;Stroud et al., 2009), they likely did not permit examination of cortisol recovery (Goldberg et al., 2003;Stroud et al., 2009). Second and related, these results are limited by the cortisol collection time points (baseline, 20-and 40/45 min) typically used in the literature and may not generalize to various sampling times reported in other studies. Research demonstrates Note. Only component loadings greater than 0.4 are bolded. Baseline ¼ baseline cortisol value; 20 min ¼ cortisol value at 20 min post-challenge; 40 min ¼ cortisol value at 40 min post-challenge; Min ¼ minimum cortisol value; AUC G ¼ area under the curve with respect to ground; AUC I ¼ area under the curve with respect to increase; RT ¼ reactivity; PkRT ¼ peak reactivity; Slp ¼ slope from baseline to 40 min value; Int ¼ intercept of the regression line fitted through the raw cortisol data; % change ¼ percent increase/decrease from 0 to 40 min. that examining a high frequency of cortisol samples allows for a more nuanced understanding of the stress responses (e.g., Engert et al., 2013). Thus, further research should include a wider range of samples when examining the biometric properties of the cortisol response. Third, some of the indices remained non-normally distributed after transformation, though factor analysis and PCA is robust to non-normality (Atkinson, 1988). Fourth, the current analyses did not include cortisol trajectories (Davis and Sandman, 2010;Laurent et al., 2011), so we cannot comment on their relation to the composite indices assessed here. A major strength of this study, one the other hand, involves the use of diverse samples, which vary across developmental stage, mental health status, and challenge, offering some assurance of generalizability.

Conclusion
In summary, PCA of 15 cortisol indices revealed a consistent two-component structure, representing total cortisol output and cortisol change. This component structure was reliable across time, challenge and participant. This study provides an early step in gaining biometric clarification of the cortisol response literature.

Declaration of interests
The authors do not have any financial or personal interests to disclose.