Associations between hair and salivary cortisol, salivary alpha-amylase, and temperament dimensions among 3–6-year-olds

Associations between hair cortisol concentration (HCC), diurnal salivary cortisol (sCort) and alpha-amylase (sAA), and temperament dimensions were examined among 3-6-year-old Finnish children (n = 833). Children's hair samples were collected at preschool, while parents collected five saliva samples from children during one weekend day and completed a questionnaire assessing child's temperament dimensions i.e. surgency, negative affectivity, and effortful control (HCC, n = 677; AUCg of sAA, n = 380; AUCg of sCort, n = 302; temperament dimensions, n = 751). In linear regression analysis, diurnal sCort associated positively with HCC, the association persisting after adjustments (β 0.31, 95% CI 0.20-0.42). In logistic regression analysis, increasing scores in effortful control associated with higher likelihood of having high HCC (OR 1.47, 95% CI 1.07-2.03), the association slightly attenuating to non-significant after adjustments. Otherwise, no clear indication for associations between temperament and stress-related biomarkers were found.


Introduction
Balanced stress regulation is important for promoting both physical and mental health and wellbeing (Turner et al., 2020). Two biological stress systems are activated in response to stressors: the sympathetic (autonomous) nervous system (SNS) and the hypothalamic-pituitaryadrenal (HPA) system (Chrousos, 2009). The HPA activity results in secretion of cortisol from the adrenal glands (Chrousos, 2009). Cortisol secretion shows a strong diurnal rhythm: it peaks shortly after awakening in the morning and declines during the day (Hucklebridge et al., 2005). Because blood withdrawal is unrealistic in many study designs, cortisol is often measured in saliva as a reliable marker of HPA axis stress response (Kirschbaum and Hellhammer, 1989). Also salivary alpha-amylase (sAA) serves as an easily obtained surrogate marker of autonomic nervous system activity in adults and children (Nater and Rohleder, 2009) as it increases immediately after acute stress and returns to baseline level quickly (Engert et al., 2011).
For chronic stress, interpretation of diurnal salivary cortisol (sCort) level is challenging because many factors such as time of day, sleep, eating, and physical activity influence cortisol level (Adam and Kumari, 2009). It is not completely clear how a chronic increased cortisol exposure is reflected in the diurnal cortisol excretion rhythm, and severe stress might sometimes even lead to blunted (i.e., lower) diurnal cortisol level (Bunea et al., 2017). Therefore, the measurement of hair cortisol concentration (HCC) has emerged as a relatively new assessment method of long-term cortisol exposure, proposed also as an biomarker for chronic stress (Liu and Doan, 2019;Russell et al., 2012;Stalder and Kirschbaum, 2012). Overall, these indicators (sCort, sAA, HCC) reflect the functioning of the stress response systems, but it should be kept in mind that they do not reflect stress exposure in and of themselves although called as stress-related biomarkers. Furthermore, there might be other than stress-related factors that affect these indicators too. Among children, higher HCC is associated with male sex and increased body mass index and waist circumference, and potentially with lower socio-economic status, particularly with reference to caregiver education and income (Gray et al., 2018). HCC has been increasingly used to assess children's cumulative cortisol exposure especially because the collection of hair samples is easy and non-invasive (Gray et al., 2018;Liu and Doan, 2019;Vliegenthart et al., 2016). Among adults, significant positive correlations between salivary cortisol and HCC of varying magnitudes have been found (D' Anna-Hernandez et al., 2011;Short et al., 2016;Zhang et al., 2018), but findings have been contradictory among children (Golub et al., 2019;Papafotiou et al., 2017). Furthermore, an asymmetric association between saliva biomarkers sCort and sAA has been suggested (Ali and Pruessner, 2012), but since using the ratio of them to assess dysregulations of the stress systems has also been criticized (Sollberger and Ehlert, 2016), the associations between the different stress-related biomarkers are of interest.
Although physiological stress regulation has been increasingly studied in recent decades as its measurement has become more feasible, more knowledge is needed on the functioning of children's regulatory system and its relation with self-reports and psychological traits. A person's way of reacting to stressful situations depends on several factors, possibly including temperament (Kudielka et al., 2009). Strelau (2001) suggests that certain temperamental characteristics could make more vulnerable to experiencing stress: individuals scoring high in emotional reactivity and low in sensation seeking could, for example, experience the same situation more stressing than an individual with more extroversion or high sensation seeking. Moving further from trait theories, Rothbart and Bates (2006) have described temperament as differences in individuals' reactivity, self-control, affect, and attention. It is relatively stable and biologically based yet environmentally affected. Among 3-7-year-old children, Rothbart et al. (2001) have defined three temperament dimensions: surgency (including extraversion, enjoyment of high intensity activities, high activity level, and impulsivity), negative affectivity (including fear, discomfort, anger or frustration, and sadness), and effortful control (including self-regulatory skills such as inhibitory control, attention focus, and gaining pleasure from low intensity activities).
Previous studies examining the relations between child temperament and physiological stress-related biomarkers have presented somewhat mixed results. In a recent cross-sectional study among toddlers higher total cortisol production was associated with temperamental surgency, but not with negative affectivity and effortful control (Tervahartiala et al., 2020). On the other hand, effortful control has been linked to low salivary alpha-amylase (sAA), the association with salivary cortisol (sCort) being non-significant among toddlers in a cross-sectional setting (Laurent et al., 2012). The same pattern was noted when effortful control was assessed among 30-month-olds and sAA and sCort measured at the age of 6 years (Taylor et al., 2013). In the predictive models in which intrusive or overcontrolling parenting was included, however, no associations between effortful control and salivary stress-related biomarkers were present (Taylor et al., 2013). On the other hand, in a study of stressful situations, effortful control has been associated positively with the reactivity of both sAA and sCort among preschoolers (Spinrad et al., 2009). In a preschool setting, the combination of high surgency and low effortful control associated with low sCort but, through a mediation of aggressive behavior and peer rejection, with high sCort level (Gunnar et al., 2003). Thus, there seems to be diverse and context specific associations between temperament and stress-related biomarkers. Identification of the associations among temperament dimensions and stressrelated biomarkers are needed to better understand the pathways between stress and health outcomes. Furthermore, as early childhood is an important developmental period for calibration of the stress response, this knowledge could help guiding parents and other caregivers to support the development of children with different temperaments through potential stress exposing situations.
In the present study, using data on 3-6-year-old children, we aimed to examine the associations between stress-related biomarkers, ie. diurnal sCort and sAA, and HCC. Moreover, we aimed to study how children's temperament dimensions associate with these stress-related biomarkers. The nature of our study is exploratory, and could be seen as hypothesis generating.

Participants and study design
The Increased Health and Wellbeing in Preschools (DAGIS) study (www.dagis.fi) aimed to gain more knowledge on the socioeconomic differences in children's energy balance-related behaviors and stress. The current research applies the data collected in the cross-sectional phase of the DAGIS study between autumn 2015 and spring 2016 in southern and western Finland. The study was carried out in accordance with the Declaration of Helsinki and approved by the University of Helsinki Ethical Review Board in the Humanities and Social and Behavioral Sciences in February 2015 (#6/2015). Previous papers have described in detail the design and rationale of the study (Määttä et al., 2015) as well as the recruitment process, participants, and data collection (Lehto et al., 2018). In short, the study included 66 public early childhood education and care (ECEC) centers with a participation rate of 56%. Parents of 3592 3-6-year-old children were invited to the study via the ECEC centers. In total, 864 (24%) children participated. The flow chart showing the participation and exclusion has been presented previously (Lehto et al., 2018). Mean age of the children was 4.7 years and 49% of the participants were girls (Table 1). The general population in Finland is relatively ethnically homogeneous, and information on ethnicity or race is not registered anywhere. Thus, we followed the common practice of not requiring it in the survey. In our study, only 3.6% of participating parents reported speaking other language than Finnish or Swedish (two official languages) with their child (the Table 1 Descriptive characteristics among the participants (n = 833). majority of them speaking Russian or Estonian). Thus we assume the participating children to be dominantly of Finnish origin.

Stress-related biomarkers
Children's cumulative measure of cortisol production was assessed using HCC with single sampling. The ECEC personnel were trained in person to collect hair samples (approximately 40 hairs tied together) from the posterior vertex of the scalp of the children, cutting as close to the scalp as possible. The ECEC personnel marked the scalp end of the hair sample, packed the sample in foil, and put it in a small plastic bag. If the hair of a child was too short to be tied up or child refused their hair to be cut, no hair sample was taken. Furthermore, absence from the ECEC center on the sample collection day, due to sickness or some other reason, resulted in missing hair samples even if the child otherwise participated in the DAGIS study. Thus, we received hair samples from 677 children (78% of all DAGIS study participants). The samples were sent to a laboratory where the strands were lined up and cut into two separate 2-cm segments. The laboratory washed the hair samples and extracted steroids according to the protocol of Davenport et al. (2006) HCC was measured from hair samples using a chemi-luminescence immunoassay (IBL, Hamburg, Germany). The intra-assay and interassay coefficient of variance (CV) was below 12% for both. We report the HCC (pg/mg) of the first 2-cm segment nearest the scalp, which indicates HCC of approximately the previous two months.
To measure diurnal stress-related biomarkers, sCort and sAA, parents collected five saliva samples from children during one weekend day (immediately after waking up, 30 min after waking up, one hour after waking up, before lunch, and before bedtime) with Salivette® swabs. Written instructions for the collection of saliva were given and video instructions could be found on the DAGIS webpage. We instructed parents that children should refrain from eating, drinking, and/or brushing teeth for at least 15 min before sampling, and children should chew on the swab for at least one minute until it was completely wet. Parents recorded sampling times along with any notes regarding the sampling in a diary. We asked parents to store the samples in a refrigerator until they were brought to ECEC centers and collected by the research staff. In total, the samples were kept in a refrigerator for 1-3 weeks until they were delivered to the laboratory and stored in freezer (-20 • C) at the laboratory at the Department of Food and Nutrition, University of Helsinki. The sAA analyses were conducted first. Samples were spun at 2700 g for 15 min, and 10 μl of saliva was taken for the assay. The rest of the saliva was transferred to Eppendorf tubes and stored again at -20 • C. For sCort analyses, the saliva samples were spun at 2000 g for 5 min, and 25 μl was taken for the assay. sCort concentrations were analyzed using an enzyme immunoassay and sAA activity using a kinetic enzyme assay (Salimetrics, State College, Pennsylvania, USA). The intra-assay CV was below 4.8% and inter-assay CV below 9.5% for sCort. The corresponding numbers were below 3.6% and below 8.7% for sAA.
Altogether, 779 children (92% of all DAGIS study participants) gave saliva samples, from which at least one analysis value could be successfully assayed. Not all children, however, gave five saliva samples, and there was some variance in the compliance with the protocol. We decided to exclude those children who had samples that were not collected accordingly with the sampling guidelines suggested by Stalder et al. (2016) and Strahler et al. (2017). Thus, we excluded children who had samples collected during two different days (n = 32) or if the sampling diary included remarks indicating a reason for exclusion (e.g., the sample was contaminated) (n = 6) (see Table 2 for final n of the samples).
To obtain the total diurnal response of the sCort concentration and sAA activity, the area under the curve respect to the ground (AUCg) was calculated (Pruessner et al., 2003). Before calculating AUCg, the variables on sCort and sAA were winsorized, as suggested by Schlotz (2011), by giving outliers the value that was three standard deviations from the mean with the lowest value being zero (n = 1-7 samples per sample collection time winsorized). Furthermore, when calculating AUCg, we excluded those children who did not have analysis value or sampling time for all five saliva samples available, leaving 519 children for sCort and 660 for sAA analyses. Children's awakening time (available for 89% of children) and sample collection times were reported by parents and based on that information we excluded those first samples in the morning, which had been taken more than 10 min after awakening. We decided to include those children who did not have awakening time available (57 and 73 children for sCort and sAA, respectively) because excluding them from the analytical sample did not change the results (based on sensitivity analyses; see Chapter 2.5). For the second and third samples, we allowed five minutes of deviation from the protocol, and for the fourth sample collected before lunch we allowed times recorded between 11 a.m. and 3:30 p.m. No time restriction was set for the fifth sample collected before going to bed. Samples not complying with the sample collection schedule were excluded, leaving 302 children (35% of all DAGIS study participants) with eligible values for sCort and 380 (44% of all DAGIS study participants) for sAA analyses. In addition, further exclusion criteria were applied in the sensitivity analyses to examine whether food compliance and other remarks related to sample collection would affect the results (see Chapter 2.5). AUCg: Area under curve with respect to the ground; HCC: Hair cortisol concentration; sAA: Salivary alpha-amylase; sCort: Salivary cortisol.

Temperament
One parent in each family completed the Very Short Form of The Children's Behavior Questionnaire (n = 751), assessing surgency, effortful control, and negative affectivity, as established by the instrument developers (Putnam and Rothbart, 2006). Parents indicated their opinion of their child's behavior during the past weeks in the 36-item questionnaire using a scale ranging from 1 (extremely untrue) to 7 (extremely true). Each dimension consists of 12 items presented in Supplement Table 1. The mean of the 12 items for each dimension represents a child's score for that dimension. Higher scores represent a higher level of the corresponding temperament characteristics. The questionnaire has been shown to demonstrate acceptable internal consistency and criterion validity (Putnam and Rothbart, 2006). In the DAGIS data, the Cronbach's alpha values for surgency, effortful control, and negative affectivity were 0.80, 0.74, and 0.76, respectively.

Covariates
Covariates in the analyses were chosen based on previous studies (Strahler et al., 2017). Parents provided information on children's date of birth and gender (girl or boy). For the analyses, we calculated children's age as the date of conducting the research minus the birthdate. The educational level of both parents was inquired, and the highest level in the household was used in the analyses, categorized as high school or lower (including comprehensive, vocational, or high school), bachelor's degree or similar (including bachelor's degree or college), or master's degree or higher (including master's degree or licentiate/doctorate). Furthermore, parents provided information on household net income level on 10 pre-defined answer options ranging from less than 500 euros to more than 10,000 euros per month. This variable was transformed into household relative income by dividing monthly household income (the mean of the chosen answer option) with household size, where members of the household were added up together with given coefficients: first adult 1, second adult 0.5, and children aged 18 or younger 0.3. Weight and height of the children were measured by trained researchers at ECEC centers, and body mass index (BMI) was calculated as kg/m 2 . Age-and sex-standardized BMI (ISO-BMI) was calculated using the Finnish references (Saari et al., 2011). We also adjusted the analysis for the season of collecting the data (September-October, November-December, or January-April) to take into account the possible seasonal variation in the stress-related biomarkers. The analyses with saliva biomarkers were additionally adjusted with time difference between the last and the first saliva sample collection.

Statistical analysis
We used Student's t-and Chi-Squared tests to compare the basic characteristics of the included and excluded children. Spearman's rankorder correlation analysis was performed to examine bivariate correlations between the different stress-related biomarkers, and between stress-related biomarkers and temperament variables. In addition, accordingly with the normality of the variables, Pearson's correlation analysis was performed when including temperament variables only.
Furthermore, we used linear regression analysis to examine whether AUCg of sCort or AUCg of sAA (first one at the time as the independent variable, and then, in addition, both simultaneously in the model adjusting for each other) were associated with HCC (the dependent variable). For this analysis, the continuous HCC variable was logtransformed to deal with outliers and skewed distribution. With both analyses on sCort and sAA as independent variables, three separate models were used to adjust for the covariates. Model 1 was adjusted for age and gender, Model 2 was additionally adjusted for season of data collection and the time difference between the last and the first saliva sample collection, and Model 3 was additionally adjusted for highest educational level in the household, household relative income level, and the child's BMI. In addition, to examine the independent association of the stress-related biomarkers as suggested by Sollberger and Ehlert (2016), the same three models as described above were further adjusted for sCort when regarding analyses on sAA and for sAA when regarding analyses on sCort.
Furthermore, we examined the association between temperament dimensions (surgency, effortful control, and negative affectivity as independent variables one at the time) and HCC, sCort, and sAA (the dependent variables one at the time) using logistic regression analysis. To deal with outliers and skewed distribution, as well as due to interest to examine both high and low values of the biomarkers, we decided to divide HCC, AUCg of sCort and AUCg of sAA in quintiles. Temperament dimensions were used in the analyses as continuous variables. As we wanted to consider that stress can possibly yield in low cortisol levels in saliva (Bunea et al., 2017) or hair (Khoury et al., 2019), and this might apply to sAA as well, we examined separately the high and low HCC, sCort, and sAA values, thus calculating the odds ratios (OR) and 95% confidence intervals (CI) in two different analyses. Firstly, for having high values (i.e., belonging to the 5th quintile), where we used quintiles 2-4 as the reference category and excluded the 1st quintile from the analyses. Secondly, for having low values (i.e., belonging to the 1st quintile), where we used quintiles 2-4 as the reference category and excluded the 5th quintile from the analyses. Confounding factors were adjusted for by using the following models for HCC: Model 1 was adjusted for age and gender; Model 2 was additionally adjusted for the two other temperament dimensions; Model 3 additionally included season of data collection, highest educational level in the household, household relative income level, and the child's BMI. For analysis regarding sCort and sAA, the models were the same as for HCC, except for additionally adjusting for the time difference between the last and the first saliva sample collection in all models. Finally, regarding analyses on association between stress markers and temperament, possible effect modification of gender was studied by entering the interaction term with the independent variable into the models, as previous studies have found potential gender differences in physiological stress responses (Oyola and Handa, 2017). One potential gender interaction was found (p < .1), and in this case the analysis was conducted separately for girls and boys. Finally, regarding log-transformed HCC variable, a linear regression analyses was applied to study the linear association of HCC with temperament dimensions as supplementary analysis.
Furthermore, for saliva biomarkers, we performed sensitivity analyses by re-running the above-mentioned analyses with filters excluding non-compliant participants according to stricter criteria. Thus, we excluded those first samples in the morning, which were taken more than 5 min after awakening as well as those children who did not have awakening time available. Furthermore, based on reports in the sample collection diary, acute current illness, eating or brushing teeth before sample collection, and non-compliance with other guidelines (e.g., not keeping samples in refrigerator, or possible contamination) were reasons for exclusion. The results of the sensitivity analyses did not notably differ from the results of the main sample (data not shown). Thus, we present the results of the sample that complied to less stricter rules to maximize the statistical power.
All the statistical analyses were conducted using the two-sided 5% level of statistical significance and performed using SPSS Statistics 25 (IBM, Armonk, NY, USA). Table 1 shows the characteristics of the participants, defined as those children from whom there were data on at least one of the following variables: HCC, AUCg of sCort, AUCg of sAA, surgency, effortful control, or negative affectivity (n = 833). Table 2 presents the descriptives of stress-related biomarkers and temperament dimensions. Because of missing information in the above-mentioned variables as well as in covariates, the analyzed samples were smaller than the total DAGIS data. As the number of children with complete AUCg of sCort data (n = 302) was substantially lower than the number of all participants in this study (n = 833), we compared whether the groups of children with and without AUCg of sCort data differed from each other according to demographic factors, HCC, or temperament dimensions. There were no differences in these groups according to participant's age, gender, BMI, HCC, highest educational level in the household, relative household income or scores in surgency, effortful control, and negative affectivity (Supplement Table 2).

Results
The Spearman's r for the correlation between AUCg of sCort and HCC was 0.32 (p < .001). Surgency correlated negatively with effortful control (Pearson's r = -0.27, p < .001) and negative affectivity (Pearson's r = -0.19, p < .001). No other statistically significant correlations were found between dependent and independent variables. A correlation table of the Spearman's r for the examined variables is shown as Supplement Table 3.
In the linear regression analysis, AUCg of sCort had a positive association with HCC in the age and gender adjusted Model 1 (Table 3). Further adjustments with season of data collection, the child's BMI, highest educational level in the household, and household relative income did not change the results. The result remained statistically significant even after adjustment for AUCg of sAA. On the other hand, we found no association between AUCg of sAA and HCC in any of the statistical models examined (Table 3).
In the logistic regression analysis, a positive association was found between effortful control and high HCC in Model 1. The OR for belonging to the highest quintile of HCC was 1.47 for one-point increment in effortful control (95% CI 1.07-2.03, p = .02) ( Table 4). With further adjustments in Models 2 and 3, the association was slightly weaker, close to reaching the significance level (p = .06 in Model 2; p = .08 in Model 3). Furthermore, there was a suggestive inverse association (close to reaching the significance level) between surgency and high HCC (p = .07 in Model 1; p = .17 in Model 2; p = .08 in Model 3) ( Table 4). The associations between temperament dimensions and low HCC level (quintile 1 vs. quintiles 2-4) were also tested, but no statistically significant associations were found (Table 4). However, a suggestive inverse association between negative affectivity and low HCC (p = .06) appeared in Model 3 that was close to reaching the significance level (Table 4). Furthermore, as supplementary analysis, results on linear association of HCC with temperament dimensions are shown in Supplement Table 4, but there were no statistically significant associations.
The temperament dimensions did not associate with diurnal sCort (Table 5), or sAA (Table 6). However, a potential gender interaction was found for the association between surgency and high sAA (p for interaction .053). Still, the results were statistically non-significant in all models in the gender-stratified analysis on the association between surgency and high sAA (data not shown; all p-values > .05). Although the point estimate was above 1 for boys and below 1 for girls, indicating some degree of gender interaction, confidence intervals were large and overlapping (Model 1: OR 1.36, 95%CI 0.83-2.23 among boys; OR 0.74, 95%CI 0.48-1.14 among girls).

Discussion
In this study we examined the associations between stress-related biomarkers (i.e., HCC as well as diurnal sCort and sAA) among 3-6year-old children. In addition, we examined whether a child's temperament dimensions associated with the above-mentioned biomarkers. The diurnal sCort level was positively associated with HCC independently of sAA, whereas the diurnal sAA was not associated with HCC. The association of the diurnal sCort level and HCC has been examined in many studies among adults, and although significant positive correlations have been found, many times the effect sizes have been modest (Stalder et al., 2017). Among few studies on school children, the results on HCC have been rather contrasting, with one study showing a moderate positive correlation to AUCg of sCort (Papafotiou et al., 2017), and another showing no correlations to almost all of the examined salivary cortisol measures (Golub et al., 2019). Although hair and saliva samples were taken in our study at the same time, they reflect cortisol concentrations across different durations, as the HCC of 2-cm hair reflects cumulative cortisol levels up to 2 months prior to the data collection, and the diurnal AUCg of sCort reflects one-day cortisol output. A positive correlation between AUCg of sCort and HCC, as in our study, was also found among adults by Short et al. (2016) when they examined the correlation of a prior 30-day mean sCort AUCg score and 1-cm HCC (Short et al., 2016). Furthermore, the same study showed no associations among the monthly average of other sCort measures such as cortisol awakening response (CAR) or diurnal slope and HCC (Short et al., 2016). Thus, our results together with previous studies on HCC support the use of HCC as a biomarker of cumulative long-term cortisol levels (Short et al., 2016;Stalder and Kirschbaum, 2012), which is a good ground for future studies examining children's stress-related biomarkers and, for example, development and well-being.
The lack of association between sAA and HCC is a plausible result, as sAA levels increase immediately after acute stress and return to baseline level quickly. Moreover, SNS is characterized by instantaneous fluctuations in physiological markers of the stress response, whereas the HPA axis is slower to respond (Ali and Pruessner, 2012). Thus, activation of different stress-related systems follows different time courses and comparing single output values or long-term exposure does not take into account the distinct temporal dynamics between cortisol and alphaamylase secretion (Engert et al., 2011). There are very few studies on stress biomarkers and temperament among children in the same age group as in our study, thus studies conducted in other age groups and in different study designs are also considered in this discussion. We noted only one statistically significant association regarding temperament dimensions and stress-related biomarkers. Higher scores in effortful control were associated with increased odds of being in the highest HCC quintile, although the association attenuated in the fully adjusted model. This is somewhat unexpected result, as a previous study suggested that higher chronic stress Table 3 Associations of salivary cortisol and alpha-amylase to hair cortisol concentration, linear regression, standardized B coefficient, and its 95% confidence interval. could associate with lower scores in effortful control during children's transition to school (Hall and Lindorff, 2017). Although the crosssectional design of our study does not allow conclusion on the direction of the association, the present result could be interpreted so that children scoring high in effortful control have higher HCC levels. If speculating, this could possibly be due to high expectations from their caregivers that these children could cope more independently than other children and are left without caregivers' comfort and support compared to children who are expressing their feelings more prominently. Moreover, an assumption that children scoring high in effortful control could be experiencing less stress due to their potentially better self-regulation skills was not supported in our study.

Table 4
Odds ratios (OR) and 95% confidence intervals (CI) for having high 1 or low 2 hair cortisol concentrations (HCC) by the three temperament dimensions (surgency, effortful control, and negative affectivity). Model 3: Adjusted as Model 2 + season of data collection, highest educational level in the family, household relative income level, and the child's BMI. a The 5th quintile vs. quintiles 2-4 (the 1st quintile excluded from analyses). b The 1st quintile vs. quintiles 2-4 (the 5th quintile excluded from analyses).

Table 5
Odds ratios (OR) and 95% confidence intervals (CI) for having high 1 or low 2 AUCg of sCort values by the three temperament dimensions (surgency, effortful control, and negative affectivity). AUCg: Area under the curve with respect to the ground; BMI: Body mass index; sCort: Salivary cortisol. Model 1: Adjusted for age, gender, and time difference between the saliva samples 5 and 1. Model 2: Adjusted as Model 1 + the other temperament dimensions. Model 3: Adjusted as Model 2 + season of data collection, highest educational level in the family, household relative income level, and the child's BMI. a The 5th quintile vs. quintiles 2-4 (the 1st quintile excluded from analyses). b The 1st quintile vs. quintiles 2-4 (the 5th quintile excluded from analyses).

Table 6
Odds ratios (OR) and 95% confidence intervals (CI) for having high 1 or low 2 AUCg of sAA values by the three temperament dimensions (surgency, effortful control, and negative affectivity). Model 3: Adjusted as Model 2 + season of data collection, highest educational level in the family, household relative income level, and the child's BMI. a The 5th quintile vs. quintiles 2-4 (the 1st quintile excluded from analyses). b The 1st quintile vs. quintiles 2-4 (the 5th quintile excluded from analyses).
No relation between effortful control and sCort was found which resembles previous studies among toddlers (Laurent et al., 2012;Tervahartiala et al., 2020) and preschoolers (Taylor et al., 2013). The differences in results regarding HCC and sCort are understandable because both HCC and temperament reflect a long-term state of affairs, whereas AUCg of sCort in this study described the cortisol secretion of one day. Similarly, when examining emotional and behavioral symptoms among schoolchildren, Golub et al. (2019) found them to associate stronger with HCC than with several one-day sCort measures.
In the present study, we found a suggestive tendency for the children with high surgency to be less likely in the highest HCC quintile. Although statistical significance was not reached for this result, it seems that these children could possibly show lower long-term cortisol levels despite their impulsive temperament and high activity level. This is somewhat in contrast with a recent finding on positive association between surgency and diurnal salivary cortisol production (Tervahartiala et al., 2020). However, if speculating, these findings could be interpreted so that in long-term temperamental surgency would not lead in increasing stress load even if associated with increased cortisol production during children's daily activities. Perhaps, enjoyment of high intensity activities might give children outlets to cope with potential stressful situations, or that children with high surgency scores are allowed to express their temperament in an approving environment.
Furthermore, we found a suggestive association between higher negative affectivity and lower odds of being in the lowest HCC quintile that was close to reaching the significance level, implying that children with low scores in negative affectivity might have lower HCC values. As there is no established threshold for blunted hair cortisol levels in young children, no conclusion or implications can be drawn from this result. However, it seems plausible that children not having tendency towards fear, discomfort, anger, frustration, and sadness could be those that have lower long-term HCC levels. Interestingly, the opposite phenomenon was not seen, as negative affectivity was not associated with belonging to the highest HCC quintile. Neither was negative affectivity associated with the saliva biomarkers in this study. A previous study found a modest positive association between fearfulness, a sub-dimension of negative affectivity, and cortisol reactivity in laboratory tasks (Talge et al., 2008). The major difference between findings could be due to that our study included only indicators of accumulation of the biomarkers, not reactivity to acute stressors, but more research to clarify the associations between negative affectivity and long-term cortisol exposure is needed.
Furthermore, when interpreting the above-mentioned results, the subjective nature of the parental assessment of child temperament should be considered. Although we used a widely accepted method, the Very Short Form of The Children's Behavior Questionnaire (Putnam and Rothbart, 2006), some studies have shown that parental reports of children's temperament can differ, for example, depending on the child's gender, parent's gender, and parental psychological characteristics, such as depressive symptoms (Clark et al., 2017;Kitamura et al., 2015;Olino et al., 2013).
We did not find many associations between temperament dimensions and stress-related biomarkers. It is possible that the first and fifth quintiles of the stress-related biomarkers as an outcome do not reflect the biological thresholds for being "stressed". Among children as young as in the present study, the range of HCC has been found to be wide and age-dependent (Karlen et al., 2013), which postulates a challenge to determine these thresholds. However, our analytical strategy was reasonable because no cut-off points or risk limits for too high or too low levels of stress-related biomarkers have been established. Previous metaanalyses have reported, among individuals who had experienced major stress such as childhood maltreatment, both blunted sCort secretion as a response to an acute stressor (Bunea et al., 2017) and low HCC (Khoury et al., 2019). Our sample, on the contrary, consisted of a general Finnish child population (i.e. the sample was indiscriminating to possible stress experiences), and we examined the levels of stress-related biomarkers in a daily life setting.
Furthermore, the associations between temperament and stressrelated biomarkers are not necessarily straightforward. In path analyses examining the association between preschoolers' high surgency/ low effortful control and higher sCort, for example, the association could partly be explained by more aggressive behavior and peer rejection among these children (Gunnar et al., 2003). Moreover, future studies might benefit from studying different mediators or moderators in the associations between temperament and stress-related biomarkers. For example in the present study, a potential gender interaction was found for the association between surgency and high sAA, but it seems that the analysis might have been underpowered for gender stratified examination.
The main strengths of this study are its relatively large sample size concerning the HCC analyses as well as the age group studied, as few studies have examined relations of stress-related biomarkers in 3-6year-old children. In addition, the study is among few studies to report associations between temperament dimensions and stress-related biomarkers in children in everyday life settings. Yet another strength is the broad perspective on indicators reflecting the functioning of stress response systems, as we examined the associations of temperament with high and low HCC, sCort, as well as sAA.
A limitation of the study was the cross-sectional design, which does not allow conclusions on the direction of the associations. Furthermore, there could be potentially important mediators explaining how and why temperament is related to stress processes but they could not be examined in cross-sectional models as they require temporal precedence to be established. Another limitation of the study was that the salivary biomarkers (sCort and sAA) included only one day of assessment. Furthermore, although hair and saliva samples were taken in our study at the same time, they reflect cortisol concentrations across different durations, as the HCC of 2-cm hair reflects cumulative cortisol levels up to 2 months prior to the data collection, and the diurnal AUCg of sCort reflects one-day cortisol output. However, HCC had a statistically significant positive correlation with sCort in this study, indicating aligned results for different cortisol measures and supporting the use of these biomarkers in future studies from our DAGIS data.
In addition, the analytical sample size of the saliva biomarkers was substantially diminished by the poor compliance to the sample collection instructions. Especially in the gender-stratified analyses, the number of children in the groups was low, which resulted in wide confidence intervals. In large non-controlled studies, cortisol assessments using multiple saliva sample collections are challenging. Participant compliance can be a serious problem because following strict guidelines and sample scheduling is demanding especially among families with young children. However, based on parent-reported information during the sample collection, we cleaned the data carefully as described in the methods section. When additionally excluding those children who had not fully complied with the saliva collection instructions (e.g., regarding food compliance), the results remained the same. In future studies, emphasizing the importance of complying with the sample collection protocol and the use of time-detected sample collection methods is warranted.
Overall, associations between temperament and children's stressrelated biomarkers should be further studied with different study designs, and, especially concerning the saliva biomarkers, with larger sample sizes. With larger sample sizes it would be possible to examine, for example, children in extreme groups of the biomarkers or temperament, or stratify the analysis by potential moderators. However, due to the above-mentioned compliance problems and burden for the participants, HCC seems to be a better option in large multicomponent studies.
Another limitation is the low participation rate among families, probably because of high respondent burden as we examined several energy balance-related behaviors and stress simultaneously (the study protocol including multiple questionnaires and instructions for participants and their families). This weakens the generalizability of the findings, but should not cause biased results. However, if the sample is rather homogenous due to low participation rates, it might have led to more narrow distributions and even have caused conservative estimates of the associations. At the same time, the findings between temperament and stress-related biomarkers were few and relatively weak, so they should be interpreted with caution. The reason might be due to quantifying the variables. However, caution when interpreting the results is warranted also because the supplementary linear regression analysis showed no association at all between HCC and temperament dimensions.
We did not use other measures of sCort or sAA than AUCg, such as CAR, alpha-amylase over cortisol (AOC) ratio, or cortisol over alphaamylase (COA) ratio in our analyses. When using CAR, the right timing of the sample collection in the morning is of crucial importance , and we were doubtful of parents' compliance to the sample collection schedule, especially concerning the first samples in the morning. Furthermore, although the AOC ratio has been associated with several aspects of chronic stress better than AUCg of sAA or sCort alone (Ali and Pruessner, 2012), it is also a debated measure with acknowledged challenges in statistical methods and interpretation of the results (Sollberger and Ehlert, 2016). Thus, instead of using AOC in our analyses, we adjusted for sCort when analyzing the association between HCC and sAA and for sAA when analyzing the association between HCC and sCort (presented in Table 2) to examine the independent association of one saliva biomarker, as proposed by Sollberger and Ehlert (2016). Because the adjustment did not affect the results at all in the abovementioned analyses, or in the further analyses examining the associations of sCort or sAA with temperament (data not shown), we decided not to include the other saliva biomarker as a covariate in the final models presented in Tables 4-6.
Unfortunately, we did not collect information on glucocorticoid or other kinds of medication that could have influenced the level of the biomarkers of interest in the saliva or hair. Furthermore, no information on hair color, type, washing frequency, or use of hair products or treatments were collected, although e.g. use of hair dye could associate with lower HCC (Abell et al., 2016). However, most of the previous studies have found inconclusive or no effects of the above-mentioned factors on HCC (Gray et al., 2018). Furthermore, it is unlikely that young children would have used heavy hair treatment or coloring. Although information on illness, eating, or brushing teeth, for example, was inquired with the sampling diary, we could not rule out the possibility that some of the parents did not report these.

Conclusions
This study found a positive association between one-day diurnal sCort output and HCC representing long-term cortisol levels in young children, providing further reasoning for the use of these biomarkers in epidemiological studies. Children's temperament dimensions showed few associations to HCC. Effortful control was positively associated with high HCC, a finding contradicting the suggestion that children with high scores in effortful control could be experiencing less stress. Furthermore, suggestive (close to reaching the significance level) inverse associations between surgency and high HCC, and negative affectivity and low HCC, were noticed. No associations between temperament and saliva biomarkers were detected. As our findings can be seen as hypothesis generating, they require further investigation. Future studies should elaborate on how children's temperament, assessed as temperament subdimensions and different combinations of temperament dimensions, is established on a behavioral level and associated with stress-related biomarkers.

Declaration of competing interest
All authors declare that they have no competing interests.