Testosterone and estradiol affect adolescent reinforcement learning

Sina Kohne; Esther K. Diekhof

doi:10.7717/peerj.12653

Testosterone and estradiol affect adolescent reinforcement learning

Faculty of Mathematics, Informatics and Natural Sciences, Department of Biology, Institute of Animal Cell and Systems Biology, Neuroendocrinology and Human Biology Unit, Universität Hamburg, Hamburg, Germany

DOI: 10.7717/peerj.12653

Published: 2022-02-03
Accepted: 2021-11-29
Received: 2021-04-28

Academic Editor: Charles Okpala

Subject Areas: Developmental Biology, Neuroscience, Pediatrics, Psychiatry and Psychology
Keywords: Adolescence, Learning, Reward, Estradiol, Testosterone

Copyright: © 2022 Kohne and Diekhof
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Kohne S, Diekhof EK. 2022. Testosterone and estradiol affect adolescent reinforcement learning. PeerJ 10:e12653 https://doi.org/10.7717/peerj.12653

The authors have chosen to make the review history of this article public.

Abstract

During adolescence, gonadal hormones influence brain maturation and behavior. The impact of 17β-estradiol and testosterone on reinforcement learning was previously investigated in adults, but studies with adolescents are rare. We tested 89 German male and female adolescents (mean age ± sd = 14.7 ± 1.9 years) to determine the extent 17β-estradiol and testosterone influenced reinforcement learning capacity in a response time adjustment task. Our data showed, that 17β-estradiol correlated with an enhanced ability to speed up responses for reward in both sexes, while the ability to wait for higher reward correlated with testosterone primary in males. This suggests that individual differences in reinforcement learning may be associated with variations in these hormones during adolescence, which may shift the balance between a more reward- and an avoidance-oriented learning style.

Introduction

Sex hormones have a great impact on adolescent (neuro-) physiological maturation. With the onset of puberty at 9 to 10 years in girls and 10 to 12 years in boys, respectively, sex hormone level increases rapidly (Peper & Dahl, 2013). Sex hormone levels are regulated via the reproductive hypothalamic-pituitary-gonadal axis initiated by the secretion of hypothalamic gonadotropin releasing hormone (GnRH). GnRH thereby stimulates the synthesis and secretion of luteinizing hormones and follicle stimulating hormones in the pituitary, which in turn contribute to the maturation of the gonads and sex hormone secretion (Sisk & Foster, 2004).

The rising sex hormone level during adolescence significantly contributes to pubertal development. With attainment of sexual maturity, sex hormones maintain reproductive function (Sisk & Foster, 2004). Neurophysiological investigations demonstrated a different impact of testosterone and 17β-estradiol (E₂) on brain maturation. Testosterone is related to an increase of global white and gray matter volume in male adolescents (Peper et al., 2009; Peper et al., 2011), whereas in female adolescents E₂ may be negatively associated with gray matter volume (Peper et al., 2009). Further, E₂ seems to predict white matter growth across the entire brain in both sexes (Herting et al., 2014). Moreover, neurophysiological developmental changes during adolescence could be better explained by hormonal and pubertal development (measured by the Pubertal Development Scale or Tanner Stages) than by chronological age (Herting et al., 2014; Wierenga et al., 2018).

Sex hormones are very important when it comes to behavior and cognitive function in animals and humans. Besides the impact of E₂ and testosterone on adolescent reward-related risk-taking (i.a. Op De Macks et al., 2016), an influence on reward-related learning and cognition has been assumed as well (Diekhof, 2018; Hamson, Roes & Galea, 2016). In adult women, E₂ may promote verbal memory and fluency (Hamson, Roes & Galea, 2016). In gonadectomized male and female rats, E₂ was found to improve learning and memory even after physiological or psychological stressors (Hamson, Roes & Galea, 2016; Khaleghi et al., 2021). Moreover, studies with castrated male rats suggested that learning may be improved by testosterone treatment (Spritzer et al., 2011). In healthy older men, a short-term testosterone administration improved cognitive performance significantly (Cherrier et al., 2001). Findings from children (6 to 9 years) further showed a relationship between moderate testosterone levels and an average intelligence (IQ between 70 and 130), whereas enhanced testosterone concentrations were related to high (IQ > 130), but also low intelligence (IQ < 70) (Ostatníková et al., 2007). Other studies also reported enhanced testosterone concentrations in children and young adolescents (6 to 13 years) with learning disabilities compared to peers without impairments (Kirkpatrick et al., 1993). Given this evidence, one may assume that during early adolescence balanced testosterone concentrations may be important for efficient cognitive processing.

One way for sex hormones to modulate aspects of reward processing and reinforcement learning is through the neurotransmitter dopamine. Both estradiol and testosterone can act as natural dopamine-agonists, which promote dopamine release and dopaminergic transmission through various physiological mechanisms (Becker, 1990; Castner, Xiao & Becker, 1993; Pasqualini et al., 1995; Sinclair et al., 2014). This is in so far important, since dopamine plays a crucial role in reinforcement learning and determines how proficient individuals learn from positive or negative action outcomes. It has been assumed that changes in dopamine following so called reward prediction errors possibly act via two anatomically distinct pathways in the mesocorticolimbic dopamine system (Maia & Frank, 2011). The activation of the Go pathway after the dopamine burst that follows unexpected reward entails in a repetition of the same action. In turn, activation of the NoGo pathway results from a dip in the tonic dopamine level, which facilitates learning from unexpected reward reduction, omission, or even punishment. This optimally promotes an adaption of action choice to maximize overall reward (Frank, Seeberger & O’Reilly, 2004).

A study using a response time (RT) adaption task, the so-called “clock task”, demonstrated this relation between dopamine and reinforcement learning by showing that patients with Parkinson’s disease, but pharmacologically normalized dopamine concentration, were better in the Go learning aspect of the task. These medicated patients thereby showed an enhanced ability to speed up for a reward (i.e., better ability to acquire a higher reward through quickly responding after trial onset). In comparison, in an unmedicated state and thus with pathologically lowered dopamine, the same patients, demonstrated a better NoGo learning ability. This was indicated by an increased capacity to slow down responding for reward maximization (i.e., enhanced capacity to wait for higher reward) (Moustafa et al., 2008).

With the same task, Diekhof and colleagues characterized the impact of periodically fluctuating sex hormones in women on Go as opposed to NoGo learning ability. They compared the RT adaption during three different menstrual cycle phases of late luteal phase, luteal phase and early follicular phase. During the late follicular phase E₂ is high and progesterone still remains low. In the luteal phase progesterone nears its maximum (Reimers, Büchel & Diekhof, 2014), whereas in the early follicular phase E₂ and progesterone are at their nadir (Diekhof, 2015). Reimers, Büchel & Diekhof (2014) concluded that heightened E₂ during the late follicular phase impaired the ability to slow down for reward maximization (NoGo learning ability), as opposed to the ability to speed up for higher reward (Go learning capacity). Diekhof (2015) extended these findings by showing a positive correlation between E₂ and the ability to speed up for reward during the early follicular phase. This latter study indicated a better Go vs. NoGo learning ability during the early follicular phase and assumed that the boosting influence of the still increasing, yet intermediate E₂ on dopamine probably optimally promotes Go learning ability.

Regarding the impact of testosterone on reward processing and reinforcement learning, data from humans are currently sparse. Also, rodent studies provide inconsistent findings about the influence of testosterone on reward processing. It has been observed that testosterone administration enhanced tyrosine hydroxylase (the rate-limiting enzyme catalyzing dopamine synthesis) in the substantia nigra of gonadectomized adolescent male rats (Purves-Tyson et al., 2012). Yet, testosterone may reduce tyrosine hydroxylase in gonadally intact adolescent male rats in the caudate putamen (Wood et al., 2013). Further, testosterone administration in gonadectomized adolescent male rats enhances mRNA of the dopamine degrading enzymes catechol-O-methyltransferase and monoamine oxidase in the substantia nigra (Purves-Tyson et al., 2012). In contrast, testosterone led to a significant increase of dopamine in the nucleus accumbens and dorsal striatum of gonadally intact male rats. Finally, in humans testosterone has been found to enhance striatal activity in the context of reward processing, while it decreased activation of the striatum during punishment processing (Morris et al., 2015).

Previous studies with early adolescents and young adults could not find a relation between testosterone and performance in cognitive or reward-related tasks (Halari et al., 2005; Ladouceur et al., 2019; White et al., 2020). Therefore, no clear assumptions can be made regarding the influence of testosterone on Go and NoGo learning. However, in light of its physiological significance for dopaminergic processing, a positive influence on reward processing and Go learning may be assumed.

Current study

In the present study, we assessed response time adjustments and learning behavior in the context of reward maximization in an adolescent sample. The salivary E₂ and testosterone concentration was measured on the test day, which enabled us to examine the effect of the two sex hormones on Go and NoGo learning capacity. The adolescents performed an RT adjustment task, the so-called clock task (modified by Diekhof, 2015; created by Moustafa et al., 2008). In line with findings from adult research, we predicted that Go learning, associated with a better capability to speed up responding to maximize reward, would be related to “a higher E₂ concentration” (e.g., Diekhof, 2015; Reimers, Büchel & Diekhof, 2014). Studies reporting a behavioral influence of testosterone on reward-related processing and especially reward learning are scarce. Whether higher testosterone would positively influence Go learning as well, could not be unconditionally hypothesized. Therefore, we examined the relation of testosterone and reinforcement learning capacity with the same analysis that was used to consider the impact of E₂. Finally, we hypothesized that the effects of sex hormones on reinforcement learning would be different in female and male adolescents, mostly due to higher E₂ concentrations in females and enhanced testosterone in males.

Materials & Methods

Participants

In total, 106 healthy German adolescents, between 11 and 18 years old, participated in this study. All participants had no history of psychiatric or neurological disorders and assured no regular medication intake. Fifteen adolescents were excluded from the analysis, because they showed a random response pattern throughout the task, which suggested that the task instructions had not been properly understood or that the respective participant lacked the motivation to perform the task properly. Another two participants were excluded because of technical problems that left the task unfinished. In sum, the data of 89 adolescents (mean age ± SD = 14.74 ± 1.9 years; 52 females) were analyzed.

Every participant had to sign a written declaration of informed consent before participation. In the case of minority, a legal guardian (parent) also had to sign a written declaration of informed consent before the testing. The adolescents were recruited in sports and other leisure clubs. The study protocol was approved by the local ethics committee of the Ärztekammer Hamburg (Ref: PV3948) and the study was conducted in accordance with “The Code of Ethics of the World Medical Association” (Declaration of Helsinki).

On the test day, participants were screened for depressive symptoms with the validated German Depression Inventory for Children and Adolescents (Stiensmeier-Pelster et al., 2014). Individual cognitive capacity was tested via the Digit-Span Test by measuring both forward and backward span from the German version of the Wechsler intelligence scale for childen (Wechsler, 2014) by counting the numbers that were correctly recalled. Self-reported trait impulsivity was examined with the German Version of the Barratt Impulsiveness Scale (BIS-11) for adolescents (Hartmann, Rief & Hilbert, 2011). Finally, every participant and the corresponding legal guardian filled out a translated version of the Pubertal Development Scale (PDS) (Petersen et al., 1988). We then calculated a mean of both scores and used it as an indicator of the degree of physical pubertal development of the given participant.

Experimental task

A modified version of the clock task (see Diekhof, 2015), that had been introduced by Moustafa et al. (2008) was used. In the task, three differently colored clock faces were presented. A full rotation of the clock arm lasted 5 s. Each clock face was assigned to one of three conditions, namely the fast, the random, and the slow condition. Each of the three clock conditions was shown 50 times in three sessions of 50 trials each, resulting in a total of 150 trials. The sequence of clock faces was pseudo-randomized and balanced for trial-type transitions (see also Diekhof, 2015 for further details on the clock task). The fast clock condition required a fast reaction once the clock arm started to move, in order to maximize reward outcome. The slow clock condition, in contrast, required the participant to postpone responding and slower RTs yielded higher reward. The random condition served as a control variant with no contingency between RT and reward outcome. It was used as an indicator of baseline response preference (see Fig. 1).

Figure 1: Task design.
(A) Reward was calculated using cosine functions for the fast and slow clock. A time-independent function for the random clock was applied as control condition. (B) Clock faces were presented pseudo-randomly for 5,000 ms. Once a button press was made, the clock arm stopped, and immediate feedback was given. After that, a blank screen was shown for the remaining time that the clock arm would have needed to complete the 5,000 ms. Therefore, the blank screen ensures a constant time duration of a trial. A trial ended with the achieved points presented for 1,000 ms.

Download full-size image

DOI: 10.7717/peerj.12653/fig-1

The participants had to adapt to the optimal response speed in each condition to maximize their overall reward. The exact reward value of each trial in the fast and slow condition was calculated with a cosine function, ranging between a minimum of 15 and a maximum of 60 points. The random reward value was calculated with the difference between minimum and maximum points of reward multiplied by a random number and added with the minimum reward value (see Fig. 1). In every condition, a random noise parameter (range between −5 to +4 points) was applied to the reward. This was done to disguise the relation of a specific reward outcome with a specific RT. Immediately after the response, the reward outcome was shown to the participant. For the remaining time of a full clock arm turn, a blank screen was shown. Thus, each trial had the same length. If the participant did not respond within 5 s, no reward was presented, and the participant had to wait another 5 s before the next trial started.

Saliva collection and analyses

In the morning, three saliva samples were collected by the participant in 2 mL microcentrifuge tubes at home. Sample collection took place over the course of one hour (half-hourly samples) and started directly after awakening. The participants were allowed to drink water after the first sample up until 5 min before the second and third sample. They had to refrain from intake of food and beverages other than water during the sampling hour. Saliva samples were stored at −20 °C until further use. Before analysis, samples were thawed and centrifuged at room temperature at RCF 604 ×g (i.e., 3,000 rpm in a common Eppendorf MiniSpin centrifuge) for 5 min to separate the saliva from mucins. For the E₂ analysis, a 17-β-Estradiol Saliva ELISA was used (Limit of Detection: 2.1 pg/mL), coated with anti-17-β-Estradiol antibody (monoclonal) with antibodies derived from donkey and sheep. For the testosterone analysis, a Testosterone Luminescence Immunoassay (both assays from Tecan/IBL International) was utilized (Limit of Detection: 1.8 pg/mL), coated with anti-mouse antibody. Intra-assay precision showed a mean CV of 8.8% (17-β-Estradiol Saliva ELISA) and 7.3% (Testosterone Luminescence Immunoassay). Inter-assay precision showed a mean CV of 11.8 (17-β-Estradiol Saliva ELISA) and 7.3% (Testosterone Luminescence Immunoassay).

The three morning samples were combined in an aliquot sample that consisted of an equal amount of saliva from every tube (100 μL). The analysis was done as described in the respective manual in our in-house laboratory. From the aliquot, two samples were assayed (n = 2). In addition, a high and a low control were analyzed. Subsequent behavioral analyses were done with standardized z-transformed values ( $z_{i} = \frac{X_{i} - \bar{X}}{S_{x}}$ ) for each ELISA plate to standardize measurement inaccuracy of the plates.

Data preprocessing

For each subject, we calculated the mean RTs of each clock type. RTs under 200 ms were discarded, since they were very unlikely to reflect voluntary movements. In all, 125 trials (mean ± sd: 70 ± 72 ms) under 200 ms were excluded. We also calculated the mean RT of the initial 12 trials (called first block) and of the optimized last 12 trials (called last block) for each condition and participant (see Diekhof, 2015; Kohne et al., 2021; Moustafa et al., 2008; Reimers, Büchel & Diekhof, 2014 for a similar procedure). At the beginning of the experiment (in the first block), the participant did not know which clock face was associated with faster or slower responses for higher reward. Hence, the participant had to try to achieve the optimal outcome via various reactions exploring the task structure. Conversely, at the end of the clock task (in the last block), the participant should have been well adapted and was expected to show optimal RTs that led to the highest reward outcome in relation to individual clock faces.

Apart from the mean RT for the three clock types, the actual learning preferences that reflected individual Go and NoGo learning ability, respectively, were calculated from the last block. They reflected the adaption to the optimal response speed to the slow and fast clock, respectively, and allowed us to test the functional opponency of Go versus NoGo learning. For this, the RT of the slow and the fast clock were calculated in relation to the random clock, which provided information on the individual baseline response speed of a given participant. In order to calculate the optimized responses to the slow clock condition, we first subtracted the mean RT of the last 12 trials of the random clock condition from the mean slow clock RT of the last block. For standardization, this difference was then divided by the mean RT of the last 12 trials from the random clock. The resulting standardized relative RT reflects “optimized relative slowing”. Correspondingly, the subtraction of the mean fast clock RT from the mean random RT and its division by the mean random RT was used as the “optimized relative speeding” value.

The individual learning-related change in RT for each clock condition was calculated by subtracting the RT of the first block from the RT of the last block.

Data analyses

The behavioral data were analyzed with IBM SPSS Statistics 25. First, we performed a repeated measures General Linear Model (GLM) with the factors “clock condition” (fast, random, slow), “block” (first, last), “sex” (female, male) and “age” to test for possible effects of these factors on the RT. In another two GLMs the factor “age” was replaced by either the covariate “pubertal development” (PDS-score) or the z-standardized sex hormone concentration of E₂ (zE₂) and testosterone (zT) (see Results section below). This was done to assess the impact of pubertal maturation and sex hormones level on reinforcement learning. Post hoc tests used paired and independent t-tests, which were Bonferroni-corrected for multiple testing. If Levene’s test was significant, Welch’s t-test instead of Student’s t-test was used. The learning preference and effects of covariates were examined with a two-sided Pearson correlation. All effects and differences were considered as significant below a p-value of .05, two-tailed.

Results

Learning preference

Studies with adults revealed a reverse capability for adaptive speeding vs. adaptive slowing of responses in the clock task (Diekhof, 2015; Reimers, Büchel & Diekhof, 2014). Our data demonstrate that this reverse relation in adjustment preferences to either the slow or the fast clock may also exist in adolescents. We found that optimized relative speeding and slowing were negatively correlated in both sexes (females: r = −.48, p < .001; males: r = −.67, p < .001) (see Fig. 2). Adolescents who were better adjusted to the last block of the slow clock had difficulties to speed up for reward. In turn, participants who responded faster to the fast clock in the last block were impaired in the ability to slow down for reward.

Figure 2: Reverse relation of slowing and speeding.
Optimized relative speeding and slowing were negatively correlated in females, and males (p < .001).

Download full-size image

DOI: 10.7717/peerj.12653/fig-2

General group characteristics

The female and male adolescents did not differ in their age, impulsivity (BIS-11), and zE₂ concentration, which was determined by independent t-tests (see Table 1). The only significant differences between the two groups were a significantly higher zT level in males compared to females (t_43.95 = −6.82, p < .001, d = − 1.56) and a more advanced pubertal development of females compared to males (mean_PDSfemales ± se: 3.03 ± .07; mean_PDSmales ± se: 2.72 ± .09, t₈₇ = 2.67, p = .009, d = .57).

Table 1:

Group differences by sex.

	Females		Males		*Females vs.* males**
	Mean ± SD	n	Mean ± SD	n	t	p	95% CI
							lower	upper
Age (years)	14.67 ± 1.96	52	14.84 ± 1.83	37	−.4^c	.689	−.98	−65
zE₂	.14 ± 1.11	49	−.2 ± .56	35	1.59^b	.177	−.09	.77
E₂	5.89 ± 2.63 pg/mL	49	5.27 ± 2.08 pg/mL	35	.80	.425	−.64	1.49
zT	−.53 ± .42	52	.74 ± 1.07	37	−6.82^c	<.001	−1.64	−.89
T	21.58 ± 14.1 pg/mL	52	89.61 ± 63.28 pg/mL	37	−6.43^d	<.001	−89.45	−46.61
BIS-11	63 ± 6.45	52	63.83 ± 9.57	36	−.46^d	.65	−4.5	2.83
PDS	3.03 ± .53	52	2.72 ± .56	37	2.67^a	.009	.08	.55
DICA	11.58 ± 6.37	52	9.39 ± 3.94	36	1.99^e	.05	−.01	4.39
Digit span forward	6.31 ± .9	52	6.31 ± .79	36	.01^f	.991	−.37	.37
Digit span backward	4.85 ± 1.29	52	4.89 ± 1.13	37	−.17^a	.862	−.57	.47

DOI: 10.7717/peerj.12653/table-1

Notes:

at₈₇.

bt₈₂.

ct_43.95.

dt_56.62.

et_85.09.

ft_81.25

gt_38.55.

Influence of age and sex on response time adjustments

In an initial step, we assessed the influence of “chronological age” and “sex” of the participant on learning performance. For this, we used a repeated measures GLM including the covariate “age”, the between-subjects factor “sex” and the within-subject factors “clock condition” (fast, random, slow) and “block” (first, last). We solely found significant two-way interaction of “clock condition” x “block” (F_2,172 = 4.41, p = .014, η²_p = .05). This was reflected by a change in the RT from the initial to the optimized last block in the fast (t₈₈ = 11.08, p < .001, d = 1.17, Bonferroni corrected for three comparisons) and in the slow condition (t₈₈ = −13.79, p < .001, d = − 1.46, Bonferroni corrected for three comparisons), but not in the random condition (t₈₈ = .14, p = 1, d = .02, Bonferroni corrected for three comparisons) (Table 2).

Table 2:

Comprehensive summary of RTs and post-hoc results.

		Mean RT ± SE			Females vs. males					Correlations of all participants
Block	Clock	Females & males	Females	Males	t (df = 87)	p	95% CI		zT		zE₂		PDS
							lower	upper	r	p	r	p	r	p
first & last	FAST	1264 ± 37ms^a,^b,^***	1302 ± 54 ms	1212 ± 293 ms	1.2	.234	−60 ms	241ms	−.12	.327	−.08	.497	−.12	.274
	RANDOM	2196 ± 65 ms^a,^c,^***	2157 ± 562 ms	2253 ± 695 ms	−.71	.477	−360 ms	170ms	.23	.032^**	−.02	.843	.19	.068^*
	SLOW	3458 ± 67 ms^b,^c,^***	3346 ± 653 ms^d,^**	3617 ± 593 ms^d,^**	−2	.048^**	−539 ms	−2 ms	.28	.009^**	−.04	.731	.1	.359
	ALL CLOCKS	2307 ± 333 ms	2269 ± 311 ms	2360 ± 360 ms	−1.28	.203	−234 ms	50 ms	.29	.007^**	−.09	.412	.14	.185
first	FAST	1610 ± 58 ms^d,^***	1655 ± 571 ms	1547 ± 524 ms	.92	.363	−127 ms	345 ms	−.08	.441	−.24	.03^**	−.12	.249
	RANDOM	2203 ± 78 ms	2104 ± 685 ms	2343 ± 794 ms	−1.52	.132	−552 ms	73 ms	.18	.084^*	.08	.469	.1	.354
	SLOW	2945 ± 87 ms^e,^***	2791 ± 807 ms^**	3163 ± 803 ms^**	−2.15	.034^**	−717 ms	−29 ms	.3	.004^**	−.06	.572	.1	.366
last	FAST	919 ± 36 ms^d ^***	949 ± 382 ms	877 ± 275 ms	1.04	.3	−74 ms	219 ms	−.04	.718	.1	.383	−.04	.692
	RANDOM	2190 ± 80 ms	2211 ± 703 ms	2162 ± 834 ms	0.3	.765	−276 ms	374 ms	−.14	.195	−.12	.275	.22	.038^**
	SLOW	3972 ± 66 ms^e ^***	3902 ± 694 ms	4071 ± 503 ms	−1.33	.188	−435 ms	97 ms	.23	.03^**	<.01	.991	.07	.491

DOI: 10.7717/peerj.12653/table-2

Notes:

Equal letters mean significant paired t-Test results.

***p < .001.

**p < .05.

*p < .1.

at₈₈ = −12.51; 95 CI −1080 ms, −784 ms.

bt₈₈ = −25.93; 95 CI −2362 ms, 2026 ms.

ct₈₈ = 15.2; 95 CI 1097 ms, 1427 ms.

dt₈₈ = −11.08; 95CI −815 ms, −567 ms.

et₈₈ = 13.79, 95 CI 879 ms, 1175 ms

Influence of pubertal development and sex on response time adjustments

The first GLM was repeated with the factor “pubertal development” (measured with the PDS) replacing the factor “age”. A significant main effect of “clock condition” (F_2,172 = 7.28 p = .001, η²_p = .08), significant two-way interactions of “clock condition” x “pubertal development” (F₂ = 3.4, p = .036, η²_p = .04) and “clock condition” x “sex” (F₂ = 3.81, p = .024, η²_p = .04) emerged. Further, the interaction between “clock condition” and “block” remained significant (F_2,172 = 8.04, p < .001, η²_p = .09).

Post hoc t-tests showed a significant RT distinction between the three clock conditions (fast vs. random: p < .001, d = −1.33; fast vs. slow: p < .001, d = − 2.75; slow vs. random: p < .001, d = 1.61, Bonferroni corrected for two comparisons) (see Table 2). Consequently, an adjustment to the varying clock conditions could be observed. Concerning the interaction between “clock condition” and “sex”, a significant difference only arose in the slow clock condition. Males reacted significantly slower and thereby better to the slow clock in general than females did (p = .048, d = −.43) (see Table 2). The interaction of “pubertal development” and “clock condition” was reflected by a trend-wise positive correlation between the PDS and the RT of the random condition only (r = .19, p = .068) (see Table 2).

Influence of sex hormones and sex on response time adjustments

In a third GLM we investigated the modulatory influence of zE₂ and zT as a function of the participants’ sex on RTs in the three clock conditions (fast, random, slow) and the two blocks (first, last). The main effect of “clock condition” (F₂,₁₆₀ = 114.83 p < .001, η²_p = .81) and the interaction of “clock condition” and “block” (F₂, ₁₆₀ = 7.28 p < .001, η²_p = .59) remained significant. Furthermore, an interaction of “block x clock condition x zE₂ concentration” (F₂ = 4.9, p = .009, η²_p = .06) and a main effect of block (F_1,80 = 5.29 p = .024, η²_p = .06) and of “zT” (F ₁ = 5.28 p = .024, η²_p = .06) occurred.

The interaction of “block x clock condition x zE₂” was reflected by a negative correlation between zE₂ and the initial RT in the fast clock condition (r = -.24, p = .03) (see Fig. 3). In addition, we also examined the individual learning-related change in the RTs between first and last block, which demonstrated the adjustment from the initial to the optimized block (RT last block –RT first block). The learning-related change showed a significant positive correlation with zE₂ in the fast clock condition (r = .28, p = .01) (see Fig. 4). No correlation emerged with the slow (r = .08, p = .497) or random condition (r = −.18, p = .096).

Negative correlation between zE2 and the initial fast clock. — Figure 3: Negative correlation between zE₂ and the initial fast clock.
Subjects who had higher zE₂ concentrations responded faster during the initial fast clock condition (r = −.24, p =.03).

Download full-size image

DOI: 10.7717/peerj.12653/fig-3

Positive correlation between zE2 and the learning-related change of the fast clock. — Figure 4: Positive correlation between zE₂ and the learning-related change of the fast clock.
Subjects who had lower zE₂ concentrations showed a better adjustment from the initial to the optimized block in the fast clock condition, and became relatively faster in the last block, which resulted in as indicated by a more negative delta value of “last - first block” (r = .28, p = .01).

Download full-size image

DOI: 10.7717/peerj.12653/fig-4

A post-hoc comparison of the blocks evinced a slower response speed in the initial block compared to the last block (t ₈₈ = −2.67, p = .009, d = −.28). Further, zT was positively correlated with a slower RT independent of clock condition or block (r = .29, p = .007) (see Fig. 5). Since we found a significant difference in the zT of females and males, with higher concentrations in males (see Table 1), we additionally explored the zT effect separately for both sexes. From this, it became obvious that the correlation probably emerged from the male adolescents. Accordingly, the mean of both blocks across all clocks was positively correlated with zT in males (r = .48, p = .002), but not in females (r = −.15, p = .298). In males, a general slowing could also be observed with increasing zT in both blocks of all conditions (first: r = .37, p = .025, last: r = .5, p = .002) and especially in the slow (r = .42, p = .01) and the random (r = .35, p = .032), but not in the fast condition (r = .09, p = .579). Additionally, in the initial (r = .35, p = .036) and optimized block (r = .44, p = .007) of the slow clock positive correlations emerged. Again, these correlations could not be found in females.

Figure 5: Positive correlation between zT and the response time of all clocks and both blocks.
Subjects who had higher zT concentrations generally responded more slowly (r = .29, p = .007).

Download full-size image

DOI: 10.7717/peerj.12653/fig-5

Discussion

This study examined the effects of adolescent E₂ and testosterone concentrations on RT adjustments in the clock task. Results indicate individual differences in the preference for either Go or NoGo learning (see Fig. 2) and an adaption to the different clock conditions from the initial to the optimized block. Both findings have already been demonstrated previously in studies with adults (Kohne et al., 2021; Moustafa et al., 2008; Reimers, Büchel & Diekhof, 2014). In addition, we also found that testosterone levels were significantly higher in males then females, while age, impulsivity and E₂ concentrations did not differ between the sexes. We also did not observe an age-dependent influence on the RT, and there was no association between individual pubertal development and Go or NoGo learning. Solely, a tendency towards a slower baseline response speed with increasing pubertal development emerged. Apart from that, we found a sex difference in the slow clock condition. Male adolescents responded significantly slower (better adapted) to the slow clock condition compared to females. E₂ and testosterone further appeared to modulate learning ability in different ways. Whereas E₂ apparently enhanced initial Go learning (see Figs. 3 and 4), testosterone presumably promoted NoGo learning ability (see Fig. 5), yet primarily in males.

Similar to studies with adults, our data confirmed the detection of a preference for Go or NoGo learning ability with a presumable supporting effect of E₂ on Go learning (Diekhof, 2018; Moustafa et al., 2008; Reimers, Büchel & Diekhof, 2014). Furthermore, we observed a relation between habitual testosterone and the ability to slow down for reward, which was especially evident in male adolescents. The observed divergence of females and males in the learning capability related to the slow condition could probably be ascribed to a hormonal sex-difference. Hormonal testosterone concentrations differed significantly between females and males who showed enhanced concentrations. The varying increase of gonadal hormones during puberty could thus be one reason for the different RT adjustments in the slow clock. Accordingly, testosterone was associated with a slower RT and enhanced NoGo learning in adolescents. An explorative analysis showed that this result could be traced back to the male adolescents, most likely because testosterone is the main acting gonadal hormone during male pubertal development and by far more variable in pubertal males than in females. In line with adult research, E₂ seemed to stimulate the initially faster responses and therefore Go learning in all adolescents. We speculate that the effect of E₂ could have been mediated by its modulatory impact on dopaminergic transmission, which has been assumed for similar findings in adult women (see i.a. Diekhof, 2015; Reimers, Büchel & Diekhof, 2014). Estrogen receptors can be found in the brain of both sexes via which E₂ presumably has modulating effects on neurotransmission and plasticity (Gillies & McArthur, 2010).

The correlation between Go learning and E₂ occurred exclusively in the initial block during which participants were still naïve regarding the temporal reward associations of the different clocks. This might indicate that E₂ has only a subtle effect on behavioral responding in the clock task. Once the RT had been optimized in later phases of the task, this correlation was no longer behaviorally measurable (see also Reimers, Büchel & Diekhof, 2014).

Alternatively, E₂ may also support learning through a promotion of signal transduction. E₂ administration in young and aged ovariectomized rhesus monkeys led to an increase in spine density in the dorsolateral prefrontal cortex (Hao et al., 2003). An increased spine density on pyramidal neurons is connected to an enhanced number of excitatory synapses per neuron which in turn might improve learning performance in general (Mahmmoud et al., 2015). Moreover, in ovariectomized rats E₂ administration provoked cell proliferation and an increase of dendritic spine density in the hippocampus (Adams et al., 2002; Tanapat, NB & Gould, 2005). In a previous study, Davidow and colleagues demonstrated the positive impact of hippocampal activity and its connectivity to the striatum on reinforcement learning in adolescents (Davidow et al., 2016). Therefore, the potentiating influence of E₂ on the hippocampus may improve reward learning as well. Besides E₂, androgens also positively affect prefrontal and hippocampal processing, but rat studies indicate a greater impact of androgens in males (Hamson, Roes & Galea, 2016).

Similar to E₂, testosterone can modulate dopaminergic transmission and may also impact transmission in other neurotransmitter systems (De Souza Silva et al., 2009; Sinclair et al., 2014). The enhancing effect of testosterone on slowing ability may additionally be explained through an interaction of testosterone and serotoninergic processing in males. In male rats, testosterone administration leads to an increase of cerebral serotonin and its metabolites (De Souza Silva et al., 2009; Thiblin et al., 1999). Moreover, a positive correlation between plasma testosterone and serotonin receptor 4 level emerged, leading to the suggestion that higher testosterone is accompanied by a higher cerebral serotonin tonus (Perfalk et al., 2017). Therapeutic approaches include selective serotonin reuptake inhibitors that increase synaptic serotonin levels and modulate neuroplasticity (Kraus et al., 2017). For learning and memory formation synaptic plasticity is exceedingly important. Serotoninergic impact on human behavior and neurophysiological processes is commonly investigated through a depletion of the serotonin precursor tryptophan. Studies with healthy humans using tryptophan depletion demonstrate a slowing of responses by pharmacologically increased serotonin (e.g., (Murphy et al., 2002)). We observed a better slowing ability with habitually increased testosterone, which might indicate that this could have been an indirect effect of testosterone on serotoninergic transmission. This would also be in line with other studies, that found that the effect of behavioral slowing in punishment contexts, especially under high incentive motivation, disappeared, if serotonin was pharmacologically depressed (e.g., Crockett et al., 2012). Lowered serotonin concentrations after depletion have further been associated with decreased neural sensitivity to punishment (Helmbold, Zvyagintsev & Dahmen, 2015). Hence, enhanced testosterone concentration might have driven NoGo learning and enabled a better slowing down for reward, through its interaction with the serotoninergic system.

Just as a recent study, we could not observe a relation between reward or punishment sensitivity and the pubertal stage (Chahal et al., 2021). A generally lowered response speed in further developed adolescents could be a consequence of reduced impulsivity, which may be an indicator of neurophysiological and cognitive maturation. Similar to others, we did not find an association with chronological age (Wierenga et al., 2018). Our results thus support the assumption that pubertal development is a better indicator regarding cognitive performance than chronological age.

To date, a non-invasive direct measurement of neurotransmitter processes like dopamine binding or synthesis in the adolescent human brain is not feasible. We used non-invasive measurements to determine steroid hormone concentrations and assessed the individual learning ability for Go and NoGo learning. By combining both parameters, we tried to apply them as indirect indicators of dopaminergic transmission. Besides E₂ and testosterone other steroid hormones are presumably attractive for future studies. For instance, the influence of progesterone as a counterpart to E₂ on dopaminergic action may be of increased future interest. Whereas E₂ is assumed to have an agonistic effect on dopaminergic transmission, progesterone supposedly reduces E₂ receptor density (Selcer & Leavitt, 1988) and apparently upregulates monoamine oxidase when it is administered together with E₂, which mimics the luteal phase of a natural menstrual cycle (Luine & Hearns, 1990; Luine & Rhodes, 1983). Additionally, progesterone enhances gamma-aminobutyric acid induced inhibition of dopaminergic neurons (Majewska et al., 1986). Thus, an antagonistic and reducing effect of progesterone on dopaminergic transmission has been suggested (Diekhof, 2018). In future studies, the tracking of the developing menstrual cycle of the female adolescents could probably contribute to a better interpretation of the opposite effects of E₂ and progesterone.

Finally, genetic predisposition as such has already been observed to affect reward sensitivity (Richards et al., 2016), and may further interact with steroid hormone level as demonstrated previously (Jakob et al., 2018; Veselic et al., 2021). In addition to previous findings on receptor and transporter polymorphisms of dopamine, serotonin and sex hormones, future studies could examine genetic interactions via genome-wide associations.

Conclusion

Sex hormones modulate neurophysiological processes and behavior in the context of reward processing in both adult animals and humans. However, evidence from adolescent populations is sparse. The present study assessed the impact of E₂ and testosterone on adolescents’ reinforcement learning. Similar to female adults (e.g., Diekhof, 2015), E₂ promoted initial Go learning in both sexes in our adolescent sample. Testosterone, in turn, enhanced NoGo learning in males. It could be speculated that individual differences in reinforcement learning are associated with variations in these hormones during adolescence, which shift the balance between a reward and avoidance-related learning style.

Future investigations should consider further steroid hormones (e.g., cortisol, progesterone) and neurophysiological processing to specify the impact of hormonal differences on the dopaminergic mechanisms of reinforcement learning.

Supplemental Information

Prepared data

DOI: 10.7717/peerj.12653/supp-1

Download

Raw data

DOI: 10.7717/peerj.12653/supp-2

Download

[1] Adams MM, Fink SE, Shah RA, Janssen WGM, Hayashi S, McEwen BS, Milner TA, Morrison JH. 2002. Estrogen and aging affect the subcellular distribution of estrogen receptor-α in the hippocampus of female rats. Journal of Neuroscience 22(9):3608-3614

[2] Becker JB. 1990. Direct effect of 17β-estradiol on striatum: sex differences in dopamine release. Synapse 5(2):157-164

[3] Castner SA, Xiao L, Becker JB. 1993. Sex differences in striatal dopamine: in vivo microdialysis and behavioral studies. Brain Research 610(1):127-134

[4] Chahal R, Delevich K, Kirshenbaum JS, Borchers LR, Ho TC, Gotlib IH. 2021. Sex differences in pubertal associations with fronto-accumbal white matter morphometry: implications for understanding sensitivity to reward and punishment. NeuroImage 226(2020):117598

[5] Cherrier MM, Asthana S, Plymate S, Baker L, Matsumoto AM, Peskind E, Raskind MA, Brodkin K, Bremner W, Petrova A, LaTendresse S, Craft S. 2001. Testosterone supplementation improves spatial and verbal memory in healthy older men. Neurology 57(1):80-88

[6] Crockett MJ, Clark L, Apergis-Schoute AM, Morein-Zamir S, Robbins TW. 2012. Serotonin modulates the effects of pavlovian aversive predictions on response vigor. Neuropsychopharmacology 37(10):2244-2252

[7] Davidow JY, Foerde K, Galván A, Shohamy D. 2016. An upside to reward sensitivity: the hippocampus supports enhanced reinforcement learning in adolescence. Neuron 92(1):93-99

[8] De Souza Silva MA, Mattern C, Topic B, Buddenberg TE, Huston JP. 2009. Dopaminergic and serotonergic activity in neostriatum and nucleus accumbens enhanced by intranasal administration of testosterone. European Neuropsychopharmacology 19(1):53-63

[9] Diekhof EK. 2015. Be quick about it. Endogenous estradiol level, menstrual cycle phase and trait impulsiveness predict impulsive choice in the context of reward acquisition. Hormones and Behavior 74:186-193

[10] Diekhof EK. 2018. Estradiol and the reward system in humans. Current Opinion in Behavioral Sciences 23:58-64

[11] Frank MJ, Seeberger LC, O’Reilly RC. 2004. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306(5703):1940-1943

[12] Gillies GE, McArthur S. 2010. Estrogen actions in the brain and the basis for differential action in men and women: a case for sex-specific medicines. Pharmacological Reviews 62(2):155-198

[13] Halari R, Mines M, Kumari V, Mehrotra R, Wheeler M, Ng V, Sharma T. 2005. Sex differences and individual differences in cognitive performance and their relationship to endogenous gonadal hormones and gonadotropins. Behavioral Neuroscience 119(1):104-117

[14] Hamson DK, Roes MM, Galea LAM. 2016. Sex hormones and cognition: neuroendocrine influences on memory and learning. Comprehensive Physiology 6(3):1295-1337

[15] Hao J, Janssen WGM, Tang Y, Roberts JA, McKay H, Lasley B, Allen PB, Greengard P, Rapp PR, Kordower JH, Hof PR, Morrison JH. 2003. Estrogen increases the number of spinophilin-immunoreactive spines in the hippocampus of young and aged female rhesus monkeys. Journal of Comparative Neurology 465(4):540-550

[16] Hartmann AS, Rief W, Hilbert A. 2011. Psychometric properties of the german version of the barratt impulsiveness scale. Version 11 (Bis–11) for adolescents. Perceptual and Motor Skills 112(2):353-368

[17] Helmbold K, Zvyagintsev M, Dahmen B. 2015. Effects of serotonin depletion on punishment processing in the orbitofrontal and anterior cingulate cortices of healthy women. European Neuropsychopharmacology 25(6):846-856

[18] Herting MM, Gautam P, Spielberg JM, Kan E, Dahl RE, Sowell ER. 2014. The role of testosterone and estradiol in brain volume changes across adolescence: a longitudinal structural MRI study. Human Brain Mapping 35(11):5633-5645

[19] Jakob K, Ehrentreich H, Holtfrerich SKC, Reimers L, Diekhof EK. 2018. DAT1-genotype and menstrual cycle, but not hormonal contraception, modulate reinforcement learning: preliminary evidence. Frontiers in Endocrinology 9(60):1-13

[20] Khaleghi M, Rajizadeh MA, Bashiri H, Kohlmeier KA, Mohammadi F, Khaksari M, Shabani M. 2021. Estrogen attenuates physical and psychological stress-induced cognitive impairments in ovariectomized rats. Brain and Behavior 11(5):1-15

[21] Kirkpatrick SW, Campbell PS, Wharry RE, Robinson SL. 1993. Salivary testosterone in children with and without learning disabilities. Physiology and Behavior 53(3):583-586

[22] Kohne S, Reimers L, Müller M, Diekhof EK. 2021. Daytime and season do not affect reinforcement learning capacity in a response time adjustment task. Chronobiology International 00(00):1-7

[23] Kraus C, Castrén E, Kasper S, Lanzenberger R. 2017. Serotonin and neuroplasticity –Links between molecular, functional and structural pathophysiology in depression. Neuroscience and Biobehavioral Reviews 77:317-326

[24] Ladouceur CD, Kerestes R, Schlund MW, Shirtcliff EA, Lee Y, Dahl RE. 2019. Neural systems underlying reward cue processing in early adolescence: the role of puberty and pubertal hormones. Psychoneuroendocrinology 102(2018):281-291

[25] Luine V, Hearns M. 1990. Relationship of gonadal hormone administration, sex, reproductive status and age to monoamine oxidase activity within the hypothalamus. Journal of Neuroendocrinology 2(4):423-428

[26] Luine V, Rhodes J. 1983. Gonadal hormone regulation of MAO and other enzymes in hypothalamic areas. Neuroendocrinology 36(3):235-241

[27] Mahmmoud RR, Sase S, Aher YD, Sase A, Gröger M, Mokhtar M, Höger H, Lubec G. 2015. Spatial and working memory is linked to spine density and mushroom spines. PLOS ONE 10(10):1-15

[28] Maia TV, Frank MJ. 2011. From reinforcement learning models to psychiatric and neurological disorders. Nature Neuroscience 14(2):154-162

[29] Majewska MD, Harrison NL, Schwartz RD, Barker JL, Paul SM. 1986. Steroid hormone metabolites are barbiturate-like modulators of the GABA receptor. Science 232(4753):1004-1007

[30] Morris RW, Purves-Tyson TD, Weickert CS, Rothmond D, Lenroot R, Weickert TW. 2015. Testosterone and reward prediction-errors in healthy men and men with schizophrenia. Schizophrenia Research 168(3):649-660

[31] Moustafa AA, Cohen MX, Sherman SJ, Frank MJ. 2008. A role for dopamine in temporal decision making and reward maximization in parkinsonism. Journal of Neuroscience 28(47):12294-12304

[32] Murphy F, Smith K, Cowen P, Robbins T, Sahakian B. 2002. The effects of tryptophan depletion on cognitive and affective processing in healthy volunteers. Psychopharmacology 163(1):42-53

[33] Op De Macks ZAZA, Bunge SA, Bell ON, Wilbrecht L, Kriegsfeld LJ, Kayser AS, Dahl RE. 2016. Risky decision-making in adolescent girls: the role of pubertal hormones and reward circuitry. Psychoneuroendocrinology 74:77-91

[34] Ostatníková D, Celec P, Putz Z, Hodosy J, Schmidt F, Laznibatová J, Kúdela M. 2007. Intelligence and salivary testosterone levels in prepubertal children. Neuropsychologia 45(7):1378-1385

[35] Pasqualini C, Olivier V, Guibert B, Frain O, Leviel V. 1995. Acute stimulatory effect of estradiol on striatal dopamine synthesis. Journal of Neurochemistry 65(4):1651-1657

[36] Peper JS, Brouwer RM, Schnack HG, Van Baal GC, Van Leeuwen M, Van den Berg SM, Delemarre-VanDe Waal HA, Boomsma DI, Kahn RS, Hulshoff Pol HE. 2009. Sex steroids and brain structure in pubertal boys and girls. Psychoneuroendocrinology 34(3):332-342

[37] Peper JS, Dahl RE. 2013. The teenage brain: surging hormones-brain-behavior interactions during puberty. In: Current Directions in Psychological Science.

[38] Peper JS, Hulshoff Pol HE, Crone EA, Van Honk J. 2011. Sex steroids and brain structure in pubertal boys and girls: a mini-review of neuroimaging studies. Neuroscience 191:28-37

[39] Perfalk E, Cunha-Bang S da, Holst KK, Keller S, Svarer C, Knudsen GM, Frokjaer VG. 2017. Testosterone levels in healthy men correlate negatively with serotonin 4 receptor binding. Psychoneuroendocrinology 81:22-28

[40] Petersen AC, Crockett L, Richards M, Boxer A. 1988. A self-report measure of pubertal status: reliability, validity, and initial norms. Journal of Youth and Adolescence 17(2):117-133

[41] Purves-Tyson TD, Handelsman DJ, Double KL, Owens SJ, Bustamante S, Weickert CS. 2012. Testosterone regulation of sex steroid-related mRNAs and dopamine-related mRNAs in adolescent male rat substantia nigra. BMC Neuroscience 131

[42] Reimers L, Büchel C, Diekhof EK. 2014. How to be patient. The ability to wait for a reward depends on menstrual cycle phase and feedback-related activity. Frontiers in Neuroscience 8(DEC):1-12

[43] Richards JS, Arias Vásquez A, von Rhein D, Van Der Meer D, Franke B, Hoekstra PJ, Heslenfeld DJ, Oosterlaan J, Faraone SV, Buitelaar JK, Hartman CA. 2016. Adolescent behavioral and neural reward sensitivity: a test of the differential susceptibility theory. Translational Psychiatry 6(4):e771

[44] Selcer KW, Leavitt WW. 1988. Progesterone down-regulation of nuclear estrogen receptor: a fundamental mechanism in birds and mammals. General and Comparative Endocrinology 72(3):443-452

[45] Sinclair D, Purves-Tyson TD, Allen KM, Weickert CS. 2014. Impacts of stress and sex hormones on dopamine neurotransmission in the adolescent brain. Psychopharmacology 231(8):1581-1599

[46] Sisk CL, Foster DL. 2004. The neural basis of puberty and adolescence. Nature Neuroscience 7(10):1040-1047

[47] Spritzer MD, Daviau ED, Coneeny MK, Engelman SM, Prince WT, Rodriguez-Wisdom KN. 2011. Effects of testosterone on spatial learning and memory in adult male rats. Hormones and Behavior 59(4):484-496

[48] Stiensmeier-Pelster J, Braune-Krickau M, Schürmann M, Duda K. 2014. Depressionsinventar für Kinder und Jugendliche (DIKJ) Dorsch –Lexikon Der Psychologie. Verlag Hogrefe Verlag 18:366-361

[49] Tanapat P, Hastings NB, Gould E. 2005. Ovarian steroids influence cell proliferation in the dentate gyrus of the adult female rat in a dose- and time-dependent manner. Journal of Comparative Neurology 481(3):252-265

[50] Thiblin I, Finn A, Ross SB, Stenfors C. 1999. Increased dopaminergic and 5-hydroxytryptaminergic activities in male rat brain following long-term treatment with anabolic androgenic steroids. British Journal of Pharmacology 126(6):1301-1306

[51] Veselic S, Jocham G, Gausterer C, Wagner B, Ernhoefer-Reßler M, Lanzenberger R, Eisenegger C, Lamm C, Losecaat Vermeer A. 2021. A causal role of estradiol in human reinforcement learning. Hormones and Behavior 134(August 2020)