Dynamics of spontaneous thoughts: Exploration, attentional profile and the segmentation of the stream of thoughts

For a long time, clinical knowledge and first-person reports have pointed to individual differences in the dynamics of spontaneous thoughts, in particular in the extreme case of psychiatric conditions (e.g. racing thoughts in Attention Deficit / Hyperactivity Disorder, ADHD; rumination in depression). We used a novel procedure to investigate this individual variability by combining verbal fluency tasks and introspective reports of thought content. Our goal was twofold. First, we tested the hypothesis that a greater segmentation of the stream of thoughts would be associated with trait inattention, in line with subjective reports of ADHD patients. Second, we tested whether the segmentation of the stream of thoughts increased with an increased tendency for exploratory behavior, following recent theoretical claims on the mechanisms underpinning the generation of spontaneous thoughts. Our results support both hypotheses, shedding light on the factors contributing to the individual variability in the dynamics of the stream of thought.


Introduction
In his Principles of Psychology (1890), William James describes subjective experience as a flow: thoughts seem to unfold in a continuous stream.While James notes the absence of precise delimitations within the stream of consciousness, he also insists on the notion that this stream flows at varying speeds: sometimes it is slow and stable (substantive states), while at other times it is fast (transitive states) leading to a reconfiguration of the content of our consciousness.Because of its subjective and private nature, the study of the stream of thought has long been limited to purely introspective methods such as meditation and phenomenology.Despite the merits of these approaches, they do not allow for a quantitative comparison of individual experiences.In particular, it has been extremely difficult to objectivate the distinction between substantive and transitive states, even though the distinction seems important in the clinical description of subjective experience.
In the present study, we focused on one aspect of this flow of subjective experience that will refer to as the "stream of spontaneous thoughts" (SST), that is, the undirected thoughts (images, ideas, memories) that come to mind in the absence of external stimulation.Spontaneous thoughts are ubiquitous in our daily lives (Killingsworth & Gilbert, 2010), and have been extensively studied by experimental psychologists in the last two decades under the label of mind-wandering (Smallwood & Schooler, 2015), following seminal work from the late 1960's.Most research in this domain focused on identifying categories of mind-wandering and studying its consequences on task performance.Directly relevant to our own investigation, the more challenging question of how thoughts unfold in time has recently been an object of renewed interest, both from a theoretical and empirical perspective (e.g.Andrews-Hanna et al., 2021;Mildner & Tamir, 2019;Sripada & Taxali, 2020).This question can be addressed along several axes.For instance, some authors have investigated the emotional valence of thoughts and its link to the speed of their unfolding (Raffaelli et al., 2021), while others have tried to characterize the kinds of transitions between thoughts (Mills et al., 2021).
Yet, the dynamics of the SST (speed, organization in time, etc.), as well as inter-individual variability in the general population remain understudied to this day.We chose to study one aspect of dynamics, namely the level of segmentation (i.e. its organization in groups or sections).We set out to study this with reference to the notion that individuals vary along an attentional profile continuum, a notion that we generalize from the literature on disorders of attention in psychiatry.
Anomalies in the dynamics of spontaneous thoughts have been associated with several psychiatric disorders, like ruminative thoughts in depression (McLaughlin & Nolen-Hoeksema, 2011), racing thoughts during hypomanic episodes in bipolar disorders (Keizer et al., 2014;Piguet et al., 2010), or distracted and restless mind for ADHD patients (Asherson, 2005;Martz et al., 2022).Those anomalies have long been identified based on subjective reports and clinical observation of speech.More quantitative evaluations have involved word associations (e.g.Levine et al., 1996), and more recently verbal fluency tasks (Weiner et al., 2019).For example, Martz et al. (2022) observed that ADHD patients tended to form more and smaller semantic clusters in a free verbal fluency task, where no category was imposed, which suggests that they had more thought transitions, i.e. a more segmented stream of thought.
In the present study, we focused on characterizing spontaneous thoughts in the general population, extending the attentional deficit that is at the core of the diagnosis of Attention Deficit Hyperactivity Disorder (ADHD) to the notion of trait inattention.ADHD is a neurodevelopmental disorder characterized by a tendency for inattention, hyperactivity and impulsivity that affects about 3 % of adults (Asherson, 2005).It is documented that ADHD symptomatology is present to some extent as a continuum in the general population (Asherson & Trzaskowski, 2015;Larsson et al., 2012;McLennan, 2016).In the following, we shall refer to the individual traits captured by this symptomatology as the attentional profile, and we operationalize its measurement by means of the Adult Self-Report Scale (ASRS, Adler et al., 2006;Caci et al., 2014).The impact of sub-clinical variations of this attentional profile on spontaneous thoughts has not been evaluated, with few exceptions: Seli et al. (2015) and Franklin et al. (2014) showed that ADHD selfreported clinical diagnosis and sub-clinical ADHD symptomatology were both associated with increased mind-wandering, while Van den Driessche et al. (2017) and Beikmohamadi & Meier (2022) described an increase of mind-blanking episodes without increase of mind-wandering in clinically diagnosed children and sub-clinical adults with higher ADHD symptomatology.
Previous research provides some indication as to the generative mechanisms for spontaneous thoughts.Indeed, it has been proposed that the content of spontaneous thought originates from a recombination of episodic memory elements (Christoff et al., 2016), structured by semantic memory (Mildner & Tamir, 2019).This claim is supported by behavioral, neuropsychological, and brain imaging data (see Christoff et al., 2016 for a review).Regarding the cognitive mechanisms involved, Mildner & Tamir (2019) proposed that the generation of spontaneous thoughts could be described as foraging in a semantic space.According to Mildner & Tamir, the mind moves along a random walk over episodic, semantically structured memories and a foraging mechanism is used to decide when to jump to another place of the semantic environment.Other accounts of mind-wandering have highlighted its potential links with the exploration / exploitation trade-off, like Mittner et al.'s (2016) framework that posits the existence of an "exploratory" off-focus state that would mediate the transitions between the on-task and mind-wandering "exploitative" states.Importantly, the balance between exploration and exploitation is also at the core of theoretical models of ADHD (Hauser et al., 2016) and its evolution (Williams & Taylor, 2006).In this article, we test this suggested link between a tendency for exploration and the dynamics of spontaneous thoughts.We predict that the frequency of spontaneous thought transitions, that is the level of segmentation of the SST, should covary, at the population level, with the tendency for exploration.
Tendency for exploration can be seen as an individual trait (Cloninger et al., 1993, Williams & Taylor, 2006), that has recently been shown to predict behavior in decision making and foraging tasks.In a recent investigation of the links between impulsivity and exploratory behavior in the general population, Dubois & Hauser (2022) showed that impulsivity was associated with a tendency for value-free random exploration only, i.e. choices that do not take into account the information gathered in the task.Addicott and al. (2021) observed more exploratory decisions for ADHD participants compared to controls in a multi-armed bandit task.Barack et al. (2024) found similar results with a foraging task.Van den Driessche et al. (2019) reported more semantic and visuo-spatial exploration in non-clinical children with a higher score on an attentional scale.
We operationalized the SST as a sequence of spontaneously produced words.We built on the hypothesis that word production tasks provide a direct and quantifiable access to the organization and dynamics of the SST as well as its inter-individual variations.We asked the following questions: Can we identify a segmentation in episodes of the stream of spontaneous thoughts?If so, is this segmentation shaped by individual differences of attentional profile?Is this segmentation related to a general tendency for exploration?Is the tendency for exploration modulated by participants' attentional profile?
Traditionally, the experimental study of mind-wandering has employed experience sampling methods, which involve real-time probing to assess participants' subjective mental states at specific moments (Hurlburt, 1997;Smallwood & Schooler, 2015).These methods can be fruitful for studying the frequency of mind-wandering episodes or their interference with a concurrent task, but it is not a suitable method for investigating the continuous unfolding in time of the SST.A promising way to accomplish the latter is to have participants continuously generate verbal content.This can be seen as a way to track the content of participants' stream of thoughts as closely as possible, as it changes in time, since this content is largely semantic in nature (Bar et al., 2007;Heavey et al., 2019).
Three types of tasks that have been used to attempt to track the SST: In think-aloud paradigms (e.g., Raffaelli et al., 2021;Sripada & Taxali, 2020) participants are instructed to vocalize their thoughts in real time, as was pioneered in the 1980 s by Ericsson & Simon's protocol analysis (Ericsson & Simon, 1993); in a chained associations task (e.g., Andrews-Hanna et al., 2021;Gray et al., 2019) participants are instructed to generate series of words so that each new word must be the first that comes to mind in relation to the previous one; finally, in a free verbal fluency task (Joanette et al., 2004;Martz et al., 2022) participants must produce as many words as possible within a limited time period.We designed a slightly different spontaneous word generation task, asking participants to freely enunciate the words that come to their mind in real time.Thus, our task is a form of single word think-aloud, or self-paced content probing.This constitute our core free condition and we assume that the words produced in this task are real time proxies for the succession of thoughts that participants experience.This free condition builds on and extends the paradigms cited above.But, it stands out as it combines longer series, a free rhythm of production and the instructions to "freely say the words that come to mind".Since we sought to study spontaneous thoughts, we chose to ask for the words that came spontaneously in mind, rather than putting emphasis on "associations" between consecutive words, or on productivity.
In addition, in order to gain a deeper understanding of the subjective unfolding of thoughts during the free word generation task, we asked participants to provide retrospective annotations of the series of words they produced.This approach aims to complement our objective metrics (semantic similarities and response times), with subjective indications that might include personal knowledge.Further, we engaged participants in an explicit semantic search control condition, building on the hypotheses reviewed above that search mechanisms contribute to the generation of the stream of spontaneous thoughts.To do so, we used a dual verbal semantic fluency task.
In semantic fluency tasks, participants are given one category and asked to generate exemplars from it.Searching behavior can then be quantified by counting the number of "switches", where the participants move from one sub-category to another.The identification is often done manually, either using criteria that were standardized for the "animals" category (Troyer et al., 1997), or using the intuition of external annotators (Martz et al., 2022).We chose to use a double fluency task (thereafter f2 condition), in which participants were instructed to generate words from two possible categories.This allowed us to objectify the switches between categories, hence the foraging strategies.Our fluency task enabled us to characterize the behavior of participants in a semantic search context, and to assess their tendency for exploration in a semantic environment close to our free condition.Our prediction was that participants with a more segmented stream of thoughts, as revealed in the free condition, and participants with higher trait inattention would display a more exploratory behavior in the semantic fluency task.
In sum the free condition is a proxy for the unfolding of spontaneous thoughts in time, whereas the double fluency (f2) condition puts the participants under semantic search constraints.Our results demonstrate that participants higher in trait inattention have a more segmented stream of consciousness and have a more exploratory behavior in their semantic searches.We also observe a direct relationship between the segmentation of thought as it is quantified in the free word fluency condition and the strategies exhibited in the semantic search task.

Participants
59 participants completed the experiment.They were recruited via a mailing list dedicated to relaying information about cognitive sciences experiments.The inclusion criteria were normal or corrected to normal vision with contact lenses, to be a native French speaker, and to be 18 or older.The mean age was 26.2 (+/-5.7),ranging from 18 to 35, and the sample included 38 women.This study was approved by the ethics committee of Université Paris Cité (CER-U-Paris Cité).4 participants showed a bad comprehension of the instructions of one of the tasks.Those participants were removed for the analysis of those tasks.6 participants had incomplete data because of technical problems but were kept for some analyses.Our sample size was chosen in the same order of magnitude as the few published studies using comparable methods: (Andrews-Hanna et al., 2021;Martz et al., 2022;Raffaelli et al., 2021;C. Sripada & Taxali, 2020;Troyer et al., 1997).

Tasks
Our participants completed three 180 s word-generation tasks with an increasing level of constraint, in a within-subject design (see Supp. Figure 1 for a task schematic).In the free condition, the instruction was to say the words that came spontaneously in their mind, following an initial neutral seed word displayed on the screen (see the list of seed words in Supp.Table 1).In the double fluency condition (f2), they had to name items belonging to one or the other of two target categories (e.g."sports & animals"), with no instruction as to the starting category or the switches.Participants also completed a standard one-category fluency task ("f1") in between these two conditions.This additional task was included as part of another project centered on semantic searches (Kérébel et al., in preparation).In this article we focus on the free and f2 conditions, that are at the core of our argumentation.
Each participant completed 4 series of each condition in a row, with different target categories and seed word every time.Before each condition the participants read the instructions for the following task under the supervision of the experimenter.

Set up
Participants were seated 70 cm away from a 52.3 cm * 29.5 cm monitor, with their head stabilized by a forehead-rest and a chin-rest that were adjusted so as not to impede speech production.The stimuli words were displayed with a white font on a gray background.Participants were instructed to fixate a white hourglass shape (size: 1.5 cm) that was displayed at the center of the screen.The level of the "sand" in this hourglass indicated the remaining time in the current trial.The experimenter left the room during the task phases of the experiment.Pupillometric data was recorded with an EyeLink 2000 eye-tracker (SR Research).However, due to an excessive amount of blink artifacts, it was not analyzed in this study.The audio was captured with a M− AUDIO MicroTrack II recorder, and the words produced were transcribed using the Vosk Speech Recognition Toolkit (Shmyrev and Cephei, 2020) in Python.

Procedure and design
All participants completed the conditions in the same order: free, f1, f2.The seed words for the free condition were chosen from a list of words that we had filtered for frequency, context availability and arousal, respectively based on the database Lexique (New et al., 2004) and on the data gathered by (Bonin et al., 2013).We wanted to make sure that every participant knew the words, that the words were tangible enough to trigger a chain of associations, and that the words were neutral in arousal level.For the target categories of the f2 condition, we chose two sets of category pairs that were both delimited enough and rich enough, and we validated our choice in a pilot experiment (see Supp.Table 1).

Annotation procedure
At the end of the experiment, participants were presented with a printed list of the words they had produced in the free condition, and they were asked to group the words that "belonged together" at the moment when they were produced, i.e. words that corresponded to a same thought.As a control, we also asked them to group the words of a series of words produced by another participant in the pilot experiment.This control series was the same for every participant.

Questionnaires
Between the word production tasks and the annotation procedure, we also had participants fill out the ADHD Self-Report Scale (ASRS, Kessler et al., 2005, French translation on behalf of the WHO by Caci et al., 2014) and the State Trait Anxiety Inventory (STAI T, Spielberger et al., 1983, French translation by Gauthier & Bouchard, 1993).

Words data preprocessing
The onset time of each word was obtained automatically via our transcription tool, then manually checked and corrected when needed.Spelling, plural forms and capital letters were homogenized as much as possible across participants, and for the fluency conditions, out-of-category words were identified.For the f2 condition, words were manually assigned to their category by the experimenters.The preprocessed data file is available on Open Science framework (OSF, https://osf.io/7xkrd/?view_ only=3b2ae3e5c17d48c9ab6b0a9a78eb3c80).

Statistical analyses
We ran mixed effect modelsor generalized mixed effect modelsevery time there were several data points per participant, using the lme4 R package (Bates et al., 2015).We included a random intercept of the form (1|subject id./series nb.) for the models fitted on the full data, or (1| subject id.) for the models fitted on data averaged by condition.When modeling the Word Onset Asynchrony (WOA), we used a log transformation to avoid violating some of the assumptions of the model (homoscedasticity and normality of the residuals).Unless specified otherwise, we z-scored the variables fed to our models.Note that the semantic similarities in the fluency conditions were z-scored per category.We report p-values as estimated using Satterthwaite's degrees of freedom method with the lmerTest R package (Kuznetsova et al., 2017).In the supplementary material we report the tables of every models, with 95 % confidence intervals and effect sizes in the form of R-squared for linear models (MuMIn R package, implementing the methods proposed by Nakagawa & Schielzeth, 2013) and odd ratios for logistic models.In order to assess evidence for null effects we computed Bayes factors or approximation thereof when needed.For mixed-effect models we approximated it from the Bayesian Information Criterion: BF 10 = e (BIC0− BIC1)/2 (Wagenmakers, 2007).For correlations we used the BayesFactor R package (Morey et al., 2012) with default priors to compare a regression with and without the predictor of interest.The scripts used to produce the analyses, tables and figures reported here are available on OSF, along with the supplementary materials and the data.

Metrics
We analyzed variations of behavior in time using both semantic measures and time measures.For the semantic measures we used a pre-trained word embedding model of French (fastText, Grave et al., 2018) that associates each word with a vector in a 300-dimensional space.We chose this model for its ability to handle out-of-vocabulary words, that were likely to be encountered in the free condition.We computed the consecutive semantic similarity (thereafter consSim) between consecutive words, i.e. the cosine similarity of the vectors of those words in the embedding model.The consSim tells us how far each word is from the previous one in an objective semantic space.As reviewed above, semantic relationships are likely to constrain and structure thought processes (Mildner & Tamir, 2019, Bar et al., 2007), and we assume that the vector space of the embedding model partially captures this underlying semantic structure.Our temporal metrics were the WOA, the word onset time and the rank of each word in its series.

Questionnaires
21 participants filled a short version of the ASRS (Part A) instead of the full one (Part A+Part B).Since the short version has been shown to be more reliable than the full one (Kessler et al., 2005), we chose to use this one only for all participants.Data from the STAI are not reported here but can be found on the OSF repository.

Words production
Participants produced a mean number of 88.1 (+/-7.8),and 54.9 (+/-15.3)words in the free and double fluency (f2) conditions respectively.The means and the standards deviations differed between the two conditions (respectively p < 0.001 and p < 0.05).The number of words generated varied less within participants in the free condition than in the other condition, which suggests that the seed words had little impact on free production.No instruction was given concerning repetitions and they remained very low in proportion (free: 0.04 +/-0.05,f2: 0.03 +/-0.03).
The number of words produced correlated across participants between the two conditions (r = 0.65, p < 0.001, BF 10 > 30).However, it did not correlate with the ASRS score (free: p = 0.25, BF 10 = 0.47; f2: p = 0.95, BF 10 = 0.27), and visual inspection of the data did not suggest any non-linear relationship.The mean numbers of words produced by category pairs (f2) is reported in the supplementary material (Supp.Table 2).

Questionnaires
Answers on the ASRS A questionnaires showed an important variability: mean raw score of 17.7 (+/-3.66),ranging from 10 to 26 (the maximum score being 30).40 % of participants reached the threshold corresponding to ADHD detection according to the clinical criterion.It is known that ADHD symptoms are often recognized by people even if they are not associated with life impairments (Asherson, 2005), which may explain why a high proportion of our participants were over the clinical threshold.

Spontaneous thoughts and the attentional profile
3.2.1.Thoughts segmentation 3.2.1.1.Subjective clustering in the free series.On average, participants identified 11 (+/-3.5)clusters containing 7.6 words (+/-2.5)per series.The mean duration of said clusters was 13.4 s (+/-6.0).Note that only 84 % (+/-14) of the words were subjectively Example series of words presented in a 2D semantic space (t-SNE reduction of the 300D word embedding model we used for the analyses).The first word of each series is written in a bold and italic font.The insert plots represent the word onset in seconds and distribution of consecutive semantic similarities.Words were translated from French.Left.free condition, each color corresponds to a subjective cluster.The words in gray were not assigned to any cluster.The distribution of consecutive semantic similarities of the words in gray is superimposed to the distribution of the colored words.Solid line: cluster to cluster transition, dashed line: transition between cluster and non-cluster, dotted line: transition between non-clustered words.We can see that the consecutive semantic similarities are smaller for the nonclustered words (the gray distribution is more to the left than the colored distribution in the top right insert plot).Right.f2 condition, each color represents a category (blue: countries, red: jobs).Solid line: switch, dotted line: non-switch transition.Note the bump of very small consecutive semantic similarities on the top left insert plot, that corresponds to the switches.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)classified in clusters (max = 100 %, min = 42 %).Because of that, 35 % (+/-19) of the cluster transitions were not cluster-cluster transitions.

Validation of the subjective clustering of the free series.
We then checked whether the subjective clustering was congruent with objective temporal and semantic measures.We compared the Word Onset Asynchronies (WOAs) and consecutive similarities (consSim) for words within and outside subjective clusters.To do so, we ran a mixed effect logistic regression predicting for each word whether it was part of a cluster or not, based on WOA and consSim.We observed a positive effect of consSim (est.= 0.40, p < 0.001) and a negative effect of WOA (est.= -0.22,p < 0.001), meaning that clustered words are closer temporally and semantically than non-clustered words (see Supp.Table 3).
As an additional validation, an external annotator blind to our research questions annotated the series of words of the free condition as if they had produced it themselves.For each series, we computed the mean distance between each cluster boundary identified by the annotator and the closest participant's subjective cluster boundary, which gave us a mean 'disagreement score' per series.We assessed how much smaller than chance this score was by shuffling the positions of cluster boundaries identified by the annotator and computing the disagreement score 500 times.No permutation yielded a smaller score than the true disagreement score, which shows that first and third person fragmentations aligned better than chance with p < 1/500, i.e. p < 0.002.

Impact of the attentional profile on thought segmentation in the free series
3.2.2.1.Thoughts segmentation and ASRS score.In the free condition, we predicted that participants with a higher ASRS score would tend to switch more often from one thought to another.We first assessed whether the attentional trait created a bias in the annotation strategy.We first looked if, in the common series that participants annotated without having produced it, there was no correlation between the number and size of the clusters and the ASRS score.It was not the case (respectively p > 0.5, BF 10 = 0.36 and p > 0.8, BF 10 = 0.32).Note that we only had 38 data points for that test since we implemented this "common series" control after having already run some participants (see Fig. 1).
Next, we turn to quantifying the segmentation of the series of words.For each free series, we computed the number of clusters, the cluster ratio (number of clusters/number of words), the mean size and duration of the clusters, and the proportion of clustered words, which we then averaged by participant.We observed (see Fig. 2) that participants higher on the ASRS had more clusters (r = 0.47, p < 0.001, BF 10 > 30), a higher cluster ratio (r = 0.40, p = 0.003, BF 10 = 14), smaller and shorter clusters (r = -0.34,p = 0.01, BF 10 = 4.7; r = -0.37,p = 0.005, BF 10 = 7.8).There was no correlation between the ASRS score and the proportion of clustered words (p = 0.61, BF 10 = 0.30).

ASRS and consecutive semantic similarity.
We found evidence for an increased thought segmentation at the levels of the series of words with participants' subjective clustering (global scale).But modifications of thought dynamics could happen at various scales.We then tested whether there was an effect of the ASRS at the lowest scale accessible in our study: individual word transitions (local scale).We found no indication of a relation of the attentional profile with local word production processes in the free condition: mean consSim did not correlate with the ASRS score (p = 0.93, BF 10 = 0.26).We ran a linear mixed-effect model predicting for each word's consSim with predictors of WOA, ASRS score and their interaction (Supp.Table 4).We found a main effect of the WOA (est.= -0.27,p < 0.001) but no main effect or interaction of the ASRS (all ps > 0.5, BF 10 < 0.03).

General tendency for exploration
In f2, participants switched on average 4.15 times (+/-2.56)per series between categories.The mean number of switches and the switch ratio (number of switches/number of words produced) did not correlate with the ASRS score (p = 0.09, BF 10 = 0.96; p = 0.28, BF 10 = 0.44, respectively).We ran a mixed-effect logistic regression model predicting whether each word was a switch or not, based on the WOA, the consSim, and the rank of the word in the series as a control (see Supp.Table 5).We found that the switches were associated with smaller consSim (est.= -3.14, p < 0.001), longer WOAs (est.= 0.56, p < 0.001), and were less likely to occur later in the series (word rank main effect: est.= -0.58,p < 0.001).
We then computed the imbalance in the use of the two categories as the absolute difference of the proportion of words drawn from each category, in each series.A large score indicates that participants exploited one category at the expense of the other, whereas a low score indicates a balanced use of the two categories.We refer to this score as the asymmetry between categories.We observed a negative correlation between the ASRS score and the mean asymmetry between categories difference across series (r = -0.36,p = 0.005, BF 10 = 8.8), and a negative correlation between the ASRS score and the standard deviation of the asymmetry between categories across the four series (r = -0.41,p = 0.001, BF 10 = 27), see Fig. 3.

Impulsive exploration
We then quantified impulsivity in switching between the categories.We reasoned that, given four consecutive words A, B, C, D, a switch between words B and C could be considered locally optimal if the WOA between A and B was greater than the WOA between C and D, i.e. if the switch was accompanied by a local speed up of word production.On the opposite, the switches that did not meet this criterion could be seen as impulsive.On average, 59 % (+/-19) of the switches were locally optimal according to this definition.This proportion was significantly greater than 50 % (one-sided t-test, t(54) = 3.76, lower confidence bound = 0.55, p < 0.001, d = 0.51).We then ran a linear mixed-effect regression predicting for each series the number of switches of each type (locally optimal/impulsive) based on the ASRS score, the switch type and their interaction (see Supp.Table 6).We observed a positive effect of the ASRS score (est.= 0.22, p = 0.028) as well as an interaction with the switch type (est.= -0.20,p = 0.024): the effect of the ASRS score was stronger for "impulsive" switches than for "locally optimal switches".
Then, we investigated the relationship between the ASRS score and the duration of switches, under the assumption that switching rapidly can also be seen as a sign of impulsivity.We ran a linear mixed-effect model predicting the log-transformed WOA of each word based on whether it was a switch or not and on the ASRS score.We found a main effect of the word type (switch | no switch) (est.= 0.83, p < 0.001), and a negative interaction between the two predictors (est.= -0.09,p < 0.001).There was a negative correlation between the mean duration of switches and the ASRS: r = -0.37 (p = 0.005, BF 10 = 8.1), while the mean WOA during non-switches did not correlate with the ASRS (p = 0.56, BF 10 = 0.31).The interaction was preserved when we added the word rank in the series as a predictor in the model (see Supp.Table 7), which suggests that this interaction is not simply due to participants higher on the ASRS scale to switch earlier in the series when WOAs are smaller.

Convergence of behavior between the free and f2 conditions
We investigated whether participants exhibited similar behavior at the beginnings of the subjective word clusters in the free condition as at switches in the f2 condition.To do so we ran mixed effect logistic regressions on the free condition data predicting whether each word was the first of a cluster, based on the WOA, consSim, and word rank in the series (Supp.Table 8).As in the f2 condition, we found that the first words in a cluster had a larger WOA (est.= 0.33, p < 0.001) and a lower consSim (est.= -0.80,p < 0.001).However, there was no effect of the word rank.
In both conditions, we observed that semantically closer words had shorter WOA (free: est.= -0.16,p < 0.001; f2: est.= -0.20,p < 0.001, see Supp.Tables 9 and 10), even when controlling for the position in the series.There was no significant interaction with the ASRS score (in free and f2: all ps > 0.1, BF 10 < 0.03).In order to better test this absence of interaction, we compared the regression models with and without the main effect of the ASRS and its interaction with consSim: in both conditions we found an approximated Bayes factor BF 10 < 0.03, suggesting that indeed the WOA~consSim relationship was not modulated by the inattention score.In the free condition, this relationship points to the idea that within a given thought, semantics constrains the thought process.We also had the possibility to test whether this held at a larger scale, i.e. if there was also a relationship between temporal proximity and semantic similarity between thoughts.In order to do that, we ran a similar analysis at the level of the clusters of words.We defined the embedding of each cluster as the average of the embeddings of its word components, and we computed the semantic similarity between consecutive clusters.We then ran a linear mixed effect model on this cluster consSim with predictors cluster delay the ASRS score and the position of the cluster in the series (see Supp.Table 11).We found a negative effect of the cluster delay (est.= -0.10,p < 0.001), and an interaction with the ASRS score, to the effect that the effect of delay on semantic similarity is larger with higher scores on the ASRS (cluster delay:ASRS interaction: est.= -0.06,p = 0.011).These effects mirror at the level of word clusters the effects that we found at the level of individual words.

Thoughts segmentation and tendency for exploration
To what extent does exploration, as revealed in the f2 condition, correlate with the segmentation of thought that we measured in our free condition?Correlations between the different metrics are reported in Table 1.As can be seen, thought segmentation in the free condition and exploration in the f2 condition are associated.Across participants, a more segmented thought organization in the free condition is accompanied by a more exploratory search process in f2. 1e also found that the ratio number of non− clusters number of words was a good predictor of the switch ratio (r = 0.33, p = 0.018, BF 10 = 3.1), even though it did not correlate with the previously used cluster ratio (p = 0.13, BF 10 = 0.71).

Thought segmentation, exploratory behavior, and the ASRS score
As reviewed above, ADHD symptomatology is associated with both increased exploratory behavior and a higher rate of thought change.We found evidence for both, within participants and in very similar tasks.Given the domain-general property of exploratory behavior and the conceptualization of mind-wandering as foraging in one's thoughts, we propose that a low-level tendency for exploration could be at the root the segmentation of the stream of thoughts.According to this view, one of the reasons why people with a high ASRS score have a more segmented stream of thought (for instance change topics more often) is because they explore more their thoughts, just like they tend to explore more their environment (in our tasks, their semantic environment).As a first attempt to move beyond correlations, we ran a mediation analysis that summarizes this proposed mechanism.We tested whether the influence of the ASRS score on the cluster ratio (free condition) was mediated by exploratory behaviorwe used the asymmetry between the categories (f2 condition) since it was the measure correlating the most with the cluster ratio.Results showed it was indeed partially mediated (24 % mediated, p = 0.024, see Supp.Table 20) See Fig. 4.

Discussion
Using a semantic fluency task and an original subjective annotation procedure in a free word generation task, we proposed a way to capture and characterize the dynamics of the stream of spontaneous thoughts (SST) and to link it to exploration mechanisms in semantic search tasks and attentional profile.In a condition where participants were free to generate words as they spontaneously came to mind, we observed that participants with ADHD-like attentional profiles tended to produce more and smaller clusters of words, that we interpreted as a more segmented SST.Furthermore, increased exploratory behaviors in semantic search conditions were also associated with more a more segmented SST.These results aim at overcoming the limitation of purely subjective descriptions of thought rhythms and constitute one of the first objective assessments of rate of thought change.Our results converge on patterns that have been described in the clinical literature for a long time on the basis of subjective reports.They also point to a possible explanation for the cognitive basis of those patterns.
Our study assumes that when participants utter single words, they indirectly track the unfolding of their SST.It is worth clarifying that even though we used word generation to study the unfolding of thoughts, we do not claim that all participants' mental content was always verbal in nature.More generally we do not posit an equivalence between the stream of thought and inner speech.It has been estimated that 26 % (Heavey & Hurlburt, 2008, with experience sampling) or 66 % (Heavey et al., 2019, with a survey) of our inner experience included inner speech, while the same authors admit for at least four other kinds of inner experience: visual imagery, unsymbolized thinking, feeling and sensory awareness.Crucially, we did not ask our participants to express the "voice in their head".We used the words produced in the free condition as elementary thought probes.For example, if at a given moment a participant had the image of an object in mind, they likely produced the word corresponding to this object.We chose a word generation task rather than free speech for several reasons.First, there is an intrinsic dynamic in a sentence (partly constrained by the syntax of the language) that could mask the internal dynamics of thoughts.Second, word generation is likely less disruptive for the stream of thoughts than free speech.Obviously, further studies would be necessary to systematically assess the validity of free word production tasks as reflection of the unfolding of spontaneous thoughts.We shall now discuss in turn the four aspects of our results: Do we observe a segmentation of the stream of spontaneous thoughts in distinct episodes?Is this segmentation shaped by individual differences of attentional profile?Is the tendency for exploration assessed in our f2 condition modulated by participants' attentional profile?Can we relate the inter-individual variability of spontaneous thoughts segmentation to tendency for exploration?

Do we observe a segmentation of the stream of spontaneous thoughts in distinct episodes?
We found agreeing subjective and objective markers of segmentation in our free word generation task.First, at an operational level, our subjective annotation task was readily performed by participants.Only three participants did not understand what they were asked to do, or commented on its difficulty or artificiality.We therefore argue that, from a first-person perspective, the stream of spontaneous thoughts as captured by our free condition seems to exhibit identifiable coherent segments.This observation is not trivial: one could have expected thought content to drift continuously from one topic to the next without any identifiable boundary.The presence of nonclustered words further vindicates this separation of coherent segments.The validity of the segmentation of the series of words was further supported by a correspondence between first-person and third-person annotation.
Second, the subjective clustering of words was confirmed by objective semantic and temporal metrics.When predicting whether each word was part of a subjective cluster, the positive effect of semantic similarity tells us that participants' subjective thoughts segmentation followed a logic that can be captured to some extent by a word embedding model.The effects of temporal and semantic distance in the same model suggest that participants' free series followed an exploration/exploitation dynamic: inside a cluster they tended to produce close words quicklyas if they were exploiting a thought or ideaand outside a cluster they tended to slowly Table 1 Correlations between exploration and thought segmentation metrics.r: Pearson's correlation coefficient.produce words that were far one from the nextas if they were exploring different potential topics.Importantly, only 85 % of the words were included in a subjective cluster, although the instructions did not mention the possibility of leaving some words non-clustered.In that sense, our data show an alternation between episodes of coherent thought and episodes of diminished focus during which no coherent thought emerges.This fits with a conceptual framework of exploration and exploitation of thought content (Mildner & Tamir, 2019, Mittner et al., 2016): the clusters of words would correspond to an episode of exploitation, and the consecutive words that did not belong to any cluster would correspond to an episode of exploration, or unfocused thinking.This interpretation aligns with what the participants reported in the debriefing phase: the words that did not belong to any cluster usually corresponded to episodes when the participants did not have any thought or topic in mind, or when they felt they were generating random words so as to fulfill the demand of the task (producing words at a reasonable pace).It is important to stress that contrary to what has been done by other authors (e.g., Martz et al., 2022), we did not consider each word that had not been assigned to a cluster as an individual cluster.This crucial methodological decision stems from the framing of the task: participants were just told to provide groups of words "that belonged together at the moment of their production".While the cluster ratio (the number of clusters divided by the number of words) increased with increased inattention scores, an analog metric where isolated words are considered as clusters (see Martz at al., 2022) did not (p = 0.91), suggesting that non-clusters correspond to a distinct mode of processing.
All the elements listed above converge on showing that the stream of spontaneous thoughts is characterized by its segmentation in distinct episodes.This connects with the literature on event segmentation, that shows that our experience and memories are segmented into discrete units that are shaped by the perception of changes in the environment (Ross & Easton, 2022).Most of this literature is based either on the segmentation of events in memory (Zacks et al., 2007), or on the online segmentation of a stream of external events in a movie (e.g.Newtson, 1973).Our approach suggests that the inner stream of thought might also display such general characteristics.In fact, our study is at the intersection of the mind-wandering literature and the event-segmentation literature, filling a gap identified by Ross & Easton (2022): our word production tasks were like the "internal movie" that our participants were asked to segment.In contrast with the event segmentation tasks we are aware of, we did not impose that all events had to be contiguous in time, and we observed a significant proportion of non-contiguous segments.Note that this methodology could be transferred to external stream (movies) segmentation tasks: one could allow participants to leave some portions of the movie to be segmented out of any coherent segment.
The duration of thought segments we obtained can be compared to what was reported in the literature so far.In our free condition, participants identified on average 11 clusters per series, that is to say 3.7 per minute.That is about the same order of magnitude as the rate of mental state transitions estimated by Tseng & Poppenk (2020) very indirectly using fMRI (6.5 per minute), the rate of topics change subjectively reported by participants in Sripada & Taxali's 2020 study (3 per minute), or as the slow (<0.1 Hz, i.e. > 6 per minute) BOLD signal fluctuations reviewed by Fox & Raichles (2007).The mean duration of a cluster of words was of 13 s, which is comparable to the 10 to 20 s estimated by Bastian & Sackur (2013) using RT variability, and the 5 s reported by Klinger from subjective estimations (Klinger, 1978).To our knowledge, our free condition combined with the subjective annotation procedure constitute the first attempt to directly quantify this thought "speed".
Going back to the event segmentation literature, note that those durations are not far from the average shot durations that are commonly found in movies (10 s in movies from the 1940 s, about 4 s after 2000), that likely try to mirror viewers' internal dynamics (Cutting et al., 2011).
One could see in the alternation between clusters and non-clusters of our data the reflection of the "flights and perches" structure of the mental flux hypothesized by James (1890).With that interpretation, the clusters would correspond to the substantive parts (i.e. the "perches", segments with a clearly identifiable thought content), and the non-clusters would be the transitive parts (i.e. the "flights", segments with a quickly changing or indistinct content).However, note that during the so-called transitive parts, our participants do not necessarily know "where" their chain of thoughts was going, supporting our description of it in terms of exploration episodes.Our view that even in the stream of spontaneous thoughts one can identify an alternation of episodes of exploration and exploitation complements other accounts that consider spontaneous thoughts to be exploratory as opposed to goal-directed thinking that is exploitative in nature (Sripada, 2018).
Crucially, following James' intuitions, our data suggests a thought structure where transitions from one thought to another are not instantaneous.In this regard, it nuances previous accounts of thoughts dynamics that have considered that all mental content was part of a "thought" and have focused on identifying points of thought transition in think aloud paradigms.In contrast to Sripada & Taxali's (2020) "clump & jump" structure of spontaneous thought, we propose a "clump, jump or scatter" organization.Interestingly, a handful of participants pointed out that there was a gradual change in meaning in certain clusters of words.That suggests there might be several possible modes of exploration or thought transition: disorganized unfocused content without a clear direction, progressive variations of content, or instantaneous thought transitions akin to "mental saccades".Further studies using subjective annotation procedures might provide participants with more complex instructions, so that they could also report gradual transitions.

Is this thought segmentation shaped by individual differences in attentional profile?
We found that participants with higher ASRS scores tended to display more numerous and smaller subjective clusters of words, independently of the number of words produced.Yet, the temporal and semantic metrics inside clusters did not differ depending on the ASRS score.This suggests that the dynamics of the chain of thoughts, or the repartition of the thoughts in timebut not the thoughts themselvesvaried with the attention profile.Further studies are clearly needed to validate the differential impact of attentional traits on local and global structure of the stream of thought.In addition, readers have to keep in mind that since we did not impose the rate of word production, the "thought sampling rate", so to say, varied between participants.This is likely to have had an impact on our local metric such as the consecutive semantic similarity.For example, someone whose thoughts move very fast semantically might, in our design, produce many words both close in time and close semantically, but would produce words further apart semantically if words had to be more distant in time.This prediction is supported by the fact that the number of words in a series positively predicts the mean consecutive semantic similarity in this series.
Importantly, these differences were found in a non-clinical population, but they mirror results from the clinical literature.For example, Martz et al. (2022) found a conceptually similar result of smaller and more numerous semantic clusters of words for ADHD patients compared to healthy controls in a free semantic fluency task.Our results provide an experimental observation of the phenomenon of racing thoughts, and in that regard, we predict that similar associations would be found with questionnaires such as the Mind Excessively Wandering Scale (MEWS, Mowlem et al., 2019) or clinical conditions associated with racing thought, as hypomania.
Our experiment complements the few previous studies that started investigating individual differences in thought dynamics, focusing on different personality traits and different metrics of dynamics.Raphaelli et al. (2021) and Andrews-Hanna et al. (2021) assessed thought dynamics through the lens of emotional valence and trait rumination, respectively using a think-aloud paradigm and a chained associations task.The former found that high brooding scores were associated with longer negative and shorter positive thoughts, the latter estimated that participants with high rumination scores had a higher chance of transitioning to a negative thought after a positive thought.Kim et al. (2022) extended their approach to other dimensions like time, safety-threat, self-relevance and vividness.
Our results might also shed light on the reason why some studies found that participants with ADHD symptoms experience more episodes of mind-blankingi.e.mental states with no reportable mental content (Beikmohamadi & Meier, 2022;Van den Driessche et al., 2017).In addition to cases where blanking would result from local sleep (Andrillon et al., 2019), mind-blanking reports could also be caused by either the probe arriving in a moment of transition between different thoughts, or by a frequently changing mental content that would make it difficult to identify one current topic of thought at the moment of the probe.In both cases, the higher rate of thought change that we observed for participants with higher ASRS would point in that direction.
Note that in ADHD, unfocused thinking can also alternate with hyper-focusing on a task or a thought that is found to be particularly interesting (Hupfeld et al., 2019;Ozel-Kizil et al., 2016, but see Groen et al., 2020), which in the context of the free word production task, might have led to long clusters.However, we were not expecting big clusters of words in the free condition nor low numbers of switches in double fluency condition for participants with high ASRS scores, because we did not expect participants to find our laboratory task very exciting, a condition necessary for hyper-focus to occur.

Is the tendency for exploration assessed in our double fluency condition modulated by participants' attentional profile?
In the double fluency condition, we directly observed a greater tendency for exploration associated with non-clinical ADHD-like symptomatology: participants with a higher score on the ASRS scale relied more on both categories.This is in line with the existing empirical (e.g.Addicott et al., 2021) and theoretical (e.g.Hauser et al., 2016) literature on ADHD and exploratory behavior, which further confirms that our "double fluency" task is suitable for assessing a transfer of exploratory behavior in the semantic domain.
We acknowledge that we would have predicted the number of switches to increase with the ASRS score too.However, as reviewed in the introduction, exploration can take various forms, only some of which might be modulated by ADHD-like traits in our particular task.Thus, the crucial thing to us is that there was no evidence in the opposite direction.In addition, when focusing more specifically on impulsive exploration we found that high ASRS participants with a high ASRS score had faster switches and more of their switches could be categorized as impulsive.This could be interpreted as showing that the cost of switching is lower for participants higher on the inattention trait.On the other hand, it could also be seen as a signature of impulsivity, in the sense that the decision to switch is taken quickly.Those results are congruent with Dubois & Hauser (2022)'s result that high ASRS scores were associated with more value-free random exploration, arguably as a result of impulsivity.
To summarize, we were able to quantify exploratory behavior in a semantic search task, and our results suggest that the attention profile affects it at a global level (between categories).

Can we relate the inter-individual variability of spontaneous thoughts segmentation to a general tendency for exploration?
Ultimately, our goal was to relate thought segmentation as revealed in our free condition and exploratory processes as found in the double fluency condition.We found evidence that the two tasks appeal to similar underlying processes.First, we found a positive correlation between the mean consecutive semantic similarity in the free condition and in the double fluency condition.Interestingly, an analogous effect was also observed at the level of the clusters in the free condition, which suggest that the dynamics of the SST at a larger scale is also constrained by semantics.Second, we found similar patterns of behavior around the switches in double fluency and at the beginnings of the subjective clusters in the free condition.Namely, the first word of each cluster was associated with a large WOA and a small semantic similarity with the previous word, as were the first words in a category in the double fluency condition.The temporal effect is particularly interesting because the free clusters were identified only retrospectively.It suggests that participants' thought process followed a form of category search, constrained by objective semantic relationships, as exemplified by the negative relationships we found between temporal distance and semantic similarity, both at the level of the individual words and at the level of the subjective clusters of words.These last results are in line with current frameworks (Christoff et al., 2016;Mildner & Tamir, 2019), according to which spontaneous thought is the result of an unconstrained memory process.We should add to this idea that categorical organization provide a form of minimal constraint on this process of search within one's semantic space.Yet it is worth noting that other sources of segmentation of the SST are also plausible, namely emotional and goal states (Wang et al., 2023).
We also found converging behavior between the free and double fluency conditions based on the metrics that directly quantify the tendency for exploration (switch ratio and asymmetry in the exploitation of categories in f2).Since the number of switches in the f2 condition and the number of subjective cluster boundaries in the free condition are analogous metrics, the positive correlation that we observed between the two (when controlling for the number of words produced) suggests that those two conditions reveal similar cognitive differences between participants.Interestingly, we found that the non-cluster ratio in the free condition was also a good predictor of the switch ratio (f2 condition), revealing that the more a participant had unfocused thought episodes in the free condition (non-clustered words), the more they were displaying exploratory behavior in another task.This reinforces the notion that moments of unfocused thoughts correspond to moments of exploration.
Given the subjective nature of our novel clustering method, and the moderate number of data points per task per participant, it is not so surprising that some exploration and segmentation metrics correlated together and with the ASRS, while others did not.Our predictions were formulated at a rather abstract high level, so it is possible that not all operationalizations of those high-level concepts are sensitive enough for our design.However, we find it encouraging to see that all resultseven when marginally significantunambiguously go in the expected direction.One should also keep in mind that the relationships we observed with the ASRS score in the two tasks might be the manifestation of different constructs: we might have observed clearer relationships if we had used two questionnaires separately targeting the constructs of "racing thoughts" and of "tendency for exploration".
The mediation of the association between the ASRS and thought segmentation by the intensity of exploratory behavior further suggests that the cognitive mechanisms that produce the succession of spontaneous thoughts are partially driven by low-level general exploration mechanisms.This interpretation is backed by several results from the literature showing that semantic search relies on domain-general search mechanisms (Hills et al., 2008), and that ADHD symptomatology is associated with both increased exploratory behavior (Addicott et al., 2021;Dubois & Hauser, 2022;Van den Driessche et al., 2019) and high rate of thoughts change (Martz et al., 2021).

Limits of the current study and directions for future research
There are some limitations to this study.First, we considered mostly objective semantic relationships, as quantified by word embedding models, ignoring the idiosyncratic relationships shaped by individual experience and linked to episodic memory.Yet, it has been shown that such relationships impact the stream of thoughts both in the lab (Andrews-Hanna et al., 2021;Kim Lux et al., 2022) and in our daily life (D'Argembeau, 2018), along with semantic relationships (see Jordão and St. Jacques (2022) for a review of the interplay of semantic and episodic memory in spontaneous thoughts).Our best attempt to take them into consideration to some extent was to include the subjective annotation procedure in the free condition.A more systematicbut time consumingmethod could be to have participants rate the semantic relatedness of all pairs of consecutive words.Second, the very nature of the experimental setting may have affected the spontaneity and validity of our free condition data: participants might have felt observed and they had to generate words.While this is a valid concern, it seems to come as a necessary cost to pay for a direct appraisal of thoughts, and we believe that our instruction of verbalizing single wordsinstead of fully explicit thoughts for examplereduced the issue.Third, out of all mental conditions that are associated with an alteration of spontaneous thoughts, we focused only on ADHD, and at a non-clinical level.It would be interesting to use the methodology we developed to better characterize pathological spontaneous thoughts dynamics in various conditions.For example, we could predict that patients with depression or rumination would display larger subjective clusters of words in our free condition.A natural next step for our research would also be to test whether our results extend to a comparison between clinical ADHD patients and a control group, or between patients with and without treatment.Fourth, we used the ASRS as a measure of a single construct, namely the attentional profile.Yet ADHD encompasses at least two dimensions: inattention et hyperactivity/impulsivity. It is unclear whether one of those dimensions drive the effects we found on thought segmentation and exploratory behavior, or if they both contribute.Besides, this question may be misleading, for example if inattention and hyperactivity/impulsivity were two different manifestations of a same dysfunction, may it be a deficit in arousal (Bellato et al., 2020) or a preference for novelty over gain and performance (Van den Driessche, 2022).Fifth, one should keep in mind that our study had an exploratory aspect, given the fact that we used novel tasks.We can distinguish three types of results: the most obvious analyses given our hypothesis had been planned in advance precisely (e.g.correlation between the ASRS score and the number of subjective clusters), other analysis had been conceptually planned but without knowing what would be possible to do with the data we would obtain (e.g. the idea of quantifying impulsive switches), and finally some analysis emerged as we understood better how participants were behaving in our tasks (e.g. the analyses related to non-clusters).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Visualization of Two Representative Series of Words.Example series of words presented in a 2D semantic space (t-SNE reduction of the 300D word embedding model we used for the analyses).The first word of each series is written in a bold and italic font.The insert plots represent the word onset in seconds and distribution of consecutive semantic similarities.Words were translated from French.Left.free condition, each color corresponds to a subjective cluster.The words in gray were not assigned to any cluster.The distribution of consecutive semantic similarities of the words in gray is superimposed to the distribution of the colored words.Solid line: cluster to cluster transition, dashed line: transition between cluster and non-cluster, dotted line: transition between non-clustered words.We can see that the consecutive semantic similarities are smaller for the nonclustered words (the gray distribution is more to the left than the colored distribution in the top right insert plot).Right.f2 condition, each color represents a category (blue: countries, red: jobs).Solid line: switch, dotted line: non-switch transition.Note the bump of very small consecutive semantic similarities on the top left insert plot, that corresponds to the switches.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 2 .
Fig. 2. Scatter Plots Illustrating the Relationships Between the ASRS Score and the Thought Segmentation Metrics.Each point represents a participant.The color associated with each participant is arbitrary but is consistent across figures.The metrics were averaged per series first, and then per participant.Cluster ratio = number of clusters/number of words.

Fig. 3 .
Fig. 3. Relationship Between ASRS Score and Exploration.Scatter plot illustrating the relationship between the ASRS score and the mean asymmetry between categories in the f2 condition.Each dot represents a participant.The color associated with each participant is arbitrary but is consistent across figures.

Fig. 4 .
Fig. 4. Visualization of the Mediation Analysis.Representation of the relationships investigated in the last analysis (section 3.4.3.).The regression coefficients between each pair of measures are reported for illustration purpose.