Introduction

Since the publication of the seminal paper by Daneman and Carpenter (1980), it is well established that individual differences in working memory capacity, as measured by complex span tasks, are related to individual differences in a wide range of cognitive tasks (see, e.g., Conway et al., 2003; Engle, 2002). In order to account for the relationship between working memory capacity and cognitive tasks, the executive attention theory of working memory has become the dominant view (Engle & Kane, 2004). Within this framework, working memory would comprise domain-specific components responsible for the maintenance of information and a domain-general component responsible for the allocation of attention. In 2018, Engle updated his theory, postulating that working memory capacity and general fluid intelligence play converse but complementary roles in the maintenance of information: working memory capacity “reflects an ability to maintain information in the maelstrom of divergent thought” (pp. 192), whereas fluid intelligence represents “the ability to think of something that may be important at the moment, but when it shortly proves to be unimportant or wrong, to disengage or unbind that information and to functionally forget it” (pp.192). Importantly, he positions attentional control as the ability to flexibly allocate either resource. According to this view, the relationship between working memory capacity and performance on various cognitive tasks would mainly be due to the allocation of attentional control. In this context, Kane et al. (2006) argued that since executive attention plays a role in most tasks involving attention control, individual differences in working memory capacity should be related to individual differences in search tasks. However, so far, this view has received limited support (Poole & Kane, 2009).

While many attention tasks have been investigated in the context of working memory capacity, search tasks have received relatively little attention and the impact of working memory capacity appears limited (but see Sobel et al., 2007). Kane et al. (2006) conducted the first series of experiments addressing this issue. In their first experiment, high- and low-working memory capacity participants searched visual displays for the target letter F presented among distractors composed of either Os or Es. As is usually found, participants were faster at detecting the letter F among Os than among Es. Participants were also faster when the search set was smaller and the slope relating the search set and response time was steeper for Es than for Os. Importantly, there was no difference between high- and low-working memory capacity participants and working memory capacity did not interact with any factor. The only impact of working memory capacity was on omissions: Low-span participants missed slightly more Fs than high-span participants. These results were replicated in two further experiments. Kane et al. concluded that if there is an effect of working memory capacity on visual search efficiency it must be quite small, and it is certainly much smaller than the effects of working memory capacity seen in many other tasks requiring attentional control. This interpretation was further qualified by Sobel et al. (2007), who instead concluded that rather than the relationship between search and working memory capacity being small, it might only manifest under specific conditions. Sobel et al. (2007) used a conjunction search task similar to the one used by Kane et al. (2006) to explore this question. However, building upon ideas from Bacon and Egeth (1994), Kane and Engle (2003), and Sobel and Cave (2002), they modified their task in several ways to increase the likelihood of and the benefit from top-down executive control. In exclusively the condition with the highest requirement for top-down control, they found a significant search time difference between high- and low-working memory capacity groups.

In 2009, in a further attempt to uncover a relationship between working memory capacity and visual search, Poole and Kane conducted a series of experiments in which participants were required to selectively constrain their attention to some potential target locations. More specifically, a 5 × 5 invisible grid was used. Each trial began with the presentation of a fixation display in which all 25 possible locations were occupied by a dot. The target locations were indicated by surrounding the critical dots with a square. The fixation display was presented during 300, 1,500, or 1,550 ms and was immediately replaced by the search display, which remained on the screen until the participants made their response. Participants were asked to decide whether an F or a backward F was presented in one of those critical locations. Other locations could have been populated by distractors or by dots. Results revealed that high-working memory capacity participants identified the target faster than low-working memory participants only when the non-target locations were populated with distractors and when the fixation display was presented for 1,500 or 1,550 ms. Also exploring the relationship between search and working memory capacity, Luria and Vogel (2011) reported that the electrophysiological indices associated with working memory capacity allocation – specifically the Contralateral Delay Activity (CDA; see also Emrich et al., 2009) – predicted search behavior (but see Williams & Drew, 2021). However, working memory capacity only significantly correlated with search behavior in one of three experiments: specifically, when comparing Easy with Medium search, but not when comparing Easy with Difficult search, nor when all three conditions were intermixed.

Although interesting, Poole and Kane’s (2009) and Luria and Vogel’s (2011) results based on individual differences only revealed a limited relationship between working memory capacity and visual search. This situation is in sharp contrast with studies that used an experimental manipulation (dual task) to impose a load on visual working memory that regularly impairs the efficiency of the search process. For instance, Woodman and Luck (2004) asked their participants to maintain two locations in memory for a location change detection task. In the dual task condition, during the retention interval of the location change detection task, participants performed a visual search task in which they viewed arrays of four, eight, or 12 squares each with a gap on one side. Participants were informed that only one square had a gap at either the top or the bottom and they had to decide as fast as possible which one was presented. Results showed that the search function was steeper in the dual task condition and that performance at the location change detection task decreased as a function of set size (but see Emrich et al., 2010). Kane et al. (2006) shone light on these distinctions while comparing their results with those from dual task experiments, and concluded their paper by mentioning that:

It thus remains a mystery why dual-task studies suggest WMC [Working Memory Capacity] to be important to search efficiency, whereas the individual differences studies we report suggest WMC to be largely irrelevant to prototypical laboratory tests of inefficient search. Perhaps future work that combines experimental manipulations of WMC with naturally occurring individual differences in WMC will help to unravel the mystery. (p.773)

In the present experiment, we accepted the challenge by relating working memory capacity to performance at a letter search task performed while reading for comprehension. Searching for letters in a text is well suited to achieve this goal as targets are surrounded by distractors and the difficulty of the search task naturally varies as a function of reading materials (Klein & Saint-Aubin, 2016). More specifically, it is well known that performance on the search task is influenced by reading processes. In effect, when participants search for a target letter in a text written in an unfamiliar language (viz., with little to no concurrent reading demands), their detection accuracy is high and it does not vary as a function of the words’ characteristics (see, e.g., Tao & Healy, 2002). However, when participants can read in the language of the text, they miss more target letters when targets are embedded in frequent function words than in less frequent content words (see Klein & Saint-Aubin, 2016, for an overview, and Roy-Charland et al., 2007, for an explanation wherein the timing of attentional disengagement during the dual task of reading while searching plays a critical role). This well-replicated phenomenon, known as the missing-letter effect, has been extensively investigated over the last five decades as a window on the cognitive processes involved in reading. Similarly, it has been found that the efficiency of the search process varies as a function of the target characteristics. For instance, the letter o is easier to find than the letter r (Saint-Aubin & Poirier, 1997), letters are easier to detect when they are embedded in a word located at the end of a line (Smith & Groat, 1979), and the initial letter of a word is easier to detect than the other word letters (Guérard et al., 2012).

The letter search task in reading is a particularly well suited dual task paradigm for investigating the involvement of executive attentional control, because both tasks are performed concurrently. In the auditory modality, Colflesh and Conway (2007) nicely demonstrated the involvement of executive attention in an auditory search task. Participants performed a modified version of the cocktail party procedure in which they were told their name would be presented in the irrelevant channel and were asked to detect it while shadowing the other message. Results revealed that 67% of participants with a high working memory span detected their name, while only 35% of those with a low working memory span detected it. Interestingly, the reverse pattern occurred when participants were not instructed to attend the irrelevant channel, whereby low-working memory capacity individuals reported hearing their name more often than high-working memory capacity individuals (Conway et al., 2001), suggesting instead that complex span tasks may be indexing the efficiency of the distribution of task-relevant attention. If visual search performance is similar to auditory search, then in the current study high-span participants should be more accurate than low-span participants when searching while reading.

In two experiments, we presented the texts on a computer screen using a rapid serial visual presentation (RSVP) procedure in which words or letter strings appeared one at a time at the center of the screen for a fixed duration (250 ms). For present purposes, there are a variety of benefits of this procedure. When it is a reading task, this procedure nicely controls for individual differences in reading speed and eye movement patterns. Moreover, RSVP allows good comprehension of prose and the presence of the typical missing-letter effect (see, e.g., Saint-Aubin & Klein, 2004). Indeed, it has been shown in this paradigm that readers fixate the words for their entire presentation duration (Saint-Aubin et al., 2010). This is an important control, because working memory capacity influences eye movements in normal reading. For instance, participants with higher working memory scores produce longer saccades (Luke et al., 2018). In addition, because the appearance of each letter string is controlled by the experimenter, reaction times (RTs) can be measured relative to the onset of any individual stimulus of interest, rather than relative to the onset of the entire array like when using a traditional text. Finally, unlike typical visual search tasks in which a display remains present until the target is found (or not), in RSVP each presented array, which may or may not contain a target, is replaced by the next one. This minimizes the possible contribution of endogenous disengagement of attention because the subsequent item is likely to do this exogenously. To anticipate our main finding, a robust effect of working memory capacity upon search performance was found, but ONLY when this RSVP paradigm entailed the dual-task demands of searching for target letters while reading for comprehension. This finding converges with the proposal of Sobel et al. (2007) that the requirement for top-down control mechanisms mediates (or may be a prerequisite for observing) effects of working memory capacity upon search performance.

Experiment 1

In Experiment 1 we manipulated the syntactic structure of the RSVP stimuli in order to vary the degree to which reading could play a dual-tasking role in tandem with searching. There were four conditions: Des text in full prose, the same passage with word order scrambled, the same “words” but with the letters in each word scrambled to create non-words, and PourCour text in full prose. In the full-prose conditions participants could read for comprehension while performing the letter search task. Two distinct full-prose passages were used to increase the generalizability of our results, as the two texts (Des and PourCour) differed in (a) their likeness to natural speech/text, and (b) their relative frequencies of specific target word types (see Method below for further explanation). In the scrambled word condition, each word might be read, but syntactic processing and normal comprehension of the passage would be obviated by the random order of the words. Finally, in the scrambled letter condition, the only task would be searching for the target letters.

If working memory capacity is simply influencing search performance, Omissions and RTs should decrease as OSPAN increases, regardless of text condition. However, if working memory capacity is instead influencing the efficiency of attentional control required when participants are performing the dual task of reading while searching, then the relationship between our dependent variables and OSPAN should decrease as the syntactical structure of the search stream is reduced. As per the guidance of Simmons et al. (2012), we declare that we have reported how we determined our sample size, all data exclusions, all manipulations, and all measures in the study. This study was not preregistered.

Method

Participants

A total of 100 students from Université de Moncton volunteered to take part in the experiment. Sample size was not declared in advance, rather as many participants as possible were run given the constraints of the academic term. All participants were native-French speakers. In considering statistical power, we followed the recommendation of Brysbaert and Stevens (2018) to include at least 1,600 observations per repeated-measures condition in a linear mixed model combining subject and stimulus analyses. With 100 participants × 200 critical word observations in each version of three versions of the Des text, we get 60,000 total observations. With 100 participants × 32 critical word observations in the PourCour text, we get 3,200 total observations.

Material and stimuli

Operation span task

The operation span task used in this study was adapted from that used by Colflesh and Conway (2007). In this task, participants completed 15 trials, each consisting of two to six elements. In total, there were three trials per element. An element was composed of a mathematical operation followed by a word (e.g., IS (7 * 3) + 4 = 26? DANGER). Participants were asked to read aloud the equation, say if it was true or false, and then say the word aloud. Words were all two syllables long with an average frequency of 116 occurrences per million (Lexique 3: New et al., 2001). Furthermore, care was taken to ensure that words within a trial never rhymed together. Once the word was said, the experimenter wrote down the given answer to the mathematical equation, and initiated the presentation of the next element. Following the presentation of the last element, three question marks appeared at the center of the screen and the participant was required to write down the words that had been presented in their presentation order, beginning with the first word. The number of lines on the answer sheet matches the number of presented elements. The participant was not allowed to turn the page to see the next answer sheet because the number of elements varied randomly from trial to trial.

Reading task

Two different texts were used, one longer than the other. The first (longer) text – hereafter the Des text – was 2,620 words long and contained 100 occurrences of the French plural indefinite article des, with a frequency count of 10,625 occurrences per million, and 100 instances of three-letter control content words beginning with the target letter d, with an average frequency count of 644 occurrences per million (New et al., 2001). The 100 occurrences of control content words were composed of eight different words (don [donation or gift], dit [says], dis [say], dos [back], duo [duet], dur [hard], due [due to] and duc [duke]) for which there were between six and 24 occurrences in the text. In addition, the target letter d was embedded in 100 noncritical words varying from two to 14 letters in length. The noncritical words could assume a function role like the French prepositions de and dans or a content role like the nouns médecin [physician], fraude [fraud] and divertissement [entertainment]. In the noncritical words, the target letter could be located at any position and was not necessarily the first letter. From the participants’ viewpoint, all ds are the target and they were not aware that some words were critical while others were not. The second (shorter) text was the PourCour text used by Saint-Aubin et al. (2003) in their fifth experiment. The text comprised 808 words and contained 16 instances each of the preposition pour [for] and of the noun cour [yard], with frequency counts of 6,198 and 150 occurrences per million, respectively. The target letter r was also embedded in 55 noncritical words varying in length from three to 12 letters. The noncritical words could assume a function role like the French prepositions sur [on] or a content role like dire [to say], rose [pink], and services [services]. Both texts were constructed using the following constraints: (a) each word containing the target letter, whether it was critical or not, was separated from the previous and the following target-containing words by at least four filler words without the target letter; (b) the critical words were not included in the first or last sentences of the text; and (c) they were never adjacent to a punctuation mark. The Des text contained multiple instances of the same function word (des, a plural indefinite article) and fewer presentations of several different content words. This situation is typical of normal speech and normal text in which there is a high rate of repetition of a few function words and a low rate of repetition of multiple content words. However, ultimately this confounded word function with frequency of occurrences within the text. To address this confound, the PourCour text in contrast contained only one function word (pour, a preposition meaning “for”) and only one content word (cour, a noun meaning “yard”), each repeated multiple times. The PourCour text was less natural than the Des text, but this compromise was overcome by having the target function and the target content words repeat equally often. Moreover, using two different types of passage served to increase the generalizability of the results.

From the Des text described above, a Word Scramble and Letter Scramble version were also derived. In the Word Scramble condition, the same words from the Des text were displayed using the same RSVP procedure, except the words from each sentence were in a random order. Words containing target letters were separated by at least four words not containing target letters. In the Letter Scramble condition, words of the Des text were first scrambled as in the Word Scramble condition. Then, letters within the words were scrambled semi-randomly to form non-words; semi-randomly in that the target letter always appeared as the first letter in the character string.Footnote 1 This ensured consistency with the letter positioning for critical words in the Des Prose and Word Scramble conditions. Letters in non-critical words were scrambled randomly, with capitalization preserved from the Des Prose/Word Scramble conditions (i.e., capital letters could appear in the middle of non-target words, but target words were never affected because they were never capitalized in the prose passages). These character strings were then presented using the same RSVP procedure as in all other conditions.

Procedure

Participants took part individually in one session lasting approximately 45 min. The experiment was run with E-Prime (Schneider, Eschman, & Zuccolotto, 2003) on a PC computer with a resolution of 1,024 × 768 pixels. For all participants, the operation span task was administered first, followed by the scrambled lettered Des passage, the random word order Des passage, the normal order Des Prose passage, and the PourCour text (see Fig. 1). A fixed presentation order was used because variability due to individual differences in the order of administration is undesirable in a correlational study aimed at assessing the relationship between individual differences in working memory capacity and the missing-letter effect. The conditions were presented in ascending order of syntactical structure for the first three of four conditions (with conditions three and four being of equivalent syntactical structure) to prevent participants from trying to understand meaningless passages based on their knowledge of the original text. An RSVP procedure was used. With this procedure, each word or letter string appeared at the center of the screen in Times New Roman 22 pt for 250 ms. The first word of a sentence was presented with a capital letter. The presentation of each of the three conditions of the Des text was followed by four multiple-choice comprehension questions and the PourCour text was followed by five multiple-choice questions, wherein each comprehension question contained four possible answers. The comprehension questions were presented on paper. During the reading task, participants pressed the spacebar to signal the presence of a target letter. They were asked to press the spacebar as quickly as possible each time they detected a target letter and to respond carefully, because both their speed and their accuracy would be scored. They were further instructed to read for comprehension because they would be required to answer multiple-choice questions after each text. They were warned that it was equally important to read for comprehension, even if it was difficult in some conditions, and to look for the target letter.

Fig. 1
figure 1

To illustrate the rapid serial visual presentation (RSVP) methodology used in both Experiments, nine consecutive stimuli have been excerpted from each of four lists. Each stimulus (letter string) from each of the four lists was presented at fixation for 250 ms with no gap between stimuli. Target stimuli are bolded and slightly enlarged for illustrative purposes, but were normally presented to participants. In Experiment 1 participants experienced the four conditions illustrated here in the order from left to right: Scrambled letters, scrambled words, prose passage Des and prose passage PourCour. In Experiment 2 only the prose passages were used (see text for further explanation)

Results

The partial-credit unit scoring criterion was used to compute the working memory span of each participant (Colflesh & Conway, 2007; Conway et al., 2005). With this criterion, for each item, the score represents the ratio of the number of correctly remembered stimuli on the total number of stimuli presented. Thus, if the participant correctly remembered two of four words, a score of .5 is attributed to that item. The total score is simply the mean of the proportional scores for the 15 items. Overall, participants had a mean partial credit unit score of .63 (SD = .14) and both the skewness (0.274) and the kurtosis (2.033) of the distribution were normal given the result of a Jarque-Bera Normality test: JB = 5.148, p = 0.08.

Text comprehension

Mean comprehension scores for the four text conditions are represented in Fig. 2. As expected, performance in the two scrambled conditions did not differ from chance. However, when there was syntactical structure to the text, participants demonstrated a reasonable degree of comprehension.

Fig. 2
figure 2

Mean proportion correct for comprehension questions pertaining to each text condition in Experiment 1 (dashed line = chance performance). Error bars represent 95% confidence intervals, calculated from the overall standard error across text conditions

During peer review (and in a foreshadowing of our results to come), one reviewer suggested that participants with high OSPAN scores may be balancing the reading and searching tasks differently, hypothesizing that high OSPAN individuals may be trading off performance from the reading task in order to prioritize the letter search task. We ran a post hoc analysis to test the prediction that there could be a negative relationship between comprehension score and OSPAN. We constrained this analysis to each of the two full-prose texts (Des Prose and PourCour; see Fig. 3), since responding was at chance levels in the two other text conditions. A generalized linear mixed effects model (GLMER - lme4 R package; Bates et al., 2015) was run to predict correct comprehension responses from individual OSPAN scores. OSPAN was treated as a fixed effect, with a correlated-multivariate random effect of subject on the intercept. AICs (Akaike Information Criterion) were computed via the drop1 method in the “stats” package in R (Version 4.0.2). Effect sizes for parameter estimates are reported as bootstrapped 95% confidence intervals (CIs), generated via confint. For the Des Prose condition, there was no support for the trading hypothesis, as there was a positive relationship between OSPAN and Comprehension, b = 1.154, 95% CI = [-0.15, 2.85], albeit with equivalent support for the model with the effect term included (AIC = 531) than when the term was dropped (AIC = 531). For the PourCour text, there was also no support for the trading hypothesis, as again there was a positive relationship between OSPAN and Comprehension, b = 1.386, 95% CI = [0.00, 3.31], with more support for the model with the effect term included (AIC = 511) than when the term was dropped (AIC = 513).

Fig. 3
figure 3

Performance on comprehension questions as a function of OSPAN in the two full-prose conditions in Experiment 1 (chance performance = 0.25). Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

Letter search task

Errors are presented first, followed by RTs. A target letter was considered detected if a response occurred within 1,250 ms of the onset of the target-containing word. This criterion (which we have used successfully before, e.g., see Saint-Aubin et al., 2010) was selected because with a presentation rate of 250 ms per word and a criterion of at least four words without the target letter between two words with the target letter, it is the longest possible interval that can apply to all words. As in previous studies, only omissions were analyzed because with a RSVP procedure, it is impossible to attribute a false alarm to a specific non-critical word or to a specific word class (see, e.g., Saint-Aubin & Klein, 2001; Saint-Aubin et al., 2003). For example, as shown in Fig. 1, an RT of 120 ms after the onset of the “noble” in the fourth panel could have been a false alarm of 370 ms to the preceding word “cette” or a false alarm of 620 ms to the preceding word “dans” or a false alarm of 870 ms to the preceding word “employés” or even a false alarm of 1,120 ms to the preceding word “ses” because none of these words contain the target letter “r,” but one could not discern this given the nature of the stimulus presentation. As such, we categorized any response within our 1,250-ms criterion of a target-containing word as a hit, and categorized any target-containing word that did not have a corresponding response within that window as an omission.

Following the recommendations of Jaeger (2008) and Dixon (2008), generalized linear mixed effects models were used to examine the relationship between predictor variables – Text, Word Type, and OSPAN – and outcome variables – Omission Rate and Correct RT. Each predictor was treated as a fixed effect, with a correlated-multivariate random effect of subject on the intercept and each predictor variable. The most complex model was run first, with AICs computed via the drop1 method in the “stats” package in R (Version 4.0.2). Effect sizes for parameter estimates are reported as bootstrapped 95% CIs, generated via confint. We have shared all data files generated by our experimental paradigm and the R-scripts we produced for analysis on a project page hosted by the Open Science Framework under the project name of “Individual Differences in Working Memory Capacity and Visual Search While Reading” (https://osf.io/4ctn3/).

We analyzed the main manipulation assessing the influence of reading for comprehension on the key outcome measures, including only the conditions that were generated from the original Des text: Des Prose, Word Scramble, and Letter Scramble. Models included the following predictors: Text (Des Prose, Word Scramble, Letter Scramble), Word Type (Function, Content), and OSPAN (Continuous). Performance on the PourCour text was tested separately, and only examined for effects on Omissions due to the lower trial count relative to the Des text conditions. Models included the following predictors: Word Type (Function, Content), and OSPAN (Continuous).

Omissions

As shown in Fig. 4, omissions increased as a function of syntactic structure. The fewest omissions were found in the Letter Scramble text, followed by a slight increase in the Word Scramble text, and the most omissions were found in the Des Prose condition. The effect of Word Type was severely reduced in the Word Scramble condition as most of the syntactic structure of the text was disturbed, and the effect completely vanished in the Letter Scramble condition in which reading was impossible. Additionally, OSPAN did not influence omission rate in either disturbed syntax text conditions.

Fig. 4
figure 4

Omissions of target letters for each word type (Function and Content) in the three conditions rendered from the Des text (Letter Scramble, Word Scramble, and Des Prose) in Experiment 1, plotted as a function of OSPAN. Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

When examining the influence of the predictors on omissions (Fig. 4), there was no evidence to support the three-way interaction, OSPAN × Word Type × (Des Prose - Letter Scramble): b = -0.209, 95% CI = [-0.637, 0.177], OSPAN × Word Type × (Word Scramble - Letter Scramble): b = 0.013, 95% CI = [-0.361, 0.432], with stronger support for the model with the interaction term dropped (AIC = 71293) than when the term was included (AIC = 71296).

To evaluate the two-way interactions, we contrasted the model with all two-way interaction terms included (AIC = 71293) with models where each term was dropped. The model performed worse (ΔAIC = +171) when dropping the two-way interaction term between Text and Word Type (Fig. 5), (Des Prose - Letter Scramble) × Word Type: b = 0.327, 95% CI = [0.235, 0.435], (Word Scramble - Letter Scramble) × Word Type: b = 0.599, 95% CI = [0.498, 0.685]. In addition, the model performed worse (ΔAIC = +9) when dropping the two-way interaction term between Text and OSPAN, (Des Prose - Letter Scramble) × OSPAN, b = -1.230, 95% CI = [-1.822, -0.576], (Word Scramble - Letter Scramble) × OSPAN, b = 0.307, 95% CI = [-0.188, 0.712]. The model performed slightly better (ΔAIC = -1) when dropping the two-way interaction term between Word Type and OSPAN, b = 0.246, 95% CI = [-0.125, 0.664].

Fig. 5
figure 5

Missing letter effect in Omissions (Function – Content) in the three text conditions rendered from the Des text (Letter Scramble, Word Scramble, and Des Prose) in Experiment 1. Error bars represent FLSDs, where the mean point-estimate differs from any value not captured within the error bars.

To evaluate the main effects, we contrasted the model with all main effect terms included (AIC = 71473) with models where each term was dropped. The model performed worse (ΔAIC = +115) when dropping the main effect of Text, (Des Prose - Letter Scramble), b = 0.991, 95% CI = [0.824, 1.15], (Word Scramble - Letter Scramble), b = 0.396, 95% CI = [0.318, 0.474]. The model also performed worse (ΔAIC = +54) when dropping the main effect of Word Type, b = 0.206, 95% CI = [0.152, 0.258]. The model performed slightly better (ΔAIC = -1) when dropping the main effect of OSPAN, b = -0.467., 95% CI = [-0.971, 0.131].

In the PourCour text, when examining the influence of the predictors on omissions (Fig. 6, top left), there was ambiguous evidence in support of the two-way interaction, OSPAN × Word Type: b = 0.900, 95% CI = [-0.279, 2.157], with equivalent support for the model with the interaction term dropped (AIC = 4090) than when the term was included (AIC = 4090).

Fig. 6
figure 6

Omissions of target letters for each word type (Function and Content) in the full-prose texts (PourCour and Des Prose) plotted as a function of OSPAN (Top = Experiment 1; Bottom = Experiment 2). Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

To evaluate the main effects, we contrasted the model with both main effect terms included (AIC = 4090) with models where each term was dropped. The model performed worse (ΔAIC = +17) when dropping the main effect of Word Type, b = 0.402, 95% CI = [0.249, 0.567]. The model also performed worse (ΔAIC = +2) when dropping the main effect of OSPAN, b = -1.390, 95% CI = [-2.633, -0.008].

Reaction time

As shown in Fig. 7, RT increased as a function of syntactic structure. The fastest RTs were found in the Letter Scramble text, followed by a slight increase in the Word Scramble text, and the slowest responses were found in the Des Prose condition. Moreover, the interaction between Word Type and OSPAN was present in the Des Prose condition (also presented in Fig. 8, top), but was not present in either of the syntactically disturbed text conditions. Statistical modelling of performance is presented next.

Fig. 7
figure 7

Reaction time (RT) to target letters for each word type (Function and Content) in the three conditions rendered from the Des text (Letter Scramble, Word Scramble, and Des Prose Passage) in Experiment 1, plotted as a function of OSPAN. Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

Fig. 8
figure 8

Reaction time (RT) to target letters for each word type (Function and Content) in the Des Prose text plotted as a function of OSPAN (Top = Experiment 1; Bottom = Experiment 2). Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

Models were fit to log(Correct RT) based on subjective inspection of normality of the RT distributions, however the untransformed data are presented in figures. All subsequent RT analyses are treated in the same manner. When examining the influence of the predictors on log(Correct RT), there was no evidence to support the three-way interaction, OSPAN × Word Type × (Des Prose - Letter Scramble): b = 0.078, 95% CI = [0.006, 0.182], OSPAN × Word Type × (Word Scramble - Letter Scramble): b = 0.021, 95% CI = [-0.054, 0.104], with equivalent support for the model with the interaction term dropped (AIC = -6956) than when the term was included (AIC = -6956).

To evaluate the two-way interactions, we contrasted the model with all two-way interaction terms included (AIC = -6956) with models where each term was dropped. The model performed worse (ΔAIC = +27) when dropping the two-way interaction term for Text and Word Type, (Des Prose - Letter Scramble) × Word Type: b = 0.027, 95% CI = [0.014, 0.041], (Word Scramble - Letter Scramble) × Word Type: b = -0.002, 95% CI = [-0.013, 0.008]. The model also performed worse (ΔAIC = +2) when dropping the interaction term for Word Type and OSPAN, b = 0.034, 95% CI = [0.011, 0.071]. The model performed better (ΔAIC = -3) when dropping the two-way interaction between Text and OSPAN, (Des Prose - Letter Scramble) × OSPAN, b = -0.054, 95% CI = [-0.174, 0.069], (Word Scramble - Letter Scramble) × OSPAN, b = 0.003, 95% CI = [-0.062, 0.095].

To evaluate the main effects, we contrasted the model with all main effect terms included (AIC = -6930) with models where each term was dropped. The model performed worse (ΔAIC = +141) when dropping the main effect of Text, (Des Prose - Letter Scramble), b = 0.139, 95% CI = [0.123, 0.158], (Word Scramble - Letter Scramble), b = 0.048, 95% CI = [0.036, 0.061]. The model also performed worse (ΔAIC = +9) when dropping the main effect of Word Type, b = -0.008, 95% CI = [-0.014, -0.004]. The model performed better (ΔAIC = -2) when dropping the main effect of OSPAN, b = -0.020., 95% CI = [-0.127, 0.104].

Discussion

The outcomes in Experiment 1 support the hypothesis that working memory capacity influences the efficiency of searching in a dual task context but not searching per se. The results show that OSPAN strongly predicts performance when there are high dual-task demands (i.e., reading full prose while searching for targets), more weakly predicts performance when there are reduced dual-task demands (i.e., reading random words while searching for targets), and does not predict performance at all when there are no dual-task demands (i.e., scanning non-words while searching for targets). If working memory capacity was influencing search itself, we would have observed a relationship between OSPAN and search performance regardless of the degree of dual-task demands. Moreover, the missing-letter effect was also attenuated as a function of syntactical structure, supporting the Attentional Disengagement model of the missing-letter effect (see General discussion for further explanation).

The effect of working memory capacity on visual search efficiency in a dual-task context has important theoretical implications. However, before addressing the implications for executive attention theory of working memory, it is essential to establish that this phenomenon is indeed reproducible (Simons, 2014). On this point, the presence of a large missing-letter effect with an RSVP procedure and prose passages extends previous findings (see., e.g., Newman et al., 2013; Saint-Aubin & Klein, 2001; Saint-Aubin et al., 2003, 2010) and further shows that the missing-letter effect is not a by-product of eye movements as assumed by some models (Corcoran, 1966; Hadley & Healy, 1991). In addition, our patterns are consistent with numerous findings reported previously in the literature demonstrating the impact of text structure on omission rate. Healy (1976) first reported this pattern, finding the most target letter omissions on full prose, fewer omissions in scrambled words, and the fewest omissions in scrambled letters. Drewnowski and Healy (1977) and Read (1983) both reported fewer omissions for scrambled words relative to full prose. Assink et al. (2003) found more target letter omissions for full prose relative to a less coherent text with specific function and content words swapped – a manipulation that also attenuated the missing-letter effect. Additionally, Newman et al. (2013) found higher omission rates for full prose than for scrambled letter strings using an RSVP procedure, and also found the same abolishment of the missing-letter effect when the syntactical structure of text was removed.

Experiment 2

The results from Experiment 1 support the hypothesis that working memory capacity is associated with individual differences in the ability to manage dual-task demands. However, there are empirical and theoretical motivations to replicate the dual-task conditions in a more powerful sample. Replicating the full prose passage conditions is empirically motivated due to the minor inconsistency in the relationship between OSPAN and the missing-letter effect across the short (PourCour) and long (Des) full-prose conditions. In the short prose passage, there was ambiguous statistical evidence supporting the interaction between Word Type and OSPAN; however, there was no indication that these factors were interacting in the longer passage. Theoretically, resolving this discrepancy is important in the development of the Attentional Disengagement model for the missing-letter effect – in particular advancing our understanding of how individual differences that implicate search may in turn affect the MLE. Again, as per the guidance of Simmons et al. (2012), we declare that we have reported how we determined our sample size, all data exclusions, all manipulations, and all measures in the study.

Method

Participants

A total of 172 students from Université de Moncton volunteered to take part in the experiment. Sample size was not declared in advance, rather as many participants as possible were run given the constraints of the academic term. All participants were native-French speakers. In considering statistical power, we again followed the recommendation of Brysbaert and Stevens (2018) to include at least 1,600 observations per repeated-measures condition in a linear mixed model combining subject and stimulus analyses. With 172 participants × 200 critical word observations in the Des text, we get 34,400 total observations. With 172 participants × 32 critical word observations in the PourCour text, we get 5,504 total observations.

Materials, stimuli, and procedure

As in Experiment 1 we began this experiment by administering the Operation Span task. This was followed by the two prose passages (Des Prose, PourCour, in that order) that were administered in Experiment 1. The Des Prose text and the PourCour text were followed by ten and five multiple-choice questions, respectively, each with four possible answers. Aside from not administering the scrambled word and scrambled letter condition, all other methods were the same as for Experiment 1.

Results

Overall, participants had a mean partial credit unit score of .61 (SD = .14) and both the skewness (0.250) and the kurtosis (2.498) of the distribution were normal given the result of a Jarque-Bera Normality test: JB = 3.603, p = 0.17.

Text comprehension

As in Experiment 1, we again ran a post hoc analysis to test the prediction that there should be a negative relationship between comprehension score and OSPAN. Generalized linear mixed effects models were run to predict correct comprehension responses from individual OSPAN scores. OSPAN was treated as a fixed effect, with a correlated-multivariate random effect of subject on the intercept. AICs were computed via the drop1 method in the “stats” package in R (Version 4.0.2). Effect sizes for parameter estimates are reported as bootstrapped 95% CIs, generated via confint. For the Des Prose text (Fig. 9, left), there was no support for the trading hypothesis, as there was a positive relationship between OSPAN and Comprehension, b = 0.942, 95% CI = [0.11, 1.71], with more support for the model with the effect term included (AIC = 2379) than when the term was dropped (AIC = 2383). For the PourCour text (Fig. 9, right), there was ambiguous support for the trading hypothesis. There was a slight negative relationship between OSPAN and Comprehension, b = -0.140, 95% CI = [-1.10, 1.03], but less support for the model with the effect term included (AIC = 1154) than when the term was dropped (AIC = 1152).Footnote 2

Fig. 9
figure 9

Performance on comprehension questions as a function of OSPAN in the two full-prose conditions in Experiment 2 (chance performance = 0.25). Shaded areas represent 95% confidence intervals for the slopes of the fitted lines

Letter search task

As in Experiment 1, generalized linear mixed effects models were used to examine the relationship between the predictor variables and outcome variables. Models assessing omissions included the following predictors: Text (Des Text, PourCour), Word Type (Function, Content), and OSPAN (Continuous), whereas Correct RT was only assessed in the Des Text (as in Experiment 1).

Omissions

As shown in Fig. 6, the patterns from Experiment 1 were replicated in this new sample, where the effects of Word Type and OSPAN were present in the full-prose conditions. For both texts, participants miss the target letter more frequently when it is embedded in a function than in a content word. Furthermore, omission rate decreases as OSPAN scores increase. However, the size of the missing-letter effect – that is the difference between omission rate for function and content words – does not vary as a function of OSPAN scores. These observations are supported by the following statistical results.

When examining the influence of the predictors on Omissions, the model included the following predictors: Text (Des Text, PourCour), Word Type (Function, Content), and OSPAN (Continuous). There was no evidence to support the three-way interaction, OSPAN × Word Type × Text, b = 0.537, 95% CI = [-0.087, 1.263], with equivalent support for the model with the interaction term dropped (AIC = 49017) as when the term was included (AIC = 49017).

To evaluate the two-way interactions, we contrasted the model with all two-way interaction terms included (AIC = 49017) with models where each term was dropped. The model performed worse (ΔAIC = +21) when dropping the two-way interaction term between Text and Word Type, b = 0.308, 95% CI = [0.145, 0.439], wherein the MLE was slightly larger in the Des text (Fig. 10). The model performed slightly better (ΔAIC = -2) when dropping the two-way interaction term between Text and OSPAN, b = 0.0.50, 95% CI = [-0.723, 0.842], and slightly better (ΔAIC = -1) when dropping the two-way interaction term between Word Type and OSPAN, b = 0.283, 95% CI = [-0.222, 0.813].

Fig. 10
figure 10

Missing letter effect in Omissions (Function - Content) in the two full-prose texts (Des Prose and PourCour) in Experiment 1 (top) and Experiment 2 (bottom). Error bars represent FLSDs, where the mean point-estimate differs from any value not captured within the error bars

The model including all main effects (AIC = 49035) outperformed any model with a main effect term excluded: Text, b = -0.264, 95% CI = [-0.392, -0.129], ΔAIC = +14, Word Type, b = 0.619, 95% CI = [0.538, 0.691], ΔAIC = +144, and OSPAN, b = -1.186, 95% CI = [-2.01, -0.630], ΔAIC = +7.

Reaction time

RT analysis (Fig. 8, bottom) was again constrained to only the Des text, due to the small number of trials in the PourCour text. Speed of detection for Function words was largely unaffected by OSPAN, whereas those higher in OSPAN were faster than those lower in OSPAN to detect Content words. Interpreted differently, there was a missing-letter effect in RT for higher OSPAN individuals that was not present for lower OSPAN individuals.

When examining the influence of the predictors on log(Correct RT), the model included the following predictors: Word Type (Function, Content) and OSPAN (Continuous). There was evidence to support the two-way interaction between OSPAN and Word Type, b = 0.058, 95% CI = [-0.007, 0.106], AIC = -2698, above the model with no interaction term included, AIC = -2696.Footnote 3

To evaluate the main effects, we contrasted the model with all main effect terms included (AIC = -2696) with models where each term was dropped. The model performed worse (ΔAIC = +23) when dropping the main effect of Word Type, b = 0.022, 95% CI = [0.015, 0.029]. Model performance was unchanged (ΔAIC = 0) when dropping the main effect of OSPAN, b = -0.071, 95% CI = [-0.174, 0.013].

Discussion

Two important outcomes emerged from the current experiment. First, it is promising that the effect of working memory capacity on detection accuracy with the reading task has been successfully replicated in two experiments. In addition, we extended previous results by showing that, as found in the auditory domain by Colflesh and Conway (2007), working memory capacity can be related to the efficiency of the visual search beyond the conditions found earlier (Poole & Kane, 2009).

General discussion

Current results have theoretical implications for both theories of the missing-letter effect and theories of individual differences in attentional control. According to the Attentional Disengagement (AD) model of the missing-letter effect, performance in the visual search task would reflect the deployment of attention at the reading task (Roy-Charland et al., 2007). More specifically, it is assumed that reading is under the control of an attentional beam serially attending each word. When attention is engaged on a word in which the target letter is embedded, information about the presence of the visual target accumulates. Importantly, as soon as attention is disengaged from the word, this information begins to decay. Normally, an isolable search system repeatedly and regularly checks the accumulating information for the presence of the target and the probability of detecting the target on each of these checks depends on the momentary strength of the representation of the target. The missing-letter effect, and the standard pattern of target RT – with longer RTs for function than for content words – can be explained by this model if it is assumed that readers disengage their attention more rapidly from function than content words. The most commonly accepted reasons for this early disengagement are that function words are more predictable, more frequent and provide less information about the meaning of the text (Koriat & Greenberg, 1994). In the current study, function words were more frequent than content words to reproduce the typical situation in reading and in the missing-letter effect literature, and to maximize the size of the missing-letter effect. However, it is worth mentioning that although the most common procedure is to contrast frequent function words with less frequent content words, the contribution of word frequency and word function has been isolated in some studies (e.g., Roy-Charland et al., 2007; Roy-Charland & Saint-Aubin, 2006; Saint-Aubin & Poirier, 1997).

Although the AD model was not developed to account for individual differences, we believe that the model can easily account for current findings. Within the model it is assumed that participants share their resources between the reading and the search task, in a way that allows minimal interference with reading comprehension. Previous findings in the auditory domain have shown that low-span participants are less efficient at sharing their resources between a search and a shadowing task (Colflesh & Conway, 2007). The same hypothesis can be made in the current context and would explain why low-span participants made more omissions. Moreover, if OSPAN is indexing a general capacity for sharing resources across tasks (rather than the maintenance of task-specific information), this would apply in our situation, with sharing across reading and search tasks. This could explain why there is no influence of OSPAN on performance when sharing is not needed due to the reduction or elimination of syntactical structure. Additionally, the AD model serves to explain the smaller missing-letter effect with scrambled words, and elimination of the missing-letter effect with scrambled letters. As mentioned above, Koriat and Greenberg (1994) proposed that attention is more rapidly disengaged from function words because they are more predictable, and provide less information about meaning. Since there is no meaning inherent in either the scrambled words or scrambled letters conditions, there is no relative informativeness advantage for content words as compared to function words. However, in the scrambled words condition, function words are still more frequent than content words given their relative frequencies in French (~10, 625 vs. ~644 occurrences per million), whereas random letter strings ought to be equivalently (un)predictable.

In the missing-letter effect literature, it is clearly established that higher omission rates should translate into slower RTs (Roy-Charland et al., 2009; Saint-Aubin et al., 2003). Results with high-span participants are exactly as predicted with longer RTs for the function than the content words. However, low-span participants did not show the expected pattern. In order to account for results of low-span participants, we assumed that those participants were unable to share their resources between reading and searching. For low-span participants, a straightforward solution to this sharing difficulty might be to forego the aforementioned repeated checks and instead make a single check of the accumulated target strength and, importantly, to synchronize this check with the disappearance of the word on the screen. If they made a single check, they would be less likely to find the target letter than high-span participants who checked continuously. In addition, because low-span participants would have made a single check, their response latencies would be similar for function and content words.

As implied by the AD model of the missing-letter effect, current results nicely fit with the executive-attention view according to which the observed relationship between working memory capacity and visual search efficiency would largely be due to attentional control mechanisms (e.g., Engle, 2002; Engle, 2018; Poole & Kane, 2009). The AD model accounts for the attenuation of the missing-letter effect with reduced syntactical structure by presuming the relative difference in both predictability and meaning between function and content words is reduced as syntactical structure is reduced. The present results address the mystery (quoted in the Introduction) first highlighted by Kane et al. (2006), and support the conclusion that in the context of the executive attention theory of working memory, individual differences in working memory capacity would afford better attentional control between the dual tasks of reading and searching. This accounts for why working memory capacity predicts letter search performance in the context of, but not in the absence of, syntactical structure. A similar conclusion was reached by Sobel et al. (2007) based on their finding that an influence of working memory capacity upon search performance depended on a high level of top-down control.

Future work exploring these relationships in the context of Engle’s (2018) model may want to consider individual differences in general fluid intelligence. If in fact fluid intelligence reflects the efficiency with which one assesses dynamic information, then perhaps those high in Gf would also search more efficiently. Moreover, as reported by Shipstead et al. (2016) and Martin et al. (2020), individuals low in Gf re-retrieve and recheck previously tested hypotheses, such as repeating items in a free recall list. If these consequences extend to search behavior, such as an increased proclivity to re-inspect previously rejected targets/locations, they may have adverse effects on the efficiency of the search process when context affords re-inspection (i.e., not in RSVP).

It is worth mentioning that current results also fit well within the time-based resource-sharing theory (Barrouillet et al., 2008). According to this theory, information decays when attention is switched away and frequent refreshes by attentional focus are needed. It is beyond dispute that the dual task of reading and searching a target letter requires attentional switches between reading and searching. Future work is needed to test models accounting for the influence of working memory capacity on visual search. The current study clearly established that working memory capacity is related to visual search beyond the conditions found in previous studies: the present findings support the conclusion that working memory capacity is associated with attentional control, specifically the ability to manage multiple tasks concurrently. These findings resolve much ambiguity in our understanding of the relationship between working memory and search behavior, where previous work has suggested working memory may (Colflesh & Conway, 2007; Woodman & Luck, 2004) or may not (Kane et al., 2006; Luria & Vogel, 2011; Poole & Kane, 2009) be closely related to search.