Real-time Functional Architecture of Visual Word Recognition

Abstract Despite a century of research into visual word recognition, basic questions remain unresolved about the functional architecture of the process that maps visual inputs from orthographic analysis onto lexical form and meaning and about the units of analysis in terms of which these processes are conducted. Here we use magnetoencephalography, supported by a masked priming behavioral study, to address these questions using contrasting sets of simple (walk), complex (swimmer), and pseudo-complex (corner) forms. Early analyses of orthographic structure, detectable in bilateral posterior temporal regions within a 150–230 msec time frame, are shown to segment the visual input into linguistic substrings (words and morphemes) that trigger lexical access in left middle temporal locations from 300 msec. These are primarily feedforward processes and are not initially constrained by lexical-level variables. Lexical constraints become significant from 390 msec, in both simple and complex words, with increased processing of pseudowords and pseudo-complex forms. These results, consistent with morpho-orthographic models based on masked priming data, map out the real-time functional architecture of visual word recognition, establishing basic feedforward processing relationships between orthographic form, morphological structure, and lexical meaning.


INTRODUCTION
A neurocognitive account of visual word recognition-the core process underpinning human reading-needs to address two basic questions: What is the functional architecture of the recognition process, whereby visual inputs are mapped via orthographic analysis onto representations of lexical form and meaning, and what are the units of analysis-lexical or sublexical-in terms of which these processes are conducted? Despite an enormous research effort over the last 100 years, involving behavioral, neuropsychological, and neuroimaging techniques, there is no agreed answer to these questions (Frost, 2012). Although it is generally accepted that the initial analysis of visual form and orthography engages occipitotemporal cortex, most strongly on the left (e.g., Vinckier et al., 2007;Cornelissen, Tarkiainen, Helenius, & Salmelin, 2003;Cohen et al., 2000;Bentin, Mouchetant-Rostaing, Giard, Echallier, & Pernier, 1999), and that later stages of lexical access and interpretation involve middle temporal and frontotemporal regions, also primarily on the left (e.g., Lau, Phillips, & Poeppel, 2008;Halgren et al., 2002;Bentin et al., 1999), the central properties of this process remain unclear.
Here we use magnetoencephalography (MEG), in combination with MRI-based source reconstruction techniques, to delineate the specific spatiotemporal patterns of neural activity elicited by a psycholinguistically rich set of simple and complex written words and pseudowords. We aim to determine (1) under what description the outputs of orthographic analysis are mapped onto lexical-level representations and (2) what is the balance between feedforward and feedback processes in the processing relationship between orthographic and lexical analysis. In doing so, we will integrate behavioral data about the performance characteristics of the system with direct MEG-based evidence about its underlying neural dynamics.

Background
An important clue to the organization of visual word recognition comes from masked priming research over the past decade, demonstrating equally strong priming between related pairs like hunter/hunt and lexically unrelated pairs like corner/corn (e.g., Marslen-Wilson, Bozic, & Randall, 2008;Longtin & Meunier, 2005;Rastle, Davis, & New, 2004;Longtin, Segui, & Halle, 2003;Rastle, Davis, Marslen-Wilson, & Tyler, 2000). A masked prime like hunter is assumed to prime hunt because it is decomposed into the stem morpheme 1 {hunt} and the grammatical morpheme {-er}, reflecting the meaning of the whole form hunter. The fact that significant priming is also seen for corner, where a decompositional reading as {corn} + {-er} has no relation to the meaning of the word, points to a process of automatic decomposition for any word form that contains potential morphological structure, regardless of the lexical properties of the whole form. The failure of pairs like scandal/scan to show priming highlights the morphemic basis for these effects. Although scan is a potential stem morpheme, dal is not a grammatical morpheme, and this seems to block the decomposition of scandal into {scan} + {-dal}.
This pattern of results suggests a recognition process that is dominated in its early stages by an analysis of the orthographic input into sublexical morphemic units and where a representation of the visual input in these terms is projected onto the lexical level in a strongly bottom-up manner, blind to lexical constraints (Marslen-Wilson et al., 2008;Rastle & Davis, 2008). This morpho-orthographic approach is not, however, fully supported either behaviorally (e.g., Diependaele, Sandra, & Grainger, 2009;Feldman, OʼConnor, & del Prado Martín, 2009) or in neuroimaging studies of visual word recognition, where support can be found for contrasting morphosemantic (or interactive) approaches, which claim that early orthographic analysis is modulated by top-down lexical and semantic constraints (e.g., . Within the neuroimaging domain, we focus on studies using EEG or MEG, because it is only these time-sensitive methods that can resolve the specific temporal ordering of different types of analysis during visual word recognition and thus discriminate directly between different proposals for the real-time functional architecture of the recognition system. Recent research based on these techniques falls broadly into two main classes. Several studies, stimulated by the masked priming results, ask whether there is electrophysiological evidence for early sensitivity to the morphological content of visual word forms, independent of lexical constraints. Working primarily with sets of morphologically complex and pseudo-complex word forms, masked priming has been combined with both EEG (e.g., Morris, Grainger, & Holcomb, 2008;Lavric, Clapp, & Rastle, 2007) and MEG (Lehtonen, Monahan, & Poeppel, 2011), whereas a further set of studies have used unprimed lexical decision tasks (e.g., Lavric, Elchlepp, & Rastle, 2012;Lewis, Solomyak, & Marantz, 2011;Zwieg & Pylkkänen, 2009). Taken as a whole, these and similar studies provide evidence for sensitivity to potential morphological structure, where complex and pseudo-complex forms like farmer and corner initially group together relative to orthographic controls like scandal, consistent with a morpho-orthographic view where these processes are not lexically driven. The spatiotemporal distribution of these effects is quite diverse, both in terms of hemispheric involvement (right and/or left) and posterior/anterior location and in terms of timing, with early effects (150-250 msec) seen in some studies (e.g., Lavric et al., 2012;Zwieg & Pylkkänen, 2009) and later effects (350-500 msec) in others (e.g., Lavric et al., 2007;Dominguez, de Vega, & Barber, 2004).
A different set of MEG and EEG studies focus instead on the earliness with which lexical and semantic effects can be detected. These studies use unprimed lexical decision tasks and contrast morphologically simple words (nouns and verbs like help and gold) with matched pseudowords (e.g., Hauk, Coutout, Holden, & Chen, 2012;Hauk, Davis, Ford, Pulvermüller, & Marslen-Wilson, 2006;Assadollahi & Pulvermüller, 2003). Early lexical effects, although small relative to later N400 time frames, have been reported in a range of posterior and middle temporal sites. Hauk et al. (2012), for example, report word-pseudoword differences for the time period 180-220 msec in left anterior middle and inferior temporal lobes, whereas Shtyrov, Goryainova, Tugin, Ossadtchi, & Shestakova, (2013) observe even earlier lexicality effects (at around 100 msec) in an EEG study using MMN techniques.
It is hard to determine, however, what the implications of these results are for the functional organization of the word recognition process. This is partly because the stimulus materials used rarely overlap across the two research strands, with the lexically oriented work, for example, generally not including morphologically complex material. This means that there is little direct evidence, under conditions where early lexical effects are detected, whether these serve to modulate candidate morphological decompositions-so that, for example, the segmentation of corner into {corn} + {-er} is inhibited.
A further issue is the use of overt response tasks in combination with EEG and MEG, which all the studies cited above have in common. Behavioral research into the dynamics of language function requires the use of these tasks to provide information about underlying cognitive processes. There are several concerns, however, that argue against their use in the neuroimaging context. The most salient of these is the evidence that such tasks can modulate the actual process under investigation, through attentional tuning of neuronal computations relevant to the task requirements, even at early stages of the cortical analysis of sensory inputs (e.g., Zanto, Rubens, Thangavel, & Gazzaley, 2011;van Atteveldt, Formisano, Goebel, & Blomert, 2007). This raises the possibility that early effects seen in the EEG or MEG studies are induced by the ubiquitous experimental task. Such concerns are compounded when a priming task is used, especially in masked priming, where prime and target overlap closely in time. Under such conditions, it is hard to assign neural effects separately to the properties of the prime, the target, or to interactions between them at different levels of visual analysis.
We address these issues in the current study by (a) ensuring that evidence about the timing of lexical and morphological effects can be linked within the same experiment to evidence about the spatiotemporal organization of word recognition more generally; (b) presenting the materials in a simple viewing paradigm, reducing the likelihood that the experimental situation will induce attentional tuning of specific aspects of the input analysis process; and (c) conducting a separate behavioral masked priming study so that the functional properties of critical stimulus materials can inform the analysis of the MEG responses evoked by a parallel set of stimuli.

Experimental Considerations
This experiment explores the dynamic roles of morphological, lexical, and semantic variables in the mapping between prelexical orthographic processing and semantically sensitive lexical analysis. To define the spatiotemporal coordinates of these twin poles of the word recognition process-for these stimulus sets and these participants in this specific experimental context-we contrast morphologically simple words (e.g., corn), pseudowords (e.g., frum), and length-matched consonant strings (e.g., wvkp). These simple forms, derived from the complex words and pseudowords (e.g., corner, frumish) used elsewhere in the experiment, establish the anchor points of the recognition process using items that come from orthographic neighborhoods matched across the main experimental conditions. Contrasts between words and pseudowords versus consonant strings (e.g., Cohen et al., 2000) should capture early orthographic effects in occipitotemporal cortex, differentiating word-like forms from random letter strings. The same sets of simple words and pseudowords allow us to locate the other pole of the processing continuum, testing for lexicality effects in a word versus pseudoword contrast. These are likely to be seen later in the access process-possibly in the N400 time frame (e.g., Lau et al., 2008)-with differential responses in left-lateralized middle and anterior temporal regions.
To evaluate the properties and timing of the intervening processes that link orthographic analysis to lexical representation, we present complex and pseudo-complex stimuli that vary in morphological and lexical status. The morphological dimension, varying the presence or absence of stems and affixes in potentially complex forms, asks whether the mapping from orthographic analysis onto lexical form and meaning is in terms of morphemic (or pseudo-morphemic) units (cf. Vinckier et al., 2007). Because simple words in English are always also morphemes, this can only be tested by using complex forms that can pull apart the lexical and morphemic properties of a given word form-whether they are made up of potential stems and affixes, as in farmer or brother, or whether they combine an existing affix (e.g., {-ish}) with a pseudo-stem, as in blemish, or an existing stem (e.g., {scan}) with a pseudoaffix, as in scandal. From a lexical point of view, forms like brother, blemish, and scandal are monomorphemic and nondecompositional and should be treated differently from genuinely complex forms like farmer. From a morpho-orthographic perspective, the form brother, analyzable into the potential stem and affix pair {broth} + {-er}, should behave differently from blemish and scandal in the early stages of lexical access but similarly to farmer.
It is necessary here to treat derivational morphology (the main focus of masked priming research) separately from inflectional morphology, which involves word forms like played that contain a stem and an inflectional affix (the past tense {-ed}). Regular inflectional morphology is systematic and transparent and does not change the meaning of the stem. Inflected forms are argued to be processed and represented decompositionally, relying on a left-lateralized frontotemporal network (Marslen-Wilson & Tyler, 2007). In contrast, derivational morphemes change the meaning and often the grammatical category of the stem, with a much less predictable relationship between stem and whole form and where emerging neuroimaging evidence suggests that these may not be represented decompositionally (Bozic, Tyler, Su, Wingfield, & Marslen-Wilson, 2013). Morphological structure of both types should be parsed before lexical access based on the presence of a stem and affix, but with potentially different lexical outcomes.
For derivational morphology, we contrast a set of potentially complex words (see Table 1) that have a stem and affix (e.g., farmer), a stem but no affix (e.g., scandal ), an affix but no stem (e.g., blemish), and neither stem nor affix (e.g., biscuit). These contrasts test whether initial morphological decomposition depends on the presence of both a stem and an affix. Against the same backdrop of morphologically simple forms (biscuit), we can also evaluate the pattern of effects for inflected words like blinked. All forms containing a potential stem and an affix should trigger morphological segmentation, in contrast to stem-only forms like scandal-which do not elicit a masked priming effect-and simple forms like biscuit. The effectiveness of an embedded affix (blemish) in triggering morphological decomposition has not been tested in masked priming, although research with spoken words shows that the presence of a potential inflectional affix does trigger decompositional processes (Tyler, Stamatakis, Post, Randall, & Marslen-Wilson, 2005). An inflectional pseudoword condition, where the affix is attached to a nonexistent stem (e.g., bected ), tests for this possibility in the visual domain, matched by a further pseudoword set with derivational affixes (e.g., frumish).
The second critical dimension of lexical status tests competing claims for the degree of autonomy of the early stages of visual analysis and lexical access. This dimension contrasts semantically transparent forms like farmer with opaque pseudo-complex pairs like corner. On a morphoorthographic account, no difference should be found between these forms before lexical access, but they should begin to diverge once access to the meaning of the whole form is in progress. To mirror the derivational contrasts, we also include semantically transparent and opaque inflectional conditions. Because the inflectional equivalents of corner words are rare in English (i.e., ending in -ed without being an adjective or past-tense form), we use inflected pseudowords analogous to those tested by Longtin and Meunier (2005) in French. Nouns that do not function as verbs (e.g., ash) were used as the stem, resulting in an interpretable but nonexistent pseudoword like ashed. Both semantically transparent inflected forms (blinked) and inflected pseudowords (ashed ) should generate early decompositional processing, as well as significant masked priming.
In summary, this study aims to specify the functional architecture of visual word recognition by tracking the patterns of neural activity that underlie processing of morphologically simple and complex words in English. It asks three main questions: (i) is the early output of orthographic analysis structured into morphemic units, (ii) is there a distinct processing phase at which potential morphological structure is identified independent of lexical constraints, and (iii) what is the timing with which these processes are influenced by lexical-level constraints? To relate the MEG results directly to the behavioral evidence for morphoorthographic processing, we will run a separate masked priming study on parallel sets of complex and pseudocomplex materials. Finally, as noted earlier, participants are tested in a simple word viewing situation, accompanied by an occasional recognition task to reinforce sustained attention to the stimuli.

MEG Participants
Sixteen participants (nine women) took part in the MEG experiment. All were right-handed native British English speakers between the ages of 18 and 35 (mean age of 25) with normal hearing, normal or corrected-to-normal vision, and no history of neurological disease, who gave written consent to take part and were paid for their time.

Stimuli and Design
In each of the nine MEG test conditions (see Table 1), 50 words were selected, which contrasted the presence of different morphological features. Four conditions contained a potential derivational affix. Three of these were real word conditions: semantically transparent ( farmer), pseudo-derived (corner), and pseudo-affix (blemish), plus a pseudoword condition ( frumish), where the stem was not a real word. Three conditions contained a potential past-tense inflectional {-ed} affix: semantically transparent (blinked ) and pseudo-inflected (ashed ) words, paired with a pseudoword condition (bected ) where the stem is not a real word. The pseudo-inflected items (ashed) contained an embedded word that is only used as a noun in English, creating a pseudoword that could be segmented into an existing stem and an existing affix but which was not itself an existing word. The stems chosen for this condition appeared in the Celex English database (Baayen, Piepenbrock, & Gulikers, 1995) only as a noun, and no instances (or a single instance only) of use as a verb were found in the British National Corpus (www.natcorp.ox.ac.uk/). Two baseline conditions were included that did not contain a potential affix: pseudostem (scandal ) and no stem/no affix (biscuit).
Participants also saw the 450 embedded stems and pseudostems (or first syllables for words without embedded stems) extracted from these complex forms. These were accompanied by 160 strings of random consonants, matched to the length of the target items (both stems and whole forms), and varying in length from three to nine letters. These were included both as a general length-matched baseline and to allow specific contrasts with word and pseudoword stimuli to select out regions sensitive to orthographic structure. For use as test items in the recognition task, a set of 50 filler items (words, pseudowords, and consonant strings) were also presented. An additional 10 filler items were included as dummy items at the beginning of each block. The total number of stimuli in the study was 1120 items.
For all conditions where an embedded stem was present, pairs of items were presented to native English speakers who rated the semantic relatedness between the two words (e.g., corner/corn) on a scale of 1-7 (unrelated to highly related ). Test items selected for the morphologically transparent condition were rated as 6.5 or above. For the pseudoderived and pseudo-stem conditions, test items were rated as 3.5 or below (weakly related). Items for the nine conditions were selected using the Celex database and the conditions were matched (Table 2) on whole form and stem length, percentage of orthographic overlap between stem and whole form (where applicable), and frequency of the whole form and embedded stem (where applicable).

Behavioral Study
To provide a bridge between masked priming research and the current study, we ran an initial set of stimuli (40 words per condition) in a conventional masked priming task, using a prime-target SOA of 40 msec with whole forms as primes ( farmer) and stem forms as targets ( farm). This study was conducted to determine what pattern of priming effects we would see for the particular combination of derivations, inflections, real words, and pseudowords chosen for this research (as listed in Table 1). Previous studies had not included all of these conditions in a single stimulus set, and no study in English (as far as we are aware) has used pseudowords like "ashed" and "bected" as primes. Behavioral evidence about which combinations of real and pseudo-stems and affixes do or do not show priming is an essential input to the MEG study and its analysis.
We tested 29 new participants (none of whom took part in the MEG study), all right-handed native British English speakers between the ages of 18-34 (mean age of 24). Each trial began with a set of hashmarks as a premask, which appeared in the center of the screen for 500 msec. This was followed by the prime in the same location in lower-case letters for 40 msec and which itself was immediately followed by the target in uppercase letters. The experiment was run in a sound-proof, dimly lit room, using a PC-compatible microcomputer using DMDX software (Forster & Forster, 2003). Trial order was pseudorandomized online using DMDX software, with two items from each condition appearing in each scrambling block (one related prime and one unrelated prime from each condition). Outliers (RTs over 1500 msec) were discarded, accounting for 0.8% of the data.

MEG Procedure
For the MEG study, the stimulus materials (see Tables 1 and 2) were based on those used in the masked priming study. To enhance the signal-to-noise ratio in the MEG environment, 10 additional stimuli were added to each condition, increasing the number of items to 50 per condition (listed in Appendix 1). The stimuli were randomly assigned to one of two blocks, each further divided into five subblocks with the constraint that each subblock contained five items from each condition and 16 consonant strings (for a total of 106 items per subblock). Whole forms and their equivalent stem forms (e.g., farmer and farm) were placed in separate blocks and always appeared in corresponding subblocks across the experiment (i.e., farmer in Subblock 1 of Block 1 and farm in Subblock 1 No stem no affix biscuit 6.3 n/a n/a 12.0 n/a of Block 2). The order of the two blocks was alternated for each participant, so that presentation of the whole form and equivalent stem ( farmer and farm) were alternated, with the stem appearing first for half of the participants. The order of the five subblocks was randomized for each participant in a cyclical order (e.g., subblock order 1-2-3-4-5 for Participant 1, and 2-3-4-5-1 for Participant 2). This preserved the order of the subblocks so that the repeated stem and whole form were always separated by the same number of subblocks, with a mean distance of 559 trials (range of 448-670 trials). Trial order was randomized within each subblock using E-Prime 1.0 software (Psychology Software Tools, Inc.). Each trial began with a fixation cross in the middle of the screen for 500 msec to direct the attention of the participant to the appropriate location on the screen. This was followed immediately by presentation of the stimulus for 100 msec, centered at the same location. The short presentation prevented participants from making saccades. A blank screen was then presented for 1.4-1.6 sec, jittered randomly for each trial, before the next stimulus appeared. At the end of a subblock, a screen appeared asking if the letter string indicated had been seen in that subblock. Participants were instructed to make a response within 3000 msec using the button boxes. Ten items were used in each recognition task over the 10 subblocks for a total of 100 items (50 old/50 new). Each subblock was separated by a break at the completion of the recognition task, and participants could control the length of each break.
Participants sat in a dimly lit magnetically shielded room (IMEDCO AG, Switzerland), viewing items as they were presented on a screen at eye level. All stimuli were displayed in bold Arial font in black letters on a light gray background. Participants received spoken and written instructions about the task and were given 10 practice items. They were instructed to read the items silently but not to articulate or make any movements. Because each subblock contained approximately 100 items, participants were instructed not to attempt to memorize the items but to simply attend to them. Participants did not make button presses during blocks of trials but used two button boxes (one in each hand) to perform the recognition task at the end of each subblock. The experiment was run using E-Prime 1.0 and lasted approximately 45 min.

MEG Acquisition
MEG data were continuously acquired at a sampling rate of 1000 Hz (passband 0.01-300 Hz), with triggers placed at the onset of each stimulus. Neuromagnetic signals were recorded continuously with a 306-channel Vectorview MEG system (Elekta Neuromag, Helsinki, Finland). Before recording, four electromagnetic coils were positioned on the head and digitized using the Polhemus Isotrak digital tracker system (Polhemus, Colchester, VT) with respect to three standard anatomical landmarks (nasion, left and right preauricular points). During the recording, the position of the magnetic coils was tracked using continuous head position identification, providing information on the exact head position within the MEG dewar for later movement correction. Four EOG electrodes were placed laterally to each eye and above and below the left eye to monitor horizontal and vertical eye movements.

MEG Preprocessing
Continuous raw data were preprocessed offline with the MaxFilter (Elekta Neuromag) implementation of the signal-space separation technique with a temporal extension (Taulu & Simola, 2006). Averaging was performed using the MNE Suite (Athinoula A. Martinos Center for Biomedical Imaging). Epochs containing gradiometer, magnetometer, or EOG peak-to-peak amplitudes larger than 3000 fT/cm, 6500 fT, or 200 μV, respectively, were rejected. Trials were averaged by condition with epochs generated from −100 to 500 msec from onset of the target word. Averaged data were baseline corrected using −100 to 0 msec interval and low-pass filtered at 45 Hz. For sensor-level analyses, MEG data were transformed to the head position coordinates of the participant with the median head position within the helmet to minimize transformation distance.

Sensor-level Analyses
These analyses were conducted on gradiometers and magnetometers separately using the SensorSPM analysis method implemented in SPM5 (www.fil.ion.ucl.ac.uk/ spm/). Magnetometer data were used as such, whereas for each pair of gradiometer channels, a vector sum was calculated that reconstructed the field gradient from its two orthogonal components and its amplitude (computed as a square root of the sum of squared amplitudes in the two channels). For each participant and condition, a series of F tests were performed on a three-dimensional topography (2-D sensors by time image), which extended through 601 samples (1 msec each), allowing for the application of random field theory as in fMRI analysis (Kiebel & Friston, 2004). The 3-D images were thresholded at a voxel level of p < .005 and corrected for cluster size at p < .05. These clusters could extend in space (distributed across the topography) and in time. This made it possible to compare conditions across every sensor over the entire time window while still correcting on a whole-brain basis for multiple comparisons. This procedure eschews any preselection of time windows of interest and provides a data-driven selection process, which is not restricted to specific peaks found through visual inspection of the data.

Source Estimation
MP-RAGE T1-weighted structural images with a 1 × 1 × 1 mm voxel size were acquired on a 3-T Trio Siemens scanner for each participant, which were used for reconstruction of the cortical surface using Freesurfer (Athinoula A. Martinos Center for Biomedical Imaging). The L2 minimum-norm estimation (Hämäläinen & Ilmoniemi, 1994) technique was applied for source reconstruction as implemented in the MNE Suite. An individual MRI-based one-layer boundary element model (BEM) was created for each participant and was used to compute the forward solutions. An average cortical solution, containing 10,242 dipoles per hemisphere, was created from the 16 participants, and data from individual participants were morphed to this cortical surface in 10-msec time steps. ROIs were defined from Free-Surfer anatomical ROIs, with the exception of the large temporal and fusiform ROIs, which were subdivided into anterior, middle, and posterior regions. ROIs were defined on the average cortical surface, and for each participant, the mean value for all dipoles within each region was extracted for statistical analysis. The source-level analyses, using repeated-measures ANOVAs on the participant means within a given ROI, were restricted to the time windows where significant effects (after correction for multiple comparisons) were found in the sensor analyses. The results are visualized on the inflated cortical surface of the average participant.

Recognition Task Results
For the recognition task, mean accuracy was at 65% and did not vary significantly between words, pseudowords, and consonant strings, with accuracy at 66%, 62%, and 66%, respectively. Performance was assessed statistically using signal detection theory to test the discriminability index (d 0 ) against 0 using a paired t test, where d 0 = 0 would represent no difference between signal and noise. Discriminability was significantly greater than 0 (d 0 = .98, p < .0001), suggesting participants were reliably attending to the items.

MEG Results
In the results, we combine sensor-based and source-space analyses in each section. Sensor-level results are presented separately for gradiometers and magnetometers, followed by source-space analyses. The results are organized into two analysis streams, relating basic stages in the visual word recognition process to the processes that map between them. One stream focuses on the morphologically simple word and pseudoword stems, together with matched consonant strings, and the other on the morphologically complex and pseudo-complex forms. These statistically rigorous analyses, on sets of matched simple and complex materials, provide a well-controlled backdrop for evaluating how lexical, morphological, and semantic variables relate to different stages of the visual word recognition process. The Detecting Orthographic Structure and Emergence of Neural Sensitivity to Morphological Structure sections focus on the relationship between orthographic analyses and the early stages of lexical access. The Processing Lexical Identity and Lexical Effects for Morphologically Complex Words sections address the role of lexical constraints in the analysis of orthographic inputs.

Detecting Orthographic Structure
The first set of analyses contrasted words and pseudowords with consonant strings to establish spatiotemporal coordinates for effects associated with processing readable letter strings. To conduct these analyses, we used 100 pseudoword stems ( frum, bect) from the ( frumish) and (bected) conditions, excluding the pseudoword stems from the (blemish) and (biscuit) conditions, because these could be interpreted as the initial portion of an existing word. One hundred word stems ( farm, corn) were also selected, together with 100 consonant strings, all matched in length. At the sensor level (see Figure 1A), the SensorSPM contrast of words and pseudowords against consonant strings showed significant effects emerging between 155 and 230 msec in both gradiometers and magnetometers bilaterally. In the gradiometers, the cluster was significant from 155 to 230 msec within posterior right sensors with the peak at 195 msec. In the magnetometers, significant bilateral clusters appeared in left hemisphere (LH) sensors from 170 to 230 msec, peaking at 190 msec, and in right hemisphere (RH) sensors from 175 to 220 msec, peaking at 200 msec. All of these clusters reflect a stronger response to consonant strings than to words and pseudowords. Figure 1B plots early orthographic effects for LH and RH gradiometers and magnetometers at the peak of the significant sensor-level cluster in each hemisphere. In both hemispheres, there is an initial common response to all three stimulus types, peaking at 140 msec, followed by a second peak, at around 190 msec, which differentiates consonant strings from words and pseudowords. This indicates the presence of processes that are sensitive to orthographic structure but not to the lexical properties of the strings being analyzed.
The location and timing of these orthographically sensitive processes is consistent with earlier research. Previous fMRI studies have shown increased activation for consonant strings in posterior occipital regions (e.g., Vinckier et al., 2007), indicating that visual word forms are differentially processed on the basis of their orthographic structure. Processing between 150 and 200 msec has been shown to be specific to letter strings but not yet to word-like strings (Cornelissen et al., 2003), although some studies have found effects associated with orthographic typicality as early as 100 msec (Hauk et al., 2006). Here we find that the initial component peaking at 140 msec did not differentiate between stimulus types (although we did not explicitly test for typicality).

Emergence of Neural Sensitivity to Morphological Structure
Here we examine the timing and distribution of analysis processes sensitive to the presence of cues to morphological structure. If potential stems and grammatical affixes are present in an orthographic input string, when do they start to trigger differential neural responses? We were guided here by the masked priming results. The four conditions containing complex forms with a stem and an affix (+S+A) all showed significant priming (see Table 3). These were two derivational sets ( farmer, corner) and two inflectional sets (blinked, ashed). We contrasted these with two noncomplex conditions (the scandal (+S−A) and biscuit (−S−A) sets), neither of which elicited priming. The presence of the pseudo-derived corner forms and the non-existing ashed forms make this a test for morphological effects that are blind to lexical-level variables. In both cases, the significant masked priming effect is direct behavioral evidence that stimuli of this type elicit morphologically driven decomposition that is not blocked by lexical criteria.
There was an increase in processing activity for the combined derived and inflected forms compared with noncomplex forms in anterior left magnetometers, extending from 325 to 350 msec with a peak at 335 msec (see Figure 2A). There was no evidence in these brain-wide (and globally corrected) analyses for earlier or more posterior effects of morphological structure. At the peak magnetometer sensor from the SensorSPM analysis, there was no difference between the four +S+A conditions (F < 1) nor between complex and pseudo-complex forms within complexity type ( farmer vs. corner (t(15) = 1.49, p = .16); blinked vs. ashed (t(15) < 1). Analyzing derived and inflected forms separately, the derived forms show the same magnetometer cluster, from 320 to 365 msec in left anterior sensors with a peak at 335 msec. The effects for the inflected forms fall short of significance (but see source level analyses below).
The topography of these sensor-level effects is more anterior, left-lateralized, and later in time than the orthographic effects displayed in Figure 1. Evidence from the magnetometers ( Figure 2B) showed a stronger response to the derivational and inflectional forms at the peak left temporal sensors, sustained over the period 300-450 msec, with peak effects at around 320-330 msec.
Turning to the contribution of stems and affixes to these morphologically driven processes, the results confirm that both these elements need to be present, whether to elicit masked priming or to generate different distributions of neural activity. In a further analysis, the (−S, + A) blemish set patterned with the noncomplex scandal and biscuit conditions, consistent with the view that both a potential stem and a potential affix are needed to trigger early segmentation. At the source level ( Figure 2C), focusing on the time window during which significant sensor-level effects were found, activation has shifted more anteriorly and now includes inferior frontal areas, most strongly on the left. Specific contrasts between conditions within ROIs showed that the derivational/noncomplex contrast was significant in a 330-340 msec time window in left middle MTG (F(1, 15) = 4.69, p < .05), at the peak of the effect found at the sensor level. For the inflectional/noncomplex contrast, we see a more complex pattern, with effects in left posterior MTG from 300 to 320 msec (F(1, 15)  The left frontotemporal patterning of these inflectional effects, which is identical for both the (+S+A) inflectional conditions (blinked, ashed), is consistent with extensive research using spoken words (e.g., Marslen-Wilson & Tyler, 2007) locating morphosyntactic effects in exactly these left peri-sylvian locations. The pseudoword bected, in contrast, which contains a potential inflectional affix but no stem, did not elicit significant effects in these ROIs compared with the noncomplex forms. This suggests that the requirement for both a stem and an affix to be present extends to inflectional as well as derivational morphology in visual word recognition.

Processing Lexical Identity
A complementary set of analyses focused on the word and pseudoword stems to test for effects linked to successful lexical access. The same two sets of 100 morphologically simple words and pseudowords were used as before. At the sensor level ( Figure 3A), the gradiometers revealed one cluster at 390-450 msec in left temporal sensors and a smaller cluster from 410 to 440 msec in right temporal sensors, peaking at 430 msec in both hemispheres. In the magnetometers, one cluster emerged at 425-500 msec within anterior left sensors with a peak at 470 msec. All clusters showed increased processing of pseudowords over words. Figure 3B plots the gradiometer and magnetometer response amplitudes for words and pseudowords at the peak of the significant LH cluster, with the two conditions starting to separate at 350 msec and peaking at 430 msec.
Source-level analyses ( Figure 3C) focused on the 390-500 msec time window where significant lexicality effects were found in the sensor-level analyses. The overall distribution of activation has shifted anteriorly and frontally, especially on the left, where there is strong activity in temporal and inferior frontal regions for both words and pseudowords. Differences between these conditions emerge more posteriorly, with stronger responses to pseudowords in left posterior STG from 390 to 500 msec (F(1, 15) = 5.12, p < .05), left middle MTG from 410 to 440 msec (F(1, 15) = 5.48, p < .05), and left middle ITG from 430 to 440 msec (F(1, 15) = 4.50, p = .05). These lexicality effects overlap spatially with the morphological effects ( Figure 2) but emerge around 100 msec later.
These spatiotemporal and functional patterns are consistent with the standard N400-like effects seen in MEG in terms of timing as well as location (Pylkkänen & Marantz, 2003;Halgren et al., 2002) and with evidence from fMRI showing the involvement of left posterior temporal regions in semantic processing (Hickok & Poeppel, 2007). Lexicality effects on the N400 have been interpreted as reflecting access to lexical representations (e.g., Lau et al., 2008;Kutas & Federmeier, 2000), whereas Dominguez et al. (2004) report prolonged N400 effects at the level of meaning selection because of incorrect morphological decomposition.

Lexical Effects for Morphologically Complex Words
The four +S+A ( farmer, corner, blinked, ashed) conditions were used to examine potential interactions of lexicallevel effects with the processing of morphologically complex letter strings. For the derivational pair, both farmer and corner are existing words, but corner has a potential interpretation as a pseudo-stem plus a pseudo-affix. For the inflectional pair, ashed is not an existing word, although it is potentially interpretable as a real stem plus a real affix. In each case, if an early segmentation process identifies these forms as potentially morphologically complex real words, a later process sensitive to lexical-level information will need to rescue the perceptual system from these potential garden paths.
A significant cluster emerged from 400 to 500 msec within left anterior temporal magnetometers, showing increased activation of both corner and ashed forms relative to their corresponding semantically transparent conditions, peaking at 450 msec ( Figure 4A). The amplitude plots indicate comparable effects for the two contrasts. Consistent with this, the corresponding magnetometer and gradiometer responses ( Figure 4B) show similar trajectories over time, although differences between conditions emerge earlier for the inflectional pairs. Notably, the timing and distribution of these effects are very similar to those seen for the lexicality effects reported in Processing Lexical Identity section for morphologically simple words and pseudowords.
At the source level ( Figure 4C), we see substantial bilateral activation in temporal and inferior frontal regions for both complex and noncomplex conditions, but significant differences between them only emerge in more posterior and inferior temporal sites. In all cases these reflect increased processing for pseudo-complex over complex forms and are likely be the result of top-down feedback processes. The combined contrast of complex versus pseudo-complex is significant from 400 to 470 msec in left posterior fusiform (F(1, 15) = 7.92, p < .05) and approaches significance from 400 to 410 msec in left middle ITG (F(1, 15) = 4.20, p = .058) and left middle fusiform (F(1, 15) = 3.85, p = .068). Broken down by morphological type, the 400-470 msec effect in left posterior fusiform gyrus is significant only for the derivational farmer/corner contrast (F(1, 15) = 8.66, p < .01), whereas the brief effect in left middle ITG from 400 to 410 msec is found only for the inflectional blinked/ashed contrast (F(1, 15) = 4.94, p < .05).

DISCUSSION
This research shows that we can unify the functional characteristics of real-time neural analysis with the functional properties of visual word recognition as revealed in the masked priming data. We can use this linkage across methodological domains to determine the functional architecture of the neurobiological system that generates these properties. In doing so, we benefit in particular from the spatiotemporally specific constraints provided by MEG. Unlike other imaging methodologies, MEG data mapped into neuroanatomically constrained source space allow us to specify not only when processes of different types begin and end, but also (within the limits of MEG source reconstruction) where these processes take place.
The proposed architecture based on these results conceptualizes visual word recognition as a two-phase process, where primarily feedforward orthographically driven analyses segment the visual input into potentially meaningful linguistic substrings (words and morphemes) and where these substrings initiate lexical access in middle and frontotemporal locations from about 300 msec after stimulus onset. The initial stages of this access process are dominated by morpho-orthographic factors, with lexical constraints becoming detectable around 100 msec later, as reflected in effects of lexical identity and in increased processing for pseudo-complex strings like corner. We review below the evidence for these claims and their implications.

Morphemically Driven Lexical Access
The joint behavioral and neuroimaging false segmentation effects for ashedand corner-type stimuli are compelling evidence that the output of orthographic analysis is not in terms of lexical words per se, but in terms of morpheme-like linguistically relevant substrings. The critical MEG contrast is between materials that show decompositional effects in masked priming-the derivationally and inflectionally complex farmer, corner, blinked, and ashed conditions-and materials (scandal, biscuit) that do not ( Figure 2). Consistent with the masked priming results, this contrast reveals a time period-between 300 and 370 msec from stimulus onset-where both complex sets diverge from the noncomplex sets, but where there are no significant differences within each set as a function of their lexical properties. At the source level, the derivational set differs from the noncomplex set in left middle MTG at 330-340 msec, but the farmer/corner subsets do not differ. Similarly, the inflectional set differs from the noncomplex set between 300 and 370 msec in left posterior MTG and LIFG, but the blinked/ashed subsets do not differ. This pattern of results closely parallels the morphoorthographic process hypothesized on behavioral grounds, where the input is analyzed in terms of its morphological properties, but is blind to the lexical properties of the words involved. This is particularly clear for the inflected forms, where both blinked and ashed deviate from the noncomplex forms at around 300 msec and where the presence of the inflectional morpheme activates classic LIFG regions (BA 44 and BA 45) irrespective of the lexical status of the whole form.
The finding that these early processes do not discriminate between genuinely complex and pseudo-complex strings demonstrates that the processes generating candidates for lexical access and recognition are blind to the lexical properties of the strings they are generating. The results for ashed and corner also demonstrate that the manner in which these output processes interface with lexical representations is morphologically compositional. For the inflectional morphology, strings like ashed cannot be accessed as stored forms, because they are not existing words. Instead, they must be compositionally constructed, combining the potential stem (ash) and affix (−ed). More striking still, a monomorphemic form like cornerno different at the lexical level from a simple form like biscuit-seems to be temporarily reconstructed as the nonexistent complex form {corn} + {-er}. This patterns in the relevant neural time window-as well as in masked priming-with genuinely complex forms like farmer and blinked and not with noncomplex forms like scandal and biscuit. These effects require the output of orthographical analysis to be morphemically decomposed.
More evidence for morphemic constraints on early orthographically driven string segmentation and identification comes from the sensitivity of this process to the morphemic status of both elements of a potential complex form. The masked priming data-here and in earlier studies-together with the MEG analyses involving the partially complex scandal, blemish, frumish, and bected sets, suggest that pseudo-complex forms are not treated as complex forms in the 300-370 msec time window unless they contain both a potential stem and a potential affix-as in the corner and ashed conditions. This points to a segmentation process that is not only sensitive to the presence of linguistically relevant subunits but also to the contexts in which they co-occur.
The view that orthographic analysis results in a morphemic output is consistent with the proposals of Dehaene and colleagues (Vinckier et al., 2007;Dehaene, Cohen, Sigman, & Vinckier, 2005), where the endpoint of orthographic analysis in posterior temporal cortex is seen as the identification of "small words and recurring substrings (e.g., morphemes)." It is also consistent with earlier MEG evidence that orthographic processing in inferior and posterior temporal regions, over early 150-250 time windows, is sensitive to the presence of potential stems or affixes (e.g., Lehtonen et al., 2011). More generally, these can be seen as aspects of a ventral stream object recognition process tuned to orthographic analysis over decades of intensive experience with written text.

Visual Word Recognition as a Two-phase Process
The evidence for the salience of morphemic factors in the visual word recognition process, together with the demonstration of a short period during which structural morphological factors seem to dominate, raises the question of whether this indicates a separate, specifically morphological processing stage, intervening between orthographic analysis and access to lexical representations. The existence of such a stage is both a frequent postulate in cognitive theories of visual word recognition and a major source of disagreement between competing theories. The evidence here is that there is no such separable processing stage and that what we see instead are two intersecting phases of neurocognitive activity. The first, as described above, is located in posterior and inferior temporal and occipitotemporal regions and is concerned with the analysis of the visual input into higher-order linguistically relevant orthographic units. These processes are in themselves neither lexical nor semantic in nature and form a spatiotemporally distinct phase in the recognition process. This can be viewed as a modality-specific input system that projects onto a more distributed morpholexical system, more anterior and frontotemporal, that is sensitive to the morphological structure of complex forms and to lexically represented variables more generally-and which is likely to be largely in common with the target systems accessed from auditory inputs.
The separation into two phases is reflected in the spatiotemporal distribution of processes sensitive to orthographic variables but not to morphological structure or lexical identity. For orthographic structure, there is increased activation for consonant strings (relative to words and pseudowords) in the time period 150-230 msec, seen bilaterally in posterior brain regions (Figure 1). Analyses sensitive to morphological structure emerge at around 300 msec, peaking 100-150 msec later than the orthographic effects, whereas the spatial center of gravity shifts anteriorly to more dorsal left frontotemporal sites (Figure 2). None of the inferior temporal and fusiform ROIs that were significant in the orthographic analyses are active in the contrasts sensitive to morphological structure. Although there continues to be RH activation for both complex and pseudo-complex conditions, the only effects that differentiate between conditions are seen in LH middle temporal and inferior frontal sites. A similar though spatiotemporally more clear cut separation from early orthographic processing is seen for the lexically sensitive effects, with increased processing for pseudoword stems from 390 to 500 msec, chiefly in left middle temporal regions ( Figure 3). In complementary analyses contrasting complex pseudowords ( frumish, bected ) with complex real words ( farmer, blinked ), we found comparable effects, with increased processing for pseudowords emerging in left temporal sensors from 425 to 465 msec.
Taken together, these data show that there is a clear separation in neural space and time between orthographically centered analyses and those sensitive to morphological structure and to lexical variables. There is little evidence for a similar separation between the latter types of process, on the basis of the contrasts plotted across Figures 2-4. Although different phases of analysis peak at different points in time, with the earliest morphological structure effects emerging around 100 msec earlier than the effects of lexicality, there is no evidence that these activities are spatially distinct-especially where core middle temporal locations are concerned. Given these timing and location constraints-which are consistent, as noted earlier, with earlier EEG and MEG studies-the most plausible account is that, although the engagement and interpretation of lexical constraints have a sequential time course, these processes involve the same set of frontotemporal brain regions as those implicated in the analysis of morphological structure.
On this account, morphological effects in visual word recognition will emerge as an interaction between the outputs of orthographic analysis and the properties of morpholexical representation and analysis. These in turn will depend on the properties of simple and complex words in the language and how they are lexically represented. Research in the auditory domain suggests that inflectionally complex words in English are decompositionally represented and analyzed in the neural language system (e.g., Marslen-Wilson & Tyler, 2007), whereas derivationally complex words are accessed as whole forms (Bozic et al., 2013), although with some preservation of internal morphological structure (cf. Marslen-Wilson, 2007). There are signs of this differentiation here, with the analysis of inflected and pseudo-inflected forms (blinked, ashed) closely paralleling the decompositional neural patterns seen in the auditory domain (Bozic, Tyler, Ives, Randall, & Marslen-Wilson, 2010;Marslen-Wilson & Tyler, 2007). Derivationally complex forms, in contrast, primarily activate middle temporal sites, although further research is needed here.

Feed-forward Processing and Recurrence
The third defining feature of the proposed functional architecture concerns the processing relationship between the orthographic analysis of the input and the broader lexical and contextual context in which this analysis occurs: Does this context modulate early orthographic analyses, as interactionist accounts would require, or do these analyses operate in a primarily feedforward (or bottom-up) manner? There are two aspects to thiswhether lexical constraints are directly coded into the orthographic analysis process and whether this process is dynamically modulated by top-down predictive or recurrent processes.
On an encoding account, the orthographic mapping process is tuned to the specific lexical context of the language, so that it would not generate (or would disprefer) outputs that were not lexically valid. The results here argue against this, with the first-pass analysis of letter strings into potential stems and affixes being conducted without reference to the lexical identities of these strings (cf. Marslen-Wilson et al., 2008). Otherwise the misanalysis of corner would be blocked, along with the rejection of nonwords like ashed. The results are also inconsistent with a weaker encoding account, where lexical variables modulate but do not determine early segmentational hypotheses. Significant lexical effects on the analysis of potential complex forms-indexed by increased processing for corner over farmer and for ashed over blinked-are not seen until the 400-470 msec time period, substantially later than the initial emergence of morphological structure effects. On an encoding account, these effects should be seen at earlier time points as well.
This leaves open the possibility that early analyses can be modulated by externally generated constraints-for example, by predictive constraints generated in a sentence context or by recurrent constraints generated top-down as a letter string is being processed. In the current study, we only see candidate effects of this type in late time windows (400 msec from word onset), with increased processing for pseudo-complex forms like corner at posterior temporal sites (Figure 4). If this is a top-down feedback effect, then it occurs too late to be evidence for early morphosemantic interactions. Note that in the context of an fMRI study, where the BOLD response sums over a multisecond time window of neural activity, this critical temporal separation is lost, making it possible to misinterpret such an effect as evidence for early interactions between form and meaning. Similar caveats apply to conventional RT tasks, where the temporal ordering of the multiple processes contributing to overall RT is also opaque.
However, although the current experiment allows us to evaluate the role of system-internal constraint, it does not allow us to evaluate the effects of contextual variables more generally. The letter strings were presented in isolation, and we took care to minimize task-based effects. The natural habitat of the visual word recognition system is of course reading in context, with potential constraints generated at syntactic, semantic, and pragmatic levels as words are being read. Without properly time-resolved processing measures, however, it is not possible to determine how these contexts operate-whether they operate primarily in the second phase domain of morpholexical interpretation, or whether they directly modulate the operations of the orthographic input system.
Finally, we note that some findings from EEG and MEG suggest that the initial sweep through the visual word recognition system to the level of lexical access occurs within 200 msec of word onset (e.g., Shtyrov et al., 2013;Lavric et al., 2012;Hauk et al., 2006;Pammer et al., 2004). These timings contrast with this study, where morphological effects emerge at 300 msec and lexicality effects appear later still. This divergence could be attributed to a number of sources.
One of these, as we proposed earlier, may be differences in task demands. Task-driven top-down effects can modulate sequential feed-forward processes in the visual system at early stages of sensory analysis (e.g., Twomey, Kawabata Duncan, , and similar effects may also modulate early processes relevant to the performance of tasks such as lexical decision. Direct evidence for such effects in visual word recognition comes from a further study by Whiting (2011), parallel to the research reported here, which ran the same set of stimuli but under different task conditions. This study replaced the end-of-block recognition test with an occasional lexical decision task, occurring on 10% of trials. The results show both commonalities and divergences relative to the current study. The timing and pattern of early orthographic effects (comparing words and pseudowords with consonant strings) were essentially unchanged. Morphological decompositional effects are detected earlier, but with a similar spatiotemporal distribution-for example, activity is seen in BA 44 for inflectional morphemes from 260 msec, rather than the 320 msec onset seen in the current study. Lexical effects (comparing words and pseudowords) emerged around 100 msec earlier, starting at 310 msec and showing a similar spatial distribution to this study. These selective effects on the timing of different processes leave their relative ordering intact (and consistent with a morpho-orthographic account) but suggest that task demands can indeed shift the timing with which neural processes can be detected.
An additional source of divergences between studies may be differences in statistical methods. In the majority of published EEG and MEG studies of visual word recognition, the dominant analysis strategy is to identify potential temporal or spatiotemporal ROIs on the basis of visual inspection of the global energy profile and to focus subsequent analyses around the visible peaks in this profile. In the current study, we avoided any preselection of areas of interest in favor of a brain-wide analysis process (SensorSPM), conducted in sensor space, where the significance of any contrast is corrected on a brain-wide basis for multiple comparisons. We then used the outcome of these analyses to select the time windows of interest within which we conducted the source space analyses. This approach, which is a more conservative-because globally corrected-procedure for selecting time windows of interest, will be less likely to pick up effects that are weak and transient, and this may disfavor some very early effects. This is a possibility, however, that will need careful evaluation in further research.
We suggest, in conclusion, that our findings are a robust reflection of the basic underlying structure of the processing system supporting visual word recognition, as revealed in the context of reading words in the absence of a lexical decision task. Top-down effects may well be at work in more predictive natural perceptual contexts-such as the reading of continuous text-to maximize the speed and efficiency of the reading process. Nonetheless, such effects would serve to modulate the performance of the basic feedforward process we have described, not to replace it. APPENDIX 1. MEG