The effect of convolving word length, word frequency, function word predictability and first pass reading time in the analysis of a fixation-related fMRI dataset

The data presented in this document was created to explore the effect of including or excluding word length, word frequency, the lexical predictability of function words and first pass reading time (or the duration of the first fixation on a word) as either baseline regressors or duration modulators on the final analysis for a fixation-related fMRI investigation of linguistic processing. The effect of these regressors was a central question raised during the review of Linguistic networks associated with lexical, semantic and syntactic predictability in reading: A fixation-related fMRI study [1]. Three datasets were created and compared to the original dataset to determine their effect. The first examines the effect of adding word length and word frequency as baseline regressors. The second examines the effect of removing first pass reading time as a duration modulator. The third examines the inclusion of function word predictability into the baseline hemodynamic response function. Statistical maps were created for each dataset and compared to the primary dataset (published in [1]) across the linguistic conditions of the initial dataset (lexical predictability, semantic predictability or syntax predictability).


Data
depicts the linguistic and eye tracking regressors used in each analysis. Figs. 1e3 contain conjunction maps created to compare the effect of convolving word length and frequency as baseline regressors to the primary dataset (found in Ref. [1]). Word length and frequency were added as baseline regressors because there is some evidence that these features such as word length have an independent influence on the oculomotor profile [6]. Incorporating these values into the secondary dataset produced statistical maps similar to the primary dataset, with a few differences noted in the semantic and syntax conditions. Therefore, incorporating word frequency and length into a baseline function may be of little utility. Figs. 4e6 contain conjunction maps demonstrating the effect of removing first pass reading time as a duration modulator from the primary dataset. Removing first pass reading time created a loosely fitted hemodynamic response function relative to the primary analysis and resulted in distinctly different statistical maps for all conditions. The most dramatic difference can be seen in the semantic condition in which the default mode network is now highly associated with this hemodynamic response function. This demonstrates the necessity of a tightly fitted hemodynamic response function that includes a duration modulation when using the oculomotor profile to study reading. Figs. 7e9 demonstrate the effect of including the lexical predictability of function words into the baseline function to the primary dataset. The inclusion of function words into the baseline response function is theoretically interesting as they are often skipped by the reader [7]. Including fixations on function words into the baseline response function resulted in statistical maps comparable to the primary dataset in both the lexical and syntax conditions. There were however differences in the semantic condition with the right and left anterior insula being associated. This deserves deeper investigation. Overall, this dataset demonstrates that focusing the analysis on content words is the best approach. Figs. 10e12 depict statistical maps of functional activity for each dataset that was compared to the Specifications Table   Subject Tables 2e10). These include volumetric data (how large activated regions of the brain were in microliters), max z-scores (the magnitude and direction of association with the hemodynamic profiles), MNI coordinates for the maximal intensities within each region (to allow for comparison with other data), anatomical and functional designations. Additional data can be accessed via GitHub (https://github.com/btcarter/LinguisticPrediction). Sample data from nine study participants are provided for the purpose of testing the scripts. This includes DICOM files from one structural image and three functional images per participant. Group statistical maps and conjunction maps for each dataset are also provided. The complete dataset can be found on the Open Science Framework (osf.io/7csxr).

Participants
Forty-three participants were recruited from the student body at Brigham Young University. All were right-handed, literate and native English speakers with 20/20 uncorrected or corrected vision without a history of reading disorders. Two participants were excluded due to eye tracking problems or excess motion in the scanner, resulting in a total of 41 participants included in the final analysis. Informed consent was obtained from all individuals prior to participation. The study was approved by the Brigham Young University Institutional Review Board ethics committee to ensure it conformed with the recognized ethical standards of the Declaration of Helsinki [8].

Materials
54 paragraphs were presented to participants during three functional scans (18 paragraphs per scan). These paragraphs were a subset of those created for the Provo Corpus [9] and their linguistic predictability characteristics were previously characterized via cloze procedure [9e11] and latent semantic analysis [12]. Linguistic predictability refers to the probability that a word may be accurately predicted given the preceding text and can be computed in terms of lexical (whole word form), semantic (word meaning), syntactic (word class) values.
The cloze procedure is a simple method of computing how expected a word is given its preceding context or predictability. Participants are presented with the first word of a sentence and asked what the following word will be. Their response is recorded and then the word is revealed. At this point they are asked what the third word in the sentence will be, and so on until responses have been gathered for each word in the text. Responses are then scored according to whether they match the word class (syntax), and whole word form (lexical) of the target word. The fraction of correct responses for each characteristic results in a predictability score for that characteristic. E.g. if participants were asked what word might follow the phrase "I want to drive the" and 50% responded "car", 30% responded "truck",                15% responded "train" and 5% "forklift" (the correct response was "car") then this word would be scored as having a lexical predictability of 0.5 (only 50% of respondents answered "car") and syntactic predictability of 1.0 (all respondents answered with a noun).

Apparatus
Paragraphs were presented to participants via Cambridge Systems MRI-safe LCD monitor located at the end of the scanner bore and viewed via a mirror attached to the head coil. Screen resolution was set for 1600x1200. Text was displayed in Courier New font at 26pt, resulting in approximately 4 letters per degree of visual angle. Eye-movements were recorded via an SR Research Eyelink 1000 plus long-range MRI eye tracker sampling at 1000 Hz (Eyelink 1000, SR research, Mississauga, Canada). A Siemens 3T Tim Trio with a 12-channel receive only head coil was used for this study. Software version was syngo MR B17.

Eye-movement data acquisition
Only movements of the right eye were recorded, though viewing was binocular. Prior to the beginning of each scan, participants completed a nine-point calibration and validation exercise. An average error of 0.49 and a maximum error of 0.99 of visual angle were required to pass. A single trial consisted of viewing a fixation cross for 6 seconds, followed by a paragraph, which was viewed for 12 seconds. Stimulus presentation and eye position were controlled and recorded via SR. Research software. Eye movements were co-registered with scanner sequence. The experiment was programmed to begin once an onset signal had been received from the scanner control computer. All fixation times were computed relative to this signal.

Scan Sequence
The following scans were performed, listed in order: a localizer, 3 consecutive 5.66 minute functional scans, followed by a structural scan.
Structural scan parameters. a T 1 -weighted, magnetization prepared rapid gradient-echo (MPRAGE) protocol: orientation ¼ sagittal, anterior to posterior phase encoding, FOV ¼ 218Â250, matrix ¼ 256x256, slice thickness ¼ 1mm, TR ¼ 1900 ms, TE ¼ 2.26 ms and flip angle ¼ 9 .  [14] were used. DICOM images were converted to BRIK and HEADER files via to3d. The structural scan was then co-registered to the third functional scan via 3dWarp. 3dTshift was used for slice   time correction. Functional scans were corrected for low-frequency motion by aligning all volumes to the middle acquisition volume. Blocks were aligned to the same functional space via 3dvolreg. A skullstripped mask was created for each subject using 3dSkullStrip and used to restrict the analysis to only brain matter. Input matrices were constructed and decoded via 3dDeconvolve. Each analysis had 6 polynomial regressors for motion. This included pitch, roll, yaw, superior-inferior translation, left-right translation, and anterior-posterior translation. Additional regressors were added for each dataset. Timing series coding these regressors were constructed from the eye tracking data via R [15], version 3.3.2. Analysis 1. Three parametric regressors were added encoding lexical predictability, semantic predictability, and syntactic predictability. First pass reading time (the amount of time spent with the fovea fixed upon a word when the word is first encountered) was used as a duration modulator for each regressor. Log transformations were applied to lexical and syntactic predictability measures. Semantic predictability was not log transformed (see Ref. [1] for an explanation).
Analysis 2. This included all the regressors used for Analysis 1 with additional regressors coding for word length and word frequency. These regressors were added to baseline as amplitude modulated hemodynamic response functions. Each regressor was fitted using fixation onset to mark the beginning of each event with word length or frequency acting as the amplitude of the function (word frequency was log transformed).
Analysis 3. This omitted first pass reading time as a duration modulator. All other regressors were the same as that found in Analysis 1.  Analysis 4. This incorporated the lexical predictability of function words into the baseline hemodynamic response function, in addition to the regressors incorporated in Analysis 1.
Deconvolution was performed via 3dDeconvolve. A 5mm blur was applied to the output via 3dmerge and individual anatomical and statistical maps were projected into MNI_ICBM152 space [16,17] via ants.sh [14]. A binary map group map was then constructed and used to exclude white matter. 3dttestþþ was used to apply a random effects analysis and compute cluster thresholds via the option "-Clustsim". A voxel-wise threshold of p < 0.001 and a cluster-threshold of 38 voxels were used to achieve an a < 0.05 [18]. 3dclust was used to compute descriptive statistics and coordinates of peak activity.

Conjunction map construction
Masks were created from the statistical maps created during the random effects analysis, and overlaid via 3dcalc [13] to visualize cluster overlay. Regions pertaining to the first analysis were given a value of 1, those pertaining only to the second the value of 2. This resulted in common regions being given a value of 3. A t-statistic threshold of 3.291 was used.

Scripts
All analysis scripts are available at: https://github.com/btcarter/LinguisticPrediction/. Additional information concerning script implementation, execution, and sample data, can be found here.