The Influence of Mental Imagery Instructions and Personality Characteristics on Reading Experiences

It is well established that readers form mental images when reading a narrative. However, the consequences of mental imagery (i.e. the influence of mental imagery on the way people experience stories) are still unclear. Here we manipulated the amount of mental imagery that participants engaged in while reading short literary stories in two experiments. Participants received pre-reading instructions aimed at encouraging or discouraging mental imagery. After reading, participants answered questions about their reading experiences. We also measured individual trait differences that are relevant for literary reading experiences. The results from the first experiment suggests an important role of mental imagery in determining reading experiences. However, the results from the second experiment show that mental imagery is only a weak predictor of reading experiences compared to individual (trait) differences in how imaginative participants were. Moreover, the influence of mental imagery instructions did not extend to reading experiences unrelated to mental imagery. The implications of these results for the relationship between mental imagery and reading experiences are discussed.


Introduction
It is well established that readers perceive mental images during reading (Green & Brock, 2000;Jacobs, 2015). For instance, an eye tracking study showed that people are responsive to mental simulation-eliciting content in stories (Mak & Willems, 2019). It was found that when participants were reading action descriptions (assumed to elicit motor simulation) they sped up, whereas they slowed down when reading perceptual descriptions or mental event descriptions (assumed to elicit perceptual simulation or mentalizing, respectively).
Additionally, there is a relationship between the amount of imagery and subjective experiences during reading. Mak and Willems (2019) found that individual differences in the responsiveness to simulation-eliciting content were related to participants' subjective experiences (such as absorption and appreciation). This is only one example of work showing that mental simulation during reading is associated with absorption in and appreciation of stories (see also Green, 2004;Green & Brock, 2002;Kuijpers, Hakemulder, Tan, & Doicaru, 2014;Mol & Jolles, 2014;Weibel, Wissmath, & Mast, 2011).
Next to individual variation in amount of imagery perceived during reading, there are a number of stable (personality) characteristics that are associated with reading experiences. In the experiments we report here, we decided to study both the role of instructed mental imagery and the role of individual (trait) differences in literary reading. Below, we will discuss the relationships between (1) mental imagery and reading experiences and between (2) individual (trait) differences and reading experiences, before we (3) introduce the set-up and hypotheses of our experiment.

Mental Imagery and Reading Experiences
As mentioned above, people engage in mental imagery 1 when reading stories, and mental imagery is an important driver of absorption: it has been found that visualizing the story world will strengthen people's experience of absorption in a story (Green & Brock, 2002;Kuijpers et al., 2014;Kuiken & Douglas, 2017). 2 Absorption has been defined by Kuijpers and colleagues (2014) as "the subjective experience of being absorbed in the story world of a narrative text" (p. 90, emphasis in the original). It describes the feeling we may have when reading a good story or book where we go beyond comprehending the meaning of the words on a page, to a captivating experience that can help us become completely involved in the stories we read. This experience has been reported on in widely diverging disciplines, and has also been defined as transportation (Gerrig, 1993;Green & Brock, 2000), narrative engagement (Busselle & Bilandzic, 2008), Mak, M., et al. (2020). The Influence of Mental Imagery Instructions and Personality Characteristics on Reading Experiences. Collabra: Psychology, 6(1): 43. DOI: https://doi.org/10. 1525/collabra.281 narrative presence (Kuzmičová, 2012), immersion (Ryan, 2001; see also Jacobs, 2015) or flow (Csikszentmihalyi, 1990). For the sake of clarity, we will refer to this experience as absorption for the remainder of this article.
Absorption is a multifaceted construct that tries to capture the entirety of the subjective experience of being captivated by a good story (Kuijpers et al., 2014). It is proposed that absorption consists of multiple dimensions, being mental imagery, emotional engagement, attention, and transportation (Kuijpers et al., 2014). Mental imagery (as a dimension of absorption) is defined as a visualization of the story world, whereas emotional engagement could be seen as the emotional counterpart of mental imagery: the sympathetic and empathic feelings for the characters in the story. Attention is characterized as a heightened focus or concentration of the reader towards the story world -and as a consequence a lower concentration towards the here and now. Transportation is seen as the feeling a reader can have of being part of the story world as opposed to the real world. Together these 4 dimensions can result in an experience of complete absorption in a narrative or story world. 3 Another connection between absorption and mental imagery, is that they have both been found to be associated with another important aspect of people's reading experiences: the enjoyment (or appreciation) of stories (Busselle & Bilandzic, 2009;Green, 2004;Green, Brock & Kaufman, 2004;Kuijpers et al., 2014;Kuiken & Douglas, 2017;Mol & Jolles, 2014;Weibel et al., 2011). As there does not seem to be strong consensus with regard to the definition of appreciation (especially between different disciplines, e.g., communication research and literature studies), in the current experiment we have considered both the overall enjoyment of narratives and other facets of aesthetic experiences that we believe could play a role in our enjoyment and appreciation of stories (e.g., whether a reader is emotionally moved by a story, or finds it amusing). To test this multitude of facets of aesthetic experiences, we used adjectives that are often used to describe the multiple dimensions of aesthetic experiences (Knoop, Wagner, Jacobsen, & Menninghaus, 2016). These adjectives were taken from a list of adjectives that were most often used by people to describe their experience while reading literature (Knoop et al., 2016). Because this list of adjectives was compiled in a "bottom-up" fashion (i.e. the adjectives are derived directly from the experiences of readers), it was assumed that these adjectives could successfully tap into multiple facets of aesthetic experiences of the readers in our experiments. We found it important to look at appreciation apart from absorption, because we wanted to consider a measure of reading experience that was more distantly related (but not unrelated) to mental imagery than absorption (recall that mental imagery is considered to be one of the four subcomponents of absorption).

Reading instructions and mental imagery
In order to test the relationship between mental imagery and reading experiences, we wanted to make sure that some readers in our experiments would engage in mental imagery more than others. To this effect, we employed a method from a related but separate line of research, where it was found that instructing students to create images of what they had read impacts text comprehension (see De Koning & van der Schoot, 2013 for an extensive overview). Apart from text comprehension, some studies have also found direct links between pre-reading instructions and reading experiences. Green and Brock (2000) found that instructing readers to judge the difficulty of a story to establish the suitability for fourth-grade readers, led to lower experienced transportation (which is conceptually comparable to absorption) in comparison to readers who were instructed to pay specific attention to the story plot. The rationale of the study was that instructing participants to pay attention to the suitability of the text for fourth-graders, would lead to less absorption than instructing them to focus on the story plot. In a follow-up experiment, Green (2004) subsequently instructed participants to use relaxation strategies during reading to increase transportation. This manipulation did however not lead more experienced transportation in these participants, when compared to participants who had received a neutral instruction. Perhaps this instruction was not associated with higher transportation because relaxation is relatively unrelated to the process of transportation. Therefore, the advice for future research was to use a pre-reading instruction focusing on a specific component of transportation (e.g., imagery; Green, 2004, p. 261). Johnson, Cushman, Borden and McCune (2013) made a successful first attempt in this direction. Instead of pre-reading instructions, they gave participants an imagery generation training, which subsequently resulted in increased transportation when reading a narrative. In the current study, we take these findings as the starting point for our investigation of the influence of explicit imagery instructions on subjective reading experiences. Participants in the experiments described in the current paper were either encouraged or discouraged to engage in mental imagery through pre-reading instructions, after which their subjective experiences while reading literary short stories were measured. Even though pre-reading instructions have not been successful in all studies attempting to influence reading experiences, Tukachinsky (2014) noted in a review of these studies that the effect of pre-reading instructions seemed quite reliable. Moreover, we used an instruction specifically targeting mental imagery, which was suggested by Green (2004) as possibly more powerful in manipulating reading experiences than instructions aimed at more general processes. The purpose of this specific mental imagery instruction was to manipulate the amount of mental imagery between participants in order to establish what role mental imagery plays in subjective reading experiences.

Individual (Trait) Differences in Reading Experiences
As mentioned above, mental imagery is not the only factor that plays a role in reading experiences. Variation in experienced absorption and appreciation can also be due to individual differences in situational factors (e.g., stress, mood, distractions, level of energy), or more stable characteristics. For example, amount of print exposure is negatively associated with reading difficulties (Stanovich & West, 1989), positively associated with reading skills (Acheson, Wells, & MacDonald, 2008) and positively associated with language ability, school success, Theory of Mind and empathy (Brysbaert, Sui, Dirix, & Hintz, 2020). Additionally, reading habits in daily life are closely related to reading experiences: more habitual readers experience more absorption (Kuijpers, Douglas, & Kuiken, 2018) and enjoy reading more (Mol & Jolles, 2014).
Furthermore, a range of personality characteristics have been found to be associated with reading experiences. Individuals reporting more need for affect (Appel & Richter, 2010), as well as more transportable individuals (Bilandzic & Busselle, 2011), reported experiencing more transportation while reading a story or watching a film. Similarly, Need for Cognition was found to be a predictor of transportation experienced while reading stories (Green et al., 2008) or watching films (Hall & Zwarun, 2012). Openness was positively associated with reported interest for stories (which is related to enjoyment/appreciation of stories; Fayn, Tiliopoulos, & MacCann, 2015), absorption (although indirectly, via reading habits; Kuijpers et al., 2018), and the overall likelihood that people read literature for leisure (Kraaykamp & Eijck, 2005; see also Schutte & Malouff, 2004). Interestingly, Malanchini and colleagues (2017) link differences in reading enjoyment and motivation to genetic differences.
Because of the important role of the abovementioned individual (trait) differences in reading experiences, we took these into account in the experiments reported here. Because we controlled for the role of these individual (trait) differences in reading experiences when studying the role of guided mental imagery instructions in reading experiences, we will be able to draw conclusions about the role of mental imagery instructions over and above individual (trait) differences from the results of our experiments.

Current Study
In the current study we investigate the respective roles of mental imagery and individual differences on subjective experiences during reading. We test in two experimental studies whether guided mental imagery instructions influenced reading experiences. In keeping with the suggestion made by Green (2004), we tested the effect of pre-reading instructions specifically focusing on mental imagery in our experiments.
In our first experiment participants read a literary short story and subsequently rated their reading experiences on several questionnaires. One group of participants was instructed to use mental imagery while reading the stories, whereas another group of participants was asked to read the stories for leisure. In the second experiment a third instruction was added, designed to distract participants from the plot of the story. This third instruction was added to control for a task confound in the first experiment, where the imagery instruction was more effortful to follow than the leisure instruction. The task in the third instruction was as effortful to complete as the task in the imagery instruction, but it was designed to distract participants from the plot of the story, and therefore to make their mental imagery less vivid. With this experiment we wanted to test whether overall reading experiences can be modified using reading instructions focusing on one specific facet of these reading experiences and whether these reading instructions can "overrule" the influence of individual (trait) differences (e.g., reading habits, personality characteristics) on reading experiences. Based on the literature we hypothesized that mental imagery instructions would result in more mental imagery compared to the control group (leisure readers) and therefore in more absorption and appreciation. In contrast, we hypothesized that the distracting instruction added in the second experiment would result in less mental imagery and therefore in reduced absorption and appreciation.
If specific reading instructions would indeed prove powerful in altering reading experiences these could be used to promote reading in people who do not read for leisure, which could have positive consequences for among others school success (Chiu & McBride-Chang, 2006;Mol & Jolles, 2014;Retelsdorf, Köller, & Möller, 2011) second language learning (Lao & Krashen, 2000;Lee, Schallert, & Kim, 2015;Yamashita, 2008), social cognition and empathy (e.g., Fong, Mullin, & Mar, 2013;Johnson et al., 2013;Mar & Oatley, 2008;Oatley, 2016), or persuasion (e.g., Dal Cin, Zanna, & Fong, 2004Green & Brock, 2000). If, however, other factors (e.g., individual trait differences) are found to be a stronger driver of absorption and appreciation than imagery instructions, this would indicate that using such instructions in for instance educational settings is not an optimal intervention to increase reading pleasure. A third option would be that individual trait differences and imagery instructions interact as drivers of absorption and appreciation. In that case, this could indicate that using imagery instructions would only be useful for some individuals and that it would be necessary to find a way to determine which individuals would or would not benefit from such instructions.

Experiment 1 Methods
This first experiment was conducted in the context of Bachelor's theses, for which five students of Communication and Information Studies at the Radboud University in Nijmegen, The Netherlands, worked together under the supervision of the first author to conduct an experiment testing the influence of mental imagery-inducing reading instructions on reading experiences.

Participants
A total of 120 participants took part in this first experiment. To ensure that participants understood their instructions for the experiment they were asked to repeat what they had been instructed to do while reading (i.e. mental imagery versus reading for leisure) after reading the story. Due to an error during data collection, the experimental condition of 20 participants was not registered correctly. Data from these 20 participants were excluded from analysis. It was double checked that the data from the remaining 100 participants were registered correctly before moving on with the analyses. The remaining participants consisted of an experimental group (n = 45; 25 females; Age: M(SD) = 32 (15) years old; age range = 19-71) and a control group (n = 55; 33 females; Age: M(SD) = 33 (16) years old; age range = 17-82). Chisquare tests indicated that there were no significant differences between the participants in the two groups with respect to biological gender (χ 2 (1) = 0.06; p = .81) and educational level (χ 2 (1) = 10.06; p = .07). However, to control for any possible individual differences in age, biological gender or educational level (measured as the highest completed education, ranging from 1 to 6, where 6 was the highest possible level of education in the Netherlands), we considered these factors as covariates in our analyses.
Prior to the experiment, participants were informed about the procedure of the experiment. It was made clear that participation was voluntary and that it was allowed to withdraw from the experiment at any time without need for explanation. All participants gave written informed consent in accordance with the Declaration of Helsinki. The study was approved by the local ethics committee.

Materials
Materials used in Experiment 1 consisted of the story that was read, the instructions participants received before reading, and the questionnaires participants filled in after reading. We will now discuss these materials in more detail.

Story
The story used in the first experiment was an existing literary short story by the acclaimed Dutch writer Rob van Essen (2014), called De mensen die alles lieten bezorgen (The people who had everything delivered). The story was 2988 words long and took participants about 10-15 minutes to read. The story recounts the experiences of a man who lives in an apartment building in Amsterdam. His neighbors rent out their apartment while they are on holiday for the Christmas days, and a morbidly obese British couple stays there. When the wife has a heart attack she has to be lifted out of the apartment by a firetruck, as there is no elevator in the building. The events in the story are described using very colorful language and are easy to visualize.

Instructions
Before reading the stories, participants received a reading instruction. The experimental group was instructed to "Pay close attention to the story and read the story as you would normally read a story. Use your imagination while reading, by visualizing the surroundings described in the story and envisioning the actions of the characters. Imagine the main character standing in front of you, imagine what happens and pay close attention to what all of the characters are doing". The control group was simply instructed to "Pay close attention to the story and read the story as you would normally read a story". This way both groups were instructed to read the story attentively, but the experimental group was encouraged to form a vivid mental image of the events described in the story. When participants had finished reading, they were asked to repeat their reading instruction to the experimenter. If they were unable to repeat the instruction correctly or if their answer indicated they did not use the instruction while reading the story, data for these participants were excluded from the analysis.

Questionnaires
After reading, participants had to fill in some questionnaires measuring their reading experiences (i.e. Story World Absorption, see Kuijpers et al., 2014), reading habits in daily life, and other more general information (i.e. age, level of education, and biological gender). Since this experiment was conducted in the context of Bachelor's theses, a couple of questionnaires were devised by the Communication and Information Studies students regarding topics not under study here (e.g., attitudes towards fast food, behavioral intentions regarding healthy eating). Since we do not have specific theoretical assumptions regarding these topics, we will not discuss these measures in the current paper.
Story world absorption was measured using the Story World Absorption Scale (SWAS; Kuijpers et al., 2014). The SWAS is a validated scale consisting of 18 items with high internal validity (Kuijpers et al., 2014) which measures 4 aspects of story world absorption on the four subscales Attention, Transportation, Emotional Engagement and Mental Imagery (e.g., When I finished the story I was surprised to see that time had gone by so fast; I could imagine what the world in which the story took place looked like). Participants rated each question on a 7-point scale (1 = disagree, 7 = agree).
Reading habits were measured using five multiple choice questions about reading habits in everyday life, with 4 or 5 optional answers (Hartung, Burke, Hagoort, & Willems, 2016;Mak & Willems, 2019; e.g., How often do you read fiction; How many books do you read each year). Additionally, participants were asked for their genre preference in an open-ended question, where they could list up to three preferred genres.

Procedure
Participants were recruited and tested in a quiet room at the university campus or at home. Participants were informed about the procedure and were asked for written informed consent. At the start of the experiment, participants were given one of the two possible instructions on paper (as described above). If necessary, the instruction was clarified by the experimenter. After that, participants read the story at their own pace. Both the instructions and the story were read from paper. After reading, participants were asked to fill in the questionnaires (SWAS, reading habits, general information) on the experimenter's laptop. The entire procedure took about 20-25 minutes.

Data-analysis
Data-analysis was done using the 'stats' package in R version 3.6.1 (R Core Team, 2018). We constructed a linear regression model that predicted average scores on the SWAS based on experimental group (imagery instruction contrasted with control). Biological gender (male contrasted with female), age, level of education, and reading habits were added to the model as general variables expected to explain additional variance. As a result, any effects of the experimental group would represent variance explained by the given instruction over and above variance explained by these demographic variables. Similar models were constructed to predict scores on the four subscales of the SWAS separately (i.e. Attention, Transportation, Emotional Engagement and Mental Imagery). All continuous predictors were centered and scaled. Variance Inflation Factors (VIFs) were calculated for all models, to check for multicollinearity between predictors. All VIFs for all models were between 1 and 2, indicating that multicollinearity was not problematic in our models and all planned predictors could be entered into the models.

Questionnaires
To test the reliability of the used scales and subscales, Ω t was calculated. We decided to calculate Ω as opposed to Cronbach's α, since it has been argued by several researchers that Ω is a more appropriate measure of reliability (e.g., Dunn, Baguley, & Brunsden, 2014;Peters, 2014;Revelle, 2014). We decided to report Ω t as opposed to Ω h because we assumed the constructs measured with our scale to be multidimensional (see Revelle, 2014). The average scores on the Story World Absorption Scale as well as the scores on the four subscales of the SWAS all showed sufficient to excellent reliability; total SWAS (18 items), Ω t = .95; SWAS Attention (5 items), Ω t = .87; Transportation (5 items), Ω t = .84; Emotional Engagement (5 items), Ω t = .92; Mental Imagery (3 items), Ω t = .79. Descriptive statistics for the questionnaire scores per subscale and per group are visualized in Figure 1.
The answers on the reading habits questionnaire were measured on a scale ranging from 1 to 5 on four of the five multiple choice questions, but from 1 to 4 on the final question. Therefore, z-scores were calculated for all questions on this questionnaire (with higher scores for more habitual readers). Overall reliability was sufficient, Ω t = .78.

Main analysis
The model predicting average scores on the SWAS based on biological gender, age, level of education, reading habits and experimental group (imagery instruction versus control; model adjusted R 2 = 0.185) showed that participants in the experimental group were more absorbed than participants in the control group (see Table 1.1 for all results of this model; see Figure 1A). Additionally, participants who were more habitual readers, also reported more story world absorption. Finally, males reported less story world absorption than females and participants with a higher level of education were more absorbed.
To find out which aspects of story world absorption this difference between the experimental and control group stems from, similar models were built to predict  Table 1.1a), showed that participants in the experimental group read the story more attentively (see Figure 1B). More habitual readers in daily life also reported more Attention to the story and males reported less Attention than females.
For the Transportation subscale of the SWAS (model adjusted R 2 = 0.170; see Table 1.1b), it was found that participants in the experimental group experienced more transportation than participants in the control group (see Figure 1C). Participants with a higher level of education also reported experiencing more transportation into the story.
The model predicting scores on the Emotional Engagement subscale of the SWAS (model adjusted R 2 = 0.118; see Table 1.1c) revealed no differences between the experimental and control group with regard to emotional engagement (see Figure 1D). Males were less emotionally engaged than females. Additionally, more habitual readers reported to be more emotionally engaged with the story.
In support of the effectiveness of our manipulation, participants in the experimental group reported more Table 1: Coefficients of the models predicting absorption based on type of instruction (mental imagery instruction contrasted with the control instruction), taking into account biological gender (male contrasted with female), age, self-reported reading habits, and level of education. Significant predictors are marked (* p < .05, ** p < .01, *** p < .001). mental imagery than participants in the control group (see Table 1.1d; see Figure 1E) on the Mental Imagery subscale of the SWAS (model adjusted R 2 = 0.084). None of the other tested predictors (biological gender, age, level of education, reading habits) were significantly related to scores on the Mental Imagery subscale of the SWAS.

Discussion
The results from this first experiment show that mental imagery-inducing reading instructions were associated with a stronger absorption experience, in particular a stronger attention towards the story, a stronger experience of transportation into the story world and more reported use of mental imagery (confirming the effectiveness of our manipulation). This suggests that, indeed, pre-reading instructions focusing on specific aspects of reading (such as mental imagery) can influence the way readers experience stories. Apart from the influence of reading instructions, we found substantial individual differences in reading experiences. For instance, females reported to be more absorbed by the story. When looking at the subcomponents of absorption, it became clear that this difference between males and females was most prominent in the emotional engagement component of absorption: women were more emotionally engaged when reading the story than men. Level of education also appeared to explain some of the variation in absorption. This was mostly visible in the transportation subcomponent of absorption: participants with a higher level of education, reported experiencing more transportation into the story world. Habitual readers were also more absorbed than participants who did not read much in their own time. This was visible in their scores on the attention and emotional engagement subcomponents of absorption: more habitual readers reported more attention to the story and more emotional engagement with the story. Age was not significantly related to the absorption experience, nor to any of its subcomponents, as has also been found in previous work (Hartung, Withers, Hagoort, & Willems, 2017).
From this first experiment it could be concluded that there are indeed individual differences in reading experiences, which are related to both mental imageryinducing reading instructions and stable individual differences (i.e. biological gender, level of education, reading habits). However, a few questions remain after this experiment. First and foremost, a difference in reading experiences between the experimental group and the control group could mean two things. Although it is very well possible that absorption was enhanced through induced mental imagery as a result of the mental imagery reading instruction in the experimental group, it could also be the case that elaborate reading instructions in general promote more intensive reading and as a result a stronger experience of absorption (in both reading instructions used in this experiment, participants were told to read the story attentively, but only in the imagery instruction participants were asked to vividly imagine the events happening in the story on top of reading the story attentively). That is, in this experiment, the imagery group got more elaborate (longer) instructions than the control group and this could have led to the differences that we observed (see De Koning & van der Schoot, 2013 for a similar argument). Secondly, the reading instruction was aimed at mental imagery, which is part of the absorption experience. Although the effect of the reading instruction did translate to other subcomponents of absorption (attention, transportation), it would be interesting to find out if a mental imagery reading instruction could also influence other reading experiences, most notably story appreciation. Third, we did look at some general individual differences that could influence reading experiences, such as biological gender, age and reading experience, but we did not look at any personality characteristics that have been associated with reading experiences before. In the next experiment we have therefore considered the personality characteristics fantasy (how imaginative a person is) and perspective taking (the extent to which someone takes other peoples' perspectives in daily life).
To be able to answer these remaining questions, we ran a second experiment, where we made the mental imagery instruction more clear (to improve compliance with the instruction) and included a third instruction, which was as elaborate as the mental imagery instruction, but its content was aimed at diverting the reader from the narrative's plot (but still encouraging thorough reading of the text; adapted from an instruction successfully used to lower experienced transportation; Green & Brock, 2000). This way, we could rule out that the effect of instruction was simply the result of more intensive reading instead of being related to the actual content of the instruction. To elaborate, we added this third instruction to test the alternative explanation for the results from experiment 1: that elaborate, detailed reading instructions in general promote more intensive reading and as a result a stronger experience of absorption. If both detailed instructions would increase absorption, this would be evidence that this increase in absorption is the result of more thorough reading due to the length or details in the instructions. However, if only the imagery instruction would increase absorption and the other detailed instruction would not change or even decrease absorption, we would have stronger evidence for our hypothesis that the increase in absorption after the imagery instruction is the result of the content of the instruction. For this third instruction (that was as detailed as the imagery instruction), we chose an instruction aimed at decreasing absorption since using a second detailed instruction that was aimed at increasing absorption would not have been helpful: if in this case both detailed instructions would increase absorption, we would still not know whether this was due to the content of the instructions, and not simply to the fact that they were detailed instructions that encouraged thorough reading.
We also included measures for story appreciation, a more thorough measure of mental imagery, and measures of personality traits we thought might play a role in reading experiences. Furthermore, to ensure more experimental control we tested our experiment in a more controlled environment in this second study. Finally, we decided to use two new stories to find out if the effect of reading instructions would also extend to different stories.

Methods
In this experiment participants were divided into three groups. Apart from the group receiving an elaborate mental imagery instruction and the control group, we also included a group that received the instruction to judge whether the writing style of the story (sentence construction, word use) was suitable for teenagers of about 14 or 15 years old, who were in the lower grades of Dutch secondary education (henceforth called secondary school suitability instruction; cf. Green & Brock, 2000). Such "distraction manipulations" have in previous studies been particularly useful in manipulating transportation (for a review, see Tukachinsky, 2014). If the length of the imagery instruction was the reason people became more transported in the first experiment, this secondary school suitability instruction should also result in increased absorption and there should be no difference between the results for the imagery instruction and the suitability instruction. However, if the effect of the instruction was due to the content of the instruction, the imagery instruction should increase absorption but the suitability instruction should decrease absorption. This enabled us to test if an effect of instruction was the result of the actual content of the instruction or, alternatively, was simply the result of more intensive reading.

Participants
To ensure sufficient statistical power the sample size of this study was calculated in G-power (Faul, Erdfelder, Buchner, & Lang, 2009) using the effect size from experiment 1. This resulted in a required sample size of approximately 120 for an estimated power of .85, divided over 3 groups. A total of 125 participants (102 females) participated in the second experiment. The data of 7 participants had to be discarded because of procedural errors (4), too much missing data (2), or because they did not have enough time to finish the experiment. The remaining 118 participants (99 females) were split up in a group receiving a mental imagery instruction (n = 39; 32 females; M age = 24 years old), a group receiving a secondary school suitability instruction (n = 39; 35 females; M age = 23 years old) and a control group (n = 40; 32 females; M age = 24 years old). There were no differences between the participants in the three groups with respect to biological gender (χ 2 (2) = 3.07; p = .22), nor were there differences between groups in age (F(2, 233) = 0.77, p = .46), reading habits (self-report: F(2, 233) = 0.17, p = .84; ART-score: F(2, 233) = .21, p = .81) or personality characteristics (IRI Fantasy: F(2, 233) = 2.81, p = .06; IRI Perspective Taking: F(2, 233) = 0.43, p = .65; see below for an extensive description of all used questionnaires). Participants were all healthy, native speakers of Dutch, without dyslexia.
Participants were recruited from the participant database of the Radboud University and received €10 or course credit for participation in this study. Prior to the experiment, participants were informed about the procedure of the experiment. It was made clear that participation was voluntary and that it was allowed to withdraw from the experiment at any time without need for explanation. All participants gave written informed consent in accordance with the Declaration of Helsinki and the study was approved by the local ethics committee.

Materials
Materials used in Experiment 2 consisted of the two stories that were read, the instructions participants received before reading, and the questionnaires participants filled in after reading. We will now discuss these materials in more detail.

Stories
Instead of just one story, participants read two stories in the second experiment. Both were literary short stories written by acclaimed Dutch writers. The first story, Brommer op zee (Moped on sea), was written by Maarten Biesheuvel (1972) and was 1827 words long. It is a surrealistic story about a boy on a boat and his encounter with a man riding a moped at sea in the middle of the night. The second story, God en de gekkenrechter (God and the judge of the insane), was written by Adriaan van Dis (1986) and was 2026 words long. In this story, the author narrates the story of a mentally instable man who is convinced that he is God, and believes that therefore all his excrements are holy and should not be thrown away. Apart from that, he terrorizes the neighborhood, leading to his institutionalization later on in the story, after which he finally seems to realize that he was mistaken in thinking that he was God. Both stories contain many descriptions that could guide mental imagery of the stories. It took participants about 15 minutes to read each story. In the remainder of the section about Experiment 2 the story Moped on sea will be referred to as Story A, and God and the judge of the insane will be referred to as Story B. Note that the stories were read in counterbalanced order: Half of the participants started with Story A and half of the participants started with Story B.

Instructions
Before reading the stories, participants were given a reading instruction. Every participant received either a mental imagery instruction, a secondary school suitability instruction, or a neutral control instruction. After participants had read the first story, they received a short reminder of their reading instruction to urge them to also keep the instruction in mind while reading the second story.
The group receiving the mental imagery instruction was told "In a short while, you will be reading a short story. During reading, try to vividly imagine the events happening in the story. Vividly imagine what you see, hear, feel or smell. For example, envision the characters and places described in the story, imagine what the conversations and environmental sounds sound like, what the odors smell like, how the physical experiences of the characters feel".
The group receiving the secondary school suitability instruction was told "In a short while, you will be reading a short story. Your job is to make sure the text is suitable for students in the lower grades of secondary school, of about 14 or 15 years old. The content of the story is not important, please pay attention to the writing style: the sentence constructions and the word use of the author of the story. Try to focus on these two aspects while reading the story. Determine whether the word use and sentence constructions are of a suitable level for students in the lower grades of secondary school".
The control group, who received a short, neutral instruction, was told "In a short while, you will be reading a short story. Please read this story the way you would usually read a story for your leisure".
To make sure all participants understood their reading instruction, they were presented with a short excerpt from a different story (which was stylistically comparable to the two experimental stories). Participants had to apply the reading instruction while reading this fragment and were afterwards asked to check with themselves whether they indeed applied the reading instructions while reading. After participants had practiced the instruction on the example fragment the instructions were repeated and participants were told to start reading the stories. Participants were reminded that they were allowed to read at their own pace and did not have to hurry. As mentioned previously, participants were given another reminder of the reading instruction just before they started reading the second story. The motivation behind this reminder was that participants had to fill in some questionnaires about their reading experience after reading the first story. Therefore, we wanted to make sure they would still remember the instruction while reading the second story. Although participants received different instructions, they all practiced their instruction on the same fragment and received reminders of their instruction at the same moments. This way, we tried to make sure that all three groups would read the story equally attentively.

Questionnaires
Just as in experiment 1, we asked participants to fill in questionnaires (discussed in detail below) regarding their reading experience after reading each story. However, this time we did not only use the Story World Absorption Scale to measure reading experience, but also a questionnaire measuring story appreciation and a questionnaire measuring the vividness of the imagery experienced during reading (for an overview of all questionnaires used in this experiment, see Figure 2). After reading both stories and filling in the story-related questionnaires, participants were asked to fill in some additional questionnaires measuring reading habits in daily life, personality characteristics, story comprehension and more general information (i.e. age and biological gender). Level of education was not considered in this experiment as nearly all of the participants were university students and therefore no claims could be made about the role of level of education in reading experiences from this experiment. We will discuss the questionnaires that were not used in experiment 1 in more detail below (see the materials section under experiment 1 for a detailed discussion of the other questionnaires).
As mentioned in the introduction of this paper we looked at appreciation of the stories. We measured this with the Appreciation Questionnaire, which is previously described in Mak & Willems (2019) and consists of a general score of story liking (How did you like the story; 1 = It was very bad, 7 = It was very good) and twelve adjectives (e.g., [did you find the story] Entertaining, … Ominous) that could be used to describe the stories (adapted from Knoop et al., 2016). These adjectives are taken from a list of adjectives that were most often used by people to describe their opinion of poetry and which are also used to describe aesthetic appeal in the domain of literature (Knoop et al., 2016). In the original scale by Mak & Willems (2019), a thirteenth adjective ([did you find the story] Entertaining) was used, but the Dutch word used for "entertaining" (in Dutch, "onderhoudend") appeared to be unknown to some of the participants, and was therefore removed from the questionnaire in the present study. Finally, 6 questions are asked regarding the enjoyment of the story (from Kuijpers et al., 2014; e.g., I was constantly curious about how the story would end; I thought the story was written well). Participants rated both the adjectives and the questions regarding enjoyment on a 7-point scale (1 = disagree, 7 = agree).
Vividness of experienced imagery was measured using a slightly adapted version of the Imaginal Vividness Scale (IVS; Fialho, personal communication), which is partly based on the Literary Response Questionnaire (Miall & Kuiken, 1995) and partly on a series of in-depth interviews with readers. This questionnaire was used because it is a more elaborate measure of imagery than the imagery subscale of the SWAS (which consists of only three items; and which is mainly focused on visual aspects of mental simulation, instead of the multisensory mental simulation we wanted to investigate, as explained in the introduction). The IVS as used in this experiment consisted of a total of 15 items divided over two subscales: Character (7 items, e.g., While reading this story I could see the events happening in the story through the eyes of the main character; While reading the story I could almost feel the physical experiences of the characters in my own body) and Setting (8 items, e.g., While reading this story I often saw the described places so clearly, that it almost was as if I was there; I sometimes had auditory experiences (for example, hearing sounds) as if I was present in the world of the story). This allowed us to capture the quality of the imagery experienced by the participants in more detail.
After reading both stories and finishing the storyrelated questionnaires all participants answered a suitability questionnaire asking about the suitability of the text for 14-15-year-olds and a comprehension check. The questions on the suitability questionnaire were asked in such a way, that the questionnaire would not feel "out of the blue" for participants in the control or imagery groups. 4 The suitability questionnaire was simply a follow-up on the secondary school suitability instruction, where participants had to answer three questions about the suitability of the word use, sentence constructions, and the general suitability of the story for students in the lower grades of secondary school, of about 14 or 15 years old. The comprehension check consisted of three multiple choice questions per story (3 possible answers per question), that should have been possible to answer correctly for people who had read the stories with normal attention. Participants who answered two or more questions of the comprehension check incorrectly for one or both stories, were excluded from analysis.
Finally, participants were asked to report their reading habits and some personality characteristics. Reading habits were measured using the same questionnaire as used in the first experiment. Additionally, as an implicit measure of reading habits, participants completed the Author Recognition Test (ART; Stanovich & West, 1989; Dutch adaptation reported in Koopman, 2015), consisting of 42 names (30 real authors and 12 foils), where they had to indicate who they thought were genuine authors.
Personality characteristics were measured using the Fantasy and Perspective Taking subscales of the Interpersonal Reactivity Index (IRI; Davis, 1980; Dutch translation adapted from De Corte et al., 2007) on a 7-point scale (e.g., Becoming extremely involved in a good book or movie is somewhat rare for me; When I'm upset at someone, I usually try to "put myself in his shoes" for a while). The Fantasy subscale (which is conceptually related but not identical to the Transportability Scale; Dal Cin et al., 2004) measures the extent to which someone gets mentally very involved in the stories they encounter, to the point at which they imagine themselves being part of the story. The Perspective Taking subscale measures the extent to which someone is able to take another person's perspective in daily life.

Procedure
Participants were tested in small groups in lecture rooms at the university campus. One or two experimenters were always present to make sure the participants did not disturb each other and (if necessary) to answer questions. Before the start of the experiment, participants were informed about the procedure and were asked for written informed consent. At the start of the experiment, participants were given one of the three instructions (as described above; see Figure 2 for a schematic overview of the procedure of this experiment). After having read the reading instruction and having practiced the instruction on an excerpt from an unrelated story not used in the remainder of the experiment, they started reading the first story. After reading, participants filled in the Story World Absorption Scale, the Appreciation Questionnaire, and the Imaginal Vividness scale. When they had finished, the reading instruction was repeated and participants read the second story. After finishing reading the second story participants completed the SWAS, Appreciation Questionnaire and IVS again, followed by the remaining questionnaires. The stories were read in counterbalanced order. Both the instructions and the stories were read from paper and the questionnaires were completed as paper and pencil tests. Participants were allowed to read the stories and fill in the questionnaires at their own pace. The entire procedure took about 40 minutes.

Data-analysis
Data-analysis was done using the package 'lme4' (Bates, Mächler, Bolker, & Walker, 2015) in R version 3.5.1 (R Core Team, 2018). We constructed a linear mixed effects regression model that predicted average scores on the SWAS based on experimental group (mental imagery instruction and secondary school suitability instruction contrasted with control). Random intercepts were allowed per participant. 5 Story (Story B contrasted with Story A), biological gender (male contrasted with female), age, selfreported reading habits, ART-score, and the Fantasy and Perspective taking subscales of the IRI were added to the model as general variables expected to explain additional variance. As a result, any effects of the experimental group would represent variance explained by the given instruction over and above variance explained by any story effects, demographic variables and other important individual difference measures. P-values were estimated using the "lmerTest" package (Kuznetsova, Brockhoff, & Christensen, 2017).
An initial effect of Experimental Group was calculated by comparing a base model (a model containing all predictors except for Experimental Group) with the full model, using an ANOVA. If this indicated a significant effect of experimental group, post-hoc pairwise comparisons (with Tukey HSD adjustment for multiple comparisons) between all three groups were made using the ' emmeans' package in R (Lenth, 2018). All continuous predictors were centered and scaled. Variance Inflation Factors (VIFs) were calculated for all models to check for multicollinearity. All VIFs for all models were between 1 and 2, indicating that multicollinearity was not problematic in our models and all planned predictors could be entered into the models.
Similar models were constructed to predict scores on our other variables of main interest (i.e. the average scores on the IVS as a more elaborate measure of mental imagery, scores on the different components of appreciation; see Table 2). Additionally, we also ran analyses for the four subscales of the SWAS separately (i.e. Attention, Transportation, Emotional Engagement and Mental Imagery), and the subscales of the IVS. We analyzed the data at the level of these subscales for two reasons. Firstly, the subscales of the SWAS and IVS are arguably built up from subscales measuring diverging sub-constructs. Therefore, analyzing the subscales separately may give us additional information with regard to the processes underlying possible effects. Our second reason for analyzing the data at the level of subscales in addition to the average scores was to find out whether the null-effect for Instruction we found when looking at the variables of main interest (i.e. the average scores on the SWAS and the IVS), would also be visible within all of the subscales.

Questionnaires
The overall scores on the Story World Absorption Scale as well as the scores on the four subscales of the SWAS all showed good to excellent reliability; total SWAS (18 items), Ω t = .95; Attention (5 items), Ω t = .91; Transportation (5 items), Ω t = .89; Emotional Engagement (5 items), Ω t = .92; Mental Imagery (3 items), Ω t = .77. Descriptive statistics for the questionnaire scores per subscale and per group are visualized in Figure 3.
The Appreciation Questionnaire was divided into two parts for the analysis. The first part, consisting of twelve adjectives that could be used to describe the stories, was analysed using a principal components analysis (PCA) with oblique rotation (direct oblimin). The Kaiser-Meyer-Olkin measure was good, KMO = .86, indicating that the sampling adequacy for this analysis was good (all KMO values for individual items >.72). Bartlett's test of sphericity showed that there was sufficient correlation between items, χ 2 (66) = 1558.99, p < .001. An initial analysis showed that three components had eigenvalues over 1 (Kaiser's criterion), but a model with three components did not fit the data well enough (fit = .93). Therefore, in the final model four components were retained (fit = .95). This The structure and pattern matrices for the factor loadings after rotation can be seen in Table 3. Factor scores per participant and story were used in the subsequent analyses for the constructs Evoked Interest, Emotional Response, Suspense, and Amusement. A second part of the questionnaire consisted of a general score of story liking, and 6 questions regarding the enjoyment of the story, Ω t = .95 (7 items). The answers on these questions were collapsed into a mean score for General Appreciation.
Reading experience was measured both directly using a reading habits questionnaire, and indirectly using the Author Recognition Test (ART). Because answers on different items of the reading habits questionnaire were measured on different scales, z scores were calculated for all questions on this questionnaire (higher values indicating more habitual readers). Overall reliability was sufficient, Ω t = .82. The scores on the ART were slightly positively skewed (M = 7.46, SD = 4.03, median = 7.00, IQR = 5.00-10.00) with higher values indicating more (literary) reading experience. Reliability of both subscales of the Interpersonal Reactivity Index was sufficient; Fantasy subscale (7 items), Ω t = .84, and Perspective Taking subscale (7 items), Ω t = .84.

Main Analysis
The model predicting average scores on the SWAS based on story, biological gender, age, ART-score, reading habits, IRI Fantasy, IRI Perspective Taking and experimental group showed no differences in SWAS scores between the three experimental groups (see Figure 3A; see Table 4.1 for all results of this model). Interestingly, participants with higher scores on the Fantasy subscale of the IRI reported more story world absorption (see Figure 4A). To find out which aspects of story world absorption this relationship between IRI fantasy and scores on the SWAS stems from, similar models were constructed to predict the scores on the four subscales of the SWAS (see Figure 4 for a visual representation of the relationship between scores on the Fantasy subscale of the IRI and the tested reading experiences).
The positive relationship between scores on the Fantasy subscale of the IRI and scores on the Story World Absorption Scale was visible on all subscales of the SWAS (see Figure 4B-E; see Table 4.1a-d). On the Attention and Mental Imagery subscales, story effects became visible; after reading Story B participants reported higher attention to the story world and higher Mental Imagery. In contrast, participants reported lower emotional engagement with Story B (see Table 4.1a-d). No other significant associations between the predictors and any of the subscales of the SWAS were found.
To investigate the effect of mental imagery instructions on reported mental imagery more thoroughly, we also tested differences between groups in mental imagery as reported on the IVS (see Figure 3K; see Table 4.2). The results on this questionnaire also indicate whether participants complied with our mental imagery instructions. A comparison between our base model and the full model suggested a significant effect of group on scores on the IVS (F(2, 118) = 6.49, p = .002), but posthoc comparison of the two experimental groups and the control group showed no notable differences (Mental Imagery Instruction vs. Control: B = -0.35, SE = 0.19, df = 128, t = -1.90, p = .14; Secondary School Suitability instruction vs. Control: B = 0.29, SE = 0.18, df = 128, t = 1.56, p = .27). However, the group receiving the Mental Imagery instruction did report significantly more vivid imagery than the group receiving the Secondary School Suitability instruction (B = 0.64, SE = 0.19, df = 128, t = 3.46, p = .002). IRI Fantasy was positively related to the vividness of mental imagery (see Figure 4K). After reading Story B, participants reported more vivid imagery. To find out if the effect of instruction was perhaps only visible on one of the two subscales and to find out which aspects of imaginal vividness the relationship between IRI fantasy and imaginal vividness stems from, these analyses were repeated for the individual subscales of the IVS.
The Character subscale of the IVS showed similar results for the relationship between instructions and imaginal vividness as were found on the overall scale: An initial comparison between our base model and the full model suggested a significant effect of type of instruction (F(2, 118) = 3.61, p = .03; see Figure 3L), but post-hoc comparison of the two experimental groups and the control group did not reveal any statistically significant differences (Mental Imagery instruction vs. Control: B = -0.30, SE = 0.20, df = 128, t = -1.47, p = .31; Secondary School Suitability instruction vs. Control: B = 0.22, SE = 0.20, df = 128, t = 1.10, p = .51). The group receiving the Mental Imagery instruction did score higher on the Character subscale of the IVS than the group receiving the Secondary School Suitability instruction (B = 0.51, SE = 0.20, df = 128, t = 2.58, p = .03).  Table 4: Coefficients of the models predicting reading experiences (absorption, vividness of mental imagery, story appreciation) based on type of instruction (mental imagery instruction and secondary school suitability instruction contrasted with the control instruction), taking into account Story (Story B contrasted with Story A), biological gender (male contrasted with female), age, self-reported reading habits, ART-score, and the Fantasy and Perspective taking subscales of the IRI. Significant predictors are marked (* p < .05, ** p < .01, *** p < .001). The same pattern was revealed for the Setting subscale of the IVS: an initial comparison between our base model and the full model suggested a significant effect of type of instruction (F(2, 118) = 8.35, p < .001; see Figure 3M), but post-hoc comparisons of the two experimental groups and the control group did not reveal any statistically significant differences (Mental Imagery instruction vs. Control: B = -0.40, SE = 0.19, df = 128, t = -2.07, p = .10; Secondary School Suitability instruction vs. Control: B = 0.35, SE = 0.19, df = 128, t = 1.86, p = .16). Again, the group receiving the Mental Imagery instruction scored higher on the Setting subscale of the IVS than the group receiving the Secondary School Suitability instruction (B = 0.75, SE = 0.19, df = 128, t = 3.93, p < .001).
On both subscales of the Imaginal Vividness Scale, we found a positive relationship between scores on the Fantasy subscale of the IRI and imaginal vividness (see Figure 4L&M; See Table 4. 2a-b). Similarly, differences between stories were found for both subscales of the IVS. After reading Story B, participants reported more vivid imagery of the characters in the story and of the settings described in the story.
The results on the Imaginal Vividness Scale suggest that our reading instructions indeed influenced the experienced vividness of mental imagery, with respect to both the characters in the story and the environment described in the stories. The imagery instruction was associated with more vivid mental imagery than the secondary school suitability instruction. This suggests that participants are able to follow these instructions while reading, and that they indeed target mental imagery, as intended. Apart from instructed mental imagery, we also found a significant role for the personality trait Fantasy in the experienced vividness of mental imagery.
To test whether mental imagery instructions would also have an impact on the appreciation of stories, we tested differences between groups in general appreciation and story appreciation as reported on the four components of the Appreciation Questionnaire (see Table 4.3a-e). Initial comparisons between the base models and the full models showed that the reading instructions influenced only the experienced Suspense (F(2, 117.90) = 3.95, p = .02; see Figure 3J 057; Note however, that -although not statistically significant -this suggests that the participants receiving the secondary school suitability instructions experienced the stories they read as being somewhat more ominous and suspenseful, opposite to our expectations). The reading instructions did not have an effect on the four other aspects of appreciation.
Comparable to our findings for the SWAS, participants scoring higher on the Fantasy subscale of the IRI appreciated the stories they read more (General Appreciation; see Figure 4F; see Table 4.3a). Similarly, a positive association was found between IRI Fantasy and factor scores for Evoked Interest (see Figure 4G; see Table 4.3b) and the factor scores for Amusement (see Figure 4I; see Table 4.3d). Additionally, there was a negative association between age and factor scores for Amusement: older participants reported finding the stories less funny, witty or special (see Table 4.3d). Differences between stories were found for General Appreciation, Evoked Interest, Emotional Response, Amusement, and Suspense (see Table 4.3a-e). Story B was generally appreciated more, evoked more interest, elicited a stronger emotional response, was considered more Amusing, and more Suspenseful.
To test whether there was a moderation between instruction condition and IRI Fantasy, we performed exploratory analyses where we included an interaction term between IRI Fantasy and instruction in our models. There was a significant interaction between IRI Fantasy and the Secondary School Suitability instruction for three of the thirteen tested dependent variables SE = 0.14,df = 118.00,p = .004;SE = 0.16,df = 116.5,p = .004;Suspense: B = 0.38,SE = 0.17,df = 117.40,t = 2.28,p = .02). From the visualization of these interactions in Figure 5 can be seen that for Mental Imagery and Emotional Response, the relationship between IRI Fantasy and the dependent variable is attenuated when participants have to read with a reading instruction in mind (and mostly so if this is the Secondary School Suitability instruction). Oppositely, the relationship between IRI Fantasy and Emotional Response seems only present in participants who received the Secondary School Suitability instruction, but not in the other groups. Although these results are interesting in itself, we have to be careful with interpreting them, as this moderation only appears for a few of our dependent variables and does not follow a highly consistent pattern.

Discussion
The aim of this second experiment was to replicate the findings of the first experiment in a more controlled setting, with additional stories, and while considering an extra set of control variables (most importantly aiming at personality characteristics that might influence reading experiences). As can be seen from the results of this experiment, reading instructions only played a very minor role in defining reading experiences. 7 Although we could see that, in particular, our Mental Imagery instruction did influence the reading experiences directly involving mental imagery, suggesting that our instruction was indeed successful in influencing mental imagery, the effect of this instruction did not translate to other reading experiences. 8 The most notable statistically significant finding in this study was that the Fantasy subscale of the Interpersonal Reactivity Index appeared to be positively associated with all aspects of participants' reading experiences. Even though it may be possible to influence certain reading experiences through reading instructions, personality characteristics appear to be much more important in determining people's reading experiences. As was described above, the Fantasy subscale measures the extent to which someone has the tendency to get mentally involved in the stories they encounter by imagining themselves being part of the story or by trying to empathize with characters in the story. Together, the questions on this subscale of the IRI give an impression of the amount of "fantasy" with which participants experience fiction on a day to day basis. Because of the theoretical relationship between this personality characteristic and reading experiences, it is interesting to find that this personality characteristic is indeed positively associated with reading experiences across the board (note that this personality characteristic is not just associated with absorption or mental imagery, but also with several aspects of how participants appreciated the stories). A question of causality remains, however: future studies will have to determine whether those who engage more in imagery during reading, as a consequence like the stories they read more, or whether people who enjoy stories gradually become more imaginative as a result of reading (comparable to the question of causality in the study of the relationship between reading and theory of mind: do more empathic people read more, or does reading result in more empathy? See Panero et al., 2016;Samur, Tops, & Koole, 2017).
Finally, it is interesting to find that the influences of biological gender and reading habits on reading experiences as found in the first experiment reported in this paper were not significantly associated with reading experiences in the second experiment. Just like the influence of reading instructions, the effects of biological gender and reading habits do not seem to be as important as personality characteristics in determining reading experiences (although note that the variation in biological gender in this second experiment was far from balanced: in the second experiment only ~2 0% of participants were male, compared to ~5 0% in experiment 1. Therefore, no strong conclusions about the effects of biological gender can be drawn from the results of experiment 2). As opposed to the results in our previous study, age was negatively associated with amusement: older participants found the stories less funny, witty or special. However, although interesting in itself, it should be noted that the majority of participants in this study were university students of about 21 or 22 years of age, and this effect might be due to a couple of outliers (only 7 participants were older than 30 years of age, with 3 of them being 55 years or older). However, a previous study with a larger variation in age between participants (and a large number of participants being between 50 and 75 years old) also showed that older participants rated the stories they read as less literary and less beautiful than younger participants did (Hartung et al., 2017). Our results showed a few differences between the two stories with respect to the reading experiences they elicited. As we did not have any hypotheses regarding how the stories would differ, we will not interpret the results regarding the differences between the stories.

General Discussion
In this study we investigated the relationship between mental imagery and reading experiences. In particular, we were interested in the act of mental imagery during reading and whether differences between people in the extent to which they engaged in mental imagery was related to their reading experiences. To make sure that participants differed in the extent to which they engaged in mental imagery, they received reading instructions in which they were instructed to envision the stories as much as possible, to read as if they were reading for leisure, or to focus on surface characteristics of the stories (word use and sentence construction; only in experiment 2). Apart from mental imagery, we were interested in the role of stable or trait-like individual differences (such as reading habits, biological gender, age, education, and personality characteristics) in determining reading experiences.
Although experiment 1 suggested that mental imagery instructions, as well as level of education, biological gender, and reading habits, played a significant role in determining reading experiences, experiment 2 showed that after controlling for personality characteristics (in particular "fantasy") and adding an extra control condition, this association between mental imagery instructions, biological gender, reading habits, and reading experiences disappeared for a large part. This suggests that, besides all other aspects involved in reading experiences, these experiences are most strongly influenced by personality characteristics, such as readers' proneness to "fantasy". Fantasy has been suggested to be one of the aspects underlying the "Openness" personality characteristic (Fayn et al., 2015), a characteristic that has also been found to be associated to reading experiences and reading habits in other studies (Kraaykamp & Eijck, 2005;Kuijpers et al., 2018;Schutte & Malouff, 2004). Mental imagery was mainly found to be related to mental imagery-related reading experiences, and not as strongly to other reading experiences. However, in a previous study mental simulation was found to be related to aspects of absorption and appreciation (Mak & Willems, 2019). The reason that this relationship was not found to be very strong in the current study, could be that there is a difference between (explicit) mental imagery and (implicit) mental simulation (see Jacobs & Willems, 2018). Perhaps the explicit mental imagery the participants were instructed to perform in this study was too different from the implicit mental simulation elicited by stories during naturalistic reading, and was therefore relatively unrelated to reading experiences. This could also explain why submitting participants to a more implicit mental imagery training before reading, did prove effective in increasing experienced transportation (Johnson et al., 2013).
Moreover, the interactions between the effects of fantasy and reading instructions on some of the tested reading experiences in experiment 2 even suggest that pre-reading instruction might in fact negatively influence naturalistic processes during reading. For both mental imagery as reported on the SWAS and for the emotional response to the story, it was found that the positive relationship between fantasy and these reading experiences in readers in the control group, was attenuated in readers who received pre-reading instructions (regardless of the content of the instructions). Therefore, it could be possible that having to remember and execute instructions during reading interferes with reading experiences as they would normally occur. However, the interactions that were found were not present for all aspects of reading experiences, and these analyses were highly exploratory, so further research should indicate whether this is indeed the case. However, when studying subjective reading experiences, it seems wise to only study naturalistic reading instead of trying to influence reading experiences using pre-reading instructions.
Another explanation for the weak association between mental imagery and reading experiences in this study could be that readers differ greatly in the type of mental imagery they prefer during reading. Kuzmičová (2014) suggests four possible types of mental imagery during literary reading: Enactment-imagery (where readers imagine themselves executing the actions described in the story), description-imagery (where readers visually imagine the objects and scenes described in the story), speech-imagery (where readers imagine hearing the narrator tell the story) and rehearsal-imagery (where readers imagine reading the story out lout). Kuzmičová suggests readers differ in the type of imagery they perform during reading (and this can also differ from one story to the next within a given reader). Perhaps the instructions given in this experiment did not match the preferred type of imagery of some (or all) of the readers, resulting in weak effects of our mental imagery instruction on reading experiences.
A different possibility would be that mental imagery just doesn't play a role in people's ability to become involved in a story. However, previous research has shown relationships between imagery and absorption, transportation, and appreciation or enjoyment of narratives, which does not fit with the proposal that imagery is unimportant in story involvement (Green, 2004;Green & Brock, 2002;Kuijpers et al., 2014;Mol & Jolles, 2014;Weibel et al., 2011). Another possibility would be that people are unable to perform mental imagery "on command". However, the fact that reading instructions were successful at inducing or reducing mental imagery in our participants contradicts this claim.
Overall, it seems that the use of mental imageryinducing reading instructions does have a small influence on people's reading experiences. However, this effect pales into insignificance compared to the effect of personality characteristics. Perhaps a single reading instruction is insufficient for altering reading experiences: to really enhance reading experiences, readers will have to be trained intensively to read in a different way. For instance, Janssen, Braaksma and Couzijn (2009) found that students' appreciation for stories increased after an intervention where they were allowed to come up with their own questions about the stories (as opposed to answering a teacher's questions). Perhaps a comparable intervention, but instead aimed at mental imagery, might have a stronger influence on reading experiences than a single instruction, and may perhaps prove more powerful in overcoming personality characteristics (but see De Koning & van der Schoot, 2013).

Data Accessibility Statement
The participant data and analysis scripts can be found on this paper's project page on the Open Science Framework (osf.io/98ntg/). The stories used as materials in this experiment are from professional authors and are copyrighted.

Notes
1 Mental imagery during reading is also sometimes referred to as mental simulation. Theoretically, mental simulation is a somewhat more subconscious process than mental imagery. For the sake of clarity, we will call the process of imagining events or perceptible elements of the story world described in a story (more or less consciously) mental imagery throughout this article.
2 Note that when looking at the literature regarding mental imagery during reading it becomes clear that people seem not just to mentally image descriptions of visual elements of story worlds, but a much more extensive range of perceptual descriptions (i.e. auditory, olfactory, proprioceptive) and motor descriptions (Kuzmičová, 2012;Kuzmičová, 2014;Mak & Willems, 2019;Nijhof & Willems, 2015). Although the exact nature of mental imagery during reading is still debated, in this paper we do not want to focus on the content of mental imagery during reading, but instead on the act of mental imagery itself. 3 Note however, that Kuiken and Douglas (Kuiken & Douglas, 2017) proposed that absorption is even more multidimensional, suggesting multiple types of absorption, with multiple different outcomes. 4 Some participants still found these questions "out of the blue". However, the majority of the participants did not report being surprised by these questions. Furthermore, this questionnaire was only presented at the end of the experiment, after all the questionnaires regarding reading experiences had already been filled in, and will therefore not have affected answers on the questionnaires that were of interest for our analyses. 5 We did not have enough observations in our dataset to support any random slopes (let alone a maximal model with random slopes for all predictor variables). Therefore, we decided that it was most appropriate to use a random intercept only model. 6 Note that the items "deeply moving" and "suspenseful" load strongly (above .40) on more than one component. This is due to the nature of PCA as an unsupervised dimension reduction method. Items loading strongly on a component are considered "typical" items for the component, and can be used for the interpretation of the components. However, every item will load on every component, and the loadings of all items (both items strongly associated with the components and items weakly associated with the components) are taken into account when calculating the component or factor scores of all components. The strong association of the items "deeply moving" and "suspenseful" with more than one component therefore does not have important consequences for the calculation of the component scores, but only for the theoretical interpretation of the components. 7 A possible concern with our analyses would be that the lack of effect of reading instructions is due to the other predictors (The Fantasy and Perspective Taking subscales of the IRI, scores on the Author Recognition Test) absorbing so much variance that effects of reading instructions would not become visible. To check this, we ran reduced models that are comparable to the models used in Experiment 1 (without extra predictors, but including a random intercept for Participant and a predictor for Story, to control for participant and story effects). Although there were some minor changes in the effect of reading instruction on some of the dependent variables (i.e. SWAS Mental Imagery, Mean Imaginal Vividness, the Setting subscale of the Imaginal Vividness scale, and the Suspense component of the Appreciation Questionnaire), this concerned more pronounced effects rather than effects we failed to find with the models that did include the additional measures. As the latter models were significantly better than the reduced models (based on AIC, BIC and LogLikelihood), we chose to report only the results for the complete models. 8 Note that it is possible that the findings with regard to the vividness of mental imagery are due to experimental demand, as the mental imagery instruction specifically asked participants to increase their mental imagery. However, we think this not the most likely explanation of our results. If the imagery findings would be entirely due to the experimental demand, we would have expected that both the secondary school instruction group and the control group would differ from the mental imagery instruction group with respect to reported vividness of mental imagery (since neither the secondary school suitability instruction nor the control instruction mentioned mental imagery).
In contrast, the situation is that the secondary school suitability instruction lowered reported mental imagery somewhat with respect to the control group, and the mental imagery instruction increased reported mental imagery somewhat, resulting in a significant difference between the secondary school suitability group and the mental imagery instruction group (with the control group being somewhere in between). Although it is possible that the decrease in mental imagery in the secondary school suitability group is coincidental, and the increase in mental imagery in the imagery instruction group is due to experimental demand, it seems more likely that both instructions had a (small) influence on mental imagery -but in opposite directions. Importantly, our general conclusions remain the same, regardless of which explanation of the results is true.