Do questionnaires reflect their purported cognitive functions?

Highlights • Imagery and navigation questionnaires reflect their purported cognitive functions.• Memory questionnaires reflect autobiographical memory vividness.• Episodic details and memory questionnaires measure different aspects of memory.• Imagery questionnaires also correlated with memory vividness and future thinking.• Single questions modelled performance comparably to established questionnaires.


Introduction
Questionnaires are used widely across psychology. They offer the opportunity to gain valuable insights into a person's subjective thoughts, beliefs and meta-cognition which are not easy to derive from task performance measures alone. Even where performance on cognitive tasks or measures of brain activity are the main focus of investigation, questionnaires nevertheless often also play a role (Arnold, Iaria, & Ekstrom, 2016;Auger, Zeidman, & Maguire, 2017;Belardinelli et al., 2009;Monaghan et al., 2015;Sheldon, Farb, Palombo, & Levine, 2016;Silani et al., 2008). Moreover, in some branches of psychology, questionnaire studies outnumber those involving the direct measurement of cognition and behaviour (e.g. Baumeister, Vohs, & Funder, 2007).
Given the importance of capturing subjective data and the widespread use of questionnaires, it is vital that such instruments map onto the cognitive functions they purport to reflect. The construction and validation of an effective questionnaire often includes consideration of the questions themselves via factor analyses and principal component loadings, examination of internal validity, test-retest reliability and, where possible, confirmatory analyses against existing measures, be they from relevant cognitive tests or from previously-validated questionnaires (Andrade, May, Deeprose, Baugh, & Ganis, 2014;Blazhenkova & Kozhevnikov, 2009;Hegarty, Richardson, Montello, Lovelace, & Subbiah, 2002).
However, when the focus is on naturalistic aspects of cognition, it can be significantly challenging, and not always possible, to test a questionnaire against performance measures. For example, our interest here is in visual imagination (the process of forming and visualising images in the absence of visual input), autobiographical memory (the recall of past events from one's life), future thinking (imagining future experiences) and spatial navigation (the process of ascertaining one's position, and planning and following a route). These cognitive functions can be time consuming and resource-heavy to examine and score (e.g. Hassabis, Kumaran, Vann, & Maguire, 2007;Levine, Svoboda, Hay, Winocur, & Moscovitch, 2002;Woollett & Maguire, 2010). This is especially true when large sample sizes are involved. Aspects of these cognitive functions can be assessed quickly and easily in the laboratory using simplified proxy tests, such as memory for a word list in place of autobiographical memory. However, the exact relationship between laboratory-based and naturalistic tasks is often unknown and, in the case of memory, for example, the two task types engage distinct brain networks (McDermott, Szpunar, & Christ, 2009). Thus, if the relationship between the tasks themselves is not clear, then generalising from questionnaires based on proxy tasks to more complex real-world tasks could be problematic.
Given the nature of mental imagery and imagination, questionnaires have been a mainstay of investigation. Across the literature, the classic measure of mental imagery ability is that of vividness. A particularly popular questionnaire is the Vividness of Visual Imagery Questionnaire (VVIQ; Marks, 1973Marks, , 1995, which has been correlated with many measures including neural activity in visual areas of the brain during imagination (Cui, Jeter, Yang, Montague, & Eagleman, 2007;D'Argembeau & Van der Linden, 2006;Finke & Kosslyn, 1980;Hanggi, 1989;Miller et al., 1987). An updated and extended version of the VVIQ is the Plymouth Sensory Imagery Questionnaire (PSIQ; Andrade et al., 2014). As well as measuring the vividness of visual imagery, the PSIQ also examines the vividness of imagery outside of the visual domain. The PSIQ has high reliability and is strongly correlated with the VVIQ (Andrade et al., 2014).
In addition to vividness, an individual's general use of visual imagery is also often of interest, and can be captured using the Spontaneous Use of Imagery Scale (SUIS; Reisberg, Pearson, & Kosslyn, 2003). The SUIS has good reliability and convergent validity against other imagery questionnaires (e.g. Nelis, Holmes, Griffith, & Raes, 2014), and has predominately been used to study the relationship between imagery and psychopathology (Pearson, Deeprose, Wallace-Hadrill, Heyes, & Holmes, 2013). It has also been argued that there are multiple types of imagery, and this is believed to be captured in the Object-Spatial Imagery and Verbal Questionnaire (OSIVQ; Blazhenkova & Kozhevnikov, 2009). Indeed, when associated with laboratory visual imagery tasks, the subscales of the OSIVQ show different patterns of associations (Blazhenkova & Kozhevnikov, 2009;Sheldon, Amaral, & Levine, 2017;Vannucci, Mazzoni, Chiorri, & Cioli, 2008).
There are, therefore, multiple visual imagery questionnaires, which correlate highly with each other and with laboratory measures of visual imagery ability. However, to the best of our knowledge, for the widely used questionnaires highlighted above, it is currently unknown if they are associated with performance measures of real world imagination (see also McAvinue & Robertson, 2007).
Considering next autobiographical memory, multiple questionnaires have been developed in an attempt to capture memory ability. However, despite their widespread use, the utility of memory questionnaires for assessing memory performance has been questioned. A review of 14 memory questionnaires (one of which we also use herethe Subjective Memory Questionnaire; Bennett-Levy & Powell, 1980) found that while responses to the questionnaires were reliable and stable, there were limited relationships between any of the memory questionnaires and laboratory measures of memory performance (Herrmann, 1982). No investigations, however, were performed using naturalistic tests of autobiographical memory, which may show different associations, given their recruitment of distinct brain regions (McDermott et al., 2009).
Since the Herrmann (1982) review, new questionnaires have been developed which may better reflect memory ability. The Memory Experience Questionnaire (Sutin & Robins, 2007) focuses on how memories can differ on a number of phenomenological dimensions. These dimensions have been used to examine, for example, how visual perspective relates to other memory phenomenology (Sutin & Robins, 2010), and to assess correlates of memory and positive self-esteem (D'Argembeau & Van der Linden, 2008). However, to the best of our knowledge, there has been no study examining how the dimensions of this questionnaire relate to autobiographical memory recall.
In contrast, the Survey of Autobiographical Memory (SAM; Palombo, Williams, Abdi, & Levine, 2013), contains two subscales relating to memory recall that purport to reflect autobiographical episodic and semantic memory. In support of these subscale definitions, the SAM Episodic subscale (and not the Semantic) was found to correlate with the recollection of scene pictures, and mean scores on this subscale were lower in participants with a history of depression, in line with the literature. However, of key relevance to the current study, in a sub-analysis of 52 participants reported by Palombo et al. (2013), the SAM Episodic subscale was not associated with the number of episodic ('internal') details generated during the Autobiographical Interview, one of the most commonly used measures of real-world autobiographical memory recall (Levine et al., 2002). How well the SAM Episodic subscale reflects autobiographical memory recall is, therefore, unclear. Overall, the link between memory questionnaires and autobiographical memory recall is questionable, and requires additional exploration.
While imagery and memory questionnaires are common across the literature, to the best of our knowledge, there are no questionnaires designed specifically to examine future thinking. Instead, questionnaires have, for example, looked at whether individuals are more biased towards thinking about the past, present or future (Zimbardo & Boyd, 1999) or the consequences of future actions (Strathman, Gleicher, Boninger, & Edwards, 1994). However, the SAM questionnaire mentioned above (Palombo et al., 2013) also contains a "Future subscale", with items believed to represent future thinking ability and phenomenology. However, while these items were separated from that of the other subscales in the SAM by a multiple correspondence analysis (a form of factor analysis), the Future subscale is yet to be associated with any measure of future thinking.
In contrast to imagination, autobiographical memory and future thinking, navigation questionnaires have been well validated against real world navigation tasks. One of the most popular is the Santa Barbara Sense of Direction Scale (Hegarty et al., 2002). This questionnaire was initially validated in a large sample of over 200 participants against various measures of their sense of direction in environments learned from direct exposure, virtual environments on a computer screen, and videotaped real world environments (Hegarty, Montello, Richardson, Ishikawa, & Lovelace, 2006;Hegarty et al., 2002). A strong relationship between the Santa Barbara Sense of Direction Scale and navigation ability tested in real and virtual worlds has also been replicated across multiple other studies, albeit often in smaller samples (e.g. Hund & Padgitt, 2010;Weisberg, Schinazi, Newcombe, Shipley, & Epstein, 2014).
The SAM questionnaire, mentioned above (Palombo et al., 2013), also has a fourth subscale designed to represent spatial ability. In an online study of over 300 participants familiar with downtown Toronto, scores on the SAM Spatial subscale significantly correlated with the ability to judge distances and spatial relationships between downtown Toronto landmarks, while the Episodic, Future and Semantic subscales did not (Selarka, Rosenbaum, Lapp, & Levine, 2019). Both the Santa Barbara Sense of Direction Scale and the Spatial subscale of the SAM seem, therefore, to reflect their purported cognitive function of navigation.
In summary, when considering imagination, autobiographical memory and future thinking, there is a surprising lack of validation of questionnaire data against naturalistic tasks. This is despite the questionnaires being widely used in the literature as a proxy for real-world behaviour. By contrast, navigation questionnaires, in particular the Santa Barbara Sense of Direction Scale, have been widely validated against real-world navigation, showing a strong relationship.
The focus of the current study is the relationship between performance on naturalistic cognitive tasks and questionnaire measures purporting to reflect the cognitive functions being assessed in those tasks. In a recent study, involving the same participants that we report on here, we examined the relationships between imagination, autobiographical memory, future thinking and navigation in terms of task performance (Clark et al., 2019). We found that scores across the four tasks were related, although navigation was more closely aligned with other (laboratory-based) spatial tasks. In that study we also tested whether or not there was a common cognitive process underpinning performance on these tasks. It has been suggested that autobiographical memory provides the building blocks for thinking about the future and imagining fictitious atemporal events (e.g. Moscovitch, Cabeza, Winocur, & Nadel, 2016;Schacter et al., 2012). An alternative proposition is that imagination, recalling the past, thinking about the future and navigation require the mental construction of scene imagery Maguire & Mullally, 2013). In this context, a scene is a naturalistic three dimensional spatially coherent representation of the world typically populated by objects and viewed from an egocentric perspective. Using mediation analyses, we identified that scene construction, and not autobiographical memory, was a key process linking performance on the four tasks (Clark et al., 2019). This association was stronger between imagination, autobiographical memory and future thinking than with navigation. However, the sole focus of Clark et al. (2019) was on the task performance data. The question arises, therefore, as to whether a similar influence of visual (scene) imagery would also be evident in questionnaires purporting to reflect the cognitive functions underpinning these tasks.
Several previous studies have alluded to a crossover between imagery questionnaires and autobiographical memory, future thinking and navigation. Comparing high and low imagery-reporting participants, Vannucci, Pelagatti, Chiorri, and Mazzoni (2016) found that high imagery participants generated a greater number of autobiographical memories, and scored these memories more highly on subjective ratings of vividness and richness of detail. Additionally, D'Argembeau and Van der Linden (2006) observed that self-reported imagery correlated with subjective reports of visual and sensory details when thinking about both past and future events. Interestingly, two participants who claimed to have a complete lack of visual imagery also reported a low sense of reliving when recalling autobiographical memories (Greenberg & Knowlton, 2014). For navigation, a more complex picture has emerged, where benefits have been apparent for participants reporting higher visual over verbal abilities, but also for those with the opposite pattern (Kraemer et al., 2017;Pazzaglia & Moè, 2013).
In summary, how questionnaires designed for one of our cognitive functions of interest relate to the other cognitive functions is unknown, and further investigation is required to identify whether there are, hitherto hidden, links between them.
A separate potential issue with questionnaires is their scope. Researchers often want to obtain a broad profile of a participant. However, questionnaires are typically designed to probe a specific aspect of cognition (e.g. imagery vividness: Andrade et al., 2014;navigation: Hegarty et al., 2002). As such, questionnaires ask multiple questions about the same cognitive function or behaviour. A standard questionnaire typically takes around 5-10 min to complete. Therefore, if seeking a wide-ranging profile over multiple aspects of cognition (and thus multiple questionnaires), a battery of questionnaires can easily take an hour. Yet, if a general overview is what is required, then indepth information on each topic is not necessary. Instead, could a set of single questions on each topic (e.g. "Please rate your ability to construct a mental image"; "Please rate your navigational ability") provide sufficient information to broadly model behaviour? If so, an individual profile could be collected using a few questions, taking only minutes. To the best of our knowledge, no such questionnaire exists.

The current study
The current study asked three main questions. First, do questionnaires reflect their purported cognitive functions? Second, is it the case that a visual imagery or autobiographical memory process is actually what is being assessed in questionnaires despite their purporting to be related to distinct cognitive functions? Third, how does a single question probing a cognitive function perform relative to a whole questionnaire? To address these questions we collected questionnaire and naturalistic task performance measures of imagination, autobiographical memory, future thinking and navigation from a large sample of participants (n = 217). The questionnaires were well established and provided detailed information about each cognitive domain of interest. We also included a bespoke, short, One Sentence Questionnaire designed for the current study. This questionnaire contained just 15 questions covering imagination, autobiographical memory, future thinking and navigation. For the task performance measures we used established, published tasks designed to assess each cognitive function in a naturalistic manner.
We considered the four cognitive functions in turn -imagination, autobiographical memory, future thinking, and navigation. For each, we started with the questionnaires most directly relevant to a cognitive function (i.e. the questionnaires on imagery for the imagination task; the questionnaires on memory for the autobiographical memory task, and so on) and examined how they related to task performance. We then investigated whether questionnaires not associated with a task (e.g. the questionnaires on imagery for the autobiographical memory task) were correlated with task performance. This allowed us to investigate whether an imagery or autobiographical memory process might be what is being tapped by the questionnaires rather than them being specific to a particular cognitive function. Finally, we assessed the ability of the questions of the One Sentence Questionnaire to predict performance on each task and how this compared to the established questionnaires, as before looking first at the questions most directly relevant, and then at those supposedly not associated with a task.
Considering our first question, we expected questionnaires to be related to their purported cognitive functions that would be evidenced by significant correlations between questionnaire and task performance data. For the second question, in our previous study (Clark et al., 2019) we reported that the construction of scene imagery explained the relationships between the performance data from tasks assessing imagination, autobiographical memory, and future thinking, with navigation being partly related. Consequently, we hypothesised that the imagery questionnaires would be significantly correlated with imagination, autobiographical memory and future thinking performance, and perhaps partly with navigation performance. Furthermore, we hypothesised that the imagery questionnaires might even correlate with the task performance data to a greater extent than the questionnaires that actually purport to closely model a particular cognitive function. In relation to the novel One Sentence Questionnaire, as this was an exploratory aspect of the experiment, we had an open mind about whether or not it would reflect the task performance data as effectively as the longer questionnaires.

Participants
Two hundred and seventeen people took part in the study. They were aged between 20 and 41 years, had English as their first language and reported no psychological, psychiatric, neurological or behavioural health conditions. The age range was restricted to 20-41 to limit the possible effects of ageing. The mean age of the sample was 29.0 years old (95% CI; 20, 38) and included 109 females and 108 males.
Participants were reimbursed £10 per hour for taking part which was paid at study completion. All participants gave written informed consent and the study was approved by the University College London Research Ethics Committee.
The guidelines described by Cohen (1992) were used to establish sample size during study design. A sample of 216 participants was determined to be robust to employing different statistical approaches when answering multiple questions of interest. For example, it allowed for sufficient power to identify medium effect sizes of correlations at alpha levels of 0.01, medium effect sizes when comparing two correlations at alpha levels of 0.05 and medium effect sizes with eight variables in multiple regressions at alpha levels of 0.01 -the statistical tests used in the current paper. In addition, although not relevant here, the sample size was large enough to identify medium effect sizes across multiple groups using ANOVAs at alpha levels of 0.01 and to conduct mediation analyses and structural equation modelling (Anderson & Gerbing, 1988) as used in Clark et al. (2019). A final sample of 217 was obtained due to over recruitment.

Procedure
Participants completed the questionnaires before any of the cognitive tasks. The cognitive tasks were conducted over multiple visits. The order of the tasks within each visit was the same for all participants (see Clark et al., 2019). Task order was arranged so as to avoid task interference, for example, not having a verbal test followed by another verbal test, and to provide sessions of approximately equal length (∼3-3.5 h, including breaks). All participants completed all parts of the study.

Questionnaires
Participants filled out a range of well-established, published questionnaires that purported to relate to one or more of the cognitive functions of interest in the current study. We briefly describe each questionnaire in alphabetical order and, where relevant, highlight particular subscales of interest. Sutin & Robins, 2007) This assesses the phenomenology of autobiographical memory across different dimensions. The original questionnaire asks participants to focus on a specific past event. For our purposes, this was adapted to concern the recall of autobiographical memories in general.

Memory Experiences Questionnaire (MEQ;
The full questionnaire examines ten dimensions (Accessibility, Coherence, Distancing, Emotional Intensity, Sharing, Sensory (non-visual) details, Time Perspective, Valence, Visual Perspective and Vividness). Here, we focussed the analyses on the Accessibility, Coherence, Sharing and Vividness subscales as they were most relevant for the current study. These subscales consist of varying numbers of statements which participants rate on a 5 point scale from 1 (strongly disagree) to 5 (strongly agree). Scores are calculated for each subscale by totalling the responses to each statement within the subscale. Questions from all subscales are intermixed throughout the questionnaire.
2.3.1.1. MEQ Accessibility. This subscale consists of five statements. Two are positively scored ("Memories are easy for me to recall") and three are reverse scored ("It is difficult for me to think of past events"). The total score is out of 25.

MEQ Coherence.
This subscale comprises eight statements. Four are positively scored ("I recognize the setting in which my memories take place") and four are reverse scored ("I have a difficult time remembering events in a coherent manner"). The total score is out of 40.
2.3.1.3. MEQ Sharing. This subscale consists of six statements. Three are positively scored ("I frequently think about or talk about past event with others") and three are reverse scored ("I rarely tell others about my memories"). The total score is out of 30.

MEQ Vividness.
This subscale consists of six statements. Three items are positively scored ("My memory for events is very vivid"), and three are reverse scored ("My memory for events is dim"). The total score is out of 30.

2.3.2.
Object-Spatial Imagery and Verbal Questionnaire (OSIVQ; Blazhenkova & Kozhevnikov, 2009) This questionnaire is designed to distinguish between different types of imagery users and has three subscales, two related to imagery and one to verbal processing. The Object subscale measures the ability to imagine vivid and detailed images of objects and scenes. The Spatial subscale measures the ability to process locations, movement and transformations, often represented by more technical and schematic imagery. The Verbal subscale measures the use of verbal strategies.
Here, for complete clarity, we renamed the Object subscale "Object-Scene" in order to better represent what the scale is designed to measure. Object imagery typically suggests an image of an object devoid of any background. However, as stated by Blazhenkova and Kozhevnikov (2009), the Object subscale is not limited to individual objects but can also refer to imagery of patterns and scenes, characterising their colour, vividness, shape and details.
Each of the three subscales of the OSIVQ contains 15 statements. The participant is asked to rate each statement on a five point scale from 1 (totally disagree) to 5 (totally agree). The final score on each subscale is the average of the responses over the 15 items.

OSIVQ Object-Scene.
For this subscale, statements include: "When reading fiction, I usually form a clear and detailed mental picture of a scene or room that has been described" and "I can close my eyes and easily picture a scene that I have experienced". No statements are reverse scored.

OSIVQ Spatial.
For this subscale, statements include: "My images are more like schematic representations for things and events rather than detailed pictures" and "I can easily sketch a blueprint for a building that I am familiar with". One statement in the subscale is reverse scored ("I find it difficult to imagine how a three-dimensional geometric figure would exactly look like when rotated").

OSVIQ Verbal.
For this subscale, statements include: "When remembering a scene, I use verbal description rather than mental pictures" and "I am always aware of sentence structure". Three statements in the subscale are reverse scored (e.g. "I have difficulty expressing myself in writing").

Plymouth Sensory Imagery
Questionnaire, Appearance subscale (PSIQ; Andrade et al., 2014) The PSIQ is a modernised version of the commonly used Vividness of Visual Imagery Questionnaire (VVIQ; Marks, 1973Marks, , 1995, with high correlations between the two questionnaires (Total score on the PISQ with VVIQ, r = 0.66, p < 0.001; PSIQ Appearance subscale with the VVIQ, r = 0.51, p < 0.001; Andrade et al., 2014). The PSIQ has an advantage over the VVIQ in also measuring imagery across multiple sensory modalities. Here, however, we focus solely on the Appearance subscale as this pertains specifically to the vividness of visual imageryour area of interest. The subscale requires participants to imagine three scenarios: a bonfire, a sunset and a cat climbing a tree. They then rate the visual image they generated on an 11 point scale from 0 (no image at all) to 10 (vivid as real life). Scores on the three scenarios are summed to create a total score out of 30. (Hegarty et al., 2002) This questionnaire assesses spatial and navigational abilities, preferences and experiences. Fifteen statements are presented, with participants indicating their level of agreement with each statement. Ratings are made on a 7 point scale from 1 (strongly agree) to 7 (strongly disagree). Seven statements are positively coded ("I am very good at giving directions") and eight are reverse scored ("I don't have a very good "mental map" of my environment"). Scores are summed across the 15 statements and are then reversed to create a total score out of 105 (so that a high score reflects good navigation ability).

Santa Barbara Sense of Direction Scale
2.3.5. Spontaneous Use of Imagery Scale (SUIS; Reisberg et al., 2003) This questionnaire, consisting of 12 statements, measures how frequently an individual uses visual imagery. Participants read each statement and indicate the degree to which the statement is appropriate to them. Each statement is rated on a 5 point scale from 1 (never) to 5 (always). Example statements include: "If I am looking for new furniture in a store, I always visualise what the furniture would look like in particular places in my home" and "When I hear a radio announcer or DJ I've never actually seen, I usually find myself picturing what they might look like." Scores are summed across the 12 items to give a final score out of 60.
2.3.6. Subjective Memory Questionnaire (SMQ; Bennett-Levy & Powell, 1980) This questionnaire probes memory for things people often try to remember. It is split into two sections. First, is a list of 36 items ("telephone numbers"; "jokes"; "birthdays") which participants rate in response to the question "How good is your memory for…?" Answers are given on a 5 point scale from 1 (very bad) to 5 (very good). The second section asks the question "How often do you…?" in relation to seven experiences (e.g. "Set off to do something, then can't remember what"; "Forget whether or not you have locked up the house"). Answers are provided on a 5 point scale from 1 (very rarely) to 5 (often). The total score is the sum of all responses (out of 215). Palombo et al., 2013) This questionnaire assesses episodic and semantic aspects of autobiographical memory, as well as future thinking and spatial memory. There are 26 items in total. Participants respond on a five point scale regarding the extent to which they agree with each statement, from 1 (completely disagree) to 5 (completely agree). Scoring is determined via a weighting system. Responses are weighted and calculated together to provide an average score that centres around 100, like an IQ. Full details of the weighting and scoring procedure are available from the SAM authors.

Survey of Autobiographical Memory (SAM;
2.3.7.1. SAM Episodic. This subscale assesses autobiographical memory recall. It contains 8 statements ("I am highly confident in my ability to remember past events"), two of which are reversed scored ("Specific events are difficult for me to recall").
2.3.7.2. SAM Future. This subscale examines a participant's ability to imagine future events. It contains 6 statements ("When I imagine an event in the future, the event generates vivid mental images that are specific in time and place"), one of which is reverse scored ("I have a difficult time imagining specific events in the future").
2.3.7.3. SAM Spatial. This subscale assesses navigation ability. It contains 6 statements ("In general, my ability to navigate is better than most of my family/friends"), two of which are reverse scored ("I get lost easily, even in familiar areas").

SAM Semantic.
This subscale probes the ability to recall facts and information. It contains 6 statements ("I can learn and repeat facts easily, even if I don't remember where I learned them"), two of which are reverse scored ("I have a hard time remembering information I have learned at school or work"). (Kirby, Moore, & Schofield, 1988) This questionnaire assesses an individual's preference for visual or verbal learning styles. Twenty statements are provided, 10 corresponding to visual items and 10 to verbal items. The participant indicates for each item whether, for them, the statement is true or false. Half of the statements are phrased positively, in that an answer of "true" reflects a visual or verbal learning preference (visual learning style: "The old saying 'A picture is worth a thousand words' is certainly true for me"; verbal learning style: "I have better than average fluency in using words"). The other half are phrased negatively where an answer of "false" reflects a visual or verbal learning preference (visual learning style: "I seldom use diagrams to explain things"; verbal learning style: "I dislike word games like crossword puzzles"). The two scales are scored separately. The final score (out of 10 for each scale) is the number of responses reflecting the stated learning style.

Questionnaire groups
We divided the established questionnaires into five groups, four of which related to our areas of interest: imagination, autobiographical memory, future thinking and navigation. The fifth group was composed of questionnaires relating to verbal and semantic processing. This was included as a control to ensure that questionnaires (regardless of subject matter) were not associated with the tasks for spurious reasons. The questionnaires in each group are shown in Table 1.

Experimental 'One Sentence Questionnaire'
This questionnaire is new, and in developing and testing it we aimed to gain a broad profile of an individual in a short time frame. It consists of 15 questions covering the four areas of cognition that were of interest in this study -imagination, autobiographical memory, future thinking and navigation (see the Supplementary Materials for the questionnaire).

Imagery Ability
"Please rate your ability to construct a mental image". Answers are on a 7 point scale from 1 (very high) to 7 (very low). This is reverse scored so that a high ability is a high score.

Imagery Use
"In everyday life, how much do you think in images (e.g. thinking in pictures in your mind)?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Imagery as a Scene
"If you think in images, to what extent does this involve spatially coherent scenes (e.g. scenes that you could step into or operate within) compared to single objects?" Answers are on a 7 point scale from 1 (single objects) to 7 (coherent scenes).

Memory Ability
"Please rate your ability to remember your personal past". Answers are on a 7 point scale from 1 (very high) to 7 (very low). This is reverse scored so that a high ability is a high score.

Memory in Imagery
"When recalling the past, to what extent do you think in images?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Memory in Scene Imagery
"If you think in images when recalling the past, to what extent do you evoke spatially coherent scenes in your mind's eye, compared to imagining single objects?" Answers are on a 7 point scale from 1 (single objects) to 7 (coherent scenes).

Memory in Words
"When recalling the past, how much do you think verbally (e.g. thinking in words and sentences)?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Future Thinking Ability
"Please rate your ability to imagine future events". Answers are on a 7 point scale from 1 (very high) to 7 (very low). This is reverse scored so that a high ability is a high score.

Future Thinking in Imagery
"When imagining the future, to what extent do you think in images?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Future Thinking in Scene Imagery
"If you think in images when imagining the future, to what extent do you evoke spatially coherent scenes in your mind's eye, compared to imagining single objects?" Answers are on a 7 point scale from 1 (single objects) to 7 (coherent scenes).

Future Thinking in Words
"When imagining the future, how much do you think verbally (e.g. thinking in words and sentences)?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Navigation Ability
"Please rate your navigational ability". Answers are on a 7 point scale from 1 (very high) to 7 (very low). This is reverse scored so that a high ability is a high score.

Navigation in Imagery
"When you navigate, to what extent do you think in images?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).

Navigation in Scene Imagery
"If you think in images when navigating, to what extent do you evoke spatially coherent scenes in your mind's eye, compared to imagining single objects?" Answers are on a 7 point scale from 1 (single objects) to 7 (coherent scenes).

Navigation in Words
"When navigating, how much do you think verbally (e.g. thinking in words and sentences)?" Answers are on a 7 point scale from 1 (not at all) to 7 (all the time).
2.6. Cognitive tasks 2.6.1. Imagination -the scene construction test  This test measures a participant's ability to mentally construct an atemporal visual scene, meaning that the scene is not grounded in the past or the future. Participants construct seven different scenes of commonplace settings. For each scene, a short cue is provided ("Imagine lying on a beach in a beautiful tropical bay"), which the participant is asked to imagine and then describe out loud in as much detail as possible. All descriptions are audio recorded and transcribed for scoring. Participants are explicitly told not to describe a memory, but to create a new atemporal scene that they have never experienced before.
The main overall outcome measure is an "experiential index" which is calculated for each scene and then averaged. In brief, it is composed of four elements: numerical scoring of the content, participant ratings of their sense of presence (how much they felt like they were really there) and perceived vividness, participant ratings of the spatial coherence of the scene, and an experimenter rating of overall quality of the scene (see Supplementary Materials for full details). Double scoring was performed on 20% of the data. We took the most stringent approach to identifying across-experimenter agreement. Inter-class correlation coefficients, with a two-way random effects model looking for absolute agreement, indicated excellent agreement among the experimenter ratings (minimum score of 0.90; see Supplementary Materials, Table  S1). For reference, a score of 0.8 or above is considered excellent agreement beyond chance. Levine et al., 2002) The AI asks participants to recall and describe autobiographical memories from a specific time and place over four time periods -early childhood (up to age 11), teenage years (from 11 to 17 years of age), adulthood (from aged 18 up to 12 months prior to the interview; two memories were requested) and the last year (a memory from the last 12 months).

Autobiographical memory -the Autobiographical Interview (AI;
Memories provided from the AI are scored to collect "internal" and "external" details of the event (see Supplementary Materials for details). Internal details are those describing the event in question (i.e., episodic details). External details describe semantic information concerning the event, or non-event information. The main outcome measure is the number of internal details as a percentage of the total number of utterances (i.e., combined internal and external details). This provides a measure of episodic memory, while taking into account a participant's verbosity. Overall AI scores were obtained by averaging performance across the five autobiographical memories. Double scoring found excellent agreement across the experimenters (minimum score of 0.81; see Supplementary Materials, Table S2).
As a secondary measure, we also examined the vividness of the memories generated. This was a participant rating performed for each memory. Participants answered the question "How clearly can you visualise this event?" on a 6 point scale from 1 (vague memory, no recollection) to 6 (extremely clear as if it's happening now). Responses were averaged across the five memories.
2.6.3. Future thinking  The future thinking task follows the same procedure as the scene construction task, but requires participants to imagine three plausible future scenes involving themselves (an event at the weekend; next Christmas; the next time they meet a friend). There are two main differences between the future thinking task and the scene construction task. First, unlike scenes in the scene construction task, scenes in the future thinking task involve 'mental time travel' to the future, so they have a clear temporal dimension. Second, the cues for the scene construction tasks are somewhat more specific than those employed in the future thinking task (see Hurley, Maguire, & Vargha-Khadem, 2011 for more on this issue). Scoring procedures are the same as for the scene construction task. Double scoring identified excellent agreement across the experimenters (minimum score of 0.88; see Supplementary Materials, Table S3).
2.6.4. Navigation (Woollett & Maguire, 2010) In this test, navigation ability is assessed using movies of navigation through an unfamiliar town. Movie clips of two overlapping routes through this real town (Blackrock, in Dublin, Ireland) are shown to a participant four times.
Five tasks are used to assess navigational ability. First, following each viewing of the route movies, participants are shown four short clips -two from the actual routes and two distractors. Participants indicate whether they had seen that clip or not. Second, after all four route viewings are completed, recognition memory for landmarks is tested. A third test involves assessing knowledge of the spatial relationships between landmarks in the town. Fourth, route knowledge is examined by having participants place sets of photographs from the routes in the correct order as if travelling through the town. Finally, participants draw a sketch map of the two routes including as many landmarks as they can remember. Sketch maps are scored in terms of the number of road segments, road junctions, correct landmarks, landmark positions, the orientation of the routes and an overall map quality score from the experimenters (see Supplementary Materials for full details). Double scoring was performed on 20% of the sketch maps finding excellent agreement (minimum of 0.89; see Supplementary Materials, Table S4). An overall navigation score was calculated by combining the scores from the five tasks.

Statistical analyses
Questionnaire and task data were first summarised using means and 95% confidence intervals, calculated in SPSS v22. Bivariate Person correlations were performed between the questionnaires and tasks in SPSS v22. Correlations were compared using the technique described by Meng, Rosenthal, and Rubin (1992), which extends the Fisher z transformation, allowing for more accurate testing and comparison of two related correlations. The analysis was performed using the R cocor package, v1.1.3 (Diedenhofen & Musch, 2015) in R v3.4 (R Core Team, 2017). Multiple regressions were performed in SPSS v22. Effect sizes are reported as R 2 values. There were no missing data in any of the analyses.
As we performed 32 correlations for each cognitive task (17 correlations pertaining to the established questionnaires and 15 correlations for the exploratory One Sentence Questionnaire), the alpha level for the initial correlations was set at p < 0.001 to avoid false positives. We note that this is slightly more stringent than using Bonferroni correction which would suggest p < 0.0016, but we felt it was prudent to adopt a more cautious approach given our reasonably large sample size. For the analyses following the correlations (i.e. the comparisons of the correlations and the multiple regressions) we only included those variables already determined to be significantly correlated at p < 0.001. Therefore, when comparing the correlations and identifying significant variables in the regressions, as we had already been selective, the alpha level was set at p < 0.05.

Results
Performance data from the cognitive tasks have been presented in a previous paper that addressed a completely different research question to that under consideration here (Clark et al., 2019). As detailed in the introduction, the previous research question involved assessing the relationships between imagination, autobiographical memory, future thinking and navigation in terms of task performance. By contrast, in the current study we investigated how questionnaire data were related to cognitive task performance on each of the aforementioned tasks. None of the questionnaire data has been reported previously. A summary of the outcome measures for the questionnaires and cognitive tasks is presented in Table 2. A wide range of scores was obtained for all variables.
The results are reported in the following manner: (1) For each cognitive function of interest, the scores on the associated Note. The cognitive task performance data have been reported previously in Clark et al. (2019) where a different research question was addressed. The questionnaire data have not been reported before. Questionnaire order is alphabetized for ease of display; they were actually completed in the following order: One Sentence Questionnaire, Santa Barbara Sense of Direction Scale, Spontaneous Use of Imagery Scale, Subjective Memory Questionnaire, Visualizer-Verbalizer, Object Spatial Imagery and Verbal Questionnaire, Survey of Autobiographical Memory, Plymouth Sensory Imagery Questionnaire, and Memory Experience Questionnaire. The cognitive tasks were performed in this order: navigation, and then (on a separate visit) scene construction, future thinking, and autobiographical memory. Additional tests were performed between the future thinking and autobiographical memory tasks that are not part of the current study; see Clark et al. (2019). MEQ = Memory Experience Questionnaire; OSIVQ = Object Spatial Imagery and Verbal Questionnaire; PSIQ = Plymouth Sensory Imagery Questionnaire; Appearance subscale; AI = Autobiographical Interview.
cognitive task were correlated with the scores from the questionnaires in the relevant group (see Table 1), i.e. those that purport to be closely related to this cognitive function.
(2) For those results that were significant (at p < 0.001) at the first step, we next examined whether there were any differences in correlations between the questionnaires, to see which questionnaire(s) best represented the cognitive task in question.
(3) We then conducted a multiple regression analysis to examine whether the questionnaires tapped into the same cognitive construct, or if the different questionnaires were representing distinct aspects of the cognitive function. (4) We next sought to ascertain if questionnaires which were related to other cognitive functions (i.e. the other questionnaire groups in Table 1), were associated with performance on the cognitive task in question (at p < 0.001). (5) The next step was to investigate whether the questionnaires associated with the cognitive task outperformed those from the other questionnaire groups, or whether all the significantly correlated questionnaires correlated with the cognitive task to a similar extent. (6) We then performed multiple regression to see if the questionnaires were correlating with the cognitive task via the same cognitive construct, or if there were multiple distinct contributions from the different types of questionnaires. (7) We next moved to our exploratory One Sentence Questionnaire, to test whether the relevant questions from this questionnaire correlated with task performance (e.g. did the imagination questions correlate with scene construction performance). (8) We then compared the correlations between the relevant questions of the One Sentence Questionnaire with the significantly correlating established questionnaires. (9) Our final analyses then examined whether the questions from the One Sentence Questionnaire that were relevant to other cognitive functions were associated with performance on the task in question, and how this compared to the relevant questions of the One Sentence Questionnaire. (10) Finally, we relate the questionnaire results for a cognitive function back to the three research questions.
Of note, some of the steps above were not performed for a cognitive function if the main correlations between questionnaires and the cognitive task (steps 1, 4, 7) were not significant at p < 0.001.

Questionnaires from the imagery group
We first assessed whether or not the imagery questionnaires were correlated with performance on a test of imagination -the scene construction test. As is evident in Table 3, this was predominantly the case. Of the established questionnaires, the PSIQ, OSIVQ Object-Scene subscale and SUIS all correlated with performance on the scene construction test. On the other hand, the Visualizer questionnaire and the OSIVQ Spatial subscale were not correlated with scene construction, suggesting a possible divide in the imagery type being measured.
Next, we wanted to know if there were any differences in correlations between the questionnaires. To investigate this, we compared the correlation coefficients of the questionnaires that were significantly associated with scene construction, and the results are reported in Table 4. The only significant difference was that the PSIQ had a greater correlation with scene construction performance than the SUIS, while the OSIVQ Object-Scene subscale had a correlational value in between the PSIQ and SUIS, but was not significantly different to either.
Next, we wanted to examine whether the questionnaires tapped into the same cognitive construct, or if the different questionnaires are representing distinct aspects of imagery. To do this, we performed a multiple regression analysis using the imagery questionnaires that were significantly correlated with scene construction performance (at the p < 0.001 level). The results are shown in Table 5. Only the PSIQ was associated with scene construction performance (regression model statistics: F(3,213 = 18.62, p < 0.001, R 2 = 0.21, Adj. R 2 = 0.20).
Why might the PSIQ be key? The PSIQ is the questionnaire most similar to the scene construction test. The PSIQ asks participants to rate how vividly they can imagine three specific scenes (a bonfire, a sunset, a cat climbing a tree). The scene construction test asks participants to imagine and describe specific scenes (e.g. a busy fishing harbour). It, therefore, follows that a participant's subjective belief about their ability to perform this task correlated particularly well with their score on essentially the same task.

Questionnaires from the other groups
We next sought to ascertain if questionnaires which are typically associated with the other cognitive functions (namely, autobiographical memory, future thinking and navigation), were associated with performance on the scene construction task. To check that questionnaires were not simply associated with task performance for spurious reasons (e.g. the product of a reasonably large sample size and performing multiple correlations), we also included questionnaires which examined less related cognitive constructs (verbal processing and semantic memory) to act as controls.
As can be seen in Table 6, the majority of questionnaires pertaining to autobiographical memory and future thinking were also significantly (p < 0.001) correlated with performance on the scene construction test. This is in direct contrast to the control questionnaires (assessing verbal and semantic processing) that were not correlated with scene construction performance. Of note, the significant correlations seemed to occur when the questionnaire also assessed visual imagery, for example, when asking about the vividness of memory recall (the MEQ Vividness subscale). The questionnaires assessing navigation, however, told a different story, with neither questionnaire being correlated with scene construction performance (at the p < 0.001 level). Navigation questionnaires may, therefore, be assessing a different cognitive construct than the imagery questionnaires.
The next step was to investigate whether the imagery questionnaires were outperforming the questionnaires from the other groups, or whether all the significantly correlated questionnaires were correlating with scene construction performance to the same extent. To examine this, we compared the correlations of the significantly associated questionnaires from the other groups with the significantly correlated imagery questionnaires.
As can be seen in Table 7, the PSIQ had a higher correlation with scene construction performance than questionnaires from the other groups. As such, directly asking about imagery led to a greater correlation than asking about indirectly-related cognitive constructs. On the other hand, the OSIVQ Object-Scene subscale and the SUIS were similarly correlated with scene construction as the indirectly-related questionnaires. This suggests that the memory and future thinking questionnaires, although designed to examine different cognitive constructs, may in fact have a substantial overlap with imagery. Given the correlations of questionnaires from the other groups (in addition to the imagery questionnaires) with scene construction performance, we next performed a multiple regression to ascertain if the questionnaires were correlating with scene construction via the same construct, or if there were multiple distinct contributions from the different types of questionnaires.
The results of the multiple regression are presented in Table 8. Two questionnaires were found to contribute to scene construction performance -the PSIQ and the MEQ Coherence subscale [regression model statistics: F(8,208) = 8.57, p < 0.001, R 2 = 0.25, Adj. R 2 = 0.22], with the standardised Betas suggesting that the PSIQ had twice the weighting of the MEQ Coherence.
The multiple regression, therefore, suggests that the majority of correlations observed between the memory and future thinking questionnaires and scene construction ability could be explained by visual imagery. If the memory or future thinking questionnaires represented different constructs outside of imagery, then they would have been identified as separate variables by the multiple regression -as was the case with the MEQ Coherence subscale. So what does the MEQ Coherence subscale represent? This subscale probes how well a memory is placed together in a logical fashion. Hence it seems to add information to the PSIQ, which is concerned only with the vividness of an image. Overall, therefore, while many questionnaires probing imagery, memory and future thinking correlated with scene construction performance, all of these questionnaires seem to be, in some way, tapping into the same constructs as those measured by the PSIQ and MEQ Coherence.

One Sentence Questionnaire
Our final area of interest in relation to scene construction performance was the exploratory One Sentence Questionnaire. First, we assessed whether or not the imagery questions from the One Sentence Questionnaire correlated with scene construction performance, finding Table 4 Comparisons between the significant correlation coefficients of the imagery questionnaires with scene construction performance. The difference in correlation is the column questionnaires less the row questionnaires. Note. ** p < 0.01. PSIQ = Plymouth Sensory Imagery Questionnaire; Appearance subscale; OSIVQ = Object Spatial Imagery and Verbal Questionnaire; SUIS = Spontaneous Use of Imagery Scale.

Table 7
Comparisons between the correlation coefficients of the imagery questionnaires and the other group questionnaires significantly correlated with scene construction performance. The difference in correlation is the imagery questionnaires less the other group questionnaires (a negative value shows higher correlation for the row questionnaire). that they all did (One Sentence Imagery Use: r = 0.32, p < 0.001; One Sentence Imagery as a Scene: r = 0.32, p < 0.001; One Sentence Imagery Ability: r = 0.31, p < 0.001).
We then tested how the One Sentence Questionnaire imagery questions performed in comparison to the established imagery questionnaires that also correlated with scene construction performance (the PSIQ, OSIVQ Object-Scene subscale and the SUIS). The results are shown in Table 9. The only significant differences were between the PSIQ and One Sentence: Imagery as a Scene and One Sentence: Imagery Ability. That is, the questions from the One Sentence Questionnaire were correlated with scene construction to a similar degree as most of the longer, and more time consuming, established questionnaires.
Next, we sought to ascertain if, like the established questionnaires, the One Sentence Questionnaire questions designed to look at other cognitive functions (namely, autobiographical memory, future thinking and navigation), were also associated with performance on the scene construction task. The results can be seen in Table 10. Mirroring the established questionnaires, significant correlations seemed to occur when the questions also involved visual imagery.
Finally, we investigated whether the imagery questions from the One Sentence Questionnaire outperformed the One Sentence Questionnaire questions from the other groups, or whether all the significantly correlated questions were correlating with scene construction performance to the same extent. To examine this, we compared the correlations of the significantly associated One Sentence Questionnaire questions from the other groups with the One Sentence Questionnaire imagery questions.
As can be seen in Table 11, the imagery based One Sentence Questionnaire questions were similarly correlated with scene construction as the other significantly correlated One Sentence Questionnaire questions. This suggests two potential conclusions. First, it could be that the specificity of the One Sentence Questionnaire questions is poor. Or, second, and in line with the established questionnaires, it may be that, while the other questions are designed to examine different cognitive constructs, due to the inclusion of visual imagery in these questions, associations with scene construction performance remain.

Imagination summary
The main associations between the questionnaires and performance on the scene construction test can be seen in Fig. 1.
Considering our three research questions, first, imagery questionnaires that focus on concrete visual imagery measure the construct they are designed to reflect, namely imagery ability. Second, for the questionnaires from the other groups, overlap between imagination, autobiographical memory and future thinking questionnaires (but not navigation) was clearly evident with regard to scene construction performance. Notably, this relationship seemed to be based around the use of visual imagery. Third, the imagery questions from the One Sentence Questionnaire performed at the same level as the established, and more time consuming to complete, questionnaires. The only exception to this being the PSIQ, the questionnaire which also produced higher Note. PSIQ = Plymouth Sensory Imagery Questionnaire; Appearance subscale; MEQ = Memory Experience Questionnaire; SAM = Survey of Autobiographical Memory; SMQ = Subjective Memory Questionnaire; OSIVQ = Object Spatial Imagery and Verbal Questionnaire; SUIS = Spontaneous Use of Imagery Scale.

Table 9
Comparisons between the correlation coefficients of the established imagery questionnaires that correlated with scene construction performance and the One Sentence Questionnaire imagery questions. The difference in correlation is the established questionnaires less the One Sentence Questionnaire (a negative value shows higher correlation for the row questionnaire). Note. * = p < 0.05. PSIQ = Plymouth Sensory Imagery Questionnaire; Appearance subscale; OSIVQ = Object Spatial Imagery and Verbal Questionnaire; SUIS = Spontaneous Use of Imagery Scale.  Clark and E.A. Maguire Cognition 195 (2020) 104114 correlations with scene construction performance when compared to the established imagery questionnaires.

Questionnaires from the autobiographical memory group
Considering next the memory questionnaires, the task performance data here were the number of AI internal (episodic) details as a percentage of total utterances (internal and external details combined), to provide a measure of autobiographical memory ability with verbosity taken into account. The expectation based on the literature was that responses on the memory questionnaires would be correlated with AI internal details. However, as is shown in Table 12, none of the memory questionnaires were correlated with AI internal details at the p < 0.001 level. Reducing this threshold to p < 0.05 identified a weak negative correlation with MEQ Coherence. The established memory questionnaires, therefore, seemed not to correlate with memory performance as measured by internal details. Given that internal details is generally taken as the 'gold standard' for measuring autobiographical memory ability, and the memory questionnaires employed here are widely used, this result is surprising.

Questionnaires from the other groups
As the memory questionnaires showed limited correlations with AI internal details, we next examined whether any of the questionnaires from the other groups were correlated with AI internal details, with the results shown in Table 13. In short, none of the questionnaires were correlated with AI internal details at the p < 0.001 level. Even reducing this threshold to p < 0.05 identified only a weak negative correlation with the Visualizer questionnaire (r = −0.14). This is in stark contrast to the correlations of the other group questionnaires with the scene construction imagination task noted previously (Table 6), which showed multiple significant correlations.

Table 11
Comparisons between the correlation coefficients of imagery questions from the One Sentence Questionnaire and the other questions from the One Sentence Questionnaire that were significantly correlated with scene construction performance. The difference in correlation is the imagery questions less the other group questions.

One Sentence Questionnaire
In line with the established questionnaires, none of the questions on the One Sentence Questionnaire were significantly related to AI internal details (Table 14).

Additional investigations into AI internal details
The analyses reported above were performed with AI internal details averaged across all of the memory ages (childhood, teenage, adulthood, last year), and combining all of the sub-categories of internal details (events, place, time, perceptual, emotion). It could be argued that use of this combined data introduced additional noise, blurring potential associations.
We, therefore, performed all of the correlations again, but separately for each of the four time points (childhood, teenage, adulthood, last year) as well as combining the two older memory categories (childhood, teenage) to create an approximation of "remote" memory and the two more recent memory categories (adulthood, last year) to create an approximation of "recent" memory. In none of these additional 192 correlations was any relationship between questionnaires and AI internal details found (at our p < 0.001 statistical threshold; see full details in Supplementary Materials, Tables S5-S7).
We also separately examined the five sub-categories of internal details (events, place, time, perceptual, emotion, each averaged across all of the time points, and represented as a percentage of total utterances). From among these additional 160 correlations, only one significant relationship was identified -a negative correlation between the internal events sub-category and the Verbalizer questionnaire (r = −0.24, Supplementary Materials, Tables S8-S10).
Moreover, we conducted all of these analyses again, but using only the raw number of internal details (instead of the number of internal details as a percentage of total utterances). Doing so identified only four significant relationships at our p < 0.001 statistical threshold, all for the MEQ Sharing subscale (Supplementary Materials, Tables S11-S16). This subscale was positively associated with the total number of internal details when averaged across all the time points (r = 0.26), and was specifically associated with the adulthood memory category (r = 0.25) and the "recent" memory category (r = 0.26). Investigation into the internal details sub-categories found that this positive correlation was specific to the internal events sub-category (r = 0.25). Given that the MEQ Sharing subscale asks about how often an individual talks about their memories, it is likely that these relationships are a reflection of participants' verbosity.
It seems, therefore, that combining across time points and internal detail sub-categories did not adversely affect the sensitivity of the results.

Autobiographical memory summary
Neither the established questionnaires nor the questions from our experimental One Sentence Questionnaire correlated with autobiographical memory internal details.

Questionnaires from the autobiographical memory group
While the memory questionnaires (and indeed questionnaires in general) were not correlated with the main outcome measure of the AI, in contrast, the memory questionnaires were correlated with performance on the scene construction imagination task (see Table 6). As noted above, this could be because the memory questionnaires actually relate to the imagery experience of memories, rather than the number of details generated. Another measure collected from the AI is a participant rating of the vividness of memory recall. A correlation analysis revealed that memory vividness was measuring a different cognitive construct than the number of internal details generated, as there was no correlation between these two measures (r = 0.067, p = 0.33). Consequently, we examined whether the memory questionnaires were instead correlated with the vividness of memory recall.
As can be seen in Table 15, nearly all of the memory questionnaires   Note. OSIVQ = Object Spatial Imagery and Verbal Questionnaire; PSIQ = Plymouth Sensory Imagery Questionnaire.
significantly correlated with AI vividness at p < 0.001. Given that nearly all of the memory questionnaires were correlated with AI vividness, we next wanted to establish if there were any differences between these correlations. No differences were found between any of the correlations (Table 16).
Next, we sought to ascertain whether the memory questionnaires were all tapping into the same construct, or if different questionnaires were measuring distinct aspects of AI vividness. To do this, we performed a multiple regression using the memory questionnaires that were significantly correlated with AI vividness (at p < 0.001). The results are shown in Table 17. While the full regression model was significant [F(6,210) = 8.71,p < 0.001, R 2 = 0.20, Adj. R 2 = 0.18], only the SMQ questionnaire was significantly related to AI vividness.

Questionnaires from the other groups
Memory vividness is potentially more of an imagery construct rather than being specific to memory. Therefore, we wanted to know if the questionnaires from the other groups (namely, imagination, future thinking and navigation), were also correlated with AI vividness, with a particular interest in the imagery questionnaires. To check that responses to questionnaires in general were not correlating with AI vividness, the control questionnaires were again included.
As is evident in Table 18, the majority of questionnaires from the other groups correlated with AI vividness. Importantly, the control questionnaires did not correlate with AI vividness. This, therefore, supports the idea that the significant correlations are not simply due to artificial constructs, but arise from aspects of the questionnaires. Of additional note, two of the imagery questionnaires -the Visualizer and OSIVQ Spatial subscale also did not correlate with AI vividness. This is particularly interesting because these are the two imagery questionnaires that also did not correlate with the scene construction imagination task. Overall, therefore, the correlations of the memory questionnaires with AI vividness seemed to extend beyond memory, with instead correlations being apparent whenever there was a visual imagery element to the questionnaire.
On the one hand, this may be expected -the vividness of an autobiographical memory is not a pure memory measure, and shares substantial overlap with that of imagination/imagery. Nevertheless, our findings suggest that widely used memory questionnaires may not only be reflecting an individual's memory vividness, but also acting as an indirect measure of imagery ability.
To examine this in more detail, we next compared the significant correlations of the questionnaires from the other groups and AI vividness with the significantly correlated memory questionnaires. This allowed us to assess whether either type of questionnaire was outperforming the other. As shown in Table 19, the OSIVQ Object-Scene subscale and the PSIQ had higher correlations than all the established memory questionnaires (significant for all the comparisons with the OSIVQ Object-Scene subscale, but only for the MEQ Accessibility and MEQ Coherence subscales for the PSIQ). On the other hand, all the other questionnaires from the other groups performed at around the same level as the main memory questionnaires. That is, none of the memory questionnaires had greater correlations with AI vividness than any of the other questionnaires. Thus, the specificity of the memory questionnaires seems to be in question.
Given the high levels of correlations of the questionnaires in the other groups (in addition to the memory questionnaires) with AI vividness, we next performed a multiple regression to ascertain if the questionnaires were correlating with AI vividness via the same construct or if there were distinct contributions from the different types of questionnaires.
The results of the multiple regression can be seen in Table 20. The full regression model was found to be significant [F(11,205) = 10.39, p < 0.001, R 2 = 0.36, Adj. R 2 = 0.32], with two questionnaires contributing to AI vividness; the PSIQ and the OSIVQ Object-Scene Table 16 Comparisons between the correlation coefficients of the memory questionnaires with AI memory vividness. The difference in correlation is the questionnaire in the column less the questionnaires in the row.   Note. OSIVQ = Object Spatial Imagery and Verbal Questionnaire. Clark and E.A. Maguire Cognition 195 (2020) 104114 Table 19 Comparisons between the correlation coefficients of the significantly correlating memory questionnaires with the questionnaires from the other groups which correlated with AI vividness (at p < 0.001). The difference in correlation is the memory questionnaires less the questionnaires from the other groups (a negative value shows higher correlation for the non-memory questionnaire). Note. * = p < 0.05. MEQ = Memory Experience Questionnaire; SMQ = Subjective Memory Questionnaire; SAM = Survey of Autobiographical Memory; OSIVQ = Object Spatial Imagery and Verbal Questionnaire; PSIQ = Plymouth Sensory Imagery Questionnaire; Appearance subscale; SUIS = Spontaneous Use of Imagery Scale; Santa Barbara = Santa Barbara Sense of Direction Scale.
subscale. This suggests that for AI vividness, the PSIQ and the OSIVQ Object-Scene subscale were tapping into different constructs. The multiple regression, therefore, suggests that if one wants to examine AI vividness then memory questionnaires are not required. Instead, imagery questionnaires have the best association with AI vividness. Overall, the main cognitive construct associated with AI vividness seemed to be subjective reports of visual imagery ability.

One Sentence Questionnaire
We next considered AI vividness in relation to our exploratory One Sentence Questionnaire. First, we assessed whether or not the memory questions from the One Sentence Questionnaire correlated with AI vividness, finding that they did, with the exception of the Memory in Words question (One Sentence Memory in Imagery: r = 0.36, p < 0.001; One Sentence Memory in Scene Imagery: r = 0.36, p < 0.001; One Sentence Memory Ability: r = 0.28, p < 0.001; One Sentence Memory in Words: r = −0.062, p = 0.36).
We then tested how the significantly associated One Sentence Questionnaire memory questions performed in comparison to the established memory questionnaires that correlated with AI vividness. The results are shown in Table 21. No differences in correlations were found. That is, the memory questions from the One Sentence Questionnaire were correlated with AI vividness to a similar degree as the longer, and more time consuming, established questionnaires.
Next, we sought to ascertain if, like the established questionnaires, the One Sentence Questionnaire questions designed to look at other cognitive functions (namely, imagination, future thinking and navigation), were also associated with AI vividness. The results can be seen in Table 22. Nearly all the questions correlated with AI vividness. The only ones that did not were those pertaining to verbal processing.
Finally, we investigated whether the One Sentence Questionnaire questions from the other groups were correlated with AI vividness to the same extent as the memory questions from the One Sentence Questionnaire. No differences were identified (Table 23). In other words, unlike the established imagery questionnaires, the One Sentence Questionnaire imagery questions did not predict AI vividness better than the memory questions, but they were no worse than the memory questions.

Autobiographical memory vividness summary
The main associations between the questionnaires and AI vividness can be seen in Fig. 2.
Considering our three research questions, first, memory questionnaires seem to relate to autobiographical memory vividness, rather than internal details. Second, imagination, future thinking and navigation questionnaires were also associated with autobiographical memory vividness. Furthermore, the imagery questionnaires had greater associations with autobiographical memory vividness than the other questionnaires. Imagery seems, therefore, to underlie memory vividness, and it may be that memory questionnaires are in fact only indirect measures of imagery ability; an important point to note when deploying questionnaires in future memory research. Third, the memory questions from our One Sentence Questionnaire preformed similarly to the established, and more time consuming, memory questionnaires. They also performed comparably with the other (imagery based) questions pertaining to other cognitive functions on the One Sentence Questionnaire.

Future thinking questionnaire
We next looked at the relationship between the future thinking questionnaire and the future thinking task. As a brief reminder, we have already noted that the future thinking questionnaire correlated with both the scene construction imagination task and AI vividness. Of note, as future thinking has become a topic of heightened interest more recently than imagery and memory, we have only one questionnaire available with which to directly assess it.
We first examined whether or not the future thinking questionnaire was correlated with performance on the future thinking task, finding only a weak correlation (r = 0.21, p = 0.002) that was not significant at our p < 0.001 level.

Questionnaires from the other groups
Given the previous overlaps in questionnaire and task correlations, we next wanted to see if the questionnaires from the other groups (namely, imagery, memory and navigation), were correlated with future thinking performance. In particular this allowed us to test whether an imagery and/or an autobiographical memory process was associated with future thinking ability. As before, to check that questionnaires in general were not simply correlating with future thinking, the control questionnaires were included.
As shown in Table 24, the PSIQ, the OSIVQ Object-Scene subscale and the MEQ Coherence subscale all correlated with future thinking performance at p < 0.001. These questionnaires were also associated with scene construction performance and AI vividness, in particular the PSIQ and OSIVQ Object-Scene subscale. In addition, a number of the other imagery-based questionnaires were also correlated with future thinking at a weaker level, similar to that of the future thinking questionnaire itself. Navigation questionnaires, on the other hand, were not correlated with performance on the future thinking task. Overall, therefore, as with AI vividness, the imagery questionnaires may be particularly associated with future thinking performance.
Given the multiple significant correlations, we performed a multiple regression to test whether the different questionnaires were tapping into the same cognitive construct or if they represented different aspects of future thinking. As can be seen in Table 25, two questionnaires were identified as being associated with future thinking [regression model statistics: F(3,213) = 10.26, p < 0.001, R 2 = 0.12, Adj. R 2 = 0.11]; the PSIQ and the MEQ Coherence subscale. Overall, therefore, the multiple regression showed two constructs associated with future thinking; visual imagery vividness and coherence. Of note, the PSIQ was also strongly associated with both the scene construction task and AI vividness.

One Sentence Questionnaire
We first assessed whether the future thinking questions from the One Sentence Questionnaire were correlated with future thinking performance. Mirroring the established SAM Future subscale, none of the future thinking One Sentence Questionnaire questions correlated with future thinking performance (One Sentence Future Thinking in Scene Imagery: r = 0.17, p = 0.012; One Sentence Future Thinking in Imagery: r = 0.17, p = 0.013; One Sentence Future Thinking Ability: r = 0.16, p = 0.017; One Sentence Future Thinking in Words: r = 0.026, p = 0.70).
Next, we sought to ascertain if, like the established questionnaires, the One Sentence Questionnaire questions designed to look at other cognitive functions were associated with performance on the future thinking task. The results can be seen in Table 26. The only significant correlation (at p < 0.001) was one of the imagery questions -One Sentence Imagery as a Scene. This is in line with the established questionnaires, finding that the PSIQ (which specifically measures scene imagery) significantly correlated with future thinking performance.

Future thinking summary
The main associations between the questionnaires and future thinking performance are shown in Fig. 3.
Considering our three research questions, first, the established future thinking questionnaire was only weakly linked to future thinking performance. Second, questionnaires from the other groups and their relationship with future thinking illustrated the importance of imagery for reflecting future thinking performance. Overall, and once again, imagery questionnaires seem to be most strongly related to imagination, autobiographical memory and future thinking ability. Third, the future thinking questions from our One Sentence Questionnaire were also only weakly associated with future thinking performance. Instead, it was the scene imagery question from the One Sentence Questionnaire that was significantly associated with future thinking performancemirroring the established questionnaires.

Table 21
Comparisons between the correlation coefficients of the established memory questionnaires and the memory questions from the One Sentence Questionnaire that were associated with AI memory vividness. The difference in correlation is the questionnaire in the column less the questionnaires in the row (a negative value shows higher correlation for the questions of the One Sentence Questionnaire).   Clark and E.A. Maguire Cognition 195 (2020) 104114 3.5. Navigation

Questionnaires from the navigation group
Our final cognitive function of interest was navigation. Thus far, navigation questionnaires have been notable in not correlating with performance on scene construction, AI internal details or future thinking, although they did correlate with AI vividness. It may be,

Table 23
Comparisons between the correlation coefficients of the One Sentence Questionnaire memory questions and the One Sentence Questionnaire questions from the other groups which correlated with AI vividness (at p < 0.001). The difference in correlation is the memory questions less the questions from the other groups (a negative value shows higher correlation for the non-memory questions).  Note. OSIVQ = Object Spatial Imagery and Verbal Questionnaire.

Table 25
Multiple regression of all the questionnaires significantly correlated with future thinking performance. therefore, that navigation questionnaires are more specific in their line of questioning.
We first correlated the two established navigation questionnaires with performance on the navigation task. Both questionnaires correlated with navigation performance (SAM Spatial subscale: r = 0.33, p < 0.001; the Santa Barbara Sense of Direction Scale: r = 0.32, p < 0.001).
Second, we tested whether there was a difference in the correlations of the two questionnaires, finding no difference between them [mean r difference = 0.001 (95% CI = −0.079, 0.10), z = 0.24, p = 0.81].
Finally, we wanted to establish whether or not the two questionnaires were tapping into the same cognitive construct, or if they represented distinct aspects of navigation. We performed a multiple regression analysis using the two navigation questionnaires, finding that while the overall model was significant [F(2,214 = 13.92, p < 0.001, R 2 = 0.12, Adj. R 2 = 0.11], neither questionnaire was significantly associated with navigation performance [SAM Spatial: Beta value (95% CI) = 0.49 (−0.016, 1.0), Standardised beta value = 0.20, t = 1.91, p = 0.058; Santa Barbara Sense of Direction Scale: Beta value (95% CI) = 0.36 (−0.12, 0.84), Standardised beta value = 0.16, t = 1.47, p = 0.14]. This, therefore, suggests that both navigation questionnaires were tapping into the same cognitive construct.

Questionnaires from the other groups
We next examined whether or not the other group questionnaires (namely, imagination, autobiographical memory and future thinking), were correlated with navigation performance. This was especially interesting given that navigation questionnaires were the least associated with these other tasks. As before, to check that questionnaires in general were not simply correlating with navigation performance, the control questionnaires were included.
The results are shown in Table 27. The only questionnaire from the other groups that correlated with navigation task performance was the OSIVQ Spatial subscale. These results are in line with the findings we described earlier, showing limited relationships between navigation questionnaires and imagination, autobiographical memory and future thinking. Navigation seems, therefore, to be a separate construct from the other three cognitive functions.
Given the significant correlation of the OSIVQ Spatial subscale with navigation performance, we next wanted to know whether the navigation questionnaires outperformed the OSIVQ Spatial subscale, or whether they correlated with navigation task performance to the same extent. No differences between the questionnaires were found [OSIVQ Spatial vs SAM Spatial: mean r difference = −0.087 (95% CI = −0.24, 0.049), z = −1.29, p = 0.20; OSIVQ Spatial vs Santa Barbara Sense of Direction Scale: mean r difference = −0.077 (95% CI = −0.23, 0.062), z = −1.13, p = 0.26].

One Sentence Questionnaire
We next investigated the relationship between navigation performance and our exploratory One Sentence Questionnaire. First, we assessed whether or not the navigation questions from the One Sentence Questionnaire correlated with navigation task performance, finding that only the Navigation Ability question was significantly associated with navigation task performance at p < 0.001 (One Sentence: Navigation Ability: r = 0.30, p < 0.001; One Sentence: Navigation in Scene Imagery: r = 0.20, p = 0.003; One Sentence: Navigation in Imagery: r = 0.19, p = 0.004; One Sentence: Navigation in Words: r = −0.073, p = 0.28).
We then wanted to establish whether or not there were any differences in correlations between the established navigation questionnaires and the significant navigation question from the One Sentence Questionnaire. No differences between the correlations were identified [One Sentence Navigation Ability vs SAM Spatial: mean r difference (95% CI) = 0.027 (-0.065, 0.12), z = 0.62, p = 0.54; One Sentence Navigation Ability vs Santa Barbara Sense of Direction Scale: mean r difference (95% CI) = 0.017 (−0.071, 0.11), z = 0.41, p = 0.68].
Next, we sought to ascertain if the One Sentence Questionnaire questions designed to look at other cognitive functions were also associated with performance on the navigation task. As can be seen in Table 28, none of the other questions on the One Sentence Questionnaire were associated with navigation task performance.

Navigation summary
The main associations between the questionnaires and navigation task performance are shown in Fig. 4.
Considering our three research questions, first, navigation questionnaires reflect navigation performance. Second, associations with the navigation task seem to be specific to when individuals were asked about their navigation ability. This is in contrast to imagination, autobiographical memory vividness and future thinking, which seemed to be more related to the imagery questionnaires. Third, the One Sentence: Navigation Ability question performed at a comparable level as the established navigation questionnaires. Note. OSIVQ = Object Spatial Imagery and Verbal Questionnaire.

Fig. 4.
Graphs showing the correlations between the SAM Spatial subscale, the Santa Barbara Sense of Direction Scale and the One Sentence: Navigation Ability question with performance on the navigation task.

Discussion
Information about an individual's thoughts and beliefs is a valuable resource in psychology, contributing unique data to help us understand cognition and behaviour. Questionnaires are one of the best methodologies for accessing this information. However, there is a surprising lack of empirical work examining whether questionnaires actually reflect their purported cognitive functions where performance on naturalistic tasks is concerned. Here, we showed that widely used visual imagery (imagination) and navigation questionnaires reflected their cognitive functions, but that autobiographical memory and future thinking questionnaires did not. Instead, we found that autobiographical memory questionnaires were associated with ratings of autobiographical memory vividness. Furthermore, we showed that as well as imagery questionnaires being most associated with the scene construction imagination task, they also had the greatest association with autobiographical memory vividness and future thinking. Finally, we highlighted the potential of our exploratory One Sentence Questionnaire for using single questions to capture a broad profile of a participant, with these questions performing comparably to the established in-depth questionnaires in this sample of healthy, young adults.
Three of the five questionnaires that purported to measure visual imagery were associated with performance on the scene construction task (our naturalistic measure of imagination): the PSIQ Appearance subscale (Andrade et al., 2014), the OSIVQ Object-Scene subscale (Blazhenkova & Kozhevnikov, 2009) and the SUIS (Reisberg et al., 2003). Of these, the PSIQ had the greatest association, likely due to its similarity to the imagination task in question. On the other hand, the Visualizer questionnaire (Kirby et al., 1988) and the OSIVQ Spatial subscale were not associated with scene construction task performance. Comparison of the questionnaires reveals why this might be the case. The PSIQ, OSIVQ Object-Scene subscale and SUIS all focus on imagery of pictures and scenes, while the Visualizer and OSIVQ Spatial subscale focus on more technical, abstract and schematic visual imagery representing locations, movement and transformations. Imagination performance, at least as measured by scene construction, seems, therefore, to only be associated with imagery questionnaires that focus on concrete visual imagery.
The two questionnaires designed to measure navigation ability, the Santa Barbara Sense of Direction Scale (Hegarty et al., 2002) and the SAM Spatial subscale (Palombo et al., 2013), were both associated with performance on the navigation task, in line with previous studies (Hegarty et al., 2006;Selarka et al., 2019;Weisberg et al., 2014). In addition, these questionnaires were associated with navigation performance to the same extent, and both seemed to be tapping into the same construct. Overall, therefore, these results permit the conclusion that these widely used navigation questionnaires reflect their purported cognitive function.
By contrast, the memory questionnaires that we examined were not associated with autobiographical memory ability as measured by AI internal (episodic) details. Six commonly used memory questionnaires were included; four subscales of the MEQ (Sutin & Robins, 2007), the SMQ (Bennett-Levy & Powell, 1980), and the SAM Episodic subscale (Palombo et al., 2013). We are not the first to note the lack of correlation between memory questionnaires and performance measures of memory. A review of 14 memory questionnaires as far back as 1982 found limited relationships between any of the memory questionnaires and laboratory measures of memory performance (Herrmann, 1982). In addition, a more recent study also failed to find a relationship between the SAM Episodic subscale and AI internal details (Palombo et al., 2013).
However, memory questionnaires were not completely divorced from autobiographical memory measures. Instead, with the exception of the MEQ Sharing subscale, all of the memory questionnaires correlated with autobiographical memory vividness. It may, therefore, be that questionnaire measures of memory actually reflect an individual's ability to visualise their memories, rather than the number of details they can explicitly produce.
Future thinking is a relatively new area of formal study, and we were only able to identify and include one future thinking questionnaire -the SAM Future subscale (Palombo et al., 2013). However, here, the SAM Future subscale was not associated with performance on the future thinking task. To the best of our knowledge, this is the first study to compare responses on the SAM Future subscale with a measure of future thinking. Further work is, therefore, required to understand the relationship between questionnaire measures of future thinking and future thinking task performance.
As well as examining whether questionnaires reflected their purported cognitive functions, we also investigated if a visual imagery or autobiographical memory process is actually what is being assessed by questionnaires, despite their purporting to be related to distinct cognitive functions. To do this, we looked at the associations between performance on each task and the questionnaires associated with the other cognitive function groups. This identified an overall role for imagery in imagination, autobiographical memory vividness and future thinking. The imagery questionnaires were associated with all three tasks, and best reflected performance, over and above that of the memory and future thinking questionnaires. This result aligns with previous findings of a link between subjective reports of imagery and elements of autobiographical memory and future thinking (D'Argembeau & Van der Linden, 2006;Vannucci et al., 2016). It also mirrors our finding of a key role for scene imagery in performance on imagination, autobiographical memory and future thinking tasks (Clark et al., 2019).
By contrast, there was a different pattern of associations between questionnaires and navigation scores. Only one non-navigation questionnaire was associated with navigation task performance -the OSIVQ Spatial subscale. While nominally an imagery questionnaire, the OSIVQ Spatial subscale was not associated with performance on imagination, autobiographical memory (internal details or vividness) or future thinking tasks, and instead was only linked to navigation. In other words, the OSIVQ Spatial subscale followed the same pattern of results as the navigation questionnaires. Of note, scores on the OSIVQ Spatial subscale have previously been associated with performance on "spatial" laboratory tasks such as paper folding and mental rotation (Blazhenkova & Kozhevnikov, 2009), tasks which have also been linked to navigation performance (Clark et al., 2019). In line with the aims of the OSIVQ, the Spatial subscale is, therefore, reflecting a different cognitive process to that of the OSIVQ Object-Scene subscale, with the OSVIQ Spatial subscale related to navigation, while the OSIVQ Object-Scene subscale reflects imagination, autobiographical memory vividness and future thinking.
The lack of an association between the memory questionnaires and AI internal details was a surprise, particularly as memory questionnaires and AI internal details are such widely used measures of autobiographical memory ability (and indeed are often used interchangeably), and so we might expect them to be related. Moving forward, it will be important to tease apart exactly what each of these test instruments is in fact measuring.
According to the scoring procedure of the AI, internal details are those that could reasonably reflect episodic re-experiencing (Levine et al., 2002;Palombo, Alain, Söderlund, Khuu, & Levine, 2015; but see also Strikwerda-Brown, Mothakunnel, Hodges, Piguet, & Irish, 2019). The idea behind this is to remove any experimenter subjectivity when scoring, and so provide an unbiased estimate of the amount of episodic information in the memory as described by an individual. This methodology has been widely used and has been shown to be sensitive to individual differences in healthy compared to pathological aging (e.g. Grilli, Wank, Bercel, & Ryan, 2018;Murphy, Troyer, Levine, & Moscovitch, 2008) and damage to the medial temporal lobe, in particular the hippocampus (e.g. McCormick, Rosenthal, Miller, & Maguire, 2018;Rosenbaum et al., 2008;Steinvorth, Levine, & Corkin, 2005). In addition, in young, healthy participants, AI internal details has been associated with combined CA2/3 dentate gyrus volume in the left hippocampus (Palombo, Bacopulos et al., 2018) and measures of fornix white matter (Hodgetts et al., 2017). However, while AI internal details may reflect some aspects of individual differences in autobiographical memory recall, the lack of an association between the memory questionnaires and AI internal details suggests that they might not reflect an individual's subjective experience of memory recall.
Instead, the memory questionnaires were associated with AI vividness, suggesting that the subjective experience of memory is particularly related to the vividness with which a memory is recalled. This has an interesting parallel with findings in people with "severely deficient autobiographical memory", where there was no difference compared to matched controls in terms of the number of internal details for autobiographical memories aged 1 week, 1 month, 1 year and 10 years old, but instead there were lower vividness ratings (Palombo et al., 2015;Palombo, Sheldon, & Levine, 2018). Furthermore, comparisons of true and false memories have found that the latter are less vivid (Heaps & Nash, 2001), whereas emotionally salient memories are typically highly vivid events (Rubin & Kozin, 1984;Talarico, LaBar, & Rubin, 2004). Vividness, therefore, seems to be central to the autobiographical memory recall experience.
Overall, memory questionnaires and AI internal details seem to be measuring different things. On the one hand, this may reflect the multifaceted nature of autobiographical memory recall. However, going forward, it will be important to no longer use memory questionnaires and AI internal detail measures interchangeably. Individual differences identified when using memory questionnaires (e.g. Sheldon et al., 2016) are unlikely to reflect "episodic memory" in the same manner as individual differences identified by AI internal details. Indeed, the results presented here suggest that memory questionnaire findings are related to memory vividness and, consequently, that when an individual is commenting upon their memory, it may be that they are actually referring to the extent to which they can visualise their memories, rather than the amount of detail they can recall. Which measure is most appropriate will likely depend upon the research question being asked.
Finally, we were also interested in whether a broad profile of a participant could be obtained by using a quick-to-administer set of single questions on each cognitive domain of interest. We compared the performance of our exploratory One Sentence Questionnaire with that of the established questionnaires. This identified a number of interesting parallels. For imagery, the three one sentence questions performed at a similar level to the OSIVQ Object-Scene and SUIS questionnaires. The PSIQ, however, outperformed the One Sentence Questionnaire, as it did the other established questionnaires. For autobiographical memory, in line with the established questionnaires, none of the memory questions from the One Sentence Questionnaire were associated with AI internal details. However, the imagery based memory questions were associated with autobiographical memory vividness, and this was to the same extent as the established memory questionnaires. For future thinking, as with the established questionnaire, the future thinking One Sentence Questionnaire questions did not correlate with performance. Finally, for navigation, the navigation ability question of the One Sentence Questionnaire correlated with navigation performance, and did so to the same extent as the established questionnaires. We observed, therefore, a parallel in the patterns of correlations between our exploratory One Sentence Questionnaire and the established questionnaires.
Overall, our exploratory One Sentence Questionnaire suggests that a broad profile of information about a person, that is comparable to established questionnaires, can be obtained quickly and easily using only single questions on a specific topic. Moreover, it seems that even our One Sentence Questionnaire can be further shortened, with only two questions required to predict performance on imagination, autobiographical memory (vividness), future thinking and navigation; one on imagery ability (reflecting imagination, autobiographical memory vividness and future thinking) and one on navigation ability. We emphasise, however, that this was an exploratory aspect of the study and additional work is needed in relation to this questionnaire, for example, replication in a separate sample of young, healthy participants and in other groups (e.g. clinical groups, older adults), and tests of validity (e.g., test-retest data) are also required. However, this initial exploration suggests that, while in-depth questionnaires certainly have their place, if a broad overview is required in a large sample of young, healthy volunteers, then asking a small number of single questions may be an efficient and accurate methodology in some circumstances.
In this study we were not able to include every published questionnaire relating to our cognitive functions of interest, nor every naturalistic cognitive task. However, those we examined here are widely used and, consequently, we believe they are representative of the extant literature.
In summary, the current study conveys four messages. First, imagination and navigation questionnaires reflect performance on naturalistic task measures of their purported cognitive functions. Second, memory questionnaires are associated with autobiographical memory vividness and not AI internal (episodic) details. Third, imagery questionnaires are better associated with autobiographical memory vividness and future thinking than the questionnaires purporting to reflect these functions. Finally, initial exploratory analyses (requiring replication) suggest that a broad profile of information may be obtained efficiently using a small number of simple single questions, and these model task performance comparably to the established questionnaires in the current sample of young, healthy adults. Overall, while some questionnaires can act as proxies for behaviour, the relationships between memory and future thinking tasks and questionnaires are more complex and require further elucidation.

Funding
The authors were supported by a Wellcome Principal Research Fellowship to E.A. Maguire (101759/Z/13/Z) and the Centre by a Strategic Award from Wellcome (203147/Z/16/Z).

Data availability
The data used in the analyses presented here are available in the Supplementary Materials.