A Visual Remote Associates Test and Its Validation

The Remote Associates Test (RAT) is a widely used test for measuring creativity, specifically the ability to make associations. The Remote Associates Test normally takes a linguistic form: given three words, the participant is asked to come up with a fourth word associated with all three of them. While visual creativity tests do exist, no creativity test to date can be given in both a visual and linguistic form. Such a test would allow the study of differences between various modalities, in the context of the same creative process. In this paper, a visual version of the well-known Remote Associates Test is constructed. This visual RAT is validated in relation to its linguistic counterpart.

Complex creativity tasks, like the solving of practical insight problems, might elicit both linguistic and visual creativity. Creativity batteries of tests which include both visual and linguistics tests do exist-like the Torrance Tests of Creative Thinking (TTCT), which contains both verbal and figural tests (Kim, 2006). However, no creativity evaluation task or test exists which can be given separately in both linguistic and visual forms, thus affording cross-domain comparison of a particular set of creative processes. The usefulness of such a test would be to: (i) check whether the same creative processes act across the visual and linguistic domain; (ii) compare performance results in various domain; and (iii) posit domain-relevant differences.
Aiming to fill this gap, this paper takes a well established creativity test, the Remote Associates Test (Mednick and Mednick, 1971) and describes an approach toward developing a visual derivate of this test.
A computational linguistic solver for this test-comRAT-C (Olteţeanu and Falomir, 2015) was previously implemented under a theoretical creative problem-solving framework (CreaCogs; Olteţeanu, 2014;Olteţeanu, 2016). Part of the formalization for comRAT-C is used to inform the creation of a visual form of the Remote Associates Test.
The rest of the paper is organized as follows. The Remote Associates Test and the construction of its visual counterpart (vRAT) are discussed in the next section. Two studies with human participants who were given vRAT queries are described in the Studies section. Results of these studies are presented in results section. A discussion on the visual RAT items and normative data takes place after the results section, where further work is also proposed.

AN APPROACH FOR CREATING THE VISUAL REMOTE ASSOCIATES TEST
The Remote Associates Test (RAT), originally devised by Mednick and Mednick (Mednick and Mednick, 1971), aims to reflect the creative ability of the participant through measuring their skill at remote compound linguistic association. In the RAT, participants are given three words-like CREAM, SKATE and WATER-and asked to come up with a fourth word which relates to all of them. A good answer to this particular query is ICE.
The Remote Associates Test has been widely used in the literature (Dorfman et al., 1996;Ansburg, 2000;Ansburg and Hill, 2003;Ward et al., 2008;Cai et al., 2009;Cunningham et al., 2009). Stimuli for this test exist not just in English (Bowden and Jung-Beeman, 2003;, but also German (Landmann et al., 2014), Chinese (Shen et al., 2016;Wu and Chen, 2017), Italian (Salvi et al., 2016), Romanian (Olteţeanu et al., 2019b), etc. An approach toward generating functional RAT queries has also been proposed (Olteţeanu et al., 2019a), enhancing the repository of available RAT queries. Furthermore, a computational solver exists that solves the compound RAT (Olteţeanu and Falomir, 2015) and correlates in performance (both Accuracy and Response Times) with existing normative data (Bowden and Jung-Beeman, 2003). Also, a computational generator of RAT queries was implemented  and shown to be useful in designing empirical explorations with a high degree of control .
Bowden and Jung-Beeman have proposed normative data on 144 compound RAT problems (Bowden and Jung-Beeman, 2003). Besides the compound (or structural) form of the Remote Associates Test, in which the relationship between query words and answer words is linguistic, Worthen and Clark (Worthen and Clark, 1971) argued that some of the items proposed by Mednick and Mednick are functional-that is the relationship between them (e.g., items like "bird" and "egg"), rather than just a structural one (e.g., items like "black" and "magic"). The functional items proposed by Worthen and Clark were lost, however a computational implementation aimed to recover their concept by generating functional versions of the RAT (Olteţeanu et al., 2019a).
In a previous formalization which aimed at computationally solving this test (Olteţeanu and Falomir, 2015), the Remote Associates Test was described as follows: three words w a , w b , w c are given to the participant; a word which relates to all these words needs to be found, w x . In solving the compound RAT, (w a , w x ), (w b , w x ), and (w c , w x ) or the reverse ordered terms (w x , w a ), (w x , w b ), (w x , w c ) have to be composed words in the language in which the RAT is given in. In the case of composed words, w z might be a word composed of a query word and a solution word, (w x w a ) or (w a w x ). For example, for the query AID, RUBBER and WAGON, the answer term BAND constructs joint composed words with some of the query terms (BAND-AID, BANDWAGON), but not with others (RUBBER BAND). Note that the answer word is also not in the same position in the three linguistic structures.
In order to devise a visual RAT, the following approach extends this formalization from the linguistic to the visual domain. Thus, if the terms w a , w b , w c , and w x stood for words in the linguistic RAT, in this visual approach they stand for visual representations of objects and scenes. The visual RAT can be described thus as follows: Given visual entities w a , w b , w c , there exists an entity w x , which generally co-occurs visually with the other shown entities w a , w b and w c .
Applying this approach, visual queries can be created. For example, Figure 1 provides visual representations of the objects GLOVE, HANDLE and PEN. An appropriate answer to this query is HAND, because hands are visual entities that co-occur with each of the given three objects.
The visual entity HAND can be considered a visual associate of each of the initial objects GLOVE, HANDLE and PEN. This notion of a visual associate is intended to play in this approach a role analogous to that of a linguistic associate in the linguistic RAT. If compound associates are possible in the linguistic compound RAT (band and aid) and functional associates are possible in the linguistic functional RAT (bird and egg) (Olteţeanu et al., 2019a), visual associates are meant to encompass both categories. A further differentiation could be made between objects that co-occur together visually (compound-like) and objects that afford interactions between them (functional-like). This work will focus on establishing the visual RAT, without delving deeper into this differentiation. Each initial object is thus considered to have a variety of visual associates. Visual associate pairs which co-occur together, in a previously encountered visual scene or experience, play the role that composed words or linguistic structures in which w a and w x co-occur. Thus, for a natural or artificial cognitive system to be able to solve queries like the one in Figure 1, it needs to be acquainted with visual experiences containing Visual queries do not have to involve body parts of the solver that interact with the given object -they could also represent objects in the environment and scenes. For example, Figure 2 shows participants a BATHTUB, GLASS and BEACH. A visual item co-occurring with each of them and thus a potential answer is WATER. The visual representations were chosen or crafted so that they do not show the answer-thus an empty bathtub and empty glass are presented, and only a part of a beach that does not depict the sea is displayed.
This approach can be summarized as follows: (a) Visual objects or scenes replace words and expressions; (b) Visual relationships between objects take the place of linguistic relationships-be it relationships of co-occurrence or functional; (c) The solver is expected to rely on principles of visual association more often than on linguistic association ones.
Using this approach, a set of visual RAT queries was manually created. In the following section, this test is evaluated in comparison to the linguistic RAT, validated and considered in relation to other measures.

STUDIES
In order to evaluate the performance of humans in the visual RAT items designed, to construct an initial set of normative data and assess the potential relationships between performance in the visual RAT and performance in the linguistic compound RAT, two studies were conducted. The first study was completed by 42 participants that have previously solved the compound RAT. In order to validate the results, a second study was set up, for which power was calculated based on the result of the first study. In the second study, participants solved both visual and linguistic queries, and measures of verbal fluency were also applied.

Study 1
3.1.1. Method Using the approach described in the previous section, a set of 46 visual RAT queries was manually created. These queries were administered to the participants of the study for validation.
The participants were then provided instructions with example vRAT queries. They were then presented with 46 vRAT queries in randomized order.

Participants
The participants from a previous study on compound RAT queries created with a computational solver comRAT-G  were invited to solve the visual RAT queries. Of the previous compound RAT queries solvers, 42 people (28 females and 14 males) participated; of these, four participants left more than 20% queries of either visual or linguistic RAT unattempted and hence the data analysis has been done for N = 38. Figure 3 shows the demographic distribution of the Study-1 participants. The majority of the participants belonged to age bracket of 30-40 years (19), had finished their undergraduate degree (17) and had rated their creativity (14) and problem-solving skills (18) as "above-average."

Creativity Metrics
Accuracy was a creativity metric used for this study: vRAT accuracy, comRAT-G accuracy, B-JB accuracy and linguistic RAT accuracy. These stood for: -vRAT accuracy: the number of correctly answered vRAT queries' -comRAT-G accuracy: the number of correctly answered queries from the corpus created with comRAT-G  and -B-JB accuracy: the number of correctly answered Bowden & Jung-Beeman queries.
The comRAT-G accuracy and B-JB accuracy were taken from the participants' previous performance in the comRAT-G study.
This was compared to their performance in the visual RAT. The linguistic RAT accuracy is the sum of comRAT-G accuracy and B-JB accuracy, denoting the total number of linguistic RAT queries answered correctly. Response Times were the second metric used. vRAT RT, comRAT-G RT and B-JB RT recorded the mean response times of the participants when correctly answering the corresponding vRAT, comRAT-G, and B-JB queries.

Method
For further validation, the same set of vRAT queries used in the first study was also used in the second study. To study the potential relationships between performance in the linguistic RAT and the visual RAT, participants were presented with 48 visual RAT queries to solve, and then with 48 linguistic RAT queries. The linguistic RAT queries were a randomized mix of 24 comRAT-G queries  and 24 queries from the Bowden & Jung-Beeman dataset (Bowden and Jung-Beeman, 2003). The 48 queries were the same for all participants but were presented to each of them in a different order.

Procedure
The participants for the second study were recruited using two platforms: Figure-Eight (F8) and Mechanical Turk (MTurk) 3 . After enrolling for the test on either of the platforms, they were redirected to our website where the study was setup using jsPsych.
Each participant was first asked a set of questions about their gender, age group, education, creativity and problem-solving skills. After the demographic questions, the participants were administered two verbal fluency tests: a phonemic test (with "F, " "A, " "S" as stimuli letters) and a semantic test (with "animal, " "fruit" and "furniture" as categories). In these verbal fluency tests, participants were asked to list as many words as they could in one minute for each of the verbal fluency stimuli.
Then, the participants were presented with the instructions for solving the visual RAT, two example queries and their answers. After this, they were presented with visual RAT queries in random order. The response time for every response was recorded.
After solving visual RAT queries, participants were presented with the instructions for solving the linguistic RAT followed by two example queries one by one and asked to try to answer them before the correct answers were revealed to them. After this, the participants were asked to solve 48 linguistic RAT queries.

Participants
26 people (15 female and 11 male) participated from F8 and 144 people (67 female and 77 male) participated from MTurk. Figure 4 shows the demographic distribution of the 170 participants in Study-2. The majority of participants belonged to the age bracket of 30-40 years (66), had finished an undergraduate degree (78) and had rated their creativity (84) and problem-solving skills (71) as "average." 3 participants from F8 and 6 participants from MTurk had left more than 20% of the queries unanswered and hence were not included in the data analysis.

Creativity Metrics
F-A-S Test (Patterson, 2011) is a verbal fluency test where a participant lists as many words that they can think of starting with the letters "F, " "A, " and "S" within a specified timeframe, usually 1 min for each of those letters. The F score, A score and S score recorded the number of words participants produced starting with the corresponding letters. FAS score was calculated as the sum of these three scores. The Category score recorded the

RESULTS
The following section presents the results of the two studies.

Descriptive Data
Each vRAT query was answered correctly on average by 18.8 participants (SD = 11.98). The query "BOTTLE-GRAPE-CELLER"(answer: "WINE") was found to be the easiest query with 40 participants answering it correctly. "HAND MIRROR-PURSE-RED MARK" (answer: "LIPSTICK") was the most difficult query to answer with only 1 participant answering it correctly. 21 (45.65%) vRAT queries out of a total of 46 were answered by more than half of the participants.

Correlations
The correlation between vRAT accuracy and comRAT-G accuracy was observed to be 0.431 (p < 0.01), as shown in Table 1. For calculating the correlations between response times, outliers were found using the Inter-Quartile Range method and removed. A significant correlation between response times of correct responses for vRAT and linguistic RAT queries was observed (n = 38, r = 0.477, p < 0.002).

Validity
As a reliability metric, Cronbach's alpha was calculated for the vRAT data gathered. Cronbach's alpha is a measure of internal consistency, that is, how closely related a set of items are as a group. It is considered to be a measure of scale reliability. A Correlations significance level indicated as follows: 0.05 -"*"; 0.01 -"**".
Cronbach's alpha above 0.75 is considered to show a high internal validity. An alpha value of 0.751 was observed for the accuracy of the participants in vRAT queries.

Results -Study-2
The responses of all the participants from both the platforms was put into three samples: data from F8 participants, data from MTurk participants and combined data from all the F8 and MTurk participants for the analysis.

Descriptive Data
Each vRAT query was solved on an average by 15.62 participants (SD = 6.01). Each participant on an average spent 13.0 s (SD = 7.22) on the vRAT queries. For the total phonemic verbal fluency metric (FAS score), a mean of 46.96 words (SD = 11.71) was observed. For the total semantic verbal fluency metric (Category score), a mean of 41.64 words (SD = 11.33) was observed. Table 2 shows detailed statistics on the verbal fluency scores and RAT accuracy metrics. Table 3 shows the descriptive statistics on response time metrics.
For calculating the correlations between response times, outliers were found using the Inter-Quartile Range method and removed. High significant correlations were observed between the response times for vRAT and linguistic RAT queries (n = 170, r = 0.70, p < 0.001); this was a consequence of correlations between performance in the visual RAT and the comRAT-G items, and also between the visual RAT and B-JB items. A high significant correlation was also observed between the response times for the correct responses for comRAT-G queries and the Bowden & Jung-Beeman queries (n = 170, r = 0.49, p < 0.001). Table 5 shows the correlations between the response times for the RAT metrics.
The correlations for the individual samples of F8 and MTurk participants can be found in the Supporting Information section at the end. No significant correlations were found between the self-ratings of creativity or problem solving skills and their performance in vRAT or linguistic RAT.

Validity
For checking the reliability of the data, Cronbach's alpha was calculated with all RAT accuracy metrics of all the samples as shown in Table 6. The Cronbach's alpha for response time metrics is shown in Table 7. All internal validity results for the visual RAT are above 0.75 showing that the results have high internal consistency.

DISCUSSION
This paper focused on the creation and validation of a set of visual RAT queries. As can be seen from the results, the validation of the visual queries was successful. A significant and positive correlation has been seen between the performance of the visual RAT queries solvers, and their previous performance in linguistic queries given in a previous session (Study 1) or in the same session (Study 2). This shows that the visual RAT queries may be a way to capture the associative factor of creativity in the visual domain, and that future versions of the RAT could be given, using these stimuli, in two sensory modalities.
A few participants presented an interesting example of performing very well on one of the RATs and very poorly or averagely on the other. For example, participant 42802 from Study-2 performed exceptionally well in vRAT queries (vRAT accuracy = 37, vRAT mean = 15.62) but had an average linguistic RAT accuracy (=24, linguistic RAT mean = 20.71). In Study-2, participant 69355 had an extraordinary linguistic RAT accuracy (=45, linguistic RAT mean = 20.71) but only had an above average performance in vRAT (vRAT accuracy = 20, vRAT mean = 15.62). Participant 28941 from Study-2 performed very well in both vRAT (vRAT accuracy = 25) and linguistic RAT queries (linguistic RAT accuracy = 33). In Study-1, participant 97828 had a very high vRAT accuracy of 38 (vRAT mean = 20.94) but had an average linguistic RAT accuracy of 54 (linguistic RAT mean = 51.26) in our previous linguistic RAT study from which they were recruited. These few outlier cases open the door to interesting questions regarding how much such outliers rely on visual versus linguistic skill (rather than some pure form of association skill), or have their association skill in one modal domain much stronger than the one in the other domain. Whether and how association skill can be analyzed separately from modal ability is a question for further work.
One of the advantages of being able to give the RAT in a different modality is that such a version can be administered to participants with different native languages. In parallel to     These empirical studies show that the visual RAT is a highly reliable tool, related to the linguistic RAT. What they do not yet show is to what extent the queries are processed visually. Future experimentation with fMRI or EEG equipment would be necessary to make any statements on this matter. The contribution or collaboration of any neuroscientists in this direction is most welcomed.
As pointed by our reviewers, to which extent linguistic associations are used by participants solving the visual RAT and visual associations when solving the linguistic RAT is hard to currently assess. One way to do this would be to establish a set of queries which have both linguistic and visual associations, and observe which solving route participants are most likely to take (and whether this is dependent on their skill in those particular domains).
As future work, we plan to implement the mechanisms for comRAT-C (Olteţeanu and Falomir, 2015) in a computational solver for the visual Remote Associates Test. This will allow us to check whether the correlation between the comRAT-C probability and human performance in linguistic queries (r = 0.49, p < 0.002) is maintained in the visual domain between a computational solver and human performance. A computational implementation of a visual RAT solver would require the gathering of data on association strength in the visual domain.
A very interesting potential future step would be the application of the RAT formalization to create RAT stimuli in a third modality. Currently, auditory and smell modalities are considered as possibilities. The initial difficulties encountered with the smell modality are related to the stimuli themselves. Attempting to obtain the stimuli from perfumers and consulting with them on the matter has made us aware that part of them are reluctant to describe their craft as a representational art (that is have smell indicate or stand in for objects), but rather as a trigger for other sensations and emotions (e.g., various smell combinations standing in for "freshness"). Thus artists of the smell modality already operate, to a certain extent, with associations. Except these associations may not point to a distinct object, but to a quality or sensation which may be, at times, hard to describe linguistically.
If comRAT-V could solve the visual RAT, it would be interesting from a computational perspective to obtain a secondary measurement for the associative process in CreaCogs, in a form of a different task. Our initial work in the direction of a second task is the Codenames board game (Zunjani and Olteteanu, 2019).
In summary, a good size set of visual RAT stimuli has been proposed and validated as part of this paper. The results show our approach to creating a visual RAT is successful. The visual stimuli will be made available to other researchers for further validation, and for scholarly pursuit of a deeper understanding of the associative factor in creativity. This work opens the path to multi-modal exploration of the association creativity process.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
A-MO conceptualized the visual RAT and designed the experiments for its validation. FZ contributed in the programming and deployment of the experiments online, and did the ratings and the analysis of the data collected from the experiments (except the first study). All authors wrote sections of the manuscript, revision, read, and approved the submitted version.

FUNDING
A-MO was the recipient of the grant (OL 518/1-1) by the German Research Foundation (Deutsche Forschungsgemeinschaft) for the project Creative Cognitive Systems (CreaCogs). This grant also supported the research stay of FZ at the Freie Universität Berlin of which the submitted work was a part.