Predicting Context-dependent Cross-modal Associations with Dimension-specific Polarity Attributions . Part 2 : Red and Valence

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers.

Studies convincingly show that the task specific context can influence the associative meaning of a concept in a task (e.g., Blair 2002;Elliot & Maier, 2014;Frühholz, Trautmann-Lengsfeld, & Herrmann, 2011;Lakens, Fockenberg, Lemmens, Ham, & Midden, 2013). For example, studies suggest that in an achievement context, red is associated with failure and negativity (e.g., Moller, Elliot, & Maier, 2009), while in a romantic context, red is associated with attraction and excitement (e.g., Elliot & Pazda, 2012). Although context-effects have repeatedly been demonstrated, it is difficult to predict how salient features in the context influence the meaning of specific concepts.
In a first series of studies (see Schietecat, Lakens, IJsselsteijn, & de Kort, 2018), we introduced the dimension-specificity hypothesis to understand and predict the context-dependency of cross-modal associations between concrete concepts (e.g., bright vs dark) and affective abstract concepts (e.g., aggression vs calm). 1 Building on the affective theory of meaning (Osgood, Suci, & Tannenbaum, 1957-see Part 1, Schietecat et al., 2018, for a more detailed discussion) and the polarity correspondence principle (Proctor & Cho, 2006), the dimension-specificity hypothesis predicts that cross-modal associations emerge depending on which affective dimension of meaning (i.e., the evaluation, activity, or potency dimension) is most salient in a specific context. The salience of dimensions of meaning depends, in part, on the relative conceptual distances between bipolar opposed concept pairs (e.g., good vs. bad). More specifically, we suggested that when a concept pair (i.e., bipolar opposed concepts) is present in a cognitive task, the dimension(s) on which the bipolar opposed concepts differ most will become salient. For example, when people process the concepts 'good' and 'bad', which are highly opposed on the evaluation dimension, but less strongly opposed on the activity dimension (Osgood et al., 1957), the evaluation dimension will become more salient than the activity dimension. As a consequence, the evaluation dimension will underlie cross-modal associations between concrete and affective abstract concepts that emerge in the cognitive task. People will attribute positive polarity to one of the polar opposites, and negative polarity to the opposite pole (i.e., polarity attribution, Osgood et al., 1957). Based on the polarity correspondence principle (Proctor & Cho, 2006) and the affective theory of meaning (i.e., parallel polarity, see Osgood et al., 1957), we expected that associations between concepts emerge when both are attributed 'plus' or both are attributed 'minus' on the bipolar dimension.
When multiple concept-pairs are present in a cognitive task (e.g., in an affective priming paradigm), the dimension-specificity hypothesis predicts that a weighting process will determine which of the activated dimensions becomes most salient in the task specific context. For example, when categorizing red, green, positive, and negative related stimuli in a cognitive task, both concept pairs (i.e., red vs green, and positive vs negative) have the highest conceptual distance on the evaluation dimension, which will therefore receive the most weight, and thus should become the salient dimension of meaning that underlies cross-modal associations in the cognitive task. Based on the salient dimension in the task, plus and minus polarities will be attributed to the bipolar concepts (e.g., green and positive will both be assigned a plus polarity on the evaluation dimension, whereas red and negative both will be assigned a minus polarity). Following the idea of polarity correspondence (Proctor & Cho, 2006) and parallel polarity (Osgood et al., 1957), we expected that associations between concrete and affective abstract concepts that share plus or minus polarities will become activated (e.g., green and positive, red and negative). For a more extensive discussion of the affective theory of meaning and the dimension-specificity hypothesis, see Part 1 (Schietecat et al., 2018). Figure 1 represents an illustration of the proposed process to predict contextdependent cross-modal associations with dimensionspecific polarity attributions.
In a first series of experiments (see Part 1, Schietecat et al., 2018), we tested the dimension-specificity hypothesis by predicting context-dependent associations between aggression-related concepts and colors, saturation, and brightness. We manipulated the context by adjusting the opposing color stimuli (Experiment 1), the presentation of the brightness stimuli (Experiment 2), and the labels of the task (Experiment 3). The results showed preliminary support for the emergence of dimension-specific polarity attributions. In the current manuscript we aim to test whether the dimension-specificity hypothesis can also explain and predict the context-dependency of associations people have with one of the most studied colors in psychology, namely the color red. If the dimensionspecificity hypothesis can explain well-known context effects in the literature investigating associations between red and valence, this would further strengthen the idea that the dimension-specificity hypothesis is a useful theory to predict cross-modal associations between concrete and affective abstract concepts. In addition, we aim to test a novel prediction of the dimension-specificity hypothesis, namely the idea that when a concept pair is not characterized by a meaningful opposition on the salient dimension (e.g., positive-negative on the activity dimension), the polar opposites of that concept pair will not form a plus or a minus pole on that dimension, and therefore no strong association will emerge based on parallel polarity.

Context-dependency of the color red
Research on the influence of color on psychological functioning has become more popular in the last decade, in fields such as color psychology and consumer behavior (e.g., Elliot, 2015). The color red has been of particular interest, and studies have shown that the meaning of the color red is context-dependent (Elliot, 2015). For example, red has been proposed to elicit feelings of attractiveness (e.g., Elliot, Greitemeyer, & Pazda, 2013;Young, 2015, but see also Lehmann & Calin-Jageman, 2017;Peperkoorn, Roberts;Pollet, 2016 for studies which do not support this specific effect of the color red). Red also shows a strong association with danger (Pravossoudovitch, Cury, Young, & Elliot, 2014). In addition, red is suggested to be associated with negativity (e.g., Gil, & Le Bigot, 2015;Moller, Elliot, & Maier, 2009) and anger (e.g., Fetterman, Robinson, & Meier, 2012;Young, Elliot, Feltman, & Ambady, 2013) but also with warmth and excitement (e.g., Bennett & Rey, 1972).
Polar oppositions in the paradigms used to study colorassociations seem to influence the associations people have with the color red. In daily life, a red-green opposition is commonly used to communicate the difference between a problematic, dangerous situation (e.g., a 'stop' traffic light, wrong answers on a test, a low battery level on an electronic device) and a situation where everything is ok and safe (e.g., a 'go' traffic light, correct answers on a test, a full battery on an electronic device). Not surprisingly, red in the context of green is often associated with negativity and danger (e.g., Moller et al., 2009;Pravossoudovitch et al., 2014). However, a red-blue opposition is more commonly used to express temperature differences (e.g., the hot vs cold tap in a kitchen). Therefore, red in the context of blue is generally associated with warmth (e.g., Bennett & Rey, 1972;Ho, van Doorn, Kawabe, Watanabe, & Spence, 2014).

Current Experiment
In the current set of studies, we aim to investigate whether associations between red and valence can be predicted based on the activated dimensions of meaning (i.e., the evaluation or activity dimension) through the opposing concepts in the task. We manipulated the color opposition (i.e., either red vs blue or red vs green stimulus figure categorizations) and the word pairs (i.e., positive vs negative, aggressive vs calm, and enthusiastic vs relaxed attribute categorizations) across a set of five implicit association tests (IAT, Greenwald, McGhee, & Schwartz, 1998). Based on earlier studies that have examined how polar opposites activate conceptual dimensions (e.g., Lakens, 2012;Lakens, Semin, & Foroni, 2012;Paradis & Willners, 2011), we expected that red in opposition to green would activate the evaluation dimension based on its association with ' dangerous' versus 'safe', whereas red in opposition to blue would activate the activity dimension based on its association with 'hot' vs ' cold' (Osgood et al., 1957). We relied on these context-dependent associations to test a prediction from the dimensionspecificity hypothesis that red as opposed to blue would be associated with highly active (negative) concepts such as aggression (Experiment 4), but also with highly active (positive) concepts such as enthusiasm (Experiment 5).

Sample size justification and statistical analyses
The sample size for Experiment 1 was determined a priori based on a target of .9 power with at least medium effect sizes expected, using statistical power analysis (G*Power ;Faul, Erdfelder, Lang, & Buchner, 2007). Experiments 2, 3, 4, and 5 were part of educational courses, and therefore the sample consisted of all the students participating in the course. Across all experiments, extreme fast and slow responses were removed by excluding values that were 1.5 times the interquartile range above the 3th quartile, or 1.5 times the interquartile range below the 2nd quartile within participants (Tukey, 1977). The outliers and errors are reported (in percentages) for each experiment in the results section. For each study, we report frequentist and Bayesian statistics, confidence intervals around Hedges' g based on d av , and the Bayes Factor (JZS) with a r-scale of .707 (Rouder, Speckman, Sun, Morey, & Iverson, 2009). When we expected no or a weak IAT effect to emerge, we additionally report the results of equivalence tests (i.e., the 'two one-sided t-tests approach', see Lakens, Scheel, & Isager, 2018). For completeness, we report the robust statistics using the Yuen-Welch method for comparing 20% trimmed means (Wilcox, 2012;Wilcox & Tian, 2011) in the supplementary materials (see the Appendix A in the supplementary materials for the robust statistics).

Experiment 1: Red, Green, Blue, and Valence
In Experiment 1, we first tested whether red as part of a red-green opposition would be more strongly associated with negativity than red as part of a red-blue opposition by asking participants to perform two IATs. We expected that the red-green opposition (as presented in stimulus colors and target labels) would have the highest conceptual distance on the evaluation dimension based on the association with ' danger' versus 'safe' (Osgood et al., 1957). Positivity and negativity are more highly opposed on the evaluation dimension as compared to the activity dimension (Osgood et al., 1957). Therefore, the positive-negative opposition (as presented in stimulus words and attribute labels) should also activate the evaluation dimension. Weighting the dimension distances of all the concept pairs, the evaluation dimension was expected to receive the most weight, and to become salient in this task. Based on the ' danger' versus 'safe' association (Pravossoudovitch et al., 2014), we expected that red would form the minus pole on the evaluation dimension, whereas green would form the plus pole on the evaluation dimension. Similarly, research has shown that positive forms the plus pole on the evaluation dimension, whereas negative forms the minus pole on the evaluation dimension (e.g., Lakens, 2012;Osgood & Richards, 1973). Based on these polarity attributions, we expect associations to emerge between red and negative, and green and positive.
In contrast, we expected that the red-blue opposition (as presented in stimulus colors and target labels) would have the highest conceptual distance on the activity dimension based on its association with hot versus cold. At the same time, the concept pair positive-negative (as presented in stimulus words and attribute labels) is highly opposed on the evaluation dimension (Osgood et al., 1957). When weighting the dimension distances of all the concept pairs, the evaluation and activity dimensions were expected to receive approximately equal weights. Therefore, there would be an equal chance that one of the two dimensions would become salient. Since red and blue are not expected to form an opposition on the evaluation dimension, and positive and negative are not expected to form an opposition in terms of activity, we expected no polarity attribution to occur, and therefore no, or at best a very weak, association to emerge between red and negative and blue and positive.

Participants
Forty-two participants (16 females, M age = 21.07 years), all students of Eindhoven University of Technology, volunteered to participate and received a monetary compensation of 5 euro.

Design
All participants performed two IATs, one with red and green figures, and one with red and blue figures, with the order of the IAT's counterbalanced between participants. Each IAT consisted of practice blocks and two critical experimental blocks, where the color-valence pairings of the response keys were manipulated. In one block, red figures and negative words shared a response key (and green/blue figures and positive words shared the other response key), whereas in the other block red figures and positive words shared a response key (and green/blue figures and negative words shared the other response key). The order of the experimental blocks, as well as the response key assignment (whether red figures were assigned to the left or right response key) was counterbalanced between participants.

Procedure
Participants completed the two implicit association tasks in isolation in individual cubicles in front of computers. They were instructed to perform a categorization task where they would have to categorize words and figures. Following the typical procedure in the IAT, participants first received two practice blocks, one with the figures and the red-blue (or red-green) categorization task, and one with the word stimuli and the positive-negative categorization task. The practice blocks consisted each of 24 trials, in which every stimulus was presented 3 times. In a third block, words and colored figures had to be categorized simultaneously. During this block, each stimulus was presented 3 times, yielding a total of 48 trials. In a fourth block, participants received a practice block to learn the inverted response assignments for the word stimuli. In the last block, a combined classification task of 48 trials was presented, in which responding was inverted for one category compared to block three. After finishing the first IAT, the second IAT started, which followed the same procedure. After the second IAT, participants filled out a short demographic questionnaire.
The presentation of the stimuli and registration of the responses were controlled by E-prime software. All word stimuli were presented in black uppercase letters in the middle of a grey (RGB: 128, 128, 128) computer screen. Category labels (good, bad, red, blue; red, green) were shown at the top right and top left corners of the display, referring to the assignment of the categories to responses. Two keys on a QWERTY keyboard were used as response keys (A: left, L: right). In each trial, the stimulus remained on the screen until a response was registered. When participants entered an incorrect response, the stimulus remained on the screen, and an error message was displayed in black beneath the stimulus ("ERROR"). The intertrial interval was 500 milliseconds.

Results and Discussion
Erroneous responses (5%) and extremely fast or slow responses (6%) were excluded from the analysis. We calculated the mean response times in milliseconds for each participant for the two critical blocks for the two IATs. 2 As expected, response times in the block where red figures and negative words were assigned to one response key and green figures and positive words were assigned to the other response key were faster (M = 528 ms, SD = 57 ms), compared to when one key had to be pressed for red figures and positive words, and the other key was assigned To investigate whether the order of the two IATs influenced this effect, we conducted a repeated measure ANOVA with order of the IAT as confounding variable. The results showed no significant interaction effect between order of the IAT and congruency, F(1, 39) = 0.345, p = .56, η p 2 = .009. However, the study was not well-powered to observe such an interaction, so we continued to explore whether the order in which the IATs were performed might have affected reaction times of the red-blue IAT, which motivated a replication study discussed below. Results showed that the participants who conducted the red-blue IAT secondly, responded faster when red Therefore, we expected that the fact that participants sequentially performed the two implicit association tests with red presented in a different color opposition might have influenced their responses in the two IATs. Although the effect of block order on the salience of affective dimensions might be in itself an example of context effects in cross-modal associations, examining it falls outside of the scope of the current set of studies. To prevent possible context effects by performing more than one IAT in sequence, and to examine whether the prediction of no, or at best a weak, IAT effect from the dimension-specificity hypothesis would hold when participants performed only one IAT, we replicated only the red-blue, positive-negative IAT of Experiment 1. Because red-blue is not expected to form an opposition on the evaluation dimension, and positive-negative does not form a strong opposition in terms of activity, we expected no, or at best a weak association to emerge when only one IAT was performed.  The stimuli consisted of words and figures. The positive and negative words were identical to Experiment 1. The figures consisted of red (RGB: 255, 0, 0; xyY; .64, .33, 23.74) and blue (RGB: 0, 0, 255; xyY: .14, .07, 7.84) 3 squares. Participants completed the test at home. The design and procedure of the IAT were similar to the procedure of Experiment 1, except for the software with which the presentation of the stimuli and registration of the responses were controlled by (i.e., Inquisit), and the number of trials (i.e., 60 trials in each critical block, and 20 trials in the practice blocks). In addition, the intertrial interval was 250 milliseconds.

Results and Discussion
Five participants were excluded from the analysis because their mean response times were faster than 50 milliseconds, either due to technical issues or intentionally pressing response keys as quickly as possible. Erroneous responses (7%) and extremely fast and slow responses (6%) were excluded from analysis. We calculated the mean response times in milliseconds for each participant for the two critical blocks. As expected, negative words and red figures (M = 632 ms, SD = 162 ms) were not categorized significantly faster when pressing the same key compared to positive words and red figures, (M = 643 ms, SD = 166 ms), 95% CI [-63.17, 42.39], JZS BF 01 = 4.76, t(29) = 0.4, p = .69, Hedges' g = 0.06, 95% CI [-0.37, 0.25]. These results suggest that these data provide stronger evidence for a null effect than for a relatively large IAT effect (such as observed in the red-green, positive-negative IAT). To test whether the observed IAT effect was smaller than the smallest effect size of interest (SESOI; Lakens et al., in press), we conducted an equivalence test. We used a SESOI of Cohen's d of 0.45, based on the smallest IAT effect found in the first set of studies of Part 1 (i.e., Experiment 2B). The results indicated that the observed effect size (d = 0.07) was significantly within the equivalent bounds of d = -0.45 and d = 0.45, t(29) = 2.06, p = 0.024. These results suggest that we can reject an effect larger than d = 0.45, or smaller than -0.45. To investigate whether this observed IAT effect is significantly smaller than the effect observed in the red-green, positive-negative IAT of Experiment 1, we conducted a repeated measures ANOVA with color opposition (i.e., red-green vs. red-blue) as confounding variable. The results showed a significant interaction between color opposition and congruency effects, F(1, 70) = 8.63, p = .004, η p 2 = 0.11. The results suggest that the observed IAT effect in Experiment 2 is significantly smaller than the observed IAT effect in the red-green, positivenegative IAT of Experiment 1.
To confirm that red as opposed to blue forms no, or at best a weak association with valence as compared to red as opposed to green, we conducted a second replication of the red-blue, positive-negative IAT. Whereas the lightness and shape of the stimuli of Experiment 2 were slightly different from the stimuli used in the red-blue, positivenegative IAT of Experiment 1, Experiment 3 was identical to 1.

Participants
Forty-one participants who did not participate in other experiments reported here, all students at Eindhoven University of Technology, volunteered to participate for partial course credit. Due to the setup of the course, twenty-five students participated in February, and 16 students participated in March. The design and the procedure were identical to Experiment 1, with the exception that participants only completed the red-blue, positive-negative IAT in Experiment 3.

Results and Discussion
Erroneous responses (7%) and extremely fast or slow responses (7%) were excluded from the analysis. We calculated the mean response times in milliseconds for each participant for the two critical blocks for the two IATs. We calculated the mean response times for each participant for the two critical blocks. As expected, negative words and red figures (M = 584 ms, SD = 132 ms) were not categorized significantly faster when pressing the same key compared to positive words and red figures, (M = 592 ms, SD = 106 ms), 95% CI [-32.92, 48.58], JZS BF 01 = 5.52, t(40) = 0.39, p = .70, Hedges' g = 0.06, 95% CI [-0.27, 0.4]. These results suggest that these data are stronger evidence for a null effect than for a relatively large IAT effect. To test whether the observed IAT effect was smaller than the smallest effect size of interest, we conducted an equivalence test. As in Experiment 2, we used a SESOI of Cohen's d of 0.45. The results indicated that the observed effect size (d = 0.06) was significantly within the equivalent bounds of d = -0.45 and d = 0.45, t(40) = 2.49, p = 0.008. To investigate whether this observed IAT effect is significantly smaller than the effect observed in the red-green, positive-negative IAT of Experiment 1, we conducted a repeated measures ANOVA with color opposition (i.e., red-green vs. red-blue) as confounding variable. The results showed a significant interaction between color opposition and congruency, F(1, 81) = 11.63, p = .001, η p 2 = 0.13. The results suggest that the observed IAT effect in Experiment 3 is significantly smaller than the observed IAT effect in the red-green, positive-negative IAT of Experiment 1. See Figure 2 for the mean reaction times for the critical blocks of the IATs of Experiments 1 to 3.
Together, the results of Experiments 2 and 3 suggest that the IAT effect was substantially smaller than a medium effect size and the relatively large effect observed in the red-green, positive-negative IAT conducted in Experiment 1. In addition, whereas red as compared to blue was associated with valence when participants first learned the red-negativity association in the red-green IAT (Experiment 1), no statistically detectable association emerged when participants completed only the red-blue IAT (Experiment 2 and 3). These data are in line with the dimension-specificity hypothesis.

Experiment 4: Red-blue versus aggressive-calm
Based on Experiments 1-3, we expected that red as opposed to blue would be associated with highly active concepts, regardless of whether these were negative such as aggression, or positive such as enthusiasm. In Experiment 4, we first tested the associations between red and blue colored figures, and aggression and calmness related concepts. As in Experiments 2 and 3, we expected that the concept pair red-blue (presented in stimulus colors and target labels) would have the highest conceptual distance on the activity dimension, and thus activate the activity dimension. Aggression as opposed to calmness (as presented in stimulus words and attribute labels) was expected to activate both the evaluation and activity dimension, as aggression-calmness is conceptually distant on the activity as well as the evaluation dimension (Russell & Mehrabian, 1977; pilot Study 2, see Appendix B in the supplementary materials of Part 1, Schietecat et al., 2018). Therefore, after weighting all dimensions, we expected that the activity dimension would receive most weight and would become salient, and determine the cross-modal associations in this task. Aggression and red were expected to form the plus pole on the activity dimension, whereas calmness and blue would form the minus pole. Following the principle of polarity correspondence, aggression and red, and calmness and blue should become associated.

Participants
Thirty-three participants (16 females, M age = 19.67 years) who did not participate in other experiments reported here, all students at Eindhoven University of Technology, volunteered to participate for partial course credits.

Design and Procedure
The stimuli consisted of words and figures. Words were either aggressive (i.e., furious, murder, enraged, destroy) or calm (i.e., zen, calm, relaxed, peaceful). The figures consisted of blue and red squares, identical to Experiment 2. Explicit aggression ratings for all words were collected in pilot Study 2 (see Appendix B in supplementary materials of Part 1, Schietecat et al., in press, for a full description of pilot Study 2). All stimuli were presented in random order on a computer screen, and participants were asked to indicate how aggressive/calm, active/ passive, and positive/negative words were on a 9-point scale (1 = calm, 9 = aggressive; 1 = passive, 9 =active; 1 = negative, 9 = positive). A manipulation check confirmed that aggression related words were clearly judged as more aggressive (M = 7.71, SD = 1.12) than the calm related words (M = 1.89, SD = 1.05), t(34) = 17.44, p < .001, Hedges' g = 5.51. In addition, aggressive words (M = 7.69, SD = .96) were evaluated as more active compared to calm words (M = 2.75, SD = 1.07), t(38) = -20.49, p < .001, Hedges' g = 4.84. Lastly, aggressive words (M = 7.26, SD = 1.13) were evaluated more as more negative compared to calm words (M = 2.06, SD = 1.18), t(38) = 14.82, p < .001, Hedges' g = 4.45. The design and procedure were identical to Experiment 2. Participants completed the test at home.

Results
Erroneous responses (6%) and extremely fast and slow responses (6%) were excluded from the analysis. We calculated the mean response times in milliseconds for each participant for the two critical blocks. In line with our predictions, participants categorized aggressive words and red figures faster (M = 560 ms, SD = 50 ms) when red figures and aggressive words (and blue figures and calm words) were assigned to the same response key compared to when red figures and calm words (and blue figures and aggressive words) were assigned to the same response key (M = 659 ms, SD = 149 ms), 95% CI [49.65,146.7], JZS BF 10 = 111.61, t(32) = 4.12, p < .001, Hedges' g = 0.86. Thus, when people had to press the same key for red figures and aggressive words, and blue figures and calm words, they responded more quickly than when people pressed the same key for red figures and calm words, and blue figures and aggressive words. Experiment 4 shows that red as compared to blue is associated with a highly active negative concept based on a salient activity dimension. To provide a final test of the dimension-specificity hypothesis, we conducted Experiment 5 to investigate whether red as compared to blue would be associated with a highly active positive concept (enthusiasm) as compared to a less active positive concept (relaxation).

Experiment 5: Red-blue versus enthusiastic-relaxed
In Experiment 5, we tested the associations between red and blue, and enthusiasm and relaxation related concepts. Enthusiasm and relaxation (presented in stimulus words and the attribute labels) were expected to activate the activity dimension, as they were likely to have the highest conceptual distance on the activity dimension as compared to the evaluation dimension (Russell & Mehrabian, 1977). As in previous IATs, we expected that the concept pair red-blue (presented in stimulus colors and target labels) would have the highest conceptual distance on the activity dimension. Therefore, the activity dimension would receive the highest weight, and thus would become salient. Enthusiastic and red would form the plus pole on the activity dimension, whereas relaxed and blue would form the minus pole. Hence, enthusiastic and red, and relaxed and blue should become associated. Note that where red was expected to be associated with negativity in the context of Experiment 1, it was expected to be associated with positive words (enthusiastic) in Experiment 5.

Participants
Thirty-eight participants who did not participate in other experiments reported here, all students at Eindhoven University of Technology, volunteered to participate for partial course credits.

Materials
The stimuli consisted of words and figures. Words were either enthusiastic (i.e., eager, lively, excited, thrilled) or relaxed (i.e., calm, peaceful, restful, quiet). The figures existed of blue and red squares, identical to Experiment 2. Explicit evaluations for the words were collected post hoc. All stimuli were presented in random order on a computer screen, and evaluative ratings in terms of valence and activity were given on a 9-point scale (1 = negative, 9 = positive; 1 = passive, 9 = active). The mean valence ratings for relaxed (M = 7.09, SD = 1.44) and enthusiastic (M = 7.26, SD = 1.06) words did not statistically differ from each other in terms of valence, t(34) = 0.57, p = .573, Hedges' g = 0.13. The mean activity ratings for enthusiastic words (M = 7.74, SD = .91) were significantly higher than the mean activity ratings for relaxed words (M = 3.15, SD = 1.67), t(34) = 11.70, p < .001, Hedges' g = 3.53. The design and procedure were identical to Experiment 2. Participants completed the test at home.

Results
Erroneous responses (6%) and extremely fast and slow responses (6%) were excluded from the analysis. We calculated the mean response times in milliseconds for each participant for the two critical blocks. As expected, enthusiastic words and red figures, and relaxed words and blue figures were categorized faster when pressing the same key (M = 656 ms, SD = 109 ms), compared to when the same key had to be pressed for enthusiastic words and blue figures, and relaxed words and red figures (M = 754 ms, SD = 169 ms), 95% CI [47.27,149.7], JZS BF 10 = 71.14, t(37) = 3.9, p < .001, Hedges' g = 0.68. Thus, when participants had to press the same key for red figures and enthusiastic words, and blue figures and relaxed words, they responded faster than when they pressed the same key for red figures and relaxed words, and blue figures and enthusiastic words. See Figure 3 for the mean reaction times for the critical blocks of the IATs of Experiments 4 and 5.

General Discussion
Research on color associations has shown clear contextdependency in color associations (e.g., Elliot, 2015). Our study confirms that red can be associated with negativity in one context, whereas in another context the same color can be associated with a highly active (and positive) concept. Most importantly, our data suggest that these contextual variations can be quite accurately predicted based on the dimension-specificity hypothesis (see also Part 1, Schietecat et al., 2018).
Our results support the idea that polar oppositions in experimental tasks can activate basic underlying dimensions, two of which can be valence and activity. The salient dimension within each of the IATs depended on the task-specific context. Red and green were consistently treated as polar opposites on the evaluation dimension. As a consequence, whenever the evaluation dimension was salient, we observed an association between red and negativity, and green and positivity. On the other hand, red and blue seemed to be polar opposites mostly on the activity dimension, and these two concepts were less strongly opposed on the evaluation dimension. Therefore, in IATs where the activity dimension was salient, red (compared to blue) was associated with active negative concepts such as aggression (Experiment 4), but also with active positive concepts such as enthusiasm (Experiment 5).
Building on the first series of experiments (see Part 1, Schietecat et al., in press), and in line with findings by Osgood and Richards (1973), our results supported the dimension-specificity hypothesis. Cross-modal associations between concrete and affective abstract concepts seem most likely to emerge when they are structurally similar on a specific dimension (i.e., the evaluation or activity dimension). These results support the notion that stronger associations emerge when concepts are structured along relevant dimensions as compared to irrelevant dimensions (Kornblum, & Lee, 1995).
With the present studies, we aimed to test the prediction that when a concept pair is not characterized by a meaningful opposition on the salient conceptual dimension, the polar opposites of that concept pair will not be attributed a plus and minus pole on the salient dimension. If so, no cross-modal association based on the parallel polarity principle should emerge. The absence of a strong association between red (as compared to blue) and valence supported this prediction (Experiments 1 -3). Additional task-specific features might influence the attribution of plus and minus polarities, and therefore the strength of a cross-modal association in a cognitive task. For example, based on earlier research (Everaert, Spruyt, & De Houwer, 2011;Lakens et al., 2012), we might expect that having not two, but three colors (e.g., red, green, and blue) in a single task removes the presence of a clear bipolar opposition, and therefore the strength of crossmodal associations in such a task might be much smaller, or maybe even zero. Similarly, the presentation of bipolar stimuli in separate blocks in a cognitive task might reduce the strength of cross-modal associations (e.g., Gallace & Spence, 2006). Future research could investigate how task-specific features that influence the salience of polar opposition in a task impact the cross-modal associations that emerge.
To predict context-dependent cross-modal associations, we believe that a weighting process determines which of the activated dimensions becomes most salient in the task specific context. In the current manuscript, we aimed for a conceptual verification of the proposed mechanism, but an important question is how the weighting process might be verified quantitatively. At the core of the weighting process lies the idea that when the distance between opposing concepts is high on a specific dimension, the probability that a dimension becomes salient within a task increases. Literature from the field of linguistics suggests that concepts that are strong opposites are likely to produce opposite responses in word association tasks (e.g., Paradis, Willners, & Jones, 2009). For example, most people mention white when they see the word black, whereas they respond to the word green without one strong opposite (Jenkins, 1970). This suggests that black versus white could form a stronger opposite compared to red versus green. Free association tasks might be one way to determine the strength of the conceptual distance between opposing concepts, and thus might be a first step to a more quantitative verification of the weighting process that determines which dimensions become most salient in a given context.
Color stimuli vary on multiple perceptual dimensions (e.g., lightness, chroma, and hue; see for example Fairchild, 2005). Color researchers have emphasized the importance of controlling for irrelevant dimensions when developing color manipulations (e.g., Elliot & Maier, 2012. In our studies, the luminance levels of our stimuli were not matched. In addition, we conducted Experiments 1 and 3 in a laboratory environment, but Experiments 2, 4, and 5 were conducted in participants' homes with an internetbased method. Therefore, we could not be completely sure that the presentation of the stimuli on their personal computers was identical for every participant in those experiments. However, we expected that the impact of these variations on our results is not substantial. The labels of the IAT (i.e., red vs blue) increased the salience of the relevant hue dimension, and we do not expect our results to hold for very specific hues, but for colors that are broadly categorized as red, blue, and green. The similar associative patterns across Experiments 2 and 3 seem to support this expectation. Error bars represent standard errors of the mean.
Our studies suggest that the evaluation and activity dimension might underlie cross-modal associations between red, enthusiasm, and aggression, as measured with an IAT. However, alternative dimensions could also explain these associations. For example, Albertazzi et al. (2013) reported on a compatibility effect between specific hues and shapes. In their studies, participants were presented with several shapes (e.g., cone, triangle, square), and were asked to choose a color from a Hue Circle, which they thought was mostly related to the shape. Relevant to our paper, the results showed that participants related the color red to a squared shape. Possibly, in our studies, polarities were attributed based on this compatibility effect, rather than the evaluation and activity dimension. These attributions could explain associational patterns found in Experiments 4 and 5, where red squares were attributed plus polarity and blue squares were attributed minus polarity. However, based on these attributions we would also expect an association between red squares and positivity and blue squares and negativity, a result which is not supported by our data (see Experiments 1 -3). Looking at the overall pattern across experiments, we believe that polarity attribution based on the evaluation or activity dimension explains our specific results (obtained with an IAT) better than a compatibility effect between hue and shape. However, future research could investigate the influence of shape on associations between color and emotion.
To conclude, our results suggest that contextdependent cross-modal associations between red and valence can be quite accurately predicted based on the dimension-specificity hypothesis. As opposed to blue, red is associated with excitement and aggression (Bennett & Rey, 1972;Young et al., 2013), but as opposed to green, red is associated with negativity (Gil & Le Bigot, 2015;Moller et al., 2009). Together with the studies reported in Part 1 (Schietecat et al., 2018), the current set of studies provides the first evidence that context-dependent associations between concrete and affective abstract concepts might be predicted by dimension-specific polarity attributions.

Data Accessibility Statement
All the stimuli, presentation materials, data, and analysis scripts are available from the Open Science Framework https://osf.io/agvz3/.

Notes
1 In the present manuscript, we define concrete concepts as concepts represented as physical entities, whereas affective abstract concepts are "entities that are neither purely physical nor spatially constrained" (Barsalou & Wiemer-Hastings, 2005, p. 129). More specifically, as in Part 1 (Schietecat et al., 2018), we define concrete concepts as features from (visual) sensory modalities (e.g., a red versus blue colored patch), and abstract concepts as affective words (e.g., aggression versus calmness).