More of what? Dissociating effects of conceptual and numeric mappings on interpreting colormap data visualizations

Soto, Alexis; Schoenlein, Melissa A.; Schloss, Karen B.

doi:10.1186/s41235-023-00482-1

Original article
Open access
Published: 19 June 2023

More of what? Dissociating effects of conceptual and numeric mappings on interpreting colormap data visualizations

Alexis Soto^1,3,
Melissa A. Schoenlein^2,3 &
Karen B. Schloss ORCID: orcid.org/0000-0003-4833-4117^2,3

Cognitive Research: Principles and Implications volume 8, Article number: 38 (2023) Cite this article

1032 Accesses
2 Citations
Metrics details

Abstract

In visual communication, people glean insights about patterns of data by observing visual representations of datasets. Colormap data visualizations (“colormaps”) show patterns in datasets by mapping variations in color to variations in magnitude. When people interpret colormaps, they have expectations about how colors map to magnitude, and they are better at interpreting visualizations that align with those expectations. For example, they infer that darker colors map to larger quantities (dark-is-more bias) and colors that are higher on vertically oriented legends map to larger quantities (high-is-more bias). In previous studies, the notion of quantity was straightforward because more of the concept represented (conceptual magnitude) corresponded to larger numeric values (numeric magnitude). However, conceptual and numeric magnitude can conflict, such as using rank order to quantify health—smaller numbers correspond to greater health. Under conflicts, are inferred mappings formed based on the numeric level, the conceptual level, or a combination of both? We addressed this question across five experiments, spanning data domains: alien animals, antibiotic discovery, and public health. Across experiments, the high-is-more bias operated at the conceptual level: colormaps were easier to interpret when larger conceptual magnitude was represented higher on the legend, regardless of numeric magnitude. The dark-is-more bias tended to operate at the conceptual level, but numeric magnitude could interfere, or even dominate, if conceptual magnitude was less salient. These results elucidate factors influencing meanings inferred from visual features and emphasize the need to consider data meaning, not just numbers, when designing visualizations aimed to facilitate visual communication.

Significance

Visual communication is fundamental to sharing of information across sectors, spanning academia, politics, business, and general public discourse. People use various kinds of information visualizations to communicate about data, but colormap data visualizations (“colormaps”) are especially useful for showing how patterns of data unfold over space. In colormaps, gradations of color correspond to gradations of quantity over space. Common examples include maps of weather patterns across a city, election outcomes across a country, and disease prevalence across the globe. When people interpret colormaps, they have expectations about how colors will map to quantities, and it is harder to interpret colormaps that violate those expectations. Several biases contribute to inferences about the meanings of colors in colormaps. For example, the dark-is-more bias leads to the inference that darker colors map to larger quantities, and the high-is-more bias leads to the inference that colors represented higher on a vertically oriented legend map to larger quantities. Here, we investigated cases in which these biases make opposing predictions, depending on the degree to which they operate at the level of conceptual magnitude represented in the colormap (e.g., healthiness), or the level of numeric magnitude used to measure that concept (e.g., rank order). We found conceptual magnitude consistently dominated for the high-is-more bias and tended to dominate for the dark-is-more bias unless the concept was less salient. Thus, efforts to create visualizations for effective communication cannot merely rely on software defaults for mapping numbers to colors; it is necessary to consider the meaning of the data.

Introduction

When people communicate about data, they leverage perceptual representations to help make sense of patterns in datasets. Such perceptual representations include data visualizations, such as diagrams, charts, and maps (see Franconeri et al., 2021 for a review), data sonifications (audition) (Dingler et al., 2008; Mynatt, 1994), tactilizations (touch) (Jones, 2011), and even olfactations (smell) (Batch et al., 2020; Patnaik et al., 2018). In all of these cases, designers encode aspects of data (e.g., quantities or categories) using perceptual features (e.g., color, position, size, frequency, texture, or odor). Observers are then faced with the task of determining what those perceptual features mean in the context of the particular perceptual representation.

Colormap data visualizations are one common type of perceptual representation, which are used to display a wide variety of data types, such as weather patterns across different geographical regions, correlations in neural activity across different brain regions, and the spread of disease across the globe. In colormap data visualizations, variations of color are used to represent variations in magnitude within a dataset. When observers interpret colormaps, they have expectations about how colors should map to magnitude (Cuff, 1973; McGranaghan, 1989; Schloss et al., 2019; Schoenlein et al., 2023; Sibrel et al., 2020), known as their inferred mappings. Interpreting colormaps, and information visualizations more broadly, is easier when visualization design matches people’s inferred mappings (Hegarty, 2011; Lin et al., 2013; Mukherjee et al., 2022; Norman, 2013; Schloss et al., 2018, 2019, 2021; Schoenlein et al., 2023; Sibrel et al., 2020; Tversky, 2011). Thus, understanding the nature of people’s inferred mappings is fundamental to understanding how to design data representations that support effective communication.

Multiple factors influence inferred mappings for colormap data visualizations, including relational and direct associations. Relational associations are correspondences between relational properties of visual features (e.g., darkness, opacity, and spatial height) and relational properties of concepts (e.g., more or less of a concept). For example, the dark-is-more bias leads people to infer that darker colors map to larger quantities (Cuff, 1973; McGranaghan, 1989; Schloss et al., 2019; Schoenlein et al., 2023; Sibrel et al., 2020). And, the high-is-more bias leads people to infer that colors represented higher in a vertically oriented legend map to larger quantities (Schloss et al., 2019; Sibrel et al., 2020).^{Footnote 1} This bias is consistent with the general notion that positions higher in a picture plane correspond to larger quantities (Hegarty, 2011; Tversky, 2011; Tversky et al., 1991). This phenomenon extends to gestures, as TV broadcasters tend to raise their hands vertically when they reference higher quantities (Winter et al., 2013). Other known relational associations for colormaps include the opaque-is-more bias (Schloss et al., 2019) and the hotspot-is-more bias (Schott, 2010; Sibrel et al., 2020). Direct associations are the degree to which a given concept is associated with a particular color (e.g., sunshine is strongly associated with light yellows but not with dark grays, whereas shade is strongly associated with dark grays but not light yellows). Direct associations can lead observers to infer that colors more associated with the concept (e.g., more sunshine, more shade) map to larger quantities, especially if those associations are particularly strong (Schoenlein, et al., 2023).

When multiple factors are activated, they combine to produce people’s inferred mappings for a particular visualization (Schloss et al., 2019; Schoenlein et al., 2023; Sibrel et al., 2020). Sometimes these factors work together, but sometimes they conflict and can cancel each other out. For example, when colormaps have a hotspot (concentric regions that form a sort of bull’s eye) and that hotspot is dark, the dark-is-more bias and hotspot-is-more bias work together, leading people to infer that darker regions map to more. But, when the hotspot is light, these two biases conflict. Depending on the salience of the hotspot, the dark-is-more bias can dominate, the two biases can cancel out, or the hotspot-is-more bias can dominate (Sibrel et al., 2020). Schoenlein et al. (2023) laid the groundwork for a method that can predict people’s inferred mappings from a weighted sum over multiple, sometimes competing factors.

However, when evaluating mappings between visual features and magnitude in colormap visualizations, there are two types of magnitude to consider, and previously it was unknown which type(s) of magnitude influence inferred mappings. The first type is conceptual magnitude, which is the amount of the construct represented in the visualization. The second type is numeric magnitude, which is the quantitative value measured when operationalizing the construct. This distinction is shown in Fig. 1, using data from the World Health Organization (Ortiz-Ospina & Beltekian, 2018; Ortiz-Ospina & Roser, 2017; Simoes & Hidalgo, 2011). In Fig. 1A, the construct is health coverage, which is operationalized with a health index. Here, conceptual and numeric magnitude are congruent because greater health coverage corresponds to larger health index values. In Fig. 1B (left), the construct is economic complexity, which is operationalized using a complexity ranking. Here, conceptual and numeric magnitudes are incongruent because more complexity corresponds to smaller numbers in the rank order.

In previous studies on inferred mappings for colormaps, conceptual and numeric magnitude were congruent (or the concept was too vague to determine). When they were congruent, it was implied that “more” of the concept corresponded to “more” of the numeric magnitude. For example, in Cuff (1973), colormaps represented temperature, such that increased temperature corresponded to larger degrees. In Schloss et al., (2019) and Sibrel et al., (2020), colormaps represented alien animal sightings on a fictitious planet, such that increased sightings corresponded to larger counts of animals (Schloss et al., 2019). In McGranaghan (1989), colormaps represented unspecified data, so congruency was too vague to determine.

Yet, as shown in Fig. 1B, cases arise in the real world in which conceptual and numeric magnitude conflict. In Fig. 1B (left), darker colors map to larger conceptual magnitude (greater economic complexity) and smaller numeric magnitude (rank position), whereas in Fig. 1B (right), darker colors map to smaller conceptual magnitude and larger numeric magnitude. Under such conflicts, are inferred mappings influenced by the conceptual level, the numeric level, or a combination of both levels? To address this question, we studied the dark-is-more bias and high-is-more bias under conditions when the conceptual and numeric level were congruent or incongruent.

Study overview

All experiments in this study followed the experimental paradigm established in Schloss et al. (2019). Participants were presented with colormaps along with a legend (Fig. 2A, left). The legend specified the encoded mapping (i.e., the correspondence between visual features and magnitude in the colormap). The lightness encoded mapping varied such that larger magnitude in the data corresponded to darker colors (dark-more; D+) or lighter colors (light-more; L+). The height encoded mapping also varied such that colors that were higher on a vertically oriented legend map to larger quantities (high-more; Hi+) or colors that were lower on the legend mapped to larger quantities (Lo+). Participants were asked to look at the colormap and legend and to indicate which side of the map had more (or less) of a target concept. We assessed response times (RT) to correctly interpret the legend. It is established that observers are faster at interpreting colormaps when the encoded mapping more closely matches their inferred mappings, so we can learn about inferred mappings by determining which kinds of encoded mappings enable faster RTs (Schloss et al., 2019; Sibrel et al., 2020).

In previous studies using this paradigm, the legend was labeled to indicate which endpoint represented “greater” and which endpoint represented “fewer” (conceptual magnitude) but there were no numbers on the legend (numeric magnitude) (Schloss et al., 2019; Sibrel et al., 2020). Here, we included numbers on the legend so we could vary the congruency of the conceptual and numeric magnitude. We will first describe our general approach in terms of the conditions in Experiment 1, and then describe how Experiments 2–5 adapted this procedure.

In Experiment 1, participants were presented with colormaps depicting fictitious data about the amount of time alien animals took to notice a scientist observing them in different regions of a planet (adapted from Schloss et al., 2019; Sibrel et al., 2020). All participants were told that time was measured in terms of seconds, and all participants saw legends that were labeled from 1 sec. to 9 sec. Congruency varied between subjects. For the congruent group, the instructions described time in terms of duration, and the legends were labeled with 1 sec. as “shorter” and 9 sec. as “longer” (Fig. 2A). Thus, greater duration corresponded to larger numeric values. The target concept was “longer,” such that participants were asked to look at the map and decide whether the time it took the animals to notice they were being observed was longer on the left or right side of the observation site. The encoded mapping at the conceptual level always matched the encoded mapping at the numeric level (e.g., both D+ or both L+ for lightness encoded mapping and both Hi+ or both Lo+ for height encoded mapping).

For the incongruent group, the instructions described time in terms of speed, and the legends were labeled with 1 sec. as “faster” and 9 sec. as “slower.” Thus, more speed corresponded to smaller numeric values of time. The target concept was “faster,” such that participants were asked to look at the map and decide whether the time it took the animals to notice they were being observed was faster on the left or right side of the observation site. The encoded mapping at the conceptual level was always mismatched with the encoded mapping at the numeric level (e.g., one was D+ and the other was L+ for lightness encoded mapping and one was Hi+ and the other was Lo+ for height encoded mapping). We acknowledge that speed entails time relative to distance and we only indicate speed in terms of time, but we chose this condition because response time is often described in terms of speed (faster/slower) in psychological studies.

Figure 3 shows potential patterns of results depending on whether the dark-is-more bias operates at the conceptual level (left), numeric level (middle), or an equal combination of both (right). For the congruent condition (conceptual and numeric levels both have dark-more (D+) encoding or light-more (L+) encoding), both levels can work together.^{Footnote 2} Thus, we expect RTs in the congruent condition will be faster for dark-more encoding than light-more encoding (extending Schloss et al., 2019; Sibrel et al., 2020). For the incongruent condition, the pattern of results will depend on the relative strength of the conceptual and numeric levels. If the conceptual level dominates inferred mappings, RTs will be faster when the conceptual level is encoded as dark-more, even though the numeric level is encoded as light-more. If the numeric level dominates, RTs will be faster when the numeric level is coded as dark-more, even though the conceptual level is encoded as light-more. If both levels play a role, then they may cancel out as shown in Fig. 3 (right), such that RTs will be similar for both conditions. Figure 3 is shown with respect to the dark-is-more bias, but the same patterns apply to the high-is-more bias if dark-more (D+) is replaced with high-more (Hi+) and light-more (L+) is replaced with low-more (Lo+).

Experiments 2–5 were variations of Experiment 1 to test the generalizability of the results (Table 1). In Experiment 2, the displays were the same as Experiment 1, but the instructions were different. Instead of being asked about the more endpoint of the conceptual dimension (“longer” for duration, “faster” for speed), participants were asked about the less endpoint (“shorter” for duration, “slower” for speed). In Experiment 3, the instructions were the same as Experiment 1, but the conceptual magnitude was made less salient by omitting conceptual magnitude labels from the legend and only showing numeric magnitude. In Experiment 4, the data domain was changed from alien animals to antibiotic discovery. Participants were told that the colormaps represented data about the amount of time it took microbes to eliminate pathogens from a Petri dish. This scenario is based on real research in the Tiny Earth Project, which aims to discover new antibiotics to address the decline in effective antibiotics (Hurley et al., 2021). The numeric magnitude unit was changed from seconds to hours, but otherwise the experiment displays were the same as Experiment 1. Finally, in Experiment 5, the data domain was changed to public health. Participants were told that the colormaps represented health data in different counties, similar to the County Health Rankings report, an annual report of the physical and mental well-being of communities throughout states in the US ("County Health Rankings & Roadmaps, 2022"). The target concept was healthiness for both the congruent and incongruent conditions, but in the congruent condition, healthiness was quantified as an index (larger numbers indicated healthier), and in the incongruent condition, healthiness was quantified as a rank order (smaller numbers indicated healthier).

Table 1 Overview of Experiments 1–5

Full size table

While creating the stimuli, we tried to avoid factors that would influence participant responses, beyond those we aimed to test. We used fictitious data in abstract scenarios to avoid cases in which participants have domain knowledge that could influence their responses. We tested domains in which participants were unlikely to have strong direct associations that would override the dark-is-more bias (alien animals, antibiotics, public health). We created colormaps that do not have strong perceptual evidence for opacity variation (see Schloss et al., 2019) and they were presented on a light background, so we focused on the dark-is-more bias and high-is-more bias and did not consider the potential effects of the opaque-is-more bias. The colormaps also did not have spatial cues in the data, such as hotspots often found in weather and neuroimaging data, so we did not consider potential effects of the hotspot-is-more bias (Sibrel et al., 2020).

The stimuli, data, and analysis code for all experiments can be found at https://osf.io/kpqjh.

Experiment 1

Experiment 1 assessed the degree to which inferred mappings operated at the conceptual or numeric level for colormaps representing sightings by alien animals. In the congruent condition, participants were told that the data represented duration, and participants judged whether the time was longer on the left/right side of the map. In the incongruent condition, participants were told that the data represented speed, and they judged whether the time was faster on the left/right side of the map.

Methods

Participants

We aimed for 60 participants (30 per group), based on the sample sizes in Schloss et al. (2019) and Sibrel et al. (2020). We collected data in batches (n = 85 total) until reaching at least 30 participants per group after excluding participants for atypical color vision (n = 8) and for accuracy less than 90% (n = 16; exclusion criteria set following Schloss et al. (2019) to ensure there were sufficient accurate trials to assess effects on response time). Color vision was assessed by asking participants: “Do you have difficulty seeing colors or noticing differences between colors compared to the average person?” and “Do you consider yourself to be colorblind?” The final sample was 61 participants (32 women, 29 men, mean age = 18.46; age and gender reported through open-response text fields in all experiments). All participants in this and all subsequent experiments participated online for extra credit in their Introductory Psychology course at the University of Wisconsin–Madison. Each experiment tested a different set of participants, all of whom were from an Introductory Psychology course within a single academic year. All gave informed consent, and the University of Wisconsin-Madison IRB approved the protocol.

Design and displays

As shown in Fig. 2A and B (left), the display for each trial contained a colormap visualization and a legend (stimuli adapted from Schloss et al., 2019). The colormap visualization (referred to as “colormap” for short) was an 8 \(\times\) 8 grid (4.8 cm \(\times\) 4.8 cm) placed in the center of the screen. These dimensions pertain to a 7 in. \(\times\) 11.25 in. monitor (2560 \(\times\) 1600 resolution) but can vary depending on the monitor size of individual participants. Each cell in the grid represented fictitious data about the amount of time it took for alien animals to notice they were being observed by a scientist. To help participants categorize the data coming from the left and right sides of the colormap, the left four columns were labeled “left side” and the right four columns were labeled “right side.”

The legend included a color scale (3.5 cm tall \(\times\) 0.5 cm wide) displayed 1 cm to the right of the colormap (also known as a color ramp as in Smart et al. (2019)). To the right of the color scale were numeric time labels: “1 sec.,” “3,” “5,” “7,” and “9 sec.” For the congruent group, the concept label “longer” (more of the concept) was next to “9 sec.” and “shorter” (less of the concept) was next to “1 sec.” (Fig. 2A). For the incongruent group, the concept label “faster” (more of the concept) was next to the numeric label “1 sec.,” and “slower” (less of the concept) was next to “9 sec.” (Fig. 2B). The colormap and legend were positioned on a white rectangle (13 cm \(\times\) 8 cm) centered on a medium gray background (RGB = [128, 128, 128]). The colormaps were generated using two possible color scales (Fig. 4): Hot (from MATLAB) and ColorBrewer Blue (“Blue” for short) (Harrower & Brewer, 2003).

Each colormap was constructed using a different underlying dataset to help ensure that the results were not due to particular spatial patterns of squares within any one colormap. The datasets were created by sampling eight points along an arctangent curve with added noise sampled from a normal distribution (see Schloss et al., 2019 for details). This approach resulted in colormaps in which one side was biased to be light and the other side was biased to be dark. Within each color scale participants judged 40 colormaps (treated as repetitions), 20 colormaps with the darker side on the left and 20 with the darker side on the right. Thus, there were 80 unique colormap stimuli: 2 color scales (Hot, Blue) \(\times\) 2 darker sides (left/right) \(\times\) 20 underlying datasets.

Participants saw each of these 80 colormap conditions four times, corresponding to four different legend conditions: 2 lightness encodings at the conceptual level (dark-more, light-more) \(\times\) 2 height encoded mappings at the conceptual level (high-more, low-more) (Fig. 2). In the congruent condition, the numeric encoded mapping matched the conceptual encoded mapping (Fig. 2A), and in the incongruent condition, the numeric encoded mapping was opposite of the conceptual encoded mapping (Fig. 2B). This design also ensured that the orientation of the color scale in the legend was balanced over trials, such that the darker end was higher on half of the trials and lower on the other half.

In total, each participant completed 320 experimental trials, including these 4 legend conditions [2 lightness encoded mappings (D+, L+) \(\times\) 2 height encoded mappings (Hi+, Lo+)] presented with each of the 80 colormap stimuli. Congruency varied between subjects, with random assignment to the congruent group (n = 30) or the incongruent group (n = 31).

Procedure

All participants were instructed that they would see colormaps that represent data collected by a scientist on a distant planet, Sparl. They were told that the data were about alien animals from different regions of observation sites and how much time the animals took to notice the scientist. In the congruent group, participants were told that in some observation sites, the time it took was LONGER on the left side of the site, and in other sites, the time was LONGER on the right side. Their task was to look at the colormap and legend, and indicate if the time it took the animals to notice they were being observed was LONGER on the left or right side of the observation site. In the incongruent group, the word “LONGER” was replaced with the word “FASTER”, but otherwise the instructions were the same. At the bottom of the screen, four example colormaps were displayed so participants could see the types of stimuli they would be asked to judge. The full set of instructions can be found in the Additional file 1.

After reading the instructions, participants completed 20 practice trials, randomly sampled from all possible trials. They then began the 320 experiment trials. Each trial started with a 500-ms blank gray screen with a black fixation cross in the center. Next, the experimental display appeared and remained on the screen until the participant responded (pressing the left/ right arrow key). If they responded correctly, the next trial appeared after a 500-ms delay. If they responded incorrectly, black text that said “WRONG” appeared 2 cm to the right of the legend for 500 ms, followed by a blank 500-ms screen, and then the next trial began. Participants were notified when they completed 25%, 50%, and 75% of the trials, and were told their accuracy after every 20 trials. At the end of the experiment, participants were presented two color vision questions as described above, followed by a debriefing message to inform them of the purpose of the experiment.

Results and discussion

To prepare the response time (RT) data for analysis, we excluded incorrect trials, and then pruned trials for each participant if RTs were + /–2 standard deviations from their mean over all trials. Next, for each participant, we calculated the mean RT over the remaining trials within each of the four legend conditions for each color scale (also averaging over the left/right balance of which side of the colormaps was darker).

We analyzed the full dataset using a mixed-design ANOVA: 2 congruency groups (congruent vs. incongruent, between-subjects) \(\times\) 2 lightness encoded mappings (dark-more concept vs. light-more concept; within-subjects) \(\times\) 2 height encoded mappings (high-more concept vs. low-more concept; within-subjects) \(\times\) 2 color scales (hot vs. blue, within-subjects). Note: for analysis purposes we coded encoded mappings in reference to the concept, rather than the numeric magnitude, but this was an arbitrary decision. In the Additional file 1: Table S1 shows the full output of the analysis and Additional file 1: Figure S1 shows the data plotted according to lightness encoded mapping and height encoded mapping for each color scale.^{Footnote 3} Here, we examine the effects of congruency on the dark-is-more bias and high-is-more bias separately because there was no 3-way interaction between congruency, lightness encoded mapping, and height encoded mapping, (F(1,59) = 1.58, p = 0.214, \({\eta }_{p}^{2}\) = 0.026).

Dark-is-more bias

Figure 5A (left) shows mean RTs plotted for trials with dark-more concept encoding and light-more concept encoding, separated by whether participants were in the congruent or incongruent group. These data are averaged over height encoded mapping and color scale. The pattern of results is similar to the pattern if the conceptual level dominated inferred mappings (Fig. 3, left).

Consistent with a dark-is-more bias operating at the conceptual level, a main effect of lightness encoded mapping indicated quicker RTs for dark-more concept encoding than light-more concept encoding (F(1,59) = 29.45, p < 0.001, \({\eta }_{p}^{2}\) = 0.333). However, lightness encoded mapping interacted with congruency (F(1,59) = 4.43, p = 0.040, \({\eta }_{p}^{2}\) = 0.070). As shown in Fig. 5A (left), the degree to which RTs were quicker for dark-more than light-more concept encoding was greater for the congruent group than the incongruent group. Given this interaction, we conducted paired-samples t tests to compare the effects of encoded mapping separately within each group. RTs were quicker for dark-more concept encoding in both the congruent group (t(29) = − 5.22, p < 0.001, d_z = 0.953) and incongruent group (t(30) = − 2.40, p = 0.023, d_z = 0.430). Taken together, these results suggest that the dark-is-more bias operates primarily at the conceptual level, but the numeric level does play a role. Conflict from the numeric level reduces the effect of the conceptual level, but the dark-is-more bias at the conceptual level still dominated inferred mappings.

High-is-more bias

Figure 5A (right) shows mean RTs plotted for trials with high-more concept encoding and low-more concept encoding, separated by whether participants were in the congruent group or incongruent group. These data are averaged over lightness encoding and color scale. Consistent with a high-is-more bias operating at the conceptual level, a main effect of height encoded mapping indicated quicker RTs for high-more concept encoding than low-more concept encoding (F(1,59) = 65.06, p < 0.001, \({\eta }_{p}^{2}\)= 0.524). This effect did not interact with congruency (F < 1), which suggests there was no interference from when the numerical level conflicted (i.e., larger conceptual magnitude and lower numeric magnitude were positioned higher on the legend). Paired-samples t tests indicated that RTs were quicker for high-more encoding than low-more encoding for both the congruent group (t(29) = − 5.64, p < 0.001, d_z = 1.029) and the incongruent group (t(30) = − 5.78, p < 0.001, d_z = 1.037).

In summary, the results of Experiment 1 suggest that both the dark-is-more and high-is-more biases are dominated by the conceptual level. Under conflicts between conceptual and numeric levels, the numeric level plays a role for the dark-is-more bias, but not enough to eliminate or override the bias at the conceptual level. No such interference occurred for the high-is-more bias. These results suggest that when assigning perceptual features to quantities in colormap data visualizations, it is important to consider what the data mean, and not just numeric values, to facilitate interpretation.

Experiment 2

Experiment 2 assessed whether the results of Experiment 1 were robust to the framing of the task. Instead of asking participants about the “more” endpoint of the conceptual dimension— “longer” in the duration (congruent) group, “faster” in the speed (incongruent) group, we modified the instructions to ask about the “less” endpoint of the conceptual dimension— “shorter” in the duration (congruent) group, “slower” in the speed (incongruent) group. Otherwise, Experiment 2 was the same as Experiment 1.