When IPCC graphs can foster or bias understanding: evidence among decision-makers from governmental and non-governmental institutions

To develop effective climate change policy, decision-makers need to have the best possible understanding of the available climate science. The IPCC Assessment Reports therefore aim to lay the foundation for informed political decision-making by providing policy-relevant information. But how successful are IPCC reports at communicating key findings? Although IPCC reports display key information in graphs, the interpretation of such graphs has received little attention. Here we provide an empirical evaluation of IPCC graph comprehension among IPCC target audience (N = 110), (political) decision-makers from climate-related (non-)governmental organizations from 54 countries, and a comparative sample of German junior diplomats, representing future international decision-makers (N = 33). We assess comprehension of current climate change risk visualizations using two IPCC graphs, one that employs principles of intuitive design, and one that violates principles of intuitive design. Results showed that (i) while a minority of IPCC target audience misinterpreted the intuitive graph, (ii) the majority of participants systematically misinterpreted the counter-intuitive graph, drawing the opposite conclusion from what was meant to be conveyed by the graph, despite (iii) having high confidence in the accuracy of their interpretation. Since misinterpretation of IPCC graphs does not allow for optimal use of the scientific information for policy-making, the results emphasize the importance of IPCC graphs that follow the principles of intuitive design.


Introduction
The Intergovernmental Panel on Climate Change (IPCC) was founded by the World Meteorological Organisation and the United Nations Environment Programme to provide policymakers with regular scientific assessments on climate change, and its risks. One of the main principles of the IPCC Assessment Reports is to lay the foundation for informed political decision-making by providing policy-relevant information. But how successful are IPCC reports at communicating key findings? Within IPCC reports, key findings and major results are often displayed in graphs. Despite the importance of these graphs, research examining how people interpret climate data visualizations is limited, and has largely focused on either objective or subjective comprehension and preferences for different designs (Mcmahon et al 2015(Mcmahon et al , 2016Taylor et al 2015). However, it is largely unknown how well objective and subjective understanding align: Do viewers have insight into the degree of their understanding of IPCC graphs? Can they correctly indicate which graphs they do, and do not, understand, and does that vary with the design of the graph? There is substantial evidence that graphs that are not well-designed often lead to systematic misunderstandings and even suboptimal decisions (e.g., Okan et al 2012Okan et al , 2016, while difficulties in understanding can be significantly reduced by depicting risk using simple, welldesigned, graphical displays (e.g. Garcia-Retamero and Cokely 2013). Here we examined objective and subjective understanding of IPCC graphs in a sample of IPCC target audience, political decision-makers from climate-related governmental organizations and staff from climate-related non-governmental organizations.
Graphs can have important cognitive and motivational advantages, such as reducing the cognitive effort required to understand risks or information processing time (Smerecnik et al 2010, Brewer et al 2012. Graphs can also reveal data patterns that may otherwise go undetected and evoke automatic mathematical operations (Lipkus 2007). However, the effectiveness of graphs to improve understanding depends crucially on how they are designed. An important design principle to support understanding is that spatial and conventional features in graphs should convey the same meaning (Okan et al 2012(Okan et al , 2016. Spatial features include bars of different heights or lines following a trend, whereas conventional features are graph elements linked to arbitrary graph conventions, such as green denoting gains, and red denoting losses. When spatial features are incongruent with conventional features (e.g. higher bars represent lower values), people tend to misinterpret graphs as they often fail to detect such incongruencies. Instead, people often rely on 'spatialto-conceptual' mappings such as 'high equals more' or 'steeper equals faster' to interpret graphs, particularly if they have low graph literacy-the ability to understand graphically presented information (Okan et al 2012(Okan et al , 2016. Another important principle for effective design is to keep graphs as simple as possible, ideally focusing on one main message per graph (Garcia-Retamero and Cokely 2017, Kause et al 2020); or only information that is essential for its intended purpose (Kosslyn 2006, Harold et al 2019. Graph complexity may also hinder viewers' insight into their objective comprehension (Fischer et al 2018). However, whether this varies with graph congruency is still unclear. Here we examined people's interpretations of a more complex IPCC graph containing a conflict between spatial and conventional features (where higher values indicated lower values of climate change impact), and a simpler one without such a conflict. These two graphs will be referred to as 'counter-intuitive' and 'intuitive' , respectively.
While objective comprehension of the information conveyed in the IPCC reports clearly represents a necessary condition for it to be implemented in actual policy, psychology research also shows that subjective comprehension-individual confidence in one's comprehension-can be decision-relevant (Jackson and Kleitman 2014). In fact, people typically have to rely on subjective comprehension to estimate their objective comprehension since subjective comprehension can be assessed internally, whereas objective comprehension can only be assessed externally (e.g. through objective tests, or feedback). Detrimental political consequences can arise from differences in interpretations of complex information between decision-makers (van den Broek 2018), but also from unwarranted confidence (Robinson and Marino 2015). For example, public servants whose subjective comprehension of climate change was higher than their objective comprehension tended to endorse more risky policy choices, above and beyond attitudes and political views (Liu et al 2017). Furthermore, among incumbent members of national parliament, subjective overconfidence in the likelihood of their re-election (but not objective likelihood) predicted risk taking (Sheffer and Loewen 2019). Subjective confidence has also been found to be particularly ill-calibrated for politicized science such as climate change, compared to non-politicized science such as biology and physics (Fischer et al 2019). On the other hand, higher confidence in climate change knowledge predicts higher climate change beliefs (estimated riskiness and anthropogenicity), above and beyond knowledge (Fischer and Said 2020). Similarly sufficiently high confidence in knowledge is required to put that knowledge into practice, also when controlling for differences in knowledge (Parker et al 2012). For example, healthcare workers with higher vaccination confidence (estimated benefits of vaccination) were more likely to recommend vaccination (Karlsson et al 2019).
Here, we evaluate objective and subjective IPCC graph comprehension among the main IPCC target audience, political decision-makers from climaterelated governmental, and non-governmental organizations. We also recruited future decision-makers, junior diplomats from the German Federal Foreign Office as a comparison group of 'the next generation' of national representatives. Graphs on human health risks (figures 1 and 2) were selected from the Health chapter of the latest IPCC report (Smith et al 2014) because human health risks are among the most severe risks caused by climate change (Watts et al 2018). We used multiple-choice questions to assess objective graph comprehension, and asked participants to indicate their confidence in their answers to assess subjective comprehension. This allowed us to investigate how well subjective and objective understanding align: Do decision-makers realize whether they do, or do not understand IPCC graphs? Does such insight vary with the design of the graph? And: is subjective understanding correlated with objective understanding such that participants with higher subjective comprehension are, in fact, more accurate? The correlation between subjective

Participants
This study included two samples: IPCC target audience and German junior diplomats. The IPCC target audience included a sample of climate change decision-makers (N = 110), n = 67 expert political decision-makers from climate-related governmental organizations (consisting of 24 public servants working for Environmental Protection Agencies, EPAs; and 43 governmental policymakers working for ministries as well as governmental, and inter-governmental institutions,), and n = 43 experts from climate-related non-governmental organizations from various sectors such as environmental protection, urban planning, energy business or economic development. The sample represented a total of 54 countries (see appendix for figure A1 depicting all countries, see table 1 for other demographics).
The comparative sample of German junior diplomats consisted of future diplomats for the German Federal Foreign Office (Auswärtiges Amt) in Berlin, Germany (N = 33). All junior diplomats had German citizenship.

Recruitment
For the IPCC target audience, invitations to take part in the survey were sent via email. The contact information was extracted from the UNFCC website which lists the admitted parties and observer organizations of the United Nations Climate Change conference, the Conference of the Parties (COP). Contact details for EPA staff were extracted from the respective websites. In total, 1036 invitation emails were sent, of which 110

Prior and posterior beliefs
Participants indicated their belief in the anthropogenic nature of climate change ('How much is climate change caused by humans?'). They also indicated their beliefs about specific aspects related to each graph, both before and after seeing the graphs. Specifically, they rated (1) the impact of climate change on human health ('How much can climate change affect human health?') and (2) the changes in the number of heat extremes until 2050 ('Is the number of heat extremes going to increase globally until the year 2050?'). Beliefs were rated using a slider (0 = not at all, 100 = very much). We recorded beliefs before and after inspection of the graphs to assess the degree to which IPCC visuals influence decisions-makers' relevant beliefs.

IPCC graphs
The graphs were selected from chapter 11, Human health: impacts, adaptation and co-benefits, from the recent IPCC report (Smith et al 2014). In the intuitive graph, higher values indicated higher levels of climate change impact; in the counter-intuitive graph, higher values indicated lower levels of climate change impact. Content-wise, the intuitive graph displayed the relative health impact from climate change and the potential for impact reduction for eight healthrelated sectors, and in light of three different temperature projections. The counter-intuitive graph displayed a world map showing the urban population increase factor until 2050 and bar graphs indicating mid-21st century projection of the frequency of extreme daily temperature for each region, with lower numbers indicating more frequent events.

Objective graph comprehension
We developed multiple-choice questions assessing comprehension of each graph. The comprehension questions were constructed such that (i) the main messages of the graphs were assessed; and (ii) prior knowledge on climate change or health were not required. Figure captions were displayed to ensure that displays were fully representative of the IPCC report.
The item corresponding to the intuitive graph assessed participants' comprehension of the potential risk reduction ('What does the graph tell you about the + 4 • C warmer world?'). Participants indicated for which facet the potential for risk reduction was greatest according to the graph (a) undernutrition, (b) extreme weather events OR mental health and violence OR occupational health (c) vector-borne diseases or (d) heat OR food-and water-borne infections). Option (a) was coded as correct (1) and all other responses were coded as incorrect (0). The item corresponding to the counter-intuitive graph assessed participants' comprehension of the extreme weather projections depicted ('Which of the following regions is expected to have the highest frequency in maximum daily temperatures that would have occurred only once in 20 years in the late 20th century according to scenario A2 (red bar)?'). The response options were (a) North Asia, (1) and all other responses were coded as incorrect (0). We also inspected the distributions of incorrect responses, to examine whether participants relied on the mapping higher = more/lower = less. A reliance on this mapping would lead viewers to assume that higher values in the graph indicated more climate change impact, and incorrectly infer that (c) (North Europe) is the correct response.
For the sub-sample of EPA staff, we included an additional question with respect to the counterintuitive graph to further assess whether participants made use of a higher = more/lower = less-mapping.
Participants were asked to indicate the region that is expected to have the lowest frequency in heat extremes with the same response options as the previous item. We would expect that participants relying on the mapping would in this case infer that option (b) (Central America/Mexico) is the correct answer, due to the lowest values displayed for this region in the graph.

Subjective graph comprehension
After each objective comprehension question, participants were asked to judge the certainty with which they answered the comprehension question correctly ('How certain are you that your answer is correct?'), which participants rated on a four-point scale ('25%: just guessing' to '100%: completely certain'). This four-point scale was chosen such that the subjective comprehension scale aligns naturally with objective comprehension (e.g. if participants are only guessing, they have a 25% chance of being correct), and because full-range scales (0%-100% certainty) were shown to yield worse calibrated confidence judgments (Weber and Brewer 2003).

Satisfaction with the graphs
Satisfaction with the graphs was assessed asking 'How satisfied are you with the graphs overall?' (0: not at all, 100: completely). 1

Procedure
The survey was conducted in English for all participants. The order of the questions in the survey is given in table 2. Immediately below the display of each graph, participants were given the objective comprehension question, followed by the subjective comprehension question while the graph was still displayed. That is, participants had the option to look at the graphs while answering both questions.

Results
We report Bayesian analyses alongside frequentist analyses, where appropriate. We report BF 10 throughout that express the probability of the data given H 1 relative to H 0 . For Bayes Factors, values > 1 indicate increasing strength of evidence for H 1, , values < 1 indicate increasing evidence for H 0 . As a guideline, values 1-3 (0.3-1) are seen as anecdotal, 3-10 (0.33-0.1) as substantial, and 10-30 (0.1-0.03) as strong evidence for the H 1 (H 0 ; Jeffreys 1998). All McNemar's X 2 tests are continuity corrected.

Objective graph comprehension
Of the IPCC target audience sample, 76% gave the correct answer to the intuitive graph. In contrast, only 41% of the sample gave the correct response to counter-intuitive graph, representing a drop in accuracy rate by > 30%, McNemar's X 2 (1) = 25.3, p < .001 ( figure 3). Similarly, 85% of junior diplomats gave the correct answer to the intuitive graph, but only 46% gave the correct answer to the counter-intuitive graph, McNemar's X 2 (1) = 11.1, p < .001. Differences in proportions correct between both groups were negligible, X 2 (1) = 0.81, p = .37, BF = 0.5 and X 2 (1) = 0.07, p = .79, BF = 0.38, for the intuitive and counter-intuitive graph, respectively. The most frequently selected response gives an indication as to why IPCC target audience provided incorrect responses: a total of n = 59 (41%) of all participants answered that the region most affected by heat extremes was North Europe (as opposed to Central Objective comprehension intuitive graph: Health risks and potential for adaptation 4 Objective comprehension counter-intuitive graph: Frequency of heat extremes 5 Posterior beliefs 6 Satisfaction with the graphs 7 Demographics America/Mexico), which had the highest values in the graph but is actually the region that is least affected. The second comprehension question for the counter-intuitive graph included for the sub-sample of EPA staff allowed us to investigate the potential use of a higher = more-mapping. Indeed results showed that accuracy correlated strongly between both questions for the counter-intuitive graph, phi(22) = .81, p < .001. Specifically, most IPCC target audience who answered 'Northern Europe' to question 1, also tended to answer 'Central America/Mexico' to question 2, suggesting that they did in fact employ a higher = more-mapping leading to these highly systematic errors.

Alignment of subjective and objective graph comprehension
Descriptive results of subjective comprehension per graph are given in figure 3. We used two indices to quantify the extent to which participants could judge whether they had correctly interpreted the graph, that is, the degree to which participants' confidence is justified by their objective comprehension: (I) Correlation between objective and subjective comprehension across participants which quantifies the extent to which more confident participants are more accurate; and (ii) Calibration which quantifies objective comprehension as a function of subjective comprehension, and is optimal when objective and subjective comprehension align (Weber and Brewer 2003). Calibration is hence relevant to assess the extent to which each level of subjective comprehension (e.g. pure guessing or 100% certainty) is predictive of objective comprehension.
(i) Correlation. Subjective comprehension was related to objective comprehension across IPCC target audience for the intuitive graph, r(108) = .22, 95% CI [.04, .39], p = .02, BF = 3.24. Importantly, however, subjective comprehension was unrelated to objective comprehension for the counter-intuitive graph, r(108) = .056, 95% CI [−.13, .24], p = .55, BF = 0.26. Among junior diplomats, correlation results provide anecdotal evidence for a positive relationship between subjective and objective comprehension for the intuitive graph, r(31) = .36, 95%CI [.02, .62], p = .04, BF = 2.3, and the counter-intuitive (ii) Calibration. Figure 4 shows objective comprehension for each level of subjective comprehension for the IPCC target audience. Calibration curves could not be estimated reliably for Junior Diplomats due to an insufficient number of participants in subjective * objective comprehension cells. Results show that calibration curves generally followed optimal calibration (the diagonal) more closely for the intuitive compared to the counterintuitive graph in that the confidence intervals of actual calibration of the intuitive graph entailed the optimal calibration at more levels of subjective comprehension compared to the counter-intuitive graph. This was especially so at the upper end of the confidence scale, that is, where IPCC target audience were almost or fully certain to have understood the graphs, which was indeed the case for the intuitive, but not the counter-intuitive graph. One interesting exception is the lower end of the confidence scale, that is, where participants believed to be guessing. Here, the subjective confidence overlaps with optimal calibration for the counter-intuitive graph only. Figure 4 also shows that for the counter-intuitive graph, viewers were vastly overconfident of their understanding for all judgments made with medium to high confidence (75% or 100% certainty). This contrasts with results for the intuitive graph, where viewers were overconfident in their understanding of judgments made with 100% certainty only, and to a considerably lesser extent.

Belief change through inspection of IPCC graphs?
The distribution of the perceived anthropogenic nature of climate change was clearly left-skewed for both IPCC target audience, and junior diplomats. Beliefs about the anthropogenicity of climate change were, however, also varied, particularly for IPCC target audience (figure 5). IPCC target audience and Junior Diplomats did not appreciably update their prior beliefs regarding how much climate change can affect human health after inspection of the intuitive graph, nor their prior beliefs regarding the frequency of heat extremes after inspection of the counter-intuitive graph (figure 6). We additionally tested whether participants' prior beliefs were related to objective graph comprehension, as prior research has suggested that motivated interpretation can affect the accuracy of interpreting evidence as a function of the belief-congruency of the evidence (e.g. Kahan et al 2017). Results of these analyses are provided in the supplementary material (available online at https://stacks.iop.org/ERL/15/114041/ mmedia).   that a good predictor for graph satisfaction among IPCC target audience was subjective comprehension, r(108) = .24, 95% CI [.06, .41], p = .01, BF = 5.0.

Discussion
This study provides an empirical evaluation of the comprehension of IPCC graphs among IPCC target audience, decision-makers from climate-related governmental organizations and climate-related non-governmental organizations, as well as a comparative sample of future diplomats. Results showed that while a majority of IPCC target audience correctly interpreted the graph that employed an intuitive design, less than half (47%) did so for the graph with the counter-intuitive design. As a worrying consequence of the counter-intuitive design, IPCC target audience often drew the opposite conclusion from what was meant to be conveyed by the graph. This shows that the design of IPCC graphs can systematically bias interpretation among decision-makers.
Notably, the IPCC target audience seemed largely unaware of their misinterpretation of the counterintuitive graph. Specifically, IPCC audience who answered with medium to high confidence were vastly overconfident of their objective understanding of the graph. In fact, even among those who were 100% certain that their interpretation was correct, only 50% interpreted the graph accurately (compared to approx. 80% for the intuitive graph). Importantly, a lack of awareness of the misinterpretations means that decision-makers may not be compelled to seek out information that will rectify their misunderstanding (Porto and Xiao 2016), and may even be prone to taking more risky policy choices (Simon and Houghton 2003, Dittrich et al 2005, Robinson and Marino 2015. Moreover, political decision-makers who are overconfident in their understanding mightironically-appear particularly convincing to others (Schwardmann and Van der Weele 2019, Solda et al 2019).
The intuitive graph, in contrast, yielded not only higher objective comprehension, but also subjective comprehension that was more indicative of objective comprehension. As revealed by the calibration plot, this was particularly the case at the upper end of the confidence scale, that is, where viewers were almost or completely certain to have understood the graph. This was indeed the case for the intuitive, but not the counter-intuitive graph. Counterintuitive IPCC graphs hence pose a double hazard: First, by preventing policymakers from distilling the correct scientific information, which warrants political decision-making that is not based on the available evidence; and second by preventing decision-makers from realizing their lack of understanding. Both perils can be prevented by employing principles of intuitive design in the creation of informative graphs.
IPCC target audience who felt more certain of their interpretation were more likely to be satisfied with the graphs, suggesting that a feeling of understanding of the information was associated with being satisfied with its communication. This is in line with previous research on graphical risk communication that reported positive associations between perceived understanding of graphs and user evaluations (Okan et al 2020), and suggests that subjective comprehension may play a key role in the satisfaction of IPCC graphs among its target audience.
These results are particularly interesting in light of recent findings showing how IPCC authors themselves can reliably indicate which graphs viewers will find difficult to comprehend (Harold et al 2019). Although the two studies cannot be directly compared since they included different samples and graphs, the contrasting findings suggest that IPCC authors may be better at judging their audience's subjective graph comprehension than the audience themselves.
The pattern of results was broadly similar across Junior Diplomats and the IPCC target group, in that subjective understanding tended to be more aligned with objective understanding for the intuitive, compared to the counter-intuitive graph meaning that actual calibration of the intuitive graph entailed the optimal calibration at more levels of subjective comprehension compared to the counter-intuitive graph. These results are noteworthy given substantial demographic differences between groups: Junior Diplomats tended to be younger, more female, and have higher education levels. Tentatively, the similarity in results between both groups suggests that bettercalibrated subjective understanding of intuitive IPCC graphs might be a general pattern that is generalizable to different groups of viewers.
Decision-makers' misinterpretation of IPCC graphs may have dire consequences for climate change policy. Specifically, the design of IPCC graphs may lead to suboptimal use of the scientific information available. Fortunately, misinterpretations can be greatly reduced when risks are communicated using simple, well-designed graphical displays (Garcia-Retamero and Cokely 2017). The present findings therefore highlight a need for more intuitive design of IPCC graphs that follow evidence-based principles of effective graph design. Specifically, our findings emphasize the detrimental consequences of violating the general principle that spatial features (e.g. heights of bars) should convey the same meaning as conventional features such as legends or numerical labels on the scale (Okan et al 2012(Okan et al , 2016. Taken together, our findings suggest that graph design should reflect congruency with regards to spatial-to-conceptual mappings, to ensure that accurate interpretations can be reached without extensive elaboration. Our findings also support the idea that high visual complexity should be avoided where possible. Although our study design does not allow for determining which specific aspects of complexity contributed to misinterpretation of the counterintuitive graph (e.g. multiple variables, multiple data points), complex visualizations tend to be associated with slower and less accurate responses (Hegarty et al 2012, Padilla et al 2018. Reducing the visual complexity of graphs depicting climate data may be challenging as a certain level of detail may be needed to maintain scientific rigor and nuance. However, strategies have been outlined in the literature considering insights from the cognitive sciences, such as breaking down the data into different visual 'chunks' or considering whether information that is not essential for interpretation could be provided in text or a separate figure (Harold et al 2016). Other evidence-based strategies to promote understanding of visualizations include using sufficiently large sizes for relevant graph features and reducing the spatial distance between the visual pattern and the captions or legends (Harold et al 2016, Kause et al 2020. This can help to direct viewers' attention to key information provided in textual elements and increase the likelihood that they will process and integrate such information. These strategies can be particularly helpful for individuals with lower graph literacy, who are less prone to attend to such information in counterintuitive graphs (Okan et al 2016). Additionally, an important general principle for effective science communication is to test the recipient's understanding of the communication material (Bruine de Bruin and Bostrom 2013). This may help avoid unintended interpretations such as the ones documented here.
The findings of this study need to be carefully interpreted in light of several limitations. First, our sample of IPCC target audience was not a representative (or random) sample of the population of interest. Rather, the fairly low response rate among current political decision-makers likely implies a selfselection bias. This self-selection probably caused an overestimation of IPCC graph comprehension since the current sample may be (1) more motivated to contribute to science (Jun et al 2017), (2) better at interpreting climate change information, or at least, (3) may have estimated the required time and effort to comprehend the graphs to be lower than policy makers that did not respond. Therefore, difficulties with interpretation, and outright misinterpretations may be more prevalent in the entire population of political decision-makers. Another limitation is that we did not assess participants' graph literacy-the ability to understand graphically presented information, or numeracythe ability to work with basic numerical concepts, using full scales. Numeracy might be relevant for comprehending IPCC graphs, because (i) numeracy may moderate any existing relationship between prior beliefs and accuracy of understanding (Kahan et al 2017), and because (ii) IPCC graphs tend to make extensive use of numerical information (Amelung et al 2016) that sometimes requires transformations. Furthermore, (iii) low graph literacy is associated with stronger reliance on spatial features such as heights of bars (Okan et al 2012(Okan et al , 2016, and future research could examine whether this is the case for IPCC graphs as well. An important avenue for future research lies in experimental variation of the extent to which principles of intuitive design are employed in one and the same graph. Since original IPCC graphs were selected in the present study, relevant differences (other than the use or violation of principles of intuitive design) exist between both graphs. Specifically, although care was taken to select graphs from one chapter only, the graphs still differ in the type and amount of information they display, and how they display it. Particularly, the counter-intuitive graph uses nonlinear axes, which has been shown to affect both objective and subjective comprehension (Fischer et al 2018), and lacks clear axis labels that indicate what exactly the displayed numbers represent. Building on existing insights from the cognitive science of graph comprehension (e.g. Kosslyn 2006), as well as our findings that suggest that EPA staff may have relied on spatial-to-conceptual mappings, future experimental research could help estimate in how far objective and subjective understanding of IPCC graphs can be improved through principles of intuitive design.
To conclude, climate change risk displays in IPCC reports can bias both objective and subjective understanding of the graphs, leading to systematically inaccurate conclusions, and ill-calibrated subjective comprehension. Specifically, the present results show that IPCC target audience tended to falsely believe they understood a graph the majority had misinterpreted. Since inaccurate interpretation of key findings conveyed in IPCC graphs can have far-reaching implications for climate change policy, these findings signal a critical need for more intuitive design of IPCC graphs.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://osf.io/tzkbg/. Data will be available from 24 June 2020.