The growth of climate change misinformation in US philanthropy: evidence from natural language processing

Two of the most consequential developments affecting US politics are (1) the growing influence of private philanthropy, and (2) the large-scale production and diffusion of misinformation. Despite their importance, the links between these two trends have not been scientifically examined. This study employs a sophisticated research design on a large collection of new data, utilizing natural language processing and approximate string matching to examine the relationship between the large-scale climate misinformation movement and US philanthropy. The study finds that over a twenty year period, networks of actors promulgating scientific misinformation about climate change were increasingly integrated into the institution of US philanthropy. The degree of integration is predicted by funding ties to prominent corporate donors. These findings reveal new knowledge about large-scale efforts to distort public understanding of science and sow polarization. The study also contributes a unique computational approach to be applied at this increasingly important, yet methodologically fraught, area of research.

Private individual and corporate donors from across the political spectrum continue to exert immense influence on US politics [1][2][3][4][5][6][7]. This influence has grown in recent years with the rapid expansion of untraceable donor-directed philanthropy enabling actors to give anonymously via pass-through organizations such as DonorsTrust and Donors Capital Fund. Indeed, much scholarly attention has been given in recent years to the increasingly important role of financial giving within US politics, exercised through new campaign finance laws such as Citizens United v. Federal Election Commission, or more generally, through the growing concentration of wealth in the US [8,9]. With the recent expansion of these activities, scholars have begun to assess empirically how-and with what impact-billions of dollars in contributions from individuals and organizations are influencing the political process [1][2][3][4][5][6]10].
Concurrently, the scientific study of misinformation has also expanded in recent years in response to the large-scale diffusion of misinformation and false news across social media, news media, and government [11][12][13][14][15][16][17][18]. These two developments-philanthropy and misinformation-have converged most clearly around the issue of climate change, where scholars have identified an empirical relationship between industry-led political philanthropy and the large-scale production and diffusion of scientific misinformation about climate change [2,10,19,20].
Yet, despite growing popular and scholarly attention at this critical intersection, our empirical understanding remains relatively one-sided because scholars have tended to focus primarily on misinformation at the expense of philanthropy. As a result, we have a much finer-grained understanding of climate misinformation (i.e. its producers, content, and diffusion) than we do about the philanthropic ecosystem that underwrites it. We know that misinformation has a significant impact on public attitudes about climate change and climate scientists [21][22][23], that there are recurring thematic patterns within climate change misinformation as revealed through automated Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. content analysis [10,24], that misinformation can create polarization among politicians as it is amplified through their partisan echo chambers [11,12], that fringe misinformation can often obtain outsized attention in the mainstream media [25,26], and that the spread of misinformation is part of a broader 'post-truth' context that has developed in recent years [27].
Of the empirical knowledge that we do have about the philanthropic underpinnings of the production and spread of misinformation, it has largely been limited to studies of the fossil-fuel industry and their connection to lobbying efforts [19,28]. While immensely important, these are just one piece of a much larger and more complicated philanthropic puzzle. More than $410 Billion dollars are donated each year in the US alone [29], only 5% of which is from corporations. Given this fact, and given the shortage of empirical research, it is incumbent upon scholars to examine broader avenues through which philanthropy might impact the production and diffusion of misinformation.
There are two primary reasons why these new avenues have not been explored. First, even though we know from empirical research that there has been an industry-led partnership with some conservative advocacy groups and think tanks to spread misinformation [1,2,18,30,31], scholars cannot and should not assume that mainstream philanthropy will work the same way or are involved in the misinformation effort to the same degree. The fact that there is no empirical evidence linking liberal or progressive groups to the production of doubt and denial about climate change has created a tendency among those on the left to lump all right-leaning philanthropy into one camp, as if the relationship to misinformation works the same across all conservative institutions 1 . But this view is shortsighted and not based on empirical evidence, and inflamed by the deep ideological polarization and tribalism on the issue of climate change in the US [10,12]. Thus, the extent to which the climate denial effort involved other aspects of American life and civil societyespecially mainstream philanthropy-is an open question only answerable with new data.
Relatedly, the second reason why these avenues have not been explored is because studying the connection between philanthropy and misinformation poses inherently difficult data constraints, effectively limiting the questions that can be asked. Political philanthropy and the spread of scientific misinformation can be intentionally furtive [2,6,19], especially in recent years with the growth of 'pass through' organizations enabling untraceable contributions by economic elites that can underwrite furtive misinformation campaigns [2]. In lieu of financial transparency, research has been forced to rely on piecemeal aggregate statistics that limit the questions that can be asked about individual and organizational behavior in this arena. A finergrained and more creative approach is needed. Fortunately, a growing body of work from the social and computer sciences is providing new tools that can open up alternative approaches to rigorously and transparently examine the link between philanthropy and misinformation, and its impact on the larger political and scientific process in the US.
Importantly, this study broadens the focus beyond well-known and well-researched actors to examine the critical-and empirically unanswered-question about the link between mainstream US philanthropy and climate misinformation. Bringing much-needed new data and methods to bear, the study asks three related research questions: (1) Is there empirical evidence of a relationship between US philanthropy and the climate misinformation movement? (2) If there is evidence of a relationship, what factors might predict links between US philanthropy and climate change misinformation efforts? (3) Has this relationship changed over time?

Data and analytical approach
In order to advance scientific understanding of these issues, this study had to overcome three obstacles that have long plagued research in this area: (1) defining the analytic bounds and unit of analysis of 'US philanthropy,' and subsequently collecting reliable and representative data on such phenomena that are often purposefully and lawfully hidden from view when it involves misinformation efforts, (2) collecting reliable and representative data on the climate change misinformation movement, made up of a complex network of actors engaged in the furtive production and diffusion of misinformation at large scales, and (3) methodologically how to empirically assess the hypothesized link between philanthropy and climate change misinformation, given the inherent data constraints described above. In presenting the data and analytical approach, I explain how I overcame each of these challenges before moving to the results.
First, following a long line of research on philanthropy [33][34][35][36], I operationalize 'philanthropy' at the institutional level [37], and focus the measurement not on routine individualized donations (e.g. tithing to place of worship; donating after a natural disaster), but on the broader philanthropic establishment and its social and political role in the US. Further, we know from a long line of research that climate change contrarianism is itself a coordinated movement between larger institutions in industry and politics [2,10], and thus the hypothesized link I am testing is also at the commensurate institutional level rather than on the charitable giving patterns of everyday Americans. We also know that climate contrarianism and public and political skepticism about consensus on climate change is largely a conservative phenomenon, and research has not yet revealed evidence to suggest that liberal and progressive institutions have been involved in spreading doubt and skepticism 2 . Thus, in my analysis I do not examine the link between left-leaning philanthropy and climate contrarianism, but more accurately, and led by prior research, focus the data and analysis on the institution of moderate and rightleaning philanthropy.
Importantly, I satisfy these measurement requirements by using the most robust and representative indicator of US philanthropy: the Philanthropy Roundtable (PR). This powerful and far-reaching institution, which tends to be moderately right-leaning, makes it an ideal archetype to examine these unanswered research questions posed above about mainstream philanthropy. As perhaps the leading institution shaping US philanthropy, its mission is ambitious, aiming to 'foster excellence in philanthropy, to protect philanthropic freedom, to assist donors in achieving their philanthropic intent, and to help donors advance liberty, opportunity, and personal responsibility in America and abroad' [38]. PR's membership includes individual donors, families, and private foundations, and its activities include highprofile conferences and events, huge amounts of written material (e.g. Philanthropy Magazine, the Almanac of American Philanthropy, philanthropy guidebooks, historical monographs, and online articles), as well as, in their words, 'education of legislators on the value of philanthropy' [38] (see the supplementary materials, available online at stacks.iop.org/ERL/ 14/034013/mmedia for more detail). Further, this institution birthed the untraceable and hugely influential donor-directed funds DonorsTrust and Donors Capital Fund, which some researchers have hypothesized have played a role in climate change contrarianism [2], again making it especially well-suited for this study.
In response to the challenge of uncovering reliable data on philanthropy, this study uses novel computational methods to uncover a complete dataset of persons, organizations, events, and written texts that create a robust and representative measure of the types of social, cultural, and human capital that ultimately drive institutional influence [37]. The observational approach taken here is especially important because information was collected in a naturally occurring setting, from persons who attended in-person events and philanthropy conferences, from the entirety of written publications on philanthropy, and from compiling administrative records such as board member lists.
We employ the Stanford Named Entity Recognizer in conjunction with the Python Natural Language Toolkit library to build a list of 52 994 persons (14 776 unique) and 41 594 organizations (13 855 unique) connected to PR, compiled directly from three primary sources: (1) attendees and speakers at large inperson state-of-philanthropy meetings held at destinations around the US, which include a total of 131 events between 2001 and 2017 involving 3660 persons and 3525 organizations; (2) written materials, including all articles published in Philanthropy Magazine, almanacs, guidebooks, and online articles, amounting to more than three million words of text; (3) lists of board members and lifetime achievement award winners.
This natural language processing technique is ideal for this study because for any given set of texts, the machine automatically recognizes and classifies names of things, such as those of persons, organizations, locations, or company names [39]. Thus, rosters of names of persons and organizations compiled are not limited to clean pre-organized lists (e.g. in-person attendee or speaker lists; IRS-990 board member lists), but also include every person and organization ever mentioned anywhere within more than three million words of all text produced between 1997 and 2017. See the SM for an in-depth discussion and screen-shots illustrating this multifaceted data collection process. Many of these documents are no longer available, and thus I utilized the Internet Archive to uncover historical texts and additional event attendee lists. Finally, I combined these sources with the two smaller lists of persons: all winners of the prestigious William E Simon Prize for Philanthropic Leadership, as well as the full list of the board of directors between 1997 and 2017, which were collected from Internal Revenue Service form 990 filings.
Second, in response to the challenge of collecting reliable and representative data on climate change contrarianism, I built upon the gold-standard and wellestablished dataset used in previous peer-reviewed research [10,18]. The data include all persons connected to organizations actively involved in the widespread promulgation of scientific misinformation about climate change between 1993 and 2017 (3532 persons, 116 organizations). These data are similarly observational and naturally occurring, collected from lists of organizations' board members taken from IRS filings, and include other established social, political, and economic ties these individuals have to a complex network of think tanks, public relations firms, trade associations, and industry front groups (see SM for extensive detail). Taken together, these proven data represent the most reliable collection of persons and organizations that have conducted possibly the most politically successful misinformation campaign in 2 As additional evidence for this partisan divergence on the promulgation of scientific misinformation-and to support my operationalization of moderate/right-leaning philanthropyrecent research by Brulle (2018) shows that lobbying spending by interests opposing significant climate legislation have outspent progressive environmental groups by a tune of 10 to 1, spending more than $2 billion between 2000 and 2016. This stark partisan split is also evident among politicians and the general public. history-one that has created widespread public skepticism about science, sowed political polarization, and stalled or reversed policy action (e.g. Paris Climate Accord) in the US for nearly three decades.
Third, to empirically model the hypothesized link between this large-scale misinformation network and philanthropy data, I developed a robust indicator using an approximate string matching method. Implemented in a custom R package for this project, this approach records a match occurrence for every instance that either a person or an organization from the misinformation network is present at a philanthropy event, written about in a philanthropy publication, or is currently a member of the PR board. I use Jaro-Winkler distance to compute string distances, which is based on the edit distance between two terms, meaning how many character edit strokes would be needed to change terms A to B (see SM for more on this). I employ a very conservative approach to fuzzy matching, using a maximum distance of 0.03 (nearly exact match) for persons and 0.00 (exact match) for organizations.
Analytically and substantively, I take the occurrence of a person or an organization to be a reliable indicator, at the very least, that this prominent and influential institution of philanthropy is aware of said person/organization and is willing to integrate them into their public facing written materials, or even more substantially, have invited them to attend or speak at their high-profile philanthropy events, indicating that they see them as thought leaders or endorse their perspective. For example, if John Doe is on the board of one of the 116 climate misinformation organizations, and was invited to speak at a philanthropy event in 2001, and is then profiled in Philanthropy Magazine in 2008, a total of two matches would be recorded for him. Separately, if his organization was also listed on an event program, or profiled in the magazine, then a total of two matches would also be recorded for this organization.
For sake of methodological clarity, figure 1 depicts this important matching process visually using excerpts from two articles from the data published in Philanthropy Magazine in 1998 and 2006 respectively. The 1998 cover story casts doubt on global warming and cites several prominent climate denial scientists, and the cover story from 2008 highlights the growing link between think tanks and conservative donors looking to make a political impact.
This methodological approach reliably records, over twenty years' time, whether or not persons and organizations from the misinformation network were integrated into mainstream philanthropy in very concrete and impactful ways. Importantly, this analytical approach overcomes the reliance on detecting evasive financial exchanges that have hindered previous research, and instead captures more robust forms of repeated realworld social and political interactions-ranging from being invited to highly exclusive in-person meetings, to being written about in widely read national publications, to serving on a board of directors.
Further, there is reason to expect that some members of the misinformation network were better integrated into mainstream philanthropy than others, so I also tested for the influence of several covariates to predict variation in the number of match occurrences (see SM for frequency statistics for all variables). Most notably, because previous research has tended to focus on the influence of fossil-fuel related funding in climate change misinformation campaigns, I tested for this possibility. I followed a large body of research-based largely on Internal Revenue Service data and internal documents-that identified ExxonMobil and Koch family foundations as historically two of the most influential funders in the movement [2,6,10,[18][19][20]40], and thus included in the multivariate regression models below a variable indicating whether or not an organization in the misinformation network had received contributions from either of these funders. To be clear, I use the phrase 'corporate funding' in a broad sense to denote for-profit businesses like ExxonMobil, but I also sought to capture foundation funding that is formally and informally linked to for-profit businesses such as Koch Industries, based on prior research. In doing so, I am able to measure both the direct and indirect influence of corporate entities (see the SM for more on the substantive and empirical justifications for including these two corporate actors in the model). Of the 116 climate contrarian organizations, slightly more than half (54%) had received such funding.

Results
To properly contextualize the quantitative results that follow, it is first important to consider ground-truth qualitative evidence about what a match occurrence substantively looks like in practice, and why it matters for the spread of misinformation.
Returning to figure 1 above, we see that the Philanthropy Magazine cover article from twenty years ago, 'The Global Warming Debate Heats Up: Politicized Science and Its Supporters,' (1998) argues that conservative foundations are trailing their liberal counterparts when it comes to influencing climate science. Earlier in this article (see full text in the supplementary material, the prominent climate scientist Michael Oppenheimer is discredited, and associated with a 'Hollywood' liberal elite who peddle climate 'doom and gloom' and ignore the purported body of alternative science casting doubt upon anthropogenic global warming. Importantly, in calling for more philanthropic funding to produce 'contrary evidence' and promote 'debate' in the public sphere, the article then draws heavily from interviews with three fringe contrarian scientists (Patrick Michaels, Richard Lindzen, and Frederick Seitz). These individuals spent several decades attacking evidence-based science, and eventually moved from the political fringe [41], to play a key role in promulgating misinformation to mislead the public, influence politicians, and create the appearance of credible scientific doubt [30,31].
The presence of such well-known actors in this philanthropic material provides one example of anecdotal evidence to suggest a possible relationship between scientific misinformation efforts and mainstream philanthropy. Further, and more broadly, their presence illustrates another channel through which the spread of misinformation can be effective, given that they are presented by mainstream philanthropy to be credentialed scientific experts with the requisite authority to affirm the supposed veracity of the misinformation.
But just how pervasive has this misinformation network become within philanthropy? And, which factors might predict why some actors from the misinformation network have been better integrated than others? To answer these questions, I turn to the main quantitative results of this study. Figure 2 presents the aggregate match results for persons and organizations across all years of the data. These graphs illustrate the total sum of match occurrences annually for twenty years' time, and indicate whether or not persons or organizations from the misinformation network were progressively integrated at in-person events and conferences and into philanthropy publications.
These data show that the number of persons from the misinformation network increased substantially. In 1997, 30 persons from the network were recorded present, yet less than ten years later, their presence had increased 443%. Organizations from the misinformation network saw a similar increase over this same period before tapering off slightly in later years. In 1997, a scant 20 organizations were present, but by 2006 their presence had grown by 345%. It is important to note that these aggregate increases are not attributable to increases in the total number of records in the data, but rather to the specific increase in the presence of persons and organizations from the misinformation network. It extends prior research showing that this network fully mobilized in the early 1990s [42], and began finding political success between 1997 and 2006, as the US abandoned its commitment to the Kyoto Protocol in 2001, creating a counter-movement to challenge scientific consensus at the foundation of the United Nations IPCC assessments, the UK government's Stern Review, and popular science mediums such as Al Gore's 2006 hit film An Inconvenient Truth.
Are there observable patterns that might explain this growth? For example, is the influence of the misinformation network broad, or concentrated among a smaller circle of persons and organizations? Figure 3 plots the same aggregate results for persons, but labels their affiliated organizations in the misinformation network according to whether or not they had received corporate funding. As noted above, only about half of all organizations in the misinformation network had received such funding (54%). The findings in this graph show very clearly that such funding is associated with much higher rates of integration into philanthropy, whereas the organizations without funding remained relatively constant, and significantly less integrated.
Further, the mean number of annual matches for organizations with corporate funding was 95.23,  whereas for those without funding was 9.57. Of the 46 unique organizations that matched, 42 of them-or 91%-had received such funding. Seventy percent of total occurrences in written articles and in-person events were from just five organizations, which had themselves played an especially significant role in the climate misinformation movement. All five had also received corporate funding. These patterns similarly hold true for persons who matched: of the 425 unique persons who matched at least once, 86% were affiliated with one or more organizations that had received corporate funding.
While these findings reveal very clear descriptive patterns, it is important to test this relationship in a multivariate predictive framework using negative binomial regression (table 1). Building on figures 1-3 above, the dependent variable is a total count of the number of occurrences (regressions were run separately for persons and organizations). This variable is examined using several covariates, including corporate funding (binary), time (year the in-person philanthropy event was held, or philanthropy publication was written), an organization's estimated assets, whether that organization had produced climate change misinformation texts of their own (binary), and the year the organization was founded (see SM for methodological details and frequency statistics for each variable).
Net of other explanatory factors, organizations in the misinformation network who received corporate funding-compared with those who did not-are positively and significantly associated with higher occurrence counts (table 1, column 1). For ease of interpretation, this finding is presented visually in figure 4 using predicted counts, with statistically significant covariates appearing in red. Substantively, this figure shows that increasing corporate funding from 0 to 1 is associated with a match occurrence increase of 2.8.
Turning to the integration of persons, I am able to test the hypotheses using two different dependent variables (table 1, columns 2 and 3). In Model 2 I used a binary dependent variable, and in Model 3 I employed a count variable (Corporate Funding Total) that records how many total organizations a person is associated with that have received corporate funding (min=0, max=12). Net of confounding factors, these persons are positively and significantly associated with higher occurrence counts in both models. To illustrate this finding, figure 5 below shows predicted occurrence counts derived from the binomial regressions in table 1 at different levels of the Corporate Funding Total variable, indicating that predicted occurrence counts increase as the number of affiliated organizations with corporate funding increases 3 .
Finally, I considered the medium through which the integration of the misinformation network was most likely to happen: at in-person events and conferences or through written philanthropy publications? Overall, I found that persons and organizations matched more often in written publications. Nevertheless, the gap between the integration in philanthropy publications versus integration at in-person events remained relatively constant over time, both for relative proportion and absolute counts. Notably, the frequency of persons at events, as shown in figure 6, revealed a modest increase between 2001 and 2006, suggesting that the role of in-person events as a medium of integration was at its peak in the early 2000s.

Discussion
This study developed and tested a novel computational method that revealed a robust relationship between two of the most consequential and evolving movements impacting contemporary political life: largescale misinformation campaigns and philanthropy. In so doing, the study introduces a new and broader pathway through which climate change misinformation travels, beyond the tendency of research to narrowly focus on the activities of think-tanks and fossil-fuel interests, often in isolation from mainstream American institutions like philanthropy. Yet, as this study also shows, the impact of funding from fossil-fuel sources still plays an important role, revealing that the strength of the relationship between the misinformation network and philanthropy is strongest for people and organizations directly tied to such funding. Finally, the study sidesteps many of the methodological roadblocks and data constraints that  3 Note, as expected, that the predicted counts become less precise at higher levels because fewer and fewer people are associated with increasing numbers of organizations.   have severely impeded empirical research into these furtive issues, and shows how-and to what extentthe discursive and in-person interactions between philanthropy and the climate misinformation network have grown over time.
The reliance on in-person events and written publications, as well as the representative scale of the data, ensures the real-world validity of these findings. Future work on these opaque processes should similarly attempt to move beyond a tendency to focus solely on often piecemeal financial contributions data to understand social, political, and economic mechanisms behind the successful spread of scientific misinformation. This approach is especially important given that in recent years philanthropy-led by the PR itself-has moved toward untraceable donor-directed funding methods (e.g. DonorsTrust and Donors Capital Fund), shielding the gifts, identities, and political intent of donors. Instead, and in the absence of transparent funding data, future work should continue to develop creative research designs that reveal other potentially important pathways for social and political influence.
It is here where the computer sciences can be especially helpful, offering innovative ways to examine social and political problems [43][44][45][46][47][48][49]. And, as larger amounts of digitized observational and historical data are made available, researchers will do well to consider the ways that computational approaches like natural language processing or unsupervised learning might unlock new methods for conducting research on seemingly impenetrable and opaque issues like misinformation and philanthropy.
One unanswered question emerging from these findings involves the causal direction of influencenamely, which set of actors were responsible for the spread of misinformation networks within philanthropy? Were some active, and others passive? Did leaders in philanthropy, and pass-through organizations such as DonorsTrust, seek out these misinformation networks, or conversely, did the climate misinformation movement actively make in-roads into philanthropy? The timing of integration, shown in part in figure 2, combined with a large body of research on the creation of the climate change misinformation movement [2,18,19,30,42], suggests that it was a dual-process. This well-organized misinformation network likely sought out integration with philanthropy, as demonstrated by previous research showing that they made similar inroads into other spheres of conservative elite influence during the early 1990s [19,50]. With that said, the unique effects observed here also suggest that from very early on (e.g. figure 1 above) philanthropy was a natural fit for the misinformation network because it offered particular benefits to members who sought to protect their own political, economic, or ideological interests that may be threatened by regulatory action on climate change.
Finally, as researchers continue to turn their attention toward the empirical study of misinformation, these findings suggest that future research ought to pay closer attention to (1) broader avenues of societal influence that enable the spread of misinformation, and (2) the role of social class. The spread of misinformation, fake news, 'alternative facts,' and the like, are not only the product of the usual suspects, such as rival nation-states or industries with much to lose (e.g. tobacco, fossil-fuel). Instead, as revealed here, these processes can also take hold among pillars of civic society such as philanthropy, and exert broad societal impacts [27]. Second, and related, the role of social class is paramount, given the prominent role economic elites play in many such civic institutions. The significance of social class, and especially of economic elites, will only continue to grow in importance as wealth continues to be further concentrated among a select few [8,9], and as laws continue to enable untraceable contributions that incentivize elite financial influence in politics [51]. It is therefore incumbent upon researchers to approach these knotty and clandestine processes with cutting-edge tools and new types of data in order to improve our scientific understanding of large-scale social and financial efforts to spread misinformation, undermine scientific facts, sow polarization, and exert disproportionate control over the political process.