Data campaigning: between empirics and assumptions

: The use of big data in political campaigns extends far beyond micro-targeting, and has been singled out by journalists and campaign staffers alike as a powerful force that is integral to electoral victory. Current scholarship on the subject remains more mixed, however. This article provides an overview of what we know (and don’t yet know) about the effects of data-campaigning across various goals of political campaigns, alongside more public facing narratives that present data campaigning as an all-powerful tactic, highlighting the gap between these two views.


INTRODUCTION
Campaigns' use of data to craft and target messages has occurred for decades, and as the amount and availability of a variety of data points has increased exponentially, discussions of cutting edge campaigning tactics have centred on data and the practices enabled by it. Often, these pieces go beyond descriptions of novel tactics and make assertions that campaigns' use of data is directly and causally linked to how well or poorly candidates are doing, or the overall electoral outcome. Articles following Trump's surprising presidential victory in 2016 emphasised how integral Facebook data was to the campaign with headlines like "How He Used Facebook to Win" (Halpern, 2017), and "The Data that Turned the World Upside Down"  (Grassegger & Krogerus, 2017). In the months leading up to the election, however, news stories were full of how data operations gave Clinton a "crucial tactical advantage" (Goldmacher, 2016) -until the day she lost the race, that is. In 2012, which was dubbed the "first big data election" (Hellweg, 2012), a similar story played out. Following Obama's victory, journalists homed in on how "Obama trumped Romney with big data" (Thiessen, 2012), and political operatives on both sides of the aisle doubled down on the need to create a "culture of testing" within their ranks.
Despite these assertions of data's power and influence, knowledge of when, how, and if data campaigning actually works is more complicated. This article explores what the fields of political science and communication studies know about the empirical effects of data campaigning and highlights the gap between those realities and how data is so often described as all-powerful.
The uses of data and analytics in political campaigns represent cutting edge practices, but also have a longer history within campaigns. What we understand as "data-driven campaigning"or using large data sets to either target messages to particular populations or test the efficacy of variations of messages and a variety of goals -rose in prominence first during the Obama 2008 campaign, but gained precision in its application and its public profile during the 2012 campaign, which many news outlets called "the big data election" (Hellweg, 2012). While the practices of 2008 and 2012 were novel in many ways, they were also deeply connected to ongoing practices and strategies. They were clearly linked to findings developed by academics in the field of political science at the turn of the millennium, and have roots in what are now considered the routine and mundane practices of polling that were developed in the 1940s and which campaigns still use to help craft their messages. Yet, the use of data and analytics in making evidence-based decisions about campaign strategy is often held up as radically new and deeply effective. Although attempting to answer important questions with more data is generally a better alternative to using no data, the assumption that just because data is involved the answers must be right is equally dubious.
This article delves into the complexities of data's power and influence in political campaigning in the US, examining if, when, and how data campaigning has been shown to actually work. I analyse the US case because, due to a combination of enormous financial investment in campaigning and lax regulation around privacy and data use, data campaigning in the US far outpaces that in other countries (Dommett & Power, 2019;Dommett, 2019). Moreover, data use regulations and guidelines such as those in the EU and Canada prohibit targeting based on personal information like race, gender or religion, while political practitioners in the US have embraced such practices and made them central to campaigns at both the national and local levels. If we are to understand the vast terrain of data campaigning, doing so with the US case in mind provides a very full picture of the opportunities at hand, as well as their limits. Moreover, despite data use regulations in some countries, concern about the export of US data practices abounds, and it appears that data-driven practices developed in the US are indeed exported to other contexts by professionals at technology firms like Google or Facebook, off-the-shelf software companies like NationBuilder, or political consultants (McKelvey & Piebiak, 2018;Strömbäck & Kiousis, 2014).
Broadly, this article argues that data-driven practices have been much more productive at mobilising action, like getting out the vote and improving donation rates, than at persuasive goals of getting someone to support a candidate. It provides a comprehensive overview of what we know data-driven practices can and cannot do, and points to places where assumptions about its effects outpace empirical knowledge. It also discusses the ways that what David Beer (2019) has dubbed the "data imaginary" or "how a faith in data emerges and then becomes embedded or cemented in social structures and practices (p. 127)" is at work in coverage of data in the 1940s has become routine, as has the use of "lifestyle" consumer data like credit card purchases and magazine subscriptions that was all the rage in the 1990s. Today's novel data type, social media data that purports to provide insight into our deepest emotional states and analytics that track which messages get more attention and engagement will likely be tomorrow's boring data set, regardless of how effective it is currently or becomes over time. As data-driven or analytics-based campaigning has become de rigueur, campaigns at all levels have adopted the language (sometimes even more than the practices) of data, and applied to whichever level of practices they are operating at. The openness of the term -polling data is in fact data, after all -coupled with the complexity of newer data practices offers this rhetorical flexibility, and a papering-over the very real distinctions between a huge variety of campaign practices. This article parses those distinctions, providing an account of how many of these

TARGETING AND TESTING
At the most overarching level, data-campaigning involves two genres of practice: targeting and testing. They can each be put to use toward a variety of campaign goals, including persuading members of the public to support their candidates, or mobilising people to take some sort of action, most often donating money or getting out the vote (GOTV) (Tufekci, 2014). Any of these goals and genres of practice can be supported by access to a variety of types of data, including public voting records and census block data, campaign or party-supplemented databases, consumer or "lifestyle" data purchased from a third party vendor, and lifestyle-adjacent social media data, which can account for web browsing history, social graph data, and the algorithmic grouping of these data points into categories like emotional state and disposition. Many of these data points can be combined by a campaign or by a third party firm selling data itself (e.g., Catalist, Aristotle, or Nationbuilder) or strategy services around how to test and target messages using this data (e.g., TargetSmart, Targeted Victory, etc.). This piece not only breaks down each of these types of data-driven campaigning, but discusses which of them have garnered public attention, which have been touted as revolutionary and powerful, and which have been empirically tested to assess their power and efficiency. information about what messages voters would respond to. Micro-targeting, or using an increasing number of data points to target smaller and smaller slices of the population, is an extension of these early practices. Targeting tactics have been celebrated before the digital turn in politics, dating back to attempts to segment the American public by ideology and political interests that began in the 1950s (Issenberg, 2012), or even, as Daniel Kreiss' (2016) work has traced, the 1980s, when the Republican party created a set of index cards with detailed information about individual members of the voting public. In the digital arena, the 2012 election saw much discussion of both the Obama and Romney campaigns' uses of microtargeting for ads that ran on websites and before YouTube videos, as well as in video games (Otenyo, 2010), and on social media platforms. The ability to target audience segments in more specific and refined ways has only increased, as digital platforms like Facebook and Google as well as ad sales firms have developed ways to reach increasingly specific slices of audiences over time, and can charge a premium for doing so.
While much has been written about the possibilities of such tools and practices (Chester & Montgomery, 2017) and subsequent dangers of these possibilities, much less work testing its efficacy has been produced. Public discussions of targeting programmatic ads often assume that more and smaller data points lead to better outcomes, but studies of consumer outcomes have recently cast doubt on how much more effective micro-targeting is (Marotta et al., 2019).
Beyond questions of its efficacy, targeting -and especially increasingly specific micro-targeting -has been particularly criticised for its implications for reducing opportunities for shared public deliberation (Howard, 2006;Kreiss, 2012;Tufekci, 2012), but a recent study from the UK shows that despite the ability to target, even major campaigns end up with messages that largely echo the narratives found in national-level ad campaigns (Anstead et al., 2018), raising the question of how different the content that results from micro-targeting really is. In the highlypublicised case of the Trump campaign's use of Facebook's "dark posts" -which allowed the campaign to functionally make ads invisible to non-targeted populations and were used to target Black Americans with messages arguing that Clinton was racist and should not be supportedsimilar ideas were hardly absent from the larger campaign, with Trump himself tweeting about these ideas and the campaign running a national ad on the topic, too (Hellmann, 2016;Savransky, 2016).
Testing, on the other hand, allows campaigns to empirically measure how well messages perform against one another and use that information to drive content production. is profoundly helpful in figuring out how to best get audiences to take action in a particular, immediate case -for instance, which ad results in the most clicks, donations, or email sign-ups.
But its relevance for understanding long-term dispositions or actions is less clear. This has raised some concern among practitioners about the long-term effects of messages that might work in the short term, but potentially have negative long term consequences, such as using fear or shame based messages to drive fundraising or turnout (Brooks, 2018).
While fundamentally different, targeting and testing can be, and often are, used in tandem. A campaign can target a message, then test it within that targeted audience, or test messages across audiences. Data about both targeting and testing are often provided by those technology firms selling digital and social media ads. Despite the differences, in much coverage of digital campaigning tactics, the lines between targeting and testing are blurred, and success in one arena is often used to define or show evidence of success in another. In 2016, the Trump campaign's targeting tactics were widely credited for the victory in profiles emphasising the aforementioned "dark posts" and "psychometric targeting" offered by Cambridge Analytica, which used data related to users' psychological state outside of politics and demographic information to segment audiences to receive different campaign messages (Grassegger & Krogerus, 2017). Fundamentally, the idea of psychographic targeting hinges on targeting users based on how they fit into one of five psychological profiles (openness, conscientiousness, extroversion, agreeableness, neuroticism). And yet, in describing how and why these tactics were productive, the Grassegger and Krogerus article blends the two as it notes that the campaign "tested 175,000 different ad variations for his arguments, […] in order to target the recipients in the optimal psychological way". While these tests may be useful for seeing which images or text perform better, they are functionally A/B tests, and not particularly tied to the use of the five psychological targeting categories.

PERSUASION VS. MOBILISATION
Both targeting and testing can be used for a variety of campaign goals related to both mobilising audiences to take particular action and persuading them to support a candidate they are not already supportive of. Although data and digital teams often have their hands in both mobilisation and persuasion activities, the two goals are fundamentally different -convincing someone to take an action they are likely to be supportive of, versus convincing or even changing someone's mind about a candidate or issue. Campaign organisation in the US often reflects this divide, with persuasion-oriented goals typically being the purview of the Communications team, and mobilising donations and mobilising GOTV being run by the Field and Finance teams, respectively (Blodgett, 2008). While digital and data teams support all of these efforts and often hold equal power in campaigns, the separation of these efforts is illustrative of their core differences.
Persuasion is tremendously difficult to measure empirically. While polling, dial tests, and focus groups may get at changes in attitudes or immediate reactions to a message, disentangling those attitudes from more macro-level dispositions and contexts, and competing narratives in the world is hard. And yet, claims about the persuasive power of any number of tactics abound.
From assessments that a campaigns' message was better so they won the race, to claims that highly targeted ads changed people's minds, assessments of persuasion oversell certainty and undersell the role of exogenous factors such as party identification, the state of the economy, or the obvious advantages of incumbency. These claims also fundamentally conflate persuasion and mobilisation. When campaigns talk about using A/B testing or randomised controlled experiments to figure out what messages worked, "working" is defined not by a change in belief, which would be all but impossible to measure, but by being mobilised to take a particular action, be it signing up for a newsletter or donating money. Even more commonly, when journalists and pundits discuss who was persuaded in an election, they are necessarily discussing who was mobilised to vote. To say persuasion is tremendously difficult to measure does not mean campaigns should abandon all attempts to cause it -of course campaigns will develop narratives, test them, and target particular populations with them in a fashion that makes logical and contextual sense, and targeting may very well work better than no targeting at all.
The point is, rather, that the empirical backing of these practices is neither easily identifiable nor clearly the case, and claims by campaign operatives and political professionals that they are -whether they are about overarching narratives or highly targeted messages -are overly confident.
Claims surrounding the role of Cambridge Analytica's psychographic targeting in both Brexit and the 2016 US election provide a useful example of these slippages. Touted as a "psychological warfare mindfu*k tool" by creator-turned whistleblower Chris Wylie (Cadwalladr, 2018), there simply is not much empirical evidence of persuasive capacity. When former Cambridge Analytica CEO Alexander Nix boasted that the company could "predict the personality of every single adult in the United States of America" (Grassegger & Krogerus, 2017) what he literally meant was that the firm could assign a "personality" category to everyone, with conflicting accounts of how precise that designation was and little clarity on whether it made a difference in political beliefs (Sumpter, 2018). Political beliefs, as opposed to consumer behaviours like becoming interested in a new product, are especially difficult to dislodge, and that when audiences notice political content in social media is an advertisement, they react more sceptically toward it than consumer brands (Boerman & Kruikemeier, 2016). In just the same way that data-oriented practitioners criticise traditional messaging consultants' "gut-based" belief that a message works (which, it should be noted, does make use of "data" gained from dial tests and focus groups), they should be hesitant to oversell their understandings of what data shows persuasion. While some journalists have written about the dubious nature of some of these claims, particularly those of Cambridge Analytica (Confessore & Hakim, 2018;Lapowsky, 2016a), many have also been more than willing to parrot campaigns' claims that they figured out a failsafe way to persuade citizens using a variety of data points. Much of this coverage echoes the earliest digital coverage of the Obama campaign, which often focused on the campaigns' use of social media, overselling the persuasive and mobilising power of novel digital platforms.
We actually do know quite a bit about what data points and practices are important for mobilising a variety of actions, as it is easier to test when a clear outcome happens or doesn't. As field experiment experts David Nickerson and Todd Rogers (2014) have explained in detail, in order to decide which people are correct targets for any mobilising goals, from mobilising turnout or GOTV to fundraising, campaigns create predictive scores for each individual, which model the likelihood that someone will undertake a specific political behaviour, support the candidate, or respond in any way to a stimulus. These scores are often used in tandem, such as when Field teams need both a clear picture of support and voting behaviour in order to determine who to target with GOTV efforts.
The data that goes into these scores includes publicly available macro-level data like that from voting records and the census, purchased macro-level data that may be more up to date than those sources, purchased lifestyle data, and user-provided data gained from citizens directly reporting that information, or from analytics and cookies that track it through their web use (Nickerson & Rogers, 2015;Dommett, 2019). In practice, behaviour scores rely fundamentally on data that concerns prior behaviour -so, prior voting behaviour is integral to GOTV, while prior donations are integral to fundraising. Support data is similarly heavily dependent on publicly available voting record data, followed by publicly available census data, and direct voter contacts, wherein campaigns ask people how supportive they are or issues they are interested in.
Support scores can also make use of analytics that can determine how someone is interacting with the campaign's digital messages to gain a better picture of those on the higher end of the scale, but this type of data is less useful for those with lower scores (Nickerson & Rogers, 2014). In lieu of getting input from everyone, they can also be used to model increasingly specific scores for others who have similar major data points, but haven't been contacted by the campaign or party. Responsiveness scores are an indicator of if someone is likely to respond to a campaign's message or call to action, and are largely based on testing that the campaign does.
They make use of a wide variety of data acquired through users' various digital footprints -from when and which campaign emails they open to if they sign up for an event -and because they are not based on identity, could be useful in countries with identity-based targeting restrictions.
Some types of data have little value in terms of direct effects on persuasion or mobilisation, yet still have some value to campaigns. Data gained by campaigns through both testing and analytics connected to who read or opened content is particularly helpful to creating the responsiveness score, which assesses how likely an audience member is to respond to any stimulus, but it is much more effective for better understanding those who already support a candidate. That said, in much public discussion of data campaigning, it is the information that is digitally "provided" by voters -assessments of this data that are provided by web browsers, social media platforms, or third party -that are touted as revolutionary, when in fact, it has been shown to have limited predictive power. Additionally, data may not hold predictive value, but can still be a valuable asset to campaigns because they can sell or rent access to it (Tactical Tech, 2019).
When creating and refining these scores, information about known supporters gets increasingly richer, which may lead to mobilising power, but is much less likely to improve persuasive ability about a candidate overall. Countries that have passed laws regulating data collection and use by either corporate or political actors, such as the EU's GDPR regulations, pose particular constraints on the ability to target for either mobilisation or persuasion (Kruschinski & Haller, 2017), and these regulations often place even greater importance on voting history, as it is less often regulated than data associated with identity such as gender or race, or more "micro" level data such as web use. While privacy and legal scholars have highlighted the possible loopholes in such regulations (Bennett, 2016), and argue that despite restrictions, targeting in particular can be engaged (Dobber et al., 2017) engaging in such workarounds often involve using more and more proxies for intended categories, thus reducing the efficacy of such data, and increasing dependency on known and macro-level data points like voting history.
In a campaign, staffers rely on tests conducted in prior election cycles that have made empirical findings concerning what data points matter when targeting potential voters to get them to the polls. For mobilising turnout, Eitan Hersh's (2015) work has shown that publicly available data found in voter records is not only key, but that purchased, hyperspecific data -such as "lifestyle data" or consumer histories that tell you what magazines people subscribe to, what kind of car they purchased, or what their spending habits are like -is often redundant rather than additive to this public, macro-level data. Moreover, this data, on its own, is shown to have no predictive power, and is largely redundant to that which is publicly available in the US (Hersh, 2015;Nickerson & Rogers, 2014). Even other publicly available data from the census, which people broadly consider useful to campaigns, like income or education level doesn't hold explanatory power when controlling for voting history. Despite a lack of evidence concerning its importance, purchasable hyper-specific data have a storied history in campaigns, as they were touted by the 1996 Clinton campaign for its ability to produce smaller populations the campaign wanted to target, like "soccer moms" or "pools and patios" (MacFarquhar, 1996). Even if as the available data and predictive modeling improves over the coming years and this data adds marginal benefit, use of the data is still risky, as it is a real threat to turnout voters who are not firmly in your corner.  (Lau et al., 2007), but emotions like anger do encourage partisan responses to false or biased information (Weeks, 2015), we should consider the possibility that highly targeted negative ads may be of benefit for mobilising supportive audiences. Micro-targeting and increasingly specific data points are much more likely to yield impressive results for fundraising, and other, less zero-sum actions than effect change in turnout or persuasion. That said, even in these cases where that seems likely, the effects of targeting are difficult to disentangle from the effects of testing that is also undertaken. Thus, newer tactics like using Facebook's "Lookalike Audiences", which allow political practitioners to find targets who share demographic and lifestyle qualities with those they already have contact information for or targeting users who "liked content related to X politician" are much more likely to be of benefit in mobilising donations and email signups, than causing a change in political opinion. Moreover, there is little to no empirical research that shows that something like "psychographic" targeting works to persuade people, or does much other than deepen existing commitment.
In each of these cases, there may be meaningful reasons to use new targeting or testing abilities that lie outside of known empirical outcomes -in cases where no effects have been shown empirically, there is unlikely to be a penalty for engaging in them, and the everyday work of campaigning largely revolves around crafting narratives and reaching out to voters in ways that may not have empirical effects. In highlighting the gap between what is known about data campaigning and how it is discussed in public, this article seeks to decouple the assumption of empiricism and objectivity associated with the very fact of being data-driven.  (2019) describes "the veneer of knowing that aims to draw people into a data rationality" as central to the "data imaginary" (p. 4). In Beer's vision of the data imaginary, six qualities or themes -that data is speedy, accessible, revealing, panoramic, prophetic, and smart -are key.
In the following section, I show how narratives of data campaigning focus especially on the data imaginary's themes of revealing, panoramic, and prophetic. For Beer, data as revealing refers to the idea that data "are represented as being the means by which 'hidden' value might be unearthed or new value might be tapped" (p. 25), and panoramic refers to the idea that "data analytics shine a light on blind spots […] in which nothing is outside of the knowledge that is produced by the data" (p. 26). In both of these qualities, information or its value is rendered visible by the use of data. Similarly, danah boyd and Kate Crawford (2012) have argued that understandings of "big data" hinge on the "widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy" (p. 663). Beer's definition of the data imaginary's theme of prophetic emphasises that data "open up a world in which it is possible to anticipate what will happen and respond accordingly" (p. 27). The data imaginary is important not because it reveals a lie, but because it demonstrates how data practices that are effective are framed in the same ways as those that are less rigorously tested (or completely untested), resulting in oversold claims about the power and impact of data practices as a whole. What follows is an initial overview of how, rhetorically, public-facing narratives of data campaigning have fallen into these themes.
Themes of data as panoramic and prophetic have long dominated narratives of data campaigning. Sasha Issenberg's (2012) best-selling book, The Victory Lab, places current data campaigning practices in a historical context, and even his accounts of the earliest uses of "data" in the 1950s emphasise these themes. He details how Simulmatics, one of the earliest consulting firms, touted its ability to provide so much data that it revealed new information about voters and how they could be grouped by interest, as it "included 130,000 respondents… was able to divide the United States into 480 'voter types' […] and take the temperature of each voter type on fifty-two 'issue-clusters'" (p. 118). As early as 2004, coverage of data campaigning was marked not only by extensive claims of data's panoramic ability to see all issue positions, but also its ability to prophecise -that it could "divine [your] likely views on taxes, law enforcement, abortion, and law enforcement" (Gertner, 2004). Such claims are also in turn "proven" by reference to data's enormity. In this case, such divination was enabled by "the several whirring, refrigerator-size computer servers in the Washington area" (ibid). New York Times columnist David Carr (2008) echoed the focus on infrastructure, arguing that Obama would have political power as he entered the White House because he will have "not just a political base, but a database". The digital sublime, via whirring largess and spreadsheet columns is valuable and astonishes without need to describe how or why it will actually work.
Accounts of data campaigning also emphasise how large data sets can reveal new people -or truer versions of people -to campaigns. A 2004 piece, "The Search for the Elusive Swing Voter", emphasises data's power to reveal in its title, and ultimately argues that what allows this previously unlocatable type of voter to be rendered legible is the vast amount of data parties have accumulated and crafted into additional analytics, with both parties holding over 150 million voter files (Green, 2004). These claims are renewed year after year, with numbers inching up -nearly 200 million files held by the Obama campaign in 2012 (Pilkington and Michel, 2012) -and commonly argue that new data creates the conditions under which "a campaign can literally know who on a block by block basis is persuadable" (Miller, 2012).
The data imaginary's frame of data as useful insofar as it reveals new information can also be seen in how the amount and type of data campaigns use is covered and revered. Descriptions of data operations that fundamentally centre scale -the "as many as 306 lifestyle variables" held by the Democrats in 2004 (Green, 2004), the "500 data points on every individual" the Obama 2012 campaign made use of, or Cambridge Analytica's boasts of having over 5,000 data points on every American (Chon, 2019) -emphasise how the scale of data operations will necessarily reveal new qualities about voters. Coverage of data campaigning also draws on and reproduces the data imaginary when it assumes that all data is equally valuable in producing insights about potential voters, and that larger data sets filled with novel data points are therefore most valuable. For instance, calling data campaigning "the Moneyball of politics" (Miller, 2012), argues that data's value is in strange, undervalued data points, not information like that which is in the publicly available voter file. Until about 2004, these novel data points were the lifestyle data discussed above, and in 2016, they were Cambridge Analytica's "psychographic" profiles.
When Cambridge Analytica CEO Alexander Nix boasts that psychographics "are equally important, or probably more important" than demographic categories (Nix, 2016), he is not only empirically wrong, but is relying on and reproducing the data imaginary. Implicitly, claims like this argue that out of 500 data points a campaign has about me, it is the 490 new and strange ones that allows them to see me more clearly, rather than the publicly available ten concerning my voting history, address, gender, age, and so on, that have been empirically shown to have the most explanatory power.
While the predictive models enabled by voter files and campaign-collected data like that gathered from phone banking have made measurable and important differences in turnout (Nickerson & Rogers, 2014) and can play a role in determining strategy across all aspects of a campaign, they are fundamentally questions of probability and prediction. Yet, prediction becomes prophecy in the data imaginary. Data journalists like Nate Silver have written extensively about the difficulty in explaining probability and uncertainty in models (Silver, 2015), and yet campaigns and political journalists often describe data-campaigning strategies in ways that reify their certainty. In 2016, the Trump campaign described how they knew ads aimed at depressing turnout in Black communities would work by merely stating "we know because we've modelled this" (Green & Issenberg, 2016). Similarly, when Nix claims Cambridge Analytica can "form a model to predict the personality of every single adult in the United States of America" he is also making a claim about the infallibility of this prediction. This treatment of predictions as prophecies that inevitably reveal the truth occurred in discussions of the Obama campaign's use of data in 2012 as well, with news stories celebrating their ability "make the data give up its secrets" (Dickinson, 2012).
Coverage of testing -whether simple A/B testing or more rigorous randomised controlled trials -also falls into the tropes of the data imaginary, particularly the emphasis on revealing. The major data story of the 2008 election was the use of A/B testing in the Obama campaign, and much coverage of the practices and results of testing fall into themes of revealing and panoramic. After Obama won the 2008 election, narratives focused on how testing the seeming small differences in interface design, such as moving a button or adding a splash page would lead to changes in behaviour that would otherwise be unknowable (Siroker, 2010). A/B testing was also central to the narrative about the power of the Trump campaign's success with Facebook ads, as they claimed to have run upwards of 100,000 ad variations per day to do "A/B tests on steroids" (Lapowsky, 2016b;Green & Issenberg, 2016). The amount of tests provides a panoramic vision of what messages would work, echoing the "test everything" mantra of 2012 Obama campaign manager Jim Messina, and contributing to the idea that testing all possible variables will reveal new information and make the world of digital campaigning entirely knowable. In one of the more widely discussed examples, wherein the Obama campaign tested small differences in the words written on a button "Learn More" versus "Sign Up", these analytics-based tests revealed differences that were barely observable (Siroker, 2010). Another lesson from the Obama campaign was that "Sometimes, ugly stuff won" (Engage DC, 2012). In cases like these, data acts as Beer's "prosthetic eye", seeing the advantage of choices that traditional experts like user experience designers and advertising creatives saw as poorly designed or aesthetically displeasing.
Within the data imaginary, testing is also framed as more than a way to reveal how to best mobilise actions like signing up for an email list or donating money; discussions of its power slip into assumptions of its persuasive capacity as well. The Cambridge Analytica whistleblower Chris Wylie has repeatedly described their targeting and testing practices as "psychological warfare tools" (Cadwalladr, 2018). In an article headlined "How He Used Facebook to Win", Sue Halpern (2017)  voter mobilisation efforts is fundamentally about using data to predict electoral outcome and emphasising its effects is integral to producing meaningful coverage of data campaigning. Yet, the narratives about other data points as necessarily adding predictive power are also presented as truth, though little research has been done in this area. This is the data imaginary at work.
Imaginaries sometimes reflect reality, but often reflect a hope -or fear -for how data will work, and emphasise the revealing, panoramic, and prophetic qualities of data to do so. Across all of these themes, practices of data campaigning go without rigorous treatment or attention to how, when, and why they actually result in meaningful change or affect political behaviour.
Instead, it is assumed that the new things that are revealed and made visible are of political importance, and that the future predicted by a data set is both necessarily correct and is open to manipulation using additional data-driven tactics.

CONCLUSION
Data campaigning has been heralded as an effective tool for nearly all aspects of campaigning, from GOTV to persuasion to fundraising. This article has attempted to nuance those claims, providing an overview of the known empirical evidence that supports how, when, and what kind of data is most effective for a variety of campaign goals. Despite overarching claims of its importance, the empirical facts are more of a mixed bag. On one hand, data campaigning has optimised field organising and improved voter turnout, using reams of data found in voter files and party databases to figure how to most effectively get people to the polls by canvassing, calling, or text messaging (Gerber & Green, 2017;Malhotra et al., 2011;Michelson & Nickerson, 2011;Nickerson & Rogers, 2010). On the other, there is little evidence that data-driven targeting works for persuasive outcomes, or changing someone's vote preference, despite the publicity concerning these practices' importance (Kalla & Broockman, 2018). What we do know is that: basic, publicly available demographic information -not the reams of hyper-specific lifestyle information -is often most effective for improving turnout (Hersh, 2015). New tactics like using social media provided models to locate and target previously-unknown potential voters seems to hold great promise for mobilising fundraising, but considerably less for persuasive outcomes. Moreover, despite the very real ways data campaigning matters, data capabilities are lacking in down-ballot races (Anstead, 2017;Baldwin-Philippi, 2016) and even successful topof-the-ballot campaigns can lag behind (Baldwin-Philippi, 2017;Kreiss, 2016;Kreiss et al., 2018), and despite the ability to target hyper-specific messages to audience segments, social media ads' content often mirrors the narratives of broad national-level ads (Anstead et al., 2018).
Despite these limited findings, public discussion of data campaigning and micro-targeting persistently makes claims about its power and effectiveness, often drawing on common tropes of "the data imaginary". Drawing on scholarship rooted in science and technology studies and internet studies that critically examine the power that "big data" as both an object and method has gained across a variety of fields (Beer, 2014;boyd & Crawford, 2012), this article emphasises the way that these descriptions of data are often decoupled from descriptions of why data is correct or relevant to the case at hand. Overwhelmingly, they come instead of, not alongside, discussions of the known empirical effects of such data.
There are stakes to the publication of these claims and belief in their veracity beyond that of unrigorous journalism. Importantly, the data imaginary is not just a set of tropes that give credibility to data practices, but a productive, reifying process that reinforces the power of the imaginary itself. With that assumption of power, consulting firms and data corporations more likely to receive investment and earn contracts, thus shoring up the data industry and firms doing that type of work. Not only is this the case in the US, but these myths are exported to other countries as "cutting-edge" tactics. While some regulations prohibiting the use of certain types of data and targeting practices exist, it is likely that data-driven practices will push up to the limit of local laws, or even practices that explicitly and clearly break local laws, as was the case when Cambridge Analytica violated campaign finance laws during their work for the Vote Leave campaign ahead of the 2016 Brexit referendum.
There are political risks to relying on claims about the effects of data campaigning to further regulate, too. Concerns about the use of data by political campaigns take many valid forms, including but not limited those related to privacy, discrimination, and deceptive authorship. But resting these concerns upon assumptions of data's inherent power and manipulation makes for slippage between these many problems. As a brief example, a 2018 UK Information Commissioner's Office (ICO) report "Democracy disrupted: Personal information and political influence" primarily focuses on the (important) problems of the lack of transparency and users' inability to control their data, but uses the claim of "influence" as a main reason for exigency in regulating. Regulation of commercial and marketing practices that protect privacy has not generally been concerned with how successful or effective marketing is, and yet, implicit claims of effects and "manipulation" slide into political discussions seamlessly. Although the current environment seems ripe for opportunities to regulate, relying on, or even emphasising, the strong effects of micro-targeting to make persuasive arguments about doing so actually poses risks because of the faulty assumptions it involves. In attending to the particulars of how data is actually used, and what its effects are within the campaigns, this article aims to re-orient conversations away from concern about media effects, and toward more fundamental ethical and normatively democratic questions that regulation has, and likely should be, concerned with.