Picturing the Arctic: digital imagery and the prospect of using search engines to collect data for interpretative political research

ABSTRACT Imagery frames reality, and political actors tell stories using images. In an increasingly digital communication landscape, political actors tell visual stories directly on websites or social media channels. This online shift places digital imagery centrally in how we picture political issues, events, and places. Digital images are mobile, circulable and appropriable, which means images are not fused to their immediately surrounding text. Telling a story with digital imagery constitutes a contribution toward a wider digital visual discourse, enabled by circulation. Interpretivist research lacks tools to unpack this digital visual discourse. This article critically evaluates a technique to tap into digital visual discourse using semi-automated data collection utilising search engines. Such data collection tools can divorce imagery from its immediately surrounding text and create a corpus that allows us to identify a digital visual discourse around a given topic. I draw on an attempt at scraping search engines to this end, studying how actors portray the Arctic. The technique is presented transparently with a call to engage with the tool, to spur methodological debates and innovation. Search engine scraping can, in the right research design and if applied critically, illuminate new dimensions of discourse by prying apart written text and imagery.


Introduction
The effort to visualise political power and action is part and parcel of politics.From the faces of emperors on early Roman coins to contemporary politicians posing for photoops and battling for followers on Instagram, there is political power in being able to visualise and shape how political action, events and spaces look (Crilley, Manor, and Bjola 2020).Representing the political visually has become increasingly important in a digital age where communication is virtually always accompanied by some form of visualisation, on conventional websites or on social media.Imagery is immediate and affective, it can trigger emotional response, and it holds a privileged position in many cultures as firm proof: in a court of law, photographic or video evidence can be damning.For political purposes, imagery can provide a distant audience with an experienced proximity to the issue that might act as a surrogate for first-hand knowledge and 'being there' (Zelizer 2007).As such, visuals shape the political world.
There is an established and growing corpus of work on visuals and politics (see the collection by Bleiker 2018).Studies often go into depth on single, particularly impactful images, or rely on traditional media like newspapers for harvesting a range of representations in composite communication including imagery and text.Digital imagery is less coupled to its original context and lends itself to appropriation and circulation (Boudana, Frosh, and Cohen 2017;Horsti 2017).The move towards digital imagery elevates it from being a subunit or a visualisation of a written text, towards constituting an own discursive field, propelled by digital imagery's circulability (Mielczarek 2016).In this article, I argue that scholars interested in the role of visuals in disciplines studying global politics ought to pay attention to the shift from analogue to digital imagery, as a shift that not only changes the amount of visual data available, but the way in which imagery contributes to discourse, owing to its ability to circulate.I therefore present a tool that could expand our toolkit to account for digital imagery's unique circulability.While I sketch ways of dealing with and analysing the data, this article is primarily centred on the data collection stage of a research design, as a step with a profound impact on the type of analyses that are possible.Gathering data is not a neutral action of building a corpus, but is a conceptually important step that has both empirical and conceptual ramifications, shaping the conclusions researchers are able to reach.To bring these discussions to a practical level, I evaluate an applied attempt at expanding the interpretivist toolkit with a technique to gather visual data using search engines, assessing whether and how we might use search engines to analyse a digital visual media landscape.
This article proceeds in three steps.In the first step, I situate digital images in relation to discourse, and identify an empirical development in how imagery circulates in a digital media landscape.I argue that methodological innovation and experimentation are needed to study digital visual discourses.The second step demonstrates my argument in practice, by transparently evaluating a study on the visual representation of the Arctic, and providing a free, open-source, semi-automatic data collection tool using a Python script to scrape Bing.The third part adopts a critical stance to consider several challenges with the Bing scraper.I conclude that this technique provides a toolalbeit imperfectthat expands the ability of interpretive research to tap into digital visual discourse, and argue for further debate grounded in practical experimentation with novel tools.

Discourse and images
Digital images, with a couple of clicks, can be extracted and recruited to a new purpose by a different author, and will typically blend in smoothly in a new context.This weakens the bond between digital images and their immediate context, and makes it necessary to consider new tools that can pry apart written text and imagery.This is crucial for interpretivist studies that see images, in their own right, as powerful discursive units, capable of shaping the way the world looks without relying on written text as a medium.Digital imagery constitutes a technological turn that both detaches photography from the chemical processes of analogue photography, and creates a new scopic regime where framing reality visually happens through a digital device connected, via the internet, to billions of other devices (Jay 2008;Ritchin 2009, 69-73).In this section, I argue that, while both digital and analogue imagery operate within broader discursive structures, digital images differ because of their ability to circulate without their immediate context and thereby generate digital visual discourses.This relies on images' ability to carry meaning and speak directly to a discourse.
Images can carry meaning independently, and do not rely on written text to convey a message.Images speak directly, both on an emotional and an evidential level.On an emotional level, images are so immediate they often cause news outlets to attach warning labels before showing disturbing imagery, 1 while equally descriptive text does not warrant such warnings (Bleiker 2018, 9-10).Likewise, images have evidential force as a technology that canor purports topresent a snapshot or freeze a moment in time (Barthes 1981;Danchev 2018;Zelizer 2007).A clear example of how images can speak beyond their immediate textual surrounding exists in Heck and Schlag's research on an iconic cover image of TIME magazine from 2010, showing an Afghan woman with ears and nose cut off (Heck and Schlag 2013), an image that resonated beyond the fate of one woman to speak further to debates on the legitimacy of the war in Afghanistan as well as, on a higher level, the role of gender and the body in war.Presenting an image alongside a written text is therefore more than simply illustration.It contributes to political debates by sketching what is real and what is possible, creating a range of visual frames that can represent a given event or issue, place or dynamic.
Imagery, while capable of carrying meaning, relates to other channels of discourse, relying on what Butler considers a 'certain field of perceptible reality having already been established' (Butler 2009, 64).Images interact with other channels in an immediate as well as a distant way.Political storytelling on blogs, websites, online newspapers, and social media mirrors analogue communication in the sense that images are typically presented and consumed immediately next to written text.Even on the most visual platforms, such as TikTok or Instagram, visuals are typically supported by written or spoken text.This does not mean images are only meaningful in that immediate context.Rather, as images occur and recur within discourse on certain topics, visual motifs fuse with higher, distant discourses, becoming tightly intertwinedsuch as polar bears and their fusing with climate change discourse (O'Neill 2022).In a more distant perspective, images are embedded within a higher discursive structure, namely allalso non-polar bear relateddiscourse on climate change, even if we ignore the immediately surrounding text.Within this discursive structure, we find the sum of discourse on climate change, from textual to visual, as well as oral, audial or any other channel.
Under the higher discursive umbrella, a vocabulary emerges that relates to, and constitutes, that overarching discourse: words, turns of phrase and reference points that refer to the higher discursive structurebut also a pool of stories, sounds and, most interestingly for this article, visuals.In the climate change example, terms, shorthand and concepts like 'mitigation', 'crisis', 'the 2-degree goal' or 'decarbonisation' assume some knowledge and have certain meanings attached to them that a reader must decodethe same is true for visual vocabularies: consider visuals that recur in the climate change discourse, such as polar bears, forest fires, droughts or year-on-year temperature maps.All of these require readers to decode the images based on their knowledge of the broader story.What is interesting for this article is the relation between visual motifs and the higher discourse, where it presents aspects, or frames, of that topic.I see motifs here as the observed content of an image, such as a polar bear, a sectoral map, a tundra landscape or an icebreaker, whereas frames are the wider political stories those motifs refer to, situated within the higher, encompassing discursive formation.While images relate to other discourses on a given topic, invoking and contributing to a certain frame is a move to shape what an issue looks like.Images relate to a frame, and do not need to be captioned or supplemented with written text to convey their message (Butler 2009, 66).

Circulation and digital visual narratives
While both analogue and digital imagery carry meaning on their own within this higher discourse, digital imagery differs because of its ability to enter exchange and circulation.Digital images, like analogue ones, are rarely presented void of a context: they appear within a social media channel, on a blog post or in a gallery on a website.Why then separate the images from their immediate context?The qualitative difference that necessitates this lies in the ease of taking one image presented in one context and circulating it in another.Images in print media are fused, even materially, to their immediate context, and require a significant time investment and resources to feed into print circulation.Digital images, once posted in one context, very easily enter circulation by being shared, copied and reused elsewhere, without their immediate context (Faulkner, Vis, and Orazio 2018;Luna 2019, 50).This can be intentional, as it is for stock photo services, or unintentional, with images circulating sometimes against the photographer or author's will (see, for example, Horsti 2017).
An analogue picture is awkward to copy and requires major resources to circulate.Digital images can be, and often are, copied or screenshot, shared and circulated without obstacles.Although an online newspaper or government web editor might typically not copy and paste in such an offhand way due to copyright laws, the images they use are picked up and shared in informal channels, appearing, for example, on blogs, social media, as illustrations on slideshows or in memes (Miltner 2018;Reyes Enverga 2019).While this paper is not chiefly concerned with social media, the ability for images to circulate online on, inter alia, social networks and thereby become enveloped in a visual vocabulary about a topic is a driver for the qualitative difference between analogue and digital imagery.Images from a range of sources feed into this online circulation, regardless of their initial immediate context.The immediate context in which an image appears is less fixed in digital media, as it is trivial to remove images from an original context and appropriate them for another use.
As images spread and enter circulation, feeding an image into this flurry constitutes a contribution to discourse on a certain topic, and impacts which kind of visuals can become representative of a topic.Mortensen dubs this the 'canonisation' of images, whichin digital culturerelies on 'accumulative 'quantity' rather than a hierarchical 'quality'' (see also Butler 2009, 11-12 on moments when digital images spread online; Mortensen 2015, 90-91).This happens without relying on the authoritative voice of a photographer or editor.In practice, an image becomes part of the visual canon surrounding an event because of the image's digital life, its online traction, and its spread, not due to careful selection by a media gatekeeper and adoption by influential elites.Actors, by presenting a visual representation, contribute directly to shaping the visual vocabulary around certain topics, without media mediators.This means that gleaning media sources, a method successfully applied to contemporary visual political research (for example Hansen, Adler-Nissen, and Andersen 2021) falls short of capturing the type of digital visual discourse I am interested in here, namely the sum of visual motifs that actors present to tell a story about a given region, dynamic or issue.
As actors speak about a topic, they select certain motifs from a visual vocabulary.Actors' choices of presenting certain motifs over others and feeding those into the circulation of digital imagery constitute attempts to create a visual narrative, as each motif also relates to a higher frame.Visual narratives are stories that are told through visual media such as photographs, films, memes, cartoons and so on, where such media are linked together and give meaning to actors, their actions, intentions and motivations as well as the events and places they are embroiled in.(Crilley, Manor, and Bjola 2020) In a digital context, images can be circulated, appropriated and used independently of written text, and actors can contribute to shaping digital visual narratives through their own channels.Visuals can speak, relying on assumptions about the readers' ability to decode and link the image to political stories.Freistein and Gadinger find that visual narratives can mask, for example, controversial patriarchal and nationalist stances by relating them to pop-cultural fantasy (2020).Such 'explicit or implicit linkages to a larger political story' (Freistein andGadinger 2020, 224-225, 2022) are a hallmark of visual narratives, always leaving room for the reader's interpretation and ability to decode.This interacts powerfully with a digital sphere where images are decoupled, appropriated and circulatedbut nonetheless link to broader narratives.Tapping into the visual narratives certain actors employ can speak volumes about the way in which they decide to represent a topic.

Identifying and probing digital visual narratives
It is not straightforward to probe these visual narratives, as images are nearly always juxtaposed with immediate context in the form of text or situated in an image gallery.There is no shortage of studies analysing digital imagery (see Chen, Sherren, and Smit 2021 for an overview specifically on social media images; Ritchin 2009 on the material difference between analogue and digital photography), but due to relying on an iconological approach built for analogue photography, they sidestep some novel challenges the digital turn presents.Not only is building a visual corpus of imagery from online sources technically challenging because it is less clear and harder to pin down grammatically and syntactically than text (see Shim 2014, 28).The digital shift makes it conceptually necessary to find new tools suited to this different empirical ground where images are hallmarked by their circulation.Visual representations constitute choices and intentionally curated visualisations to support digital visual narratives that (re)present a certain story of how a given theme, issue or place look.These feed into digital discursive circulation and are less linked to a fixed context than analogue images, becoming nodal points in a discourse of how a topic, space, issue or event look.To unpack digital visual narratives and consider the unique mobility of digital visuals, isolating digital visual narratives as stories would help identify how actors contribute to the overarching digital visual discourse.
Extracting images from the context in which they originally appear and analysing them in the context of a folder on the researcher's computer does not remove their context, it changes it.As such, the tool I present below to crack open digital visual discourse is not neutral or value-free.Following Möller, Bellmer and Saugmann, I propose that this tool can appropriate images from their original context (websites) to reproduce them in another (researcher's desktop), enabling analysis in the new context (Möller, Bellmer, and Saugmann 2022).This appropriation means images are assessed as contributions to the circulation of digital images, not based on their initial use, namely to serve as part of composite textual-visual communication.This appropriation helps flatten the field and create an archive where images stand on their own and do not risk subjugation to immediately surrounding text: once pried apart from their immediate context, it is possible to assess the type of the visual vocabulary they apply and feed into digital circulation.This relates, as I have discussed above, to the contours of the higher level of discourse on a given topic.In shaping and representing political events, issues or spaces visually, as I concretise shortly, visual motifs relate to given frames that cut across visual, textual and other channels of Arctic discourse.I have argued that separating digital imagery from their immediate context is necessary to probe a digital visual discourse enabled by the circulation of images with distinct meanings.The challenge is how to practically do this, given the fact that images almost always appear next to text in online sources.

Semi-automated online imagery collection in discursive research: an Arctic example
An answer to this challenge is that we can use one of the drivers of this digital visual discourse to resolve this challenge, namely digital technology.I therefore now turn to give a practical example of a study using semi-automated online imagery collection using search engines to probe actors' digital visual narratives, and how those constructed the Arctic as a political space.The study was motivated by a puzzling situation in the political discourse in the Arctic region.All actors seemed to agree on several things, such as the need to manage or save the Arctic, the need for fruitful cooperation and the necessity to maintain and strengthen regional governance structures. 2 Despite this, political debates often ran hot, and discussions seemed surprisingly competitive and contentious (Busch 2021, 2-5).I therefore looked at contestation at a more fundamental level: could it be that actors were agreeing on a prescription for the Arctic without sharing a common description?Situated in a critical geopolitical framework, the study questioned how actors discursively shaped the space, and which characteristics they assign the region through their inscription in communications about the Arctic.The political geographic literature concerns itself with the division of the world into discrete spaces, which relies on stylised views of politics: purporting to rely on neat, fact-based delineations arrived at using a 'geopolitical gaze', a birds-eye perspective that claims to present a scientised, supposedly neutral representation of the real world (see Busch 2021, 5-7 for a further discussion on writing space; Lefèbvre 1974;Ó Tuathail 1996).
A major part of shaping discourse around the Arctic lies in visualising and communicating what it looks like to an audience with little first-hand experience.Visual representations show subjective, territorial and theatrical views of international politics (Hughes 2007;Ó Tuathail 1996, 18, 19-43) that rely on the visual performance of political interaction.The visual component of this two-pronged study, analysing both written text and visuals, sought to understand the visual narratives actors present about the Arctic on their websites.Whereas the visual analysis was an experimental component, discourse analysis on written text is firmly established in the interpretive scholarship, making it a reflex to assign greater analytical value to written text, and considering imagery mere illustration.The challenge was therefore to build a research design that allowed me to see the visual discourse without allowing images to be subsumed by the text surrounding them.This would allow me to assess whether actors agreed on what the Arctic looked like, or whether there was contestation between different space-shaping visual discourses.

Trawling search engines to separate text and imagery
Trawling search engines is a technique that has been applied to generate large-N data (Hunter 2016).This trawling can also serve another function.It can also appropriate imagery from its original context and allow images to feature one by one, without text, on the researcher's screen.Hunter's study carried out searches on three different search engines to gather images that constitute a social-semiotic system within which Seoul is visually constructed online as a tourist destination.Hunter's aim to go beyond the images used by commercial actors and unpack a heterogenous discursive field (Hunter 2016, 222) is functionally similar to my aim, which is to understand a digital visual discourse surrounding the Arctic shaped by a range of actors' narratives.
Practically, I carry out these searches using basic automation based on Python scripts from GitHub (Ultralytics 2020), carrying out trawling searches of Microsoft's Bing search engine (see Whiting and Pritchard 2021 for more information on trawling versus tracking searches).In essence, the tool enters a search term (e.g.site:europa.eu'Arctic') into a search engine's (Google, Bing, etc.) image search, and collects images from the results.Those images originate in websites and nodes where the search operator (Arctic) appears within the specified domain (europa.eu)including all sublevels (e.g.europarl.europa.euor eea.europa.eu).The script will attempt to reach the number of images specified by the researcher from a respective domain, and moves on to repeat the process for all the domains the researcher has identified.The images are then downloaded to the researcher's computer automaticallythough preparatory work to compile a list of domains, as well as troubleshooting and occasional need for manual intervention on failed searches, which I return to later, makes it more appropriate to consider it a semiautomatic process that is not entirely out of the researcher's hands.
The use of this technique in this article is intended to concretise and inform debate on using digital tools to gather visual data, and should not be read as a dogmatic attempt to propose a unified, correct way of gathering digital visual data.Rather, it is a tool selected specifically for its relatively bare bones and therefore transparency, but also because its functionality makes it suitable to serve as a blind between text and imagery.Other scrapers, such as the Digital Methods Initiative's Image Scraper, scrapes a particular URLthat is, a particular node on a websiteand outputs a tabulation of the images on that specific page alongside alt texts and links to the image (Digital Methods Initiative 2022).The trawler search I use leverages search engines to search not only a specific URL, but entire web domains, and presents the most relevant imagery featuring alongside the search operator 'Arctic' within this whole domain, therefore capturing the span of communication by the actor owning the domain.Also importantly, it downloads the imagery directly to the researcher's computer, which, as opposed to visiting the websites to see the images, appropriates them to a simple folder structure, a setting where the images are veiled from their immediate written context and therefore allow the researcher to look for visual narratives.I will now elaborate how to prepare and carry out this Bing scraper search.To encourage an entirely transparent and practice-driven debate, full instructions, including the installation of supporting software, are provided in this article's supplemental data, for other researchers to build further and engage.

Using Bing to divorce text and imagery
To carry out this type of search, the researcher must suit the technique to the objectives and framework of the study.The first questions should deal with who and what: whose representations of what am I gathering data for?In my case, I sought to understand how the Arctic region (what) was represented by a list of 51 actors based on the membership and observership of the Arctic Council, plus a handful of businesses and NGOs (who).These actors include everything from states to NGO/INGO observers, from indigenous organisations to UN bodies.They make up an 'inner circle' of actors whose discursive constructive power is underpinned by having their activity, involvement and expertise in the Arctic legitimised either by affiliation with the Arctic Council or ongoing business or advocacy activity.As Arctic insiders, they claim expertise and an ability to speak for the Arctic, and to show the Arctic to their viewers through visual means.
The researcher needs to compile a list of the web domains of the actors considered as contributors to the discourse on a given topic.There may be a lot or very little data: there were hundreds of relevant photos of the Arctic on the domains of large actors like the European Union (europa.eu, a large domain whose hierarchy includes a vast range of agencies and bodies), but few on the smaller websites of actors with limited resources, like the Circumpolar Conservation Union (circumpolar.org).Therefore, the researcher needs to make a choice on the number of images to attempt to gather from each actor, especially if comparison between actors is an objective.
This point requires attention, as the circulation of images on social media is part of the rationale behind paying attention to digital visual discourseswhy do I stress websites, which seems like the web technology of yesteryear?This decision is made because of the emphasis on actors' choice of certain visual narratives, a choice that can be limited by third parties.Social media may have stricter technical limitations to the size of an image, or to certain formats, than websites managed in-house.More importantly still, the content of imagery is also regulated by social media platforms' terms and conditions.This can result in cultural bias that does not allow actors to freely choose visual forms of expression.This is a real concern in the Arctic, where there is a real risk that images portraying culturally significant activities, like whaling, seal clubbing or wearing polar bear skin clothing, get either removed or hidden behind warnings of graphic or offensive content.Websites are actors' most autonomous mode of communication to contribute to digital visual discourse, allowing actors to be their own editors, unhindered by social media platforms' sometimes restrictive rules.
Having compiled a list of websites, the researcher needs to get technical and prepare software like Git, Python, Google Chrome and Chromium.The article's supplemental data gives a full overview of these.This will allow the computer to download and run a script called 'google-images-download'.Note that the script is called 'google-imagesdownload', while scraping Bing for images.I refer to it here as a Bing scraper.This foreshadows a discussion I return to later, namely the fluctuating display choices by commercial search engines, and the spurious nature in which scripts are written or updated.After feeding a simple syntax into a command prompt, the process runs, gathering images from the Bing search.The figures below show two views of the process, running an example query for images on the domain europa.eu(see https://www.bing.com/images/search?q=site%3aeuropa.eu+Arctic).Figure 1 shows the script running this search, while Figure 2 shows a folder with images downloaded and stored locally.
There will naturally be a need to clean data and often also to carry out several searches after discovering technical hiccoughs.In my example, it was necessary to place the search operator 'Arctic' in quotation marks to filter out results containing the term 'Antarctic'.This process has challenges and requires refinement, and as any other research technique, requires a researcher to think on their feet to overcome issues like the Dutch government website yielding few results because most images were embedded in PDF format, which the script did not support, or the Chinese search failing because the script did not support Chinese characters.Both required me to carry out the process manually, downloading images from the Bing search interface.Likewise, it is difficult to predict the genre of the images: my attempt to suffix my searches with '-map' is an example of this.In some cases, like the EU and the World Wildlife Fund (WWF), this yielded a large catalogue of maps to analyse, but many smaller actors yielded few or no results.

Analysing appropriated digital imagery
This process yields a set of images, in a folder tree sorted by domaintherefore identifiable only by which actor presented the imageand removed from their original context (see Figure 2).The images have been appropriated from their context alongside text or description to be analysed only based on their motif and the researcher's coding scheme, to facilitate a probe of the visual narratives actors tell.While my emphasis here is on the data collection step, I will briefly illustrate what types of analyses this data collection enables.What could such a coding scheme look like, and where can it come from?I assessed both text and imagery around three thematic (environmental, economic and geostrategic, see Busch 2021) and two spatial (local and global, responding to, inter alia, divergence in research agendas between the 'global' and the 'exceptional' Arctic.See Dodds 2018; Finger and Heininen 2018; Heininen and Finger 2018; Käpylä and Mikkola 2019) discursive categories or frames that have been prominent in Arctic literature, both academic, policy-oriented and in actors' own communication.These discursive categories, recalling my previous discussion on the relation between text and imagery, were considered major frames around which large parts of Arctic discussions textual and visualrevolved.These frames constitute five distinct and prominent storylines to construct the space.With the frames identified, an exploratory pilot study led to a handful of archetypical motifs that make up the core vocabulary of visual communication, producing the following coding scheme (Table 1).
What is political about a folder full of images, and what political dynamic can it help us understand?Visual narratives show a certain understanding of the topic at hand, as well as an attempt to shape the space according to a certain vision.My research design took an iconological approach that emphasised how different actors' conceptions of the space contested and interacted with others' conceptions.Each image was assigned only one code, allowing me to paint a profile of each actor, by the number of images assigning each frame representing the weight given to each.In the Arctic, visual representations matter for how audiences understand a region about which they have little knowledge.Presenting the region as a pristine, snowy landscape leaves the audience with a different sense of what the Arctic consists of than representing its urban centres or its hydrocarbon resources.These space-shaping discourses are political, as actors contest what the facts on the ground are, and shape a different type of political space.Secondarily, this may give fertile ground to argue political movesit is easier to argue for environmental protection in a region presented and understood as a pristine and sensitive ecosystem, while the need for exploiting natural resources appears more reasonable if the region is understood as one of communities that need economic activity to maintain demographic stability.As an actor-centric design, the aim of the Arctic study was to contrast the different narratives that emerged in imagery among different actors, and thus built on an assumption that discourses and their political messages rely on who expresses them.
The visual component of the study showed a diverse and contested visual field where visual narratives largely centred around a handful of discursive nodal points.In broad strokes, environmental motifs such as polar bears, ice sheets and open landscapes were prominent especially among NGOs, scientific organisations and some states, other states were more inclined to emphasise conferences, handshakes and posing in front of flags, businesses and some states were eager to show hydrocarbon extraction and shipping, while indigenous organisations were more likely to show traditional livelihoods and scenes from everyday lives.This aligns in some ways with my textual analysis (Busch 2021), where environmental motifs were prevalent, but the discursive field seemed to be crystallising around a handful of archetypes with different perceptions of what type of space the Arctic is.Interestingly, the visual analysis and textual analysis had similar macro-level patterns, such as the prominence of environmental narratives.On an actorspecific level, the prevalent visual narrative only corresponded to the prevalent textual Logos of (international) organisations, flags constructions half of the time.This suggests that visual discourses, divorced from textual context, reveal different discursive dynamics than a singularly text-focused or a mixed approach where imagery is relegated to be subservient to written text.The data collection tool described here is only one step in a research design, but because of its ability to pry apart visual and textual discourses for the sake of examining visual narratives, it enables different research designs than visual analysis that treats the image only as a component of composite communication with text.In studies collecting visual data in this way, it is possible to observe discursive tensions within the digital visual discourse.Depending on the objective of the research, several political and discursive fault lines may be of interest.To name a few research directions that are enabled by divorcing imagery from their immediate textual context and appropriating them to a flatter structure in the data collection phase, there might be contestation (Arctic examples in parentheses): . between specific frames (are economic and environmental frames incompatible and presented in a contesting manner, or are they compatible?), .between the nature of actors (do the Arctic states' visual representations differ from those portrayed by NGOs or indigenous organisations?What is the effect of those differences?), .on the perceived tone of the image (do economic framings show more positive motifs than environmental ones?), .between the interacting types of frames (are actors who lean towards local framings more prone to economic motifs while globally minded ones tend towards environmental ones?), or .between the genres of images (do abstracting or schematic genres like maps differ from photography, or even cartoons, in the frames they invoke?).

Critical reflections
The purpose of this tool is to provide a means to probe a digital visual discourse, or a visual vocabulary around a given topic, in this case the Arctic region.The first section of this reflective section deals with the extent to which the Python-automated Bing scraper was able to achieve this, on both a practical and methodological level.The second section discusses the impact of search engines and the link between visual representation and the empirical realities they present, while the third section opens the discussion to critically consider the role of search engines.While I engage with some of the potential criticism here, the intention of this reflection is neither to deflect critique, nor to provide an exhaustive list of issues, but rather to encourage debate by transparently highlighting challenges I faced and outlining some paths to start addressing these by proposing a handful of future research agendas.

Separating images from written text
Could the Bing scraper achieve its goal, to allow researchers to separate images and text?On a banal level: Does it work?Using the Bing scraper was not a smooth process, and was at times somewhat precarious.Practical challenges, such as different display modes in search engines and on actors' own domains hindered searches and at times even rendered the scraper inoperative.The fact that the script applied to scrape Bing images is called 'google-images-download' is telling.Originally developed for Google Images searches, my initial plan was to rely on Google Images.
During the planning phase of the study, the script became inoperative after Google changed how image search results are displayed, and was subsequently rewritten for Bing, which led to a pragmatic shift.Relying on GitHub users writing accessible and reliable scripts whenever a design change causes technical issues makes the automation potentially unreliable.This became clear in the writing-up of this article for submission.
Changes to one of the frameworks the scripts run on, selenium, and how it reads input data meant that the google-images-download script was no longer operative at the time of submission.The voluntary and spurious nature in which many scripts are written means it remains unclear when the script is patched and updated.This has a silver lining, however, as the tools do not require expensive, specialised software, which means this is an approach that is available to researchers regardless of affiliation or the size of research budgets.
Things canand dogo wrong.Antarctic motifs showed up in a search about the Arctic, images embedded in PDF format could not be extracted and the scraper was unable to handle Korean and Chinese scripts.In an ideal scenario, it would be possible to follow a stepwise guide and be confident of success, but failures and changes to the tools or source material mean a researcher almost inevitably needs to do troubleshooting and resolve issues through awkward workarounds.This raises the threshold for the Bing scraper's usability for less tech-savvy researchers.Does that mean the scraper should be dropped wholesale?No social science methods follow a straight line.Carrying out interviews, analysing documents or conducting participant observation are pillars of qualitative research, but they almost always require researchers to adjust according to emerging challenges.That does not discredit those methods, but it does highlight that research is a bumpy road.For the Bing scraper, the issues are more visible, perhaps due to the novelty of the tools, but aspects remain useful despite obvious technical challenges.
Beyond the purely technical, other issues and questions emerge.The Bing scraper is not a method, but a data collection tool.It nonetheless is constructed with a methodological purpose, namely enabling research designs that hold digital visual research as a strand of discourse worth investigating apart from immediate textual context.The Arctic study and its research design show a contradiction in the relationship between text and imagery.While I have discussed how both text and imagery relate to a higher umbrella discourse that surrounds a given topic, event or region, the aim of the Bing scraper is to divorce text and imagery for the purpose of researching digital visual discourse.Despite this aim, my research design nonetheless leaned, in at least three places, on text to make sense of the discursive power of images. 3 Firstly, the frames and the coding scheme I identified for Arctic discourse came from written text: they were the result of reading written academic and policy literature on the Arcticnot the result of a familiarity with the visual representations of the Arctic, which only emerged while carrying out the study.The alternative route would be to allow the frames to emerge through a grounded approach, recognising shared motifs and weaving these into visual storylines.
Secondly, search engines work based on syntactic prompts.While it is true that search engines have a reverse-search where they can show where a certain image appears, as well as a rather opaque 'similar images' function, the only way to use a search engine to glean imagery from a given URL is to specify this in written text, and using a textual signifier, such as 'Arctic' to gather relevant imagery.
Thirdly, the present stage, namely analysing, presenting and publishing research results, relies on writing.Academic publishing is overwhelmingly centred on texts.While there are incremental moves towards innovations such as allowing visual abstracts, lenience towards including images and willingness to provide supplemental data repositories that can contain multimedia files, the analysis itself and the genre of a research article requires the researcher to square a circle by forcing visual discourse back to a textual level.It needs to be situated in a highly textual scholarly literature, and the format of an academic article requires elaboration in written text (see, however Callahan 2015; Särmä 2016).What I have done, then, is not avoiding text entirely, as written text and imagery are fundamentally linked through their relation to an overarching discourse.Rather, I have kept text at arm's length for the duration of a phase of analysis, introducing a blind that covers the immediately surrounding text.

Images, search engines and the link to the empirical world
This insertion of a blind between image and immediate context raises another question that is increasingly pressing in an era of easy photo manipulation software, deepfakes and even AI-generated images: Are images still representations of the real world?Or am I, with the Bing scraper, analysing manipulated images designed to convey a certain message or skew the truth, which can only be understood next to written text?Performing visual representation and achieving emotional proximity taps into what Barthes dubs its 'evidential force' (Barthes 1981, 89; see also Danchev 2018 on witnessing).At face value, this does not sound like a uniquely digital issueeven Stalinist Soviet was systematic in its manipulation of images both contemporaneous and historic (Dickerman 2000, 141) with the tools they had available.With the accessibility of image manipulation software, an image might always be manipulated.These are not fundamental challenges to a discursive approach that sees visual representations as attempts to tell a story, as the emphasis is on the image's relation to political discourse, not to an objective empirical reality.
A different reading of Barthes is that this evidential force is an inert, narrative quality of photography: regardless of whether it captures an empirical reality, the picture is an attempt to tell a story and back it up with hard proof.Images support a narrative.Edited or not, presenting one image over another constitutes a choice for what can represent an issue, event or place.It is nonetheless good to pay attention to the question of authenticity and manipulation, as retouched versions of the same image can convey different meanings.This occurred in my study (Figures 3).
A creative commons image of the Russian research icebreaker Kapitan Dranitsyn, originally taken on an expedition to research the Arctic climate and ocean, was reproduced by Bellona, a Norwegian environmental NGO in a retouched format with higher contrast, making the ship's light exhaust appear nearly black (Bellona 2019;Dunn 2006).This image was used to advertise a panel debate on Arctic fuel and the Northern Sea Route.While it is unknown who edited the image for what reason, the choice to include this constitutes a representation of a reality where ships' exhaust is a little darker, a little more menacing and presumably a little more polluting.Adding a method to flag images in various states of retouching would be a useful addition to this tool.
Another set of challenges relates to the technique's purpose: the effort to use search engines to black-box the imagery's immediate context and thereby be able to study visual discourses in their own right.This creates a situation where data collection is hidden from the researcher.Search engines are opaque.A typical digital user will experience their online activity as basic input-output, between themselves, their user interface, and the internet, not reflecting on the opaque structures enabling and influencing their activity (see Bernal 2020, 35-37).In the user's perception, the scopic regime of the web browser is linked to smartphones and computer screens, devices that feel directly connected to realitya user looking up pictures of a restaurant before going there will rarely reflect on the technological procedures that lead to the user receiving the exact images they see.
Commercial search engines are an interface between the real world and visual representations, as they use sophisticated algorithms to sort and decide the order of images returned by a query.These algorithms are not in the public domain and are treated as business secrets.Major actors only share the principles by which results are sortedfor example that they are sorted by relevance and recencythe specifics of each search engine's algorithm and what is considered 'relevant' are unclear.This would be problematic for a study analysing domains with very large numbers of images, as relevance sorting might push other relevant imagery out of the requested number of images.The images may well be sorted by relevance, but it is unclear whether this definition of relevance renders the images captured representative or meaningful, or whether hundreds or thousands of images excluded due to opaque sorting orders may be relevant.On the flipside, an image featuring higher in a search result, by virtue of being more visible and accessible, is likely to be seen and even shared by more people or actors, and as such is likely to be a prominent discursive representation of the issue, place or process at hand.The process might not yield a perfect panorama of all visualisations, but it allows researchers to see like a search engine.Search engines as gatekeepers Introducing search engines as an interface in research comes with its own challenges, from technical to ethical ones.Mirroring the specific challenges of the Bing scraper, internet research broadly is precarious, something a research design seeking to probe digital imagery must acknowledge.Problematically for reproducibility, search engines and websites are fundamentally in flux with changing contents and poor or no archiving (Zuev and Bratchford 2020, 42).The debate of 'link rot' and the lack of permanence and archive practices on the web are relevant here.Content may be moved or deleted, access privileges changed, and paywalls may be established, causing issues for any attempt at reproducing data collections.While this could in principle be solved by researchers and publishers providing open data repositories, this cascades into difficulties of copyright and reproduction.
Another replicability challenge lies in search engines' opacity.Since sorting orders aiming at producing relevance are dictated by complex, flexible algorithms, search results vary by several factors.They will vary over time as new data appears and other data is no longer available online, arguably a problem that is shared by most forms of research that use online material.More unique to this approach using search engines are personalisation and search engine optimisation (SEO) (Fernando, Du, and Ashman 2014;Google n.d.;Martin-Martin, Orduna-Malea, and Harzing 2017).Researchers attempting to replicate a study might get different results, partly owing to the fluctuating data basis, but also based on who they are.Personalised searches mean results may potentially be sorted on calculated relevance to the individual researcher, based on anything from geographic location, demographic status or previous online activity.A researcher whose hobby is cycling, and whose web history reflects this, may attempt to replicate the searches above and see images of the Arctic Race of Norway cycling race rather than imagery of polar bears.Good research practices, such as using a dedicated research browser, go far to mitigate person-specific personalisation (see Digital Methods Initiative 2015), although geographic factors remain.
SEO and personalisation issues are not particularly problematic for the Arctic study as it starts from a pre-defined list of actors.SEO and personalisation only affect the internal relevance ranking of each actor's image.It could, however, be a pitfall for studies using search engine visibility as a comparative measure of relevance or prominence.This could risk the researcher unwittingly ending up measuring SEO budgets.The actual impact of personalisation when employing a scraper is unclear, given that it trawls domain by domain.Giving up the actor-centric focus risks personalisation and SEO having a major impact.This would be an interesting site to start considering what digital reflexivity may look like, where the researcher's online footprint may affect the research object and results.An effective design to unpack this could set up identical, simultaneous searches on different devices with different browsing histories in different geographic locations and juxtapose the data.Some ethical concerns arise when using visual data gleaned through search engines.The first concerns informed consent (Wiles, Clark, and Prosser 2011, 693).Arguably, using images in the public domain for research is no different from using texts.However, the consent of the persons in images does matter.This issue is blurryfew would disagree with analysing images of high-level politicians whose office and international profile make them public persons, but the issue may look different for others.The fluctuating and ever-changing internet makes it impossible for the researcher to approach all persons depicted for their informed consent (Zuev and Bratchford 2020).While some of the responsibility lies with the photographer, the step from agreeing to have their likeness reproduced for news or promotional material might not translate to being placed under academic scrutiny.
Another ethical issue revolves around the independence of academic inquiry, and the privileged position of technology companies.Using a commercial tool not designed for research purposes is a choice with consequences.First, the researcher loses oversight over what data is included, as opaque search engines blur inclusion and sorting criteria.Secondly, the researcher needs to consider their own role as a user/customer of the search engine and the role and motivation of commercial actors that are certainly not value-free.Alphabet and Microsoft, the companies behind Google and Bing, are giant corporations, and hold economic and political influence.As political scientists, we must tread carefully about bolstering this influence, or assume these are neutral mediators without an own interest.That interest is obviously profit, but recent years, with the controversy of the Cambridge Analytica scandal, have shown the political power of amassed information on online populations.
The companies behind search engines are actors, and the images extracted by search engines should not be considered a reflection of reality.Any application of the Bing scraper or other search engine-reliant tools should consider this as an intervening variable, and acknowledge that the perspective gained from collecting data from a research engine is just that: a particular perspective.This is valid beyond the Bing scraper and beyond visuals.How do search engines contribute to shaping discourse, and are there reasons to be sceptical to that portrayal?The fact that results are generated by opaque algorithms moves this influence away from human intervention, allowing software to structure discourse.Search engines are actors, which means the 'blind' between the image's original context and the researcher's archive is not neutral.
How do sorting algorithms define the parameters of a visual discourse?This is a gap with much unfulfilled potential.Future projects could start chipping away at some of the issues raised here.A study comparing the imagery returned by search engines with a participatory photography project could highlight how people and search engines differ in their visual representations, while a historic assessment of how an issue is visually represented through time could contextualise the impact of search engines on visual narratives, and whether they constitute a new type of interface or are comparable to other media gatekeepers.Similarly, studies ought to examine the influence of other types of visual discourse, which may be contrasted with this photography-centric design, to also capture visual discourses stemming from other (digital) media.In the Arctic example, future research could evaluate whether visual representations of the Arctic or the North in other digital visual media like TV series, films or video games contribute to new visual vocabularies or build on similar motifs and narratives as photographic images.

Conclusion
This article has assessed a technique for gathering visual data semi-automatically, using basic scripts to scrape commercial search engines' results.I have argued that empirical developments, namely the move from analogue to digital formats and the subsequent shift from editorial media towards in-house communication, constitute a qualitative shift in the role of visuals in political communication.As digital visual narrative acts become mobile and reproducible vocabularies, it is worthwhile to try to isolate digital visuals as a channel of discourse, investigating how that links up to higher, societal discourses beyond immediately surrounding text.
My practical example showed a way of inserting a screen in the data collection and tap into visual discourse without reducing digital images to subcomponents of their immediately surrounding text.Here, this is achieved by running a script leveraging search engines as powerful resources to glean imagery from specified domains, with specified search terms.By outlining one technique to gather data, the goal of this article is not to forcefully endorse direct application of this exact technique, but rather to call for a critical, practice-informed debate around this and related approaches.There are undeniably pitfalls and problems with this approach.These must be evaluated against the beneficial impacts this technique haslike the ability to divorce text and images for the purpose of a serious consideration of digital visual discourses, its power as a corpus-building tool and its accessibility, giving individual researchers regardless of affiliation, budget, or research teams the ability to collect visual data without prohibitively expensive specialist software.As such, it offers a powerful data collection tool that is attuned to research projects emphasising the empirical uniqueness of digital imagery.
It is crucial for research on visual discourse to keep up with the rapid changes happening to the source material, and pay attention to the circulability of imagery in an age of direct online communication, social media sharing and memes.As images enter circulation and certain motifs carry certain meanings, the contours of a visual discourse appear.The claim that visuals ought to be researched as an own stream of discourse that can relate to higher, societal discourses without going through the interface of text, should be considered as part of the discussion on how to deal with digital visuals, since these are mobile and appropriable to a degree that makes them qualitatively different from analogue imagery.Based on the practical example, this tool provides a starting step to collect data that allows an assessment of digital visual discourses enabled by circulation, which in turn allows us to capture a differentnot necessarily better or more representative discursive dynamic than a text-based or blended approach would.Using search engine scraping as a data collection tool allows us to probe digital visual discourse in a way where imagery is taken seriously as narrative acts.Automatic tools and interpretative research may be strange bedfellows, but critical engagement with this and similar techniques that leverage search engines to gather visual data might move the field in new and fruitful directions.The main thrust of this article, rather than a call to apply the Bing scraper as a blueprint, is to engage practically with unfamiliar and experimental tools to assess what benefits they can bring to interpretive research.By assessing techniques and methods based on practical experience and linking these to more developed research designs, it is possible to generate rigorous tools to add to the discursive researchers' repertoire.This may equip us to better traverse a changing empirical landscape and yield brand new insights about digital visual discourse.

Figure 1 .
Figure 1.A command window running the google-images.downloadscript, note the errors highlighted in yellow.Screenshot by the author 29 September 2021.

Figure 2 .
Figure 2. A Windows explorer folder view displaying downloaded visual data.Screenshot by the author 28 September 2021. 4

Table 1 .
Coding scheme for Arctic visuals.