The value of unsolicited online data in drug policy research

We alert readers to the value of using unsolicited online data in drug policy research by highlighting web-based content relevant to drug policy generated by distinct types of actor: people who consume, supply or produce illicit drugs, online news websites and state or civil society organisations. These actors leave ‘digital traces’ across a range of internet platforms, and these traces become available to researchers to use as data – although they have not been solicited by researchers, and so have not been created specifically to fulfil the aims of research projects. This particular type of data entails certain strengths, limitations and ethical challenges, and we aim to assist researchers in understanding these by drawing on selected examples of published research using unsolicited online data that have generated valuable drug policy insights not possible using other traditional data sources. We argue for the continued and increased importance of using unsolicited online data so that drug policy scholarship keep pace with recent developments in the global landscape of drug policies and illicit drug

We alert readers to the value of using unsolicited online data in drug policy research by highlighting web-based content relevant to drug policy generated by distinct types of actor: people who consume, supply or produce illicit drugs, online news websites and state or civil society organisations. These actors leave 'digital traces' across a range of internet platforms, and these traces become available to researchers to use as data -although they have not been solicited by researchers, and so have not been created specifically to fulfil the aims of research projects. This particular type of data entails certain strengths, limitations and ethical challenges, and we aim to assist researchers in understanding these by drawing on selected examples of published research using unsolicited online data that have generated valuable drug policy insights not possible using other traditional data sources. We argue for the continued and increased importance of using unsolicited online data so that drug policy scholarship keep pace with recent developments in the global landscape of drug policies and illicit drug practices.

Introduction: drug policy research in the digital age
For decades, researchers contributing to drug policy scholarship have used the internet to reach people who consume or supply illicit drugs, a population formerly predominantly 'hidden' but becoming more accessible to researchers as internet-facilitated communications have spread. While it is now commonplace for researchers to use the internet for recruiting research participants and hosting online surveys or interviews (for a review, see Miller & Sonderlund, 2010), less commonly known or exploited by drug policy researchers is the wide range of online data sources that researchers do not solicit themselves.
Because more and more of our lives are enacted in online locationse.g. in the form of social media or online buying and selling -the 'digital traces' of these online activities become available to researchers to use as data (Décary-Hétu & Aldridge, 2015). The distinguishing feature of these digital traces is that they have not been solicited by researchers, and so have not been created specifically to fulfil the aims of research projects. In this sense, digital trace data is akin to what social researchers have long referred to as 'naturally occurring' data (Golato, 2017) or 'unsolicited' data (Robinson, 2001).
Although many readers will be aware that unsolicited online data has been used by some drug researchers, the under-use of this valuable data source provides the impetus for this article. Our aims are threefold. First, we begin by alerting readers to the full range of available online data sources with drug policy relevance and so encouraging researchers to consider whether these data could function as primary or supplementary data sources in their own research. To this end, we devised a typology of content generated by different actors across various online platforms distinctly relevant to drug policy: people who consume, produce or supply illicit drugs, online news websites and state or civil society organisations.
Our second aim is to demonstrate the value of unsolicited online data in generating fresh drug policy insights. Rather than systematically review the literature using these data, we instead elected to showcase a small number of published studies. We selected these studies from the published literature not for being typical or representative of studies with similar methods and data, but because they illustrate their valuable drug policy insights not possible using more traditional data sources.
Drug policy-relevant digital trace data may be under-exploited by researchers, but the reasons for this are unlikely to result exclusively from researchers' unfamiliarity with the range of available drug-relevant content. Researchers may simply be comfortable using more familiar, tried-and-tested data collection methods like surveys or interviews, or may perceive their own technical or skill deficits might create barrier to accessing and using unsolicited online data. Our third aim, therefore, functions to demystify and reassure. While some published studies have employed research designs requiring some specialist skills (e.g. Soska & Christin, 2015), most do not. It is unnecessary for researchers to have or develop highly specialised skills, tools and methods to take advantage of unsolicited online data beyond those basic to internet use. The published studies we showcase rely predominantly on approaches to quantitative and qualitative coding and analysis already familiar to most social researchers. Our article therefore aims to equip readers with understanding of the strengths and limitations, alongside the particular ethical and technical challenges of using this type of data.

Who generates drug-related online data?
In this section, we identify a variety of individuals, groups and organisations whose illicit drug-related activities generate online information that researchers can use as data. These include people who take, produce or supply illicit drugs, media outlets and state and civil society organisations. We provide examples of the forms these data can take, and highlight their utility and potential for drug policy researchers.

People who take illicit drugs
When people who take illicit drugs form online communities for public -but typically pseudonymous -discussion of drug-related issues, the digital traces of their discussions provide researchers with a rich and valuable source of data. Online drug communities can supply researchers with important insights into the lives of people who take illicit drugs, including drug-taking patterns and practices, drug-related beliefs and norms, and how these are developed and shared in communities.
Researcher access to pre-internet drug-using communities has been limited, as these will have been locally constituted, constrained in size by geography, and mostly hidden from outsiders in order to mitigate the risks of exposure that derive from illegality and social stigma. Indeed, people who take illicit drugs have for decades constituted the 'textbook' example of a hard-to-reach population. Therefore, research in the field of illicit drugs has been at the forefront of methodological innovation in accessing hidden populations, including for example snowball sampling (Salganik & Heckathorn, 2016), privileged access interviewing (Griffiths, Gossop, Powis, & Strang, 1993), ethnographic methods incorporating lengthy periods of fieldwork aiming to establish and develop relationships of trust (Bourgois, 2003) and even early examples of online participant recruitment (Coomber, 1997). Methodological innovations like these provide researchers with access points to potential research participants -albeit still in limited numbers -who are likely to be wary of disclosing illegal and stigmatised activities connected to their drug taking. Unsolicited data generated in online drug communities, in contrast, requires neither disclosures to researchers nor developing relationships of trust. The geographical constraints that limit the size and constitution of drug using communities disappear in the online context and -most importantly for researcherscommunity activities are often enacted openly and publicly, making data easily accessible.
The popularity of online communities among people who take illicit drugs becomes clear when we consider how illegality and stigma limit access to appropriate drug safety information and advice, for example related to dosing, administration, combining drugs and harm reduction practices. People who take illicit drugs may intentionally eschew otherwise trusted individuals -teachers, doctors or other professionals -if they fear legal sanctions. The social stigma associated with taking illicit drugs has similar consequences and can exert perverse effectsfor example in preventing or delaying people with problematic drug taking from seeking help (Ahern, Stuber, & Galea, 2007). Consequently, people who take drugs may confine their communications about drugrelated experiences and advice to small groups and hidden local networks (Jacinto, Duterte, Sales, & Murphy, 2008;Race, 2008;Southgate & Hopwood, 2001), and mistrust or reject information produced or delivered via official agencies where content is anti-drug (see e.g. O'Malley & Valverde, 2016) and thus of little practical use to people who actively take illicit drugs. Notwithstanding the increased adoption in countries across the globe of harm reduction drug policies (Stone, 2016), official resistance to providing people who take illicit drugs with accurate information about "about safer drug use persists (Duff, 2004). Online platforms may, therefore, have particular appeal and utility for people who take illicit drugs and are wary of seeking out accurate and practical drug safety information offline. Information sourced online can be obtained by people who take illicit drugs without them having to make disclosures that risk social, institutional and legal sanctions. Participation in online communities enable people who take illicit drugs to draw on a much wider network of individuals than they might come into contact with locally, and thus be welcomed by appreciative peers who share and obtain information otherwise considered controversial or illegal -all the while remaining comparatively anonymous (Montagne, 2008;Murguía, Tackett-Gibson, & Lessem, 2007;Murguía, Tackett-Gibson, & Willard, 2007). While the internet has to an extent functioned to make more visible a group previously mostly hidden to researchers, participation in online communities by people who use drugs will inevitably not be evenly spread. Online participation will vary substantially across demographic categories, drug types, as well as across geographical region and language groups; only the narratives of those participating in online communities become visible, while others remain hidden.
Content created in online drug communities is often text-based, e.g. on discussion forums such as Bluelight.org, but can also include video/ multimedia, e.g. on the video-sharing platform YouTube. Online communities of people who take illicit drugs have traditionally been located on dedicated websites, but are increasingly present on massively popular generic social platforms, including discussion forums like Reddit and social networking platforms like Facebook and Twitter. Where drug researchers have used data from platforms like these, they must employ data selection strategies to identify drug-related content relevant to their research aims and to exclude the majority of irrelevant content from their resulting samples. Various selection strategies have been employed to this end by researchers, including use of generic or platform-specific search engines (e.g. Twitter's standard search API). More recently, communities of people who take drugs have emerged on smartphone-facilitated platforms with restricted content not publicly visible to researchers, such as Snapchat, Instagram, Grindr, Tinder and WhatsApp. Where content is restricted, researchers have successfully employed informants to identify drug-relevant content from pages, channels or profiles (e.g. Barratt, Allen, & Lenton, 2014;Breuner, Pumper, & Moreno, 2014;Demant et al., 2019;Lange, Daniel, Homer, Reed, & Clapp, 2010;Nguyen et al., 2017).
In many cases, online data generated by people who take illicit drugs has produced valuable drug policy-relevant knowledge. A study by Boothroyd and Lewis (2016) that sought to understand harm reduction drug culture within online communities illustrates. The study comprised a qualitative analysis of text-based data collected over 11 months from nine online platforms (blogs, discussion forums and 'story sites'). The cross-platform comparative study design allowed the authors to distinguish among approaches to harm reduction across different drug 'scenes' -taking 'legal highs', drug addiction and nonmedical use of prescription drugs. Injecting drugs, for example, was supported in some communities offering advice on safe injection practices but rejected within other communities as inconsistent with community-specific norms of harm reduction and acceptable drug taking. This study goes beyond the concerns about accuracy (Halpern & Pope, 2001) or misinformation (Boyer, Shannon, & Hibberd, 2001) often directed at 'lay' drug information, and leverages the otherwise unheard community-based narratives of illicit drug experiences and lifestyles to better understand how people who take illicit drugs adapt to risk. In doing so, the authors gain critical perspectives on public health harm reduction policies and practices, drawing attention to the discursive problem of who defines harm reduction. Irrespective of their intentions, policymakers often define illicit drug harm reduction in ways that directly conflict with the emic harm reduction ethos evident in anonymous online spaces. In these spaces people who take illicit drugs instead have the discursive power, as they participate in a 'peerto-peer co-creation' of a notion of harm reduction centred around "doing drugs well" -which can even include a discourse of "the 'harm' a life of sobriety inflicts on their projects of living well" (Boothroyd & Lewis, 2016, p. 304) that would be unthinkable in most official contexts.
The 'harm reduction from below' observed in this study is harder to reach for researchers using traditional data collection methods such as interviews. In online communities, data production is initiated by people who take illicit drugs for themselves and their peers, in contrast to data solicited by researchers which is framed to address the aims and priorities of research. The aims, priorities and power relations that frame the production of these data arise from within the community. Researchers using data obtained from online communities are therefore able to observe knowledge production and community dynamics directly, and so gain unprecedented insight into the content and dynamics of harm reduction among people who take illicit drugs.

People who supply illicit drugs
The internet is not only a location for trading information about drugs but for the drug trade itself. This trade leaves digital traces that can be used by researchers as data to generate fresh drug policy-relevant knowledge and insights. Illicit drug trading has been facilitated by the internet in different virtual locations. Illicit drug trade on the conventional web (often termed the 'clearnet' in contrast to the encrypted 'darknet') has historically mainly included substances that are either legal or have variable legal status, such as prescription drugs (e.g. CASA (National Center on Addiction & Substance Abuse at Columbia University), 2004) and 'new psychoactive substances'/'legal highs' (e.g. Hillebrand, Olszewski, & Sedefov, 2010).
Because some categories of medicine, such as opioids and benzodiazepines, can be controlled substances that are illegal to possess without a doctor's prescription, information related to the legal trade in these products is valuable for understanding how illegal use and markets for these products arise. Data related to the legal trade in drugs categorised as medicines can be obtained online, for example, from pharmaceutical company websites, as well as in third-party online archives of drug industry documents (e.g. the Drug Industry Documents Archive maintained by the University of California).
The data obtained from clearnet markets primarily tells us about the range of substances available for purchase, as well as the prices and quantities in which they are offered for sale (Hillebrand et al., 2010). This data has been used to monitor emergent drug market trends by the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA, 2015). In contrast to darknet marketplaces, clearnet shops generally do not leave digital traces of actual transactions and so cannot provide much insight into demand and selling volume (in some cases, customer reviews may be used as a suggestive proxy for actual sales -see e.g. Bruneel, Lakhdar, & Vaillant, 2014). Social media (such as Facebook) and smartphone-enabled messaging apps (such as Wickr) are increasingly used to facilitate various aspects of illicit drug buying and selling, including promotion targeted at potential customers, conveying information about products and arranging face-to-face transactions (Demant et al., 2019;Moyle, Childs, Coomber, & Barratt, 2018). As in the case of clearnet shops, drug trading via social media platforms generally does not leave traces that illuminate actual transactions. Traces left by drug trading via smartphone messaging apps are even more limited, as this mostly takes place in private -with the exception of drug advertisements on semi-open platforms such as the photo and video-based social media platform Instagram (Thanki & Frederick, 2016).
It was in 2011, with the advent of the first so-called 'darknet' marketplace Silk Road, that we saw technologies combine to create fully open online platforms that effectively support a substantial trade in illegal drugs. Growth in illicit drug sales on these markets -in spite of numerous successful law enforcement operations -derives from technologies that obscure links between the marketplace activities of buyers and sellers and their real identities. This anonymity also enables cryptomarket trading to take place openly, thus making richly detailed trading data available to researchers. More often referred to as cryptomarkets in the literature, the appearance, structure and function of these marketplaces mimic well-known legal counterparts such as eBay. Cryptomarkets host sellers (known as 'vendors') who pay a commission on their sales to marketplace owners. Buyers can choose among vendors based on price and using the feedback from previous customers about available products and services. Cryptomarket product listings include fine detail on the type and quality of drugs offered by vendors alongside precise information on quantity and price, the country from which sellers indicate their products will be shipped, the available delivery destinations and reputation metrics based on customer feedback. Customer feedback can, in turn, be used as a proxy indicator for actual sales. Thus, cryptomarkets simultaneously provide researchers with supply and demand-side information enabling estimations for transactions and revenue generated across different drug types, prices, quantities and selling locations (e.g. Aldridge & Décary-Hétu, 2014, 2016Demant, Munksgaard, & Houborg, 2016;Kruithof et al., 2016).
Compared to online drug trading, our understanding of traditional 'offline' drug markets is limited and imprecise, resulting from reliance on data predominantly derived from partial, non-representative samples. Samples are mostly generated in connection to law enforcement activities (e.g. controlled buys), or by independent researchers collecting self-reports about drug selling operations directly from drug sellers themselves, typically -but not exclusively -identifiable because of their arrests or incarcerations (Barratt & Aldridge, 2016). These methods produce at best a partial understanding about drug markets because study samples are drawn disproportionately from sellers whose activities are known to criminal justice agencies, and therefore likely to be different in important ways to sellers whose activities go undetected. Moreover, samples in these studies are typically small, with sizes significantly less than 100 being the norm (for an exception see Sevigny & Caulkins, 2004). In contrast, the data collected from cryptomarkets can be vast, with growing datasets now including information related to many thousands of drug sellers, many hundreds of thousands of listings offering drugs for sale and with information connected to transactions that number in the millions. Unconstrained by the need to draw samples, moreover, the exceptionally large datasets from drug cryptomarkets provide researchers with the opportunity to analyse nearcomplete populations (Barratt & Aldridge, 2016), thereby overcoming some of the sampling biases mentioned above.
A recent study by Martin, Cunliffe, Décary-Hétu, and Aldridge (2018) illustrates the value of cryptomarket data in producing evidence to inform drug policy. The study examined the effects of a 2014 US legislative change designed to restrict the supply and misuse of hydrocodone, a prescription opioid medication. One study published since this change suggests that it had the intended effect: the number of prescriptions issued by doctors for hydrocodone fell (Jones, Lurie, & Throckmorton, 2016). But the decreased availability of legitimate prescriptions may also go hand-in-hand with increased illicit supply. It is not easy to establish precise trends in illicit drug markets because the market activities are mostly hidden. But on cryptomarkets, researchers can use specialised crawling and scraping software such as DATACR-YPTO (Décary-Hétu & Aldridge, 2013) to collect public traces of buying and selling. In this case, the authors collected data from 31 cryptomarkets operating between October 2013 and July 2016. They measured illicit sales and available supply across different prescription drug types. Analysis related to the nearly 3 million transactions generated by drugs listed for sale in the 30 days prior to data collection. The authors found that cryptomarket sales of hydrocodone, as well as other prescription opioids, increased immediately following the change in legislation. This increase was not found for other types of prescription drugs (sedatives, steroids and stimulants). The authors could not rule out other factors that may have driven the increased illicit opioid sales they observed. Nevertheless, their results are consistent with the possibility that the change in legislation may have played a causal role. Additional evidence was found that bolstered the causal explanation: the increases in illicit opioid sales that followed the change in legislation were only observed for US-based cryptomarket sellers, and not for sellers in other countries.
This study is valuable in highlighting some of the potential problems that may arise when supply-side restrictions are imposed without interventions to reduce demand, including prevention and treatment programmes that are evidence-based, and easily and widely accessible (Hadland & Beletsky, 2018). By using cryptomarket data, the study's authors were able to pin-point precisely -temporally and geographically -changes in illicit drug selling following a supply-side intervention in a way that studies using traditional sources of data (e.g. data from operational policing activities and self-report surveys) could not. This study illustrates how unsolicited data from online drug markets can provide drug policy scholars with unprecedented insight into illicit drug trading, and a unique opportunity to measure directly the impact of specific policies on drug trade and demand.

Online media outlets
Online news reports from mainstream media -and the increasing number of alternative news providers enabled by the internet -can provide researchers with valuable drug policy-relevant data. Researchers have, of course, turned to news stories as data sources long before the internet. Classic studies like those by Cohen (1972) and Goode and Ben-Yehuda (1994) show the central role of media reporting in framing drugs and people who take drugs in popular discourse, thereby discursively obscuring, revealing, or even urging available policy solutions.
In the era of online news, however, the role of news stories in shaping popular drug discourse may be changing in important ways. This is illustrated in a study by Forsyth (2012) examining online news reports of deaths apparently linked to mephedrone before and after the drug was outlawed in the UK. Using the Google Trends application to chart trending search terms (e.g. 'buy mephedrone') and thus measure public interest in mephedrone, Forsyth showed that online news reports of ostensible mephedrone deaths were immediately followed by increased UK-based Google searches locating online sellers of the drug. Even stories aimed at raising awareness of the potential dangers of the drug may simultaneously have alerted readers to the ease with which mephedrone could be obtained by also reporting the existence of legal online sellers. Seemingly paradoxically, it was mephedrone news stories that were inaccurate or that exaggerated harms which generated the most public interest in buying the drug. This particular finding supports previous critical research on drug policies and drug prevention initiatives relying on fear-based messages or scare tactics (for specific exceptions, see systematic review by Esrick et al., 2018). However, by using online news stories as data, Forsyth's study adds unique policy insights not possible with the more conventional data sources used by drug researchers. Forsyth's unsolicited online data sources allowed him to pinpoint precisely increased public interest in obtaining drugs to the publication of online news stories about the drug, and to explain the magnitude of this interest with reference to variations in message accuracy.
Many news websites allow readers to contribute content themselves in the form of comments and interactive discussion. This user-generated online content provides researchers with yet more data sources for studies of illicit drug discourses in the public sphere, and further enables comparative studies of differing drug discourses associated with different journal readerships. Forsyth (2012) suggested that this kind of online content may have enabled alternative perspectives on mephedrone that contradicted stock drug scare themes in mainstream news reports. These kinds of unsolicited online data connected to news reports can provide useful insights into how public drug discourses emerge, compete or prevail, and so facilitate or constrain available policy options.
Sometimes online news reports can be used in a comparatively straightforward manner to generate factual information, as illustrated in a recent study by Groshkova et al. (2018). The authors report on a pilot project by the EMCDDA that assessed the potential of openly available online information to complement existing official data routinely collected on illicit drug seizures in Europe. The researchers created an automated monitoring tool for collecting data published online by law enforcement agencies, online news outlets and social media. The authors suggest that these new data sources improve on routine official data in timeliness and by providing additional context that enriches understanding of drug supply and fills knowledge gaps, for example by providing information for countries that report drug seizures inconsistently to the EMCDDA. The authors argue that their data may be used "for forecasting to support predictive law enforcement such as datadriven threat assessment of evolving patterns of drug trafficking" (Groshkova et al., 2018, p. PP), so enabling the efficient allocation of law enforcement resources.
Databases such as Google News and LexisNexis curate a wide range of national and international news stories, allowing news reports relevant to researcher aims to be identified using keyword searches. Because these databases include news stories going back several decades, researchers can also conduct longitudinal and comparative analyses of discursive shifts and contrasts, thus highlighting the interplay of drug policy developments and public drug discourses (e.g. Houborg & Enghoff, 2018).

State-affiliated agencies and civil society groups
A range of state and non-state agencies have functions that relate to illicit drugs. Because drugs are controlled by a series of international laws and conventions, international agencies also exist to support and guide governments in their legislation, enforcement and other work connected to illicit drugs. Civil society groups include those non-governmental organisations that have stakeholder interests or activities in relation to illicit drugs. Such groups are diverse and might include groups representing the interests of people who take illicit drugs (e.g. Erowid.org), organisations working for drug policy reform (e.g. NORML) or organisations campaigning against illicit drugs (e.g. Drug-Free World).
Websites for these organisations provide researchers with messages and official policy positions related to organisational aims, alongside agency activities. Analysis of these data can be approached in different O. Enghoff, J. Aldridge International Journal of Drug Policy xxx (xxxx) xxx-xxx ways. First, data can be treated as corresponding directly to the phenomenon it represents, and thus as a valid measurement tool. The study by Groshkova et al. (2018) discussed above used this approach in accessing public data published on the official websites of national and international law enforcement agencies when they publicly reporting cases of successful drug seizures. Alternatively, data can be analysed critically -i.e. as discourse rather than facts -to reveal often unarticulated assumptions that may shed a different light on the phenomenon in question. Documentary analysis of official policy and practice-related documents produced by state and non-state agencies with illicit drug-related functions have long been analysed like this by researchers (e.g. Seddon, Williams, & Ralphs, 2012). However, we are not aware of any such studies using information available from the websites of these organisations. This information often extends beyond official documents to include information researchers have as yet under-exploited, but which may reveal drug policy-relevant insights.
As such, unsolicited data obtained from the websites of state-affiliated agencies & civil society groups with an official drugs remit represents an under-utilised resource for discursive studies similar to those made possible by data obtained from media outlets, as well as studies of the institutional side of the conflict in attitudes towards illicit drugs treated in the user-driven studies discussed in "People who take illicit drugs"

Methodological advantages and challenges in the use of unsolicited online data
New data sources entail fresh methodological and ethical challenges. Here we review a number of common advantages and disadvantages of the data types outlined above, which relate to the following aspect of drug policy research: sampling, bias, feasibility and ethics.

Sampling
Conventional data collection methods are constrained by limited access to the 'hidden populations' that people who take or sell drugs often comprise, or to highly specialised populations of people who take illicit drugs (Miller & Sonderlund, 2010). These limitations may result from available contacts, from constraints imposed by geography, institutional or informal gatekeepers, and by restrictions placed on research with human participants by institutional research ethics committees. As discussed above, online platforms attract stigmatised populations and have a potentially global reach, and consequently, data obtained from these platforms is less susceptible to such difficulties. Thus, data generated on online platforms by people who take or supply illicit drugs provide a reach that data collected with conventional methods may lack, and this has been taken advantage of in numerous studies, some of which are cited above. This advantage is also evident, although to a lesser degree, in online research using data from media outlets or stakeholder organisations, as the ability to use search engines to find relevant data makes it possible to achieve either a more diverse or a more specialised dataset than otherwise.
Even though the internet is now used by a vastly greater and more diverse population than ever before, online data are often criticised in terms of the issues of validity and representativeness that arise from the anonymity and demographic skew of internet users (Barratt et al., 2017;Hewson, Yule, Laurent, & Vogel, 2003). Within online drug research, this pertains mainly to data generated by people who take or supply illicit drugs, as media and organisational data are mostly clearly attributed, whereas individuals can create distorted or entirely fictional online identities due to internet anonymity. Online narratives cannot, therefore, be understood to correspond simply and unproblematically to the actual behaviour or attitudes of people who take or supply illicit drugs (Aldridge & Askew, 2017). Researchers may attempt to infer demographic characteristics using available information, but designations (e.g. gender of a discussion forum contributor) are not verifiable and therefore remain uncertain, and so limit the generalisability of research findings to the wider offline population of people who take illicit drugs. In demographic terms, not all people who take or sell illicit drugs participate in online communities. Those who do may differ from those who do not on key characteristics: type and pattern of drug taking/supply, age, socio-economic status, geographical location, and education (see e.g. Pew Research Center, 2018). This limits the ability of research using unsolicited online data to provide findings which are generalisable in the classic sense, such as population prevalence estimates. However, this type of research can still provide answers to adequately specified research questions about adequately specified populations, not least relating to how the internet is used by drug-relevant actors -and contemporary population surveys about the use of the internet as a source of information on drugs and similar issues qualify the wider relevance of such findings (see e.g. European Commision, 2014;Murguía, Tackett-Gibson, Lessem, 2007;Murguía, Tackett-Gibson, Willard, 2007).

Bias
A distinct advantage of unsolicited online data generated by people who take or supply illicit drugs is that it will not be subject to observer/ researcher effects where research participants consciously or unconsciously modify their behaviour in response to the research context (Hewson et al., 2003, pp. 46-47;Robinson, 2001). Self-reports by participants about their drug taking to researchers, for example, may involve exaggeration, omission or downplaying information, or even lying in response to their beliefs about what the researcher wants, or in response to their perceptions of the researcher's beliefs, values or prejudices. On the other hand, the absence of an active data collector means that unsolicited data only contains whatever the person generating it chooses to include, and so researchers must consider that relevant details may have been omitted. This is a common critique of self-report data, but it has also been argued that these subjective biases may instead represent an opportunity to study 'what matters' to the population in question (Solymosi, Bowers, & Fujiyama, 2018).
However, unsolicited online data is also sensitive to the context of its production, albeit in different ways. Online platforms that enable user interaction may create powerful norms, beliefs and meanings shared within a specific community, and these may actively shape how new and old community members behave and contribute. This was clearly seen with injecting drugs in the study by Boothroyd and Lewis (2016), and has also been noted in connection to online 'trip reports'; first-person drug narratives that are often heavily characterised by intertextuality and derivation (see e.g. Bohling, 2017;Springer, 2015). The ethos and discursive repertoire of an online community, therefore, provides a subcultural context that will frame, influence or even police community activity, and therefore the content that eventually becomes the researcher's data. While this needs to be taken into account when analysing the data, internet researchers have argued that "holding a constructivist rather than positivist worldview, trustworthiness or truth value of the data can only be evaluated in context" (Robinson, 2001, p. 712). Additionally, the locally constituted nature of user-generated online data is a valuable field of study in itself, as it enables the study of online social learning and 'meaning-making' among people who take illicit drugs -representing an online parallel to seminal interactionist drug research by e.g. Becker (1953).

O. Enghoff, J. Aldridge
International Journal of Drug Policy xxx (xxxx) xxx-xxx

Feasibility
Collecting unsolicited online data typically consumes fewer resources than data obtained using more conventional social research methods. Limited time and budgets can constrain the extent of data collection by researchers using conventional methods. Collecting online data entails little or no costs for travel, and no staffing costs associated with conducting or transcribing interviews. Consequently, digital data sets are more cost-effective to collect, and thus lend themselves more easily to large-scale data collection and exploratory research -albeit with an increased risk of information overload (Kozinets, 2002). If using digitally automated methods such as network analysis or natural language processing, large quantities of data -either quantitative (e.g. networks) or qualitative (e.g. text) -may also be analysed in a highly resource-efficient and scalable manner.
Since the generation and storage of unsolicited online data are by definition not controlled by the researcher, this type of data is volatile (Robinson, 2001, p. 713). Platform users are often allowed to edit or delete their content, and entire platforms can be erased due to an internal decision or external factors -e.g. when law enforcement shuts down cryptomarkets or when major platforms like Reddit and Facebook shut down pages that violate the site's policy (as when Reddit shut down several 'sub-reddits' dedicated to cryptomarket discussion; see Franceschi-Bicchierai, 2018). The same problem applies when platforms restrict access to their hosted content, e.g. by setting a limit on the number of posts that can be retrieved or by restricting access to old content (as seen with e.g. Facebook, Twitter and Reddit). Thus, researchers often have to collect and store online data continuously in order to prevent the loss of important contributions. These and other examples of the volatility and 'messiness' of unsolicited online data contribute to the importance of thoroughly documenting the steps taken to collect data (including the times at which websites were accessed), in order to achieve a degree of analytical replicability similar to that of more controlled research designs.
Finally, the use of digital data can in some cases entail a technical skill-based barrier. Automated online data collection, for example, can require specific computing skills, making specialised training or collaboration with other specialists valuable (Ramage, Rosen, & Chuang, 2009). In two of the studies mentioned in this paper, specialised software was developed (Cunliffe, Martin, Decary-Hetu, & Aldridge, 2017;Groshkova et al., 2018) requiring computer science expertise. This bridging between computer science and social science can present practical challenges, but may also pay dividends by encouraging interdisciplinary collaborations that enable new directions for drug policy research less confined to disciplinary silos.
The emerging literature connected to automated online data collection from drug cryptomarkets illustrates how the development of new technical expertise by researchers can instigate emerging 'good practice' consensus as researchers critically and publicly scrutinise one another's methods. When Dolliver (2015) published a paper using online data collected from a large cryptomarket, her results surprised other researchers whose data from the same marketplace painted a very different picture. Published responses to Dolliver concluded that such divergent findings may have resulted if Dolliver's web crawling software generated incomplete data due to software malfunction (e.g. Aldridge & Décary-Hétu, 2015;Van Buskirk, Roxburgh, Naicker, & Burns, 2015). A publication from a later study failing to replicate Dolliver's results devised guidelines for researchers to check for and report on integrity and completeness of data collected from cryptomarkets (see Munksgaard, Demant, & Branwen, 2016), a now-standard practice in the literature (e.g. Demant, Munksgaard, Décary-Hétu, & Aldridge, 2018;Kruithof et al., 2016). Unsolicited online data therefore can provide us with challenges in connection to replicability, but also with new opportunities, particularly when datasets are made openly available for secondary analysis.

Research ethics
The use of unsolicited online data has distinct ethical challenges documented in the growing methodological literature on the topic. The blurred line between public and private in online spaces has given rise to debates about whether unsolicited online data should be regarded as research with human participants, thus necessitating particular requirements of researchers, for example to obtain informed consent, and provide participants with the right to withdraw from the research (Boyd & Crawford, 2012;Roberts, 2015;Snee, 2013), or instead treated as public data that researchers can use without these ethical considerations (Wilkinson & Thelwall, 2010). Despite efforts to establish a 'best practice' for ethical online research (see e.g. Markham & Buchanan, 2012;Robinson, 2001) a broad consensus has yet to emerge among online researchers and research ethics committees.
In most of the literature on the subject, a highly context-specific and 'bottom-up' approach to ethical decision making is recommended; working from the general principle of balancing the potential harms to the studied with the potential benefits of the research. Online communities are valued by e.g. people who take illicit drugs as protected and safe spaces, and sometimes prefer to minimise external attention. Thus, research within such communities needs to employ research protocols that minimise the chances of creating harms or other negative impacts on these communities (for a comprehensive review of related issues, see Barratt & Maddox, 2016). In most cases, it will not be feasible to seek consent individually from community members, so if publications reporting research results will use direct quotes, it may be necessary to obscure usernames and paraphrase quotes to prevent the use of search engines to identify individual contributors (Wilkinson & Thelwall, 2010). Although online community members often post content using aliases, seemingly trivial disclosures (e.g. gender, age, town/city, college attended, occupation, arrests) may combine to enable real-world identification as an individual's contributions accumulate over time. This also necessitates reflections on data storage, as it may not be appropriate to indefinitely retain a copy of a discriminating post if the posting user has deleted it from the original platform. These concerns become increasingly salient as online metadata becomes increasingly detailed and sensitive (e.g. location data) and as analytical tools become increasingly able to correlate patterns and elicit personal identifiers from otherwise anonymous data. Thus, drugs researchers should constantly balance the benefits of their research with the imperative to protect vulnerable groups, and may thus have refrain from sharing all the knowledge gained in their analyses of unsolicited online data.
In some cases, the administrators may have developed a site usage policy which sets out rules related to e.g. copyright claims, authorship attribution, required acknowledgements in publications and whether automated data collection is allowed (e.g. Erowid and Reddit). In other cases, the preferences of the site's administrators and users may be less clear (e.g. cryptomarket forums). In both scenarios, it is worth considering a collaborative and fully disclosed approach, as many online drug-related communities (e.g. Bluelight) actively support academic and other efforts to make use of their data.

Shifting landscapes: drug policy and the digital age
We have highlighted the utility of data gathered from online platforms related to illicit drugs, specifically in gaining emic perspectives on otherwise hard-to-reach phenomena, in studying illicit drug markets and in analysing illicit drug discourses. We have illustrated this by drawing on previous research and by pointing out aspects of this type of data that warrant further research. In this final section, we review a number of contemporary technological and policy developments which further underscore the usefulness and importance of this type of data in future drug policy scholarship.
Many countries are now reforming drug policies and laws to permit medical and, in some cases, recreational use of previously controlled substances. The most well-known example of this is the increasing adoption of medical cannabis, substances such as MDMA, psilocybin and ketamine are currently being trialled as potential aides in psychotherapy. It has recently become easier for researchers and medical practitioners to get permission to administer these substances to human subjects, and thus rapid advances are being made in the testing of their neurological effects and therapeutic potential (see e.g. Tupper, Wood, Yensen, & Johnson, 2015). However, since these substances have been driven underground by prohibitive policies, knowledge about their subjective effects is scarce. Consequently, underground knowledge that has been shared for decades among people who take illicit drugs in the absence of official guidance, may now efficiently (and somewhat ironically) be used to address the information deficit faced by policymakers and medical practitioners interested in the therapeutic potential of previously stigmatised substances. Several scholars have leveraged this information to produce descriptive accounts of the subjective effects of (new and old) illicit drugs (e.g. Soussan & Kjellgren, 2017), which can be a vital resource for developing appropriate policies and codes of practice for the medical and therapeutic use of these drugscore emic concepts of safe drug taking, such as 'set and setting', have now become valuable clinical knowledge. Others have gone beyond the descriptive and leveraged this information in a critical policy perspective, e.g. Bohling (2017) who analysed user reports of recreational psychedelic drug experiences to criticise the subjugation of purely recreational taking of these substances within the contemporary medical paradigm.
A similar institutional information deficit has arisen with regards to a number of newly popularised substances which are neither legal nor approved for medical use, especially with the rapid emergence of socalled 'novel psychoactive substances'. The specific effects and risks of these substances are often not well understood, creating significant challenges for policymakers, law enforcement agencies, medical practitioners, social workers, treatment providers and harm reduction workers, since it becomes difficult if not impossible for them to gather the necessary information in a timely way when relying only on official channels of information such as clinical research and drug seizures (Deluca et al., 2012). Thus, monitoring and leveraging data generated by people who take and discuss their use of little known substances in online communities -precisely because they lack accurate and timely drug knowledge -can generate insights uniquely valuable to drug policy researchers.
With regards to drugs not encompassed by the recent push towards medicalisation and decriminalisation, many countries have instead chosen to adopt harm reduction policies that maintain the illegal status of a drug but to varying degrees support initiatives which seek to reduce the harms rather than try to prevent consumption altogether. A classic example of this is the safe injection room, but there are other more recent examples such as drug checking services (Butterfield, Barratt, Ezard, & Day, 2016) and drug safety kits such as the 'slamming kits' handed out for free in the UK by the Burrell Street sexual health clinic to facilitate safety in sexual practices involving injecting drugs. These types of harm reduction initiatives have long existed before they appeared in an institutional context, not least on the internet where, for example, ecstasy pill-testing has been facilitated by websites such as EcstasyData.org, founded in 2001. Consequently, it becomes essential for policymakers to engage with civil society groups in order to have the best possible foundation for developing new and efficient harm reduction initiatives. This highlights the utility of research that seeks to map out online harm reduction actors and understand how and why they work.
Finally, we argue that since drug markets are changing due to the appropriation of new internet technologies such as encrypted messaging, cryptomarkets and social media, it becomes essential for policymakers and practitioners to understand these technologies in terms of their potential harms and benefits, and how people who take or supply illicit drugs adopt them. This understanding necessarily involves a direct engagement with these new technologies through online research. It also becomes necessary to consider how people who take or supply illicit drugs are finding their ways to these technologies. Doing so involves considerable personal research and peer support, which can be facilitated by online community platforms such as Reddit. This platform hosts guides contributed by users, and offers interactive community support for potential cryptomarkets buyers, in order for them to gain the necessary technical expertise and to build an understanding of the implicit risks and required safety measures (Kowalski, Hooker, & Barratt, 2018). Thus, we once again see how the online civil societyincluding numerous highly active members which take on the social role of leaders and experts -becomes an important actor for policymakers and practitioners to take into consideration.
In light of these new tendencies in drug policies and illicit drug practices, and considering the sizeable body of online drug research already conducted, it becomes clear that unsolicited online data generated by drug-relevant actors is a highly valuable -and in some cases indispensable -source of data on a range of issues within drug policy scholarship. Consequently, it is essential that drug policy scholars make the best possible use of this data -by continuing the highly successful work already being done in this field, by using the data responsibly in line with the ethical concerns mentioned above and by continuously rethinking, renewing and innovating the ways in which we can harness the considerable power inherent in this diverse and constantly evolving data source.

Declarations of interest
None.

Author contribution
Both authors contributed ideas and writing.