Transnational repression: data advances, comparisons, and challenges

ABSTRACT Research on state repression generally focuses on what states do to populations within their own borders. However, recently scholars working at the intersection of comparative politics and international relations have begun to systematically analyse states repressing their populations outside their borders as part of their foreign policy. Variously called transnational repression, extraterritorial repression, or sometimes global authoritarianism, the focus is on the motives, methods, and effects of states extending repressive practices to their citizens abroad. Much of the research in this area has developed theories and findings using fieldwork and interview-based methods. Recently, however, multiple researchers and research groups have produced cross-national publicly available event data on transnational repression. This research note explains the main features of those datasets, including their scope, sources, structure, definitions, and strengths and limitations. In addition to descriptive introduction, it discusses the challenges associated with gathering data on transnational repression as well as suggestions for moving forward. The main aims are to introduce available data on transnational repression to researchers interested in working in this area and to highlight issues they may confront in gathering new data.

Instances of states violently repressing their exiles overseas periodically hit the global headlines. The grisly murder of journalist Jamal Khashoggi in 2018 by Saudi agents in Turkey revealed the shocking lengths to which some states will go to silence their critics. Belarus' dictator Alexander Lukashenko in 2021 forced a passenger plane flying over the country's airspace en route from Athens to Vilnius to land so that his henchmen could remove and detain a dissident Belarussian journalist and his girlfriend. In cases familiar to observers of African politics, for nearly two decades agents of Rwanda's Paul Kagame have hunted critics across the world to coerce them into silence (Wrong 2021). The issue of transnational repression comes to the public consciousness sporadically, but to those in politically active exile communities it can be ever present as cases unfold far from the public eye. The issue is important from an academic perspective because it speaks to the extraterritorial control mechanisms of states and their nonstate actor proxies.
This research note focuses on the data underlying emerging scholarship trying to understand the scope and drivers of transnational repression (TR). Scholars examining TR have begun to systematically analyze how, when, and why states attempt to repress their populations outside their borders as part of their foreign policy. This literature has made substantial progress in illuminating the shadowy world of TR. However, given that most TR is meant to be unseen or at least deniable by the perpetrating state, researchers face a challenge in gathering reliable data. Many of the works in this genre, therefore, use qualitative methods like interviews or case studies to painstakingly reconstruct processes of transnational repression in fine-grained detail.
Recently, however, several research teams have attempted to create broader databases of TR. These efforts document large numbers of instances and/or individual case histories across varying geographic and temporal spaces. This research note describes the basics of these efforts, discusses them in comparison with one another, clarifies the challenges of gathering this data, and suggests possible options for moving forward. As will be explained further below, the focus of the inquiry is on the foreign policy of authoritarian states as the primary source of TR. The aim of this brief research note is not to elaborate a new theory or make causal claims. Rather, given the nascent state of the research area, the objective is distinctly modest: to provide a synthetic summary of existing research materials and a basic comparison of available datasets.

Existing research
Repression, as a state-led control mechanism, occurs when authorities aim at preventing dissident beliefs and/or activities which they deem as imperilling political order (Goldstein 1978). Through undertaking repressive actions, the government aims to threaten individuals and/or groups with sanctions aimed at keeping them under control and suppressing dissent (Davenport 2000;Escriba-Folch 2013). This literature has generated important insights into understanding the relationship between dissent and domestic repression, thus how domestic repression affects dissent and violence and vice versa (e.g. Lichbach 1987;Carey 2010). To answer these questions, scholarship about state repression at the domestic level uses advanced methods and microlevel data to understand the precise practices and outcomes of repression aimed at preventing dissent and dismantling its organizational foundations (e.g. Ritter and Conrad 2016;Sullivan 2016).
All states repress pre-emptively and responsively to some degree, but in general consolidated democratic states repress less than autocracies (Henderson 1991;Davenport 2007;deMeritt 2016), and as is well documented in the literature on authoritarianism, repression constitutes one of the key instruments that dictators use to stay in power (e.g. Olar 2019). In general, research on political repression and the preservation of political control has focused on the domestic sphere, rarely considering the potential reach of repression beyond the borders of the nation state (for example see recent literature reviews by deMeritt 2016; Hassan, Mattingly, and Nugent 2022), although some research has found that states engage in pre-emptive clampdowns in light of dissent and conflict in neighbouring states (Danneman and Ritter 2014).
Widely used datasets on human rights violations and state repression generally do not account for TR. The well-known Cingranelli-Richards (CIRI) Human Rights Data Project excludes transnational repression, stipulating in its codebook that 'except in certain cases of occupation, only violations that occur within a country's internationally recognized borders are coded' (Cingranelli and Richards 2014, 5). The widely-used Political Terror Scale also excludes TR, with its codebook stating that the dataset measures political terror, which it defines as 'violations of basic human rights to the physical integrity of the person by agents of the state within the territorial boundaries of the state in question' (Gibney et al. 2021, 1). These and related datasets have been used to generate latent models of respect for human rights which could in principle capture TR (Fariss, Kenwick, and Reuning 2020), but at present those models are built on data that does not include TR, meaning that the resulting latent measures also do not account for it. These observations should not be read as criticisms of these data efforts, but rather just to note that they generally only measure repression that takes place within the borders of the perpetrating state.
Yet state repression can also take on a transnational character whereby (mostly) autocratic states aim to maintain and control their populations living abroad through their foreign policies. Forms of state repression in such cases are characterized by crossborder interaction between home state and various transnational actors and agencies aiming to target a population deemed to represent a threat or compromising the survival of the regime in place. In this important sense, TR forms an extension of home state repressive strategies.
In this context, as noted by Moss (2016, 481), the population cannot fully 'exit' from the domestic political sphere. Thus while TR is clearly linked with the study of state repression and political violence, thus far research on TR has mostly grown from the literature on migration (e.g. Adamson 2020; Tsourapas 2021), the international dimensions of authoritarianism (e.g. Lewis 2015;Dukalskis 2021), and research at the nexus of comparative politics or area studies and international relations (e.g. Moss 2016;Cooley and Heathershaw 2017). The transnational element of TR means that the state has different capabilities and available agents than with domestic repression. Conceptually with TR, sovereignty becomes an obstacle for the state and a source of protection for the target whereas with domestic repression sovereignty is usually advantageous for the state but detrimental to the target. Moreover TR and the diaspora response can become entangled in the domestic politics of the host country and/or relations between the host and source state in ways that differ from strictly domestic repression (Moss 2022).
With these similarities and differences between domestic and transnational repression in mind, the empirical focus of TR research to date has had two often implicit scope conditions. First, it has generally focused on states repressing their own citizens abroad, not citizens of other states. If repression is fundamentally about maintaining domestic political order then controlling the citizenry takes priority over taming the citizens of other states (see Adamson 2020;Tsourapas 2021). Second, TR research has focused on authoritarian or hybrid regimes as the source of repression. Democratic states can engage in TR but, building on deMeritt (2016), there are persuasive theoretical reasons to believe that they will do so less than authoritarian counterparts. Democratic states will generally have fewer exiles to begin with because they allow dissent domestically and provide institutional channels to change the status quo. Even if those channels are unsatisfactory democracies generally tolerate more extra-institutional challenges provided they are non-violent. Even if there are exiled dissidents of democratic states, the latter are likely to have more legal constraints and audience costs were they to target citizens abroad.
As such, the TR literature captures empirically and theoretically the spatial politics of contemporary authoritarianism and political violence across borders (Lewis 2015;Cooley and Heathershaw 2017;Dalmasso et al. 2018;Glasius 2018;Conduit 2020;Dukalskis 2021;Tsourapas 2021;Furstenberg, Lemon, and Heathershaw 2021;Moss 2022). TR highlights the relational and contingent nature of state power that is embedded in the international systems and structures. The wave of this new scholarship demonstrates in depth how authoritarian states attempt to control and silence their dissident populations abroad including through the use of threats, surveillance and intelligence-gathering operations, physical attacks, abduction, politically motivated extradition requests or International Criminal Police Organization (Interpol) Red Notices, and even assassination (e.g. Moss 2016;Michaelsen 2018;Lemon 2019;Baser and Ozturk 2020;Dukalskis 2021;Moss, Michaelsen, and Kennedy 2022). In this context, Adamson and Greenhill (2021) argue for the need to view the current global security environment as entangled and interconnected, which demands supplementing state-centric analysis with attention to non-state spaces, networks, actors, and geographies of security. State power is not neatly demarcated by national borders. Rather it diffuses and operates across multiple frameworks, agencies, and foreign policies (Furstenberg, Lemon, and Heathershaw 2021).
As a result of the spatial and diffuse character of TR, digital methods feature prominently in existing TR research (e.g. Deibert 2015;Michaelsen 2017Michaelsen , 2018Moss 2018;Josua and Edel 2021, 598-560). As digital technologies became ubiquitous, TR perpetrators took advantage. Digital tools allow for activists and diaspora communities to organize and network easily. However, the same digital environment also facilitates state surveillance, intelligence gathering, hacking, reputational attacks, and intimidation. Adjacent concepts in the digital sphere like misinformation campaigns or external propaganda are generally treated separately from TR, although Dukalskis (2021) integrates them and other authoritarian actions into a comprehensive framework of 'authoritarian image management'. Illustrating the connections between digital and non-digital spheres, the repression domestically of family members of exiles, sometimes called 'proxy punishment', is often communicated to the exile via digital channels in an effort to silence the target. The 'traditional' method of pressuring a family to silence the exile gained new dimensions with the ease of digital communication. The transnational digital sphere is challenging to regulate, but some have identified policies and practices by host states that may mitigate digital TR (Anstis and Barnett 2022). Often politically active exiles rely on self-help and take measures to protect their digital identity and links (Michaelsen 2018), with some NGO and civil society groups, such as Tactical Tech or the Electronic Frontier Foundation offering toolkits and trainings.
Of course, TR is not new or inherently digital. Historically, states have engaged in political violence against their citizens abroad with, for example, former Soviet revolutionary insider Leon Trotsky assassinated in Mexico by NKVD agents in 1940. In South America, from the mid-1970s until the early eighties Operation Condor saw coordinated forms of cross-border repression perpetrated between cooperating intelligence services of Argentina, Uruguay, Chile, Paraguay, Bolivia and Brazil against political opponents of the member countries (Lessa 2019).
What does appear relatively new is the label and the concerted scholarly attention to it outside of case study or historically rooted area studies literature. Scholars studying what we would now call TR analyzed features of extraterritorial violence or intimidation by, for example, Libya under Gaddafi (see discussion in Tsourapas 2020, 359-360), Revolutionary Iran in the 1980s (Halliday 1994, 314-315, 322-323), and North Korea throughout much of its history (Fahy 2019). The aforementioned Operation Condor predates wide usage of the term 'transnational repression' but has received scholarly attention and in more recent research has been explicitly labelled as such (e.g. McSherry 2002; Lessa 2019). These episodes generally predate the time spans of the data sets described below, but there are advantages to analyzing historical cases of TR as archives and records may be available that illuminate details with more certainty (Yom 2022).
Research on TR has hitherto been driven by theoretical work analysed using qualitative cases and based primarily on semi-structured interviews and participant observations. This pioneering work has provided conceptual, theoretical, and empirical foundations for studying a previously under-researched topic. One new development has been the emergence of datasets of TR that attempt to aggregate cases into comparable units for further analysis. It is to those efforts that the next section turns.

Data advances: four TR datasets
This section describes four datasets available to researchers attempting to understand patterns of TR. The geographical and/or thematic scope, temporal scope, data sources, structure and extent of the data, and salient definitions and categories of each data gathering effort will be briefly identified. Each will be presented in the temporal order that they became available. Definitions vary between the datasets, but in general capture events consistent with Moss' definition of TR as 'attempts by regimes to punish, deter, undermine, and silence activism in the diaspora' (Moss 2022, 71).
Central Asian Political Exiles Database (CAPE). The CAPE database aims to measure patterns of TR carried out by the states of Central Asia (Uzbekistan, Kazakhstan, Kyrgyzstan, Turkmenistan, and Tajikistan) on their populations abroad. It describes and categorizes TR carried out by five states from 1990 to 2018 on five categories of political exile and across three stages of intensity. The database contains categories of exile and stages of incident types. Additionally, the database includes entries that characterize the types of physical violence experienced, arrests, notice, and gender characteristics.
The CAPE database identifies five categories of political exile which are observable in a Central Asian context: (1) former regime insiders and their family members; (2) members of opposition political parties and movements; (3) banned clerics and alleged religious extremists; (4) independent journalists, academics, and civil society activists; (5) others: businessmen, employees or relatives of political exiles. It excludes war criminals and individuals convicted of terrorist offences overseas in a court in a jurisdiction where a high standard of the rule of law is upheld, members of transnational clandestine groups, including proscribed terrorist organizations, and labour migrants subject to bureaucratic controls.
The database further identifies three stages of TR developed inductively from the study of Central Asian political exiles. Each indicates an escalation of action taken against the exile.
(1) Put on notice includes informal warnings and threats to individuals and intimidation of family members and formal arrest warrants, including Interpol notices, and extradition requests.
(2) Arrest and/or detention includes short-term and long-term periods of detention ordered by courts, irregular detention and detention without charge, and conviction either overseas on charges connected to political activity or, in absentia, at home. (3) 'End game' includes a formal extradition to face torture and imprisonment, informal rendition or deportation often following release from detention, disappearance, serious attacks with an attempt to murder or disable, assassination attempt, assassination and other suspicious deaths.
During the period 1990-2018, the database records 278 political exiles (245 male, 33 female). The highest number of incidents is recorded from Uzbekistan (131 cases) followed by Tajikistan (68), and Turkmenistan (39). The most targeted category of exiles is banned clerics (32%), followed by opposition groups (28%), journalist and civil society groups (20%) and former regime insiders (10%). In terms of incident stages reached, 13% were put on notice (stage 1), 47% of recorded exiles have been subject to arrest or detention (stage 1), and 41% to rendition and kidnapping (stage 3). Around half of stage 3, and about 21% of recorded exiles have been subjected to serious physical attacks.
Transnational Repression Database, Freedom House. The Transnational Repression Database compiled by Freedom House (FH) and released in February 2021 and updated in June 2022 documents incidents of TR that occurred around the world between January 2014 and December 2021 (see Gorokhovskaia and Linzer 2022). It captures cases of direct, physical coercion: successful and attempted assassination, assault, intimidation, rendition, detention, deportation, and disappearance.
Data come from a variety of public sources including reports produced by non-governmental and international organizations, legal documents, and journalistic accounts. In some cases, individual incidents were further investigated via interviews with relevant actors. However, the published database only contains 'public' cases where information about the targeted individual is already in the public domain to safeguard against bringing unwanted attention to victims of TR who wish to remain out of public view.
Identifying the influence of an autocratic state with absolute certainty is difficult, especially where the tactics are criminal or indirect. Each incident in the database was therefore coded according to the coder's confidence that it was state driven based on contextual evidence: an established campaign against the target or community, evidence of a political motivation to target the individual, or a mechanism by which the state can target the individual.
For each recorded incident, the following information is included: the country where it took place (host state), the state responsible (origin state), the name of the individual, date, and other information such as a connection to Interpol, the target's ethnic or religious identity and professional profile, and criminal accusations made against the targeted individual. These additional descriptive characteristics make it possible to illustrate the ways in which TR intersects with other transnational issues such as the global 'War on Terror', abuse of international law enforcement organizations by autocrats, and the oftenprecarious legal status of asylum seekers.
The database includes 735 incidents of transnational repression committed by 36 origin states in 84 host states. The most active perpetrators of TR are China (229 incidents), Turkey (123 incidents), Egypt (42 incidents), Tajikistan (43 incidents) and Russia (41 incidents). Almost all origin states (35 of 36) use more than one tactic of TR, often repeatedly against the same targeted individual.
Authoritarian Actions Abroad Database (AAAD). The AAAD captures incidents of TR by authoritarian states between 1991 and 2019 (see Dukalskis 2021, 67-79). 1 Information was gathered from publicly available media, NGO, and academic sources, including data sources like CAPE, mostly using key search terms in Google News and Lexis Advance UK to gather a corpus of articles for analysis. Most data gathering was done in English, but follow-up searches were completed in Arabic, Chinese, French, Korean, Turkish and Russian.
The AAAD contains information about cases in which authoritarian states threaten, threaten the family of, facilitate arrest or detention, attack, extradite, abduct, or assassinate their exiled critics. Unsuccessful extraditions, abductions, and assassinations are also recorded. Targets include journalists, activists, opposition members, former government officials, and a category of 'citizens' that captures people who may not be especially politically active abroad but are targeted because some aspect of their identity renders them politically threatening to the origin government.
It records 1,177 total incidents of TR between 1991 and 2019. The data is structured with the event as the unit of analysis, with most entries recording one state doing one thing to one individual in another state in a particular month and year. However, some events target multiple people, so the AAAD records the number of targets for each event. The 1,177 events, therefore, involve at least 2,585 people, not including the family members in the origin state who are repressed domestically because the state perceives their relative abroad to be threatening. The 'top 5' source countries by incident are Uzbekistan (195 events), China (167), North Korea (156), Turkey (111), and Russia (74). By target, the most common category is citizens (about 37% of cases), followed by activists (33%), journalists (14%), former government officials (10%), and opposition (6%). The most common actions were threats (about 19%), arrest/detention (18%), extradition attempts (17%), extradition (15%), and threatening the family of the person abroad (15%). Assassinations (4.4% of cases, or 52 incidents) and assassination attempts (2.6% of cases, or 30 incidents) were among the lowest totals but obviously deserve attention because they are the most consequential and violent events.
China's Transnational Repression of Uyghurs Dataset. The China's Transnational Repression of Uyghurs (CTRU) dataset contains incidents of TR conducted by the People's Republic of China (PRC) to its target citizens from the Xinjiang Uyghur Autonomous Region (XUAR) 2 since 1997, when the first cases of rendition were publicly recorded. 3 It adopted a multi-stage methodology to identify relevant cases. First, existing reports on the targeting of Uyghurs and others from the XUAR beyond China's borders were compiled, including reports by Amnesty International, Human Rights Watch, the OSCE, Uyghur World Congress, and the Uyghur Human Rights Project. Second, these were supplemented with keyword searches on local news services and newswires. Third, further verification and additions were based on global datasets of transnational repression such as FH and the AAAD. Lastly, verification was done in conjunction with diaspora and advocacy groups. To establish the reliability of the data collection, the triangulation method was used, which involved the use of multiple independent sources of data to establish the truth and accuracy of the information.
Included are those who were (a) located outside the territory of the PRC when targeted; (b) members of a non-Han ethnic group from the XUAR, including Uyghurs, Kazakhs, Kyrgyz, and Tajiks; (c) targeted by the government of China or its agents. Like the AAAD, the dataset is structured with the event as the unit of analysis, meaning in some cases individuals appear multiple times. Following CAPE, the degree of TR is measured on a 3-point ordinal scale. So far, Stage 1 incidents are not included in the dataset due to the vast scale of China's harassment of Uyghurs globally.
The dataset contains 7,106 cases of transnational repression from 1997 until November 2021 in 44 countries. This figure includes mass cases, where individual biographical details such as names could not be identified. Detailed biographical information is available for 524 of the cases. CTRU documents how China's transnational repression of Uyghurs has expanded rapidly in recent years. While it logs 238 incidents from 1997 to 2013, it records 6,868 events since 2014. A majority of cases have occurred in Europe with 3,059 cases, Southwest Asia (India, Nepal, Pakistan, Afghanistan) with 2,121 cases, and the Middle East and North Africa with 773 cases.

Macro & micro comparisons across datasets
With the basics of the data gathering procedures and parameters established, this section compares the datasets at the macro and micro levels. The previous section discussed each dataset separately and described the logic, scope, timespan, data structure, and sources of each. Table 1 compares these from a macro perspective. Given that international relations and comparative politics research explicitly focused on TR is relatively new, clarifying the data foundations of it at this early stage can help researchers make choices that suit their objectives. These four datasets each make distinct contributions to researchers attempting to understand TR, but the utility of each will depend on the research questions brought to them. Each effort has strengths, weaknesses, and trade-offs that individual researchers using the data will wish to consider. Figure 1 visually compares the temporal coverage for each dataset. From a micro perspective, it is instructive to zoom in on individual entries to show how the data looks. We selected one case of TR from each dataset and provide the details underlying the entry. The aim is to show how the data is structured and what researchers can/cannot do with it. Although there is overlap, due to the geographic and temporal differences between the datasets, there is no one case or incident that appears in all four datasets. However, with the aim of showing similarities and differences most clearly, for the CAPE, AAAD, and FH datasets we present the data for the same case: Tajik exile Namujon Sharipov. Sharipov was an active member of the Islamic Revival Party of Tajikistan and left Tajikistan for Turkey after the country's Supreme Court declared the party a terrorist organization and banned its activities in the country (RFE 2018). On 16 February 2018, Tajik officials with the apparent acquiescence of Turkish authorities, returned him from Istanbul to Tajikistan (Human Rights Watch 2018). The entry for the case of Namijon Sharipov in each of these datasets is illustrated below Tables 2-4.
The three datasets offer descriptive information about TR incidents from open sources. The individual entries contain information about the targeted individual, country of origin responsible for carrying the attack, type of TR incident, host country of the targeted individual where the attack happened, and the year of the incident. However, despite these general similarities, there are important variations among the three datasets. Both FH and AAAD focus on incident type whereas CAPE focuses on individual exile cases. The datasets also vary according to their coverage of the TR target. The FH dataset offers several binary questions about the profile of the individual, which means that an exile can have more than one identity marker. For CAPE and AAAD, the identity is categorical (e.g. former regime insider) and mutually exclusive. FH includes data on criminal offenses of which the targeted exile is accused even if they are not made by a prosecutor or in court as well as gender, which is also presented in CAPE but not in AAAD.  The CAPE dataset is anonymised, however for the present paper to allow case comparison across different datasets we have de-anonymised the data. Notes on coding: State of concern 702: Tajikistan; Gender 2: Male; CatExile 2: Member of opposition political parties; Worststageexperienced 3: Rendition/Kidnapping; Violence 0: Unknown; InterpolArrest 0: Unknown.
In terms of TR incidents, in comparison to AAAD and FH, CAPE offers disaggregated information about type of physical violence experienced by the targeted individual as well as information about Interpol Red Notices. In AAAD and FH the information about physical violence is aggregated under the 'incident type' or 'action from origin country'. Finally, the major difference further lies in the presentation of the information, the CAPE dataset is numerically coded and fully anonymised, 4 this means that the individual will need the codebook to read the dataset, whereas AAAD and FH are not. Additionally, data anonymisation in CAPE doesn't allow one to search for an individual, while AAAD, FH as well as CTRU datasets offer the advantage to check on specific cases.
The CTRU dataset contains targets from Xinjiang. Here we provide an example of the well-known case of Dolkun Isa. He is a prominent Uyghur political leader based in Germany. He has been the target of several TR attempts by China. The data on his case from the CTRU is captured in Table 5.
As the most recently constructed of the four, CTRU applies a similar logic to the abovementioned datasets but differing in some respects according to its research goals. It follows CAPE's three stages of incident level and reports the information about individuals' profile, including information about age, citizenship, and profession. It further provides information about country of origin, year of incident and the host state. By contrast to AAAD, CAPE and FH, the information presented in CTRU is uncoded and more textual, which may be of advantage to researchers who wish to compare details of across TR cases.
These macro and micro comparisons along with the basics of data collection for each effort illustrate the data available to researchers. It also shows the different structures and uses to which the information could be put. Over time these data gathering efforts can be  refined with verification procedures, merged in part or in whole to give a comprehensive picture of TR in one source, or cross-referenced with topic-specific data sets that may also contain relevant events, like academic freedom indices or data sources from refugee studies. However, there are challenges for data gathering on transnational repression, a topic to which the next section turns.

Challenges in gathering and constructing TR data
Researchers may wish to move beyond these data gathering efforts to construct their own databases of TR. Some research questions may require data with different geographic, thematic, or temporal emphases, for example. This section outlines some of the challenges that may confront researchers collecting this sort of data in the hopes that they may be overcome or at least mitigated. These can be categorized into three main areas and discussed in turn: definitional, empirical, and ethical. Definitional challenges revolve around the context and intent of the action. Origin states for TR often use the cover of terrorism or corruption to target their critics abroad. This makes it difficult to disentangle legitimate law enforcement operations aimed at reducing violence or financial crimes from disingenuous cases of regime critics being intentionally discredited by origin states. The cloak of legality can create the appearance of procedural legitimacy for TR, which makes the researchers' job more cumbersome. It raises the possibility that some TR may be legal according to the strict letter of the law even if the underlying motivation is political repression.
Definitional issues are also exacerbated when host states have policies that increase the precarity of migrants or asylum-seekers. An incident in which a political exile is detained or deported in a democratic host state may be a potential case of TR, but in looking closer it may be the result of host state policies operating 'as intended' without origin state interference. In such scenarios, it is the prevailing norms of strict Arrested by a special unit of the Italian police in advance of a press conference that he would be speaking at, at the behest of the Chinese government. They took his photo and finger prints, and released him that afternoon after intervention by German authorities.
In 1988, he had been involved as a leader of the Uyghur democratic students' demonstration, and as a result he was placed under house arrest and was kicked out of university. In 1994, he left the country due to threats against him. Dolkun applied for refugee status in Germany, which was granted in 1996.
migration systems in host states that ensnare political exiles rather than any specific intervention by the origin authoritarian state. Yet the result may still be that a political exile is returned to the authoritarian origin state. Finally, some exiles may return to the origin state voluntarily prior to being repressed domestically, but if they were coerced or tricked into returning then this could be categorized as TR but is often difficult to determine with certainty. Ultimately like in other areas of contested definitions, researchers gathering data on transnational repression must use their best judgement when applying definitions to cases, explain their choices, and be willing, if feasible, to revise data as new information arises. Empirical challenges are inherent to the study of TR. This cuts in two directions: data scarcity and data abundance. In terms of scarcity, many of the actions involved are designed to be hidden and responsibility for them intentionally obfuscated. In extreme cases like assassinations, host governments may spend huge sums of money to investigate an incident and determine culpability, a luxury unavailable to academic researchers. In cases of threats to an exiled dissident's family in the source state, the repression may never be publicly reported and therefore never recorded in datasets. Interpol abuse is a problem regularly reported by activists and victims, which researchers have also been able to document on a case-by-case basis. But Interpol itself provides no transparent statistics about how its system is used or abused (such as the number of notices withdrawn under appeal), making it difficult to quantify. Additionally, incidents involving Interpol are not consistently reported by the people involved.
Because of the challenges of collecting information on TR in authoritarian contexts, most of the analysis relies on open-source data such as publicly available reports and media outputs. While open-source data provides an important tool to examine TR, it has several limitations. These difficulties are well-documented in the quantitative literature on state repression (e.g. Davenport and Ball 2002;Weidmann 2016;Fariss, Kenwick, and Reuning 2020). The information collected in the media may contain some inaccuracies and there may be also conflicting evidence. The reporting and information may often be subject to authoritarian state censorship and disinformation, which may skew findings with more cases involving democratic host countries being reported. It may be biased or partial as it is often collected in English and not always in the origin language of the country. Wealthier, more digitally connected targets may be over-represented in media or NGO reports about TR (Yom 2022). Given the intentionally hidden nature of much TR, there are vast realms of it that we will probably never be able to observe and categorize. Over time and as the scholarly area grows it may be possible to apply methodological fixes along the lines of Fariss, Kenwick, and Reuning (2020), or to cross reference an in-depth sample with more general datasets to explore and rectify limitations. Technological tools like web-scraping or machine learning applied to social media output may capture cases that would otherwise go unreported, although this method presents challenges in terms of verifying TR cases (for a machine learning application to human rights reports on domestic repression, see Cordell et al. 2022).
Data abundance also creates empirical challenges. Online threats, spyware and surveillance, or 'trolling', for example, can be so ubiquitous in the life of many activists or critical journalists that recording it all would yield millions of data points (see Moss 2018;Michaelsen 2018). Likewise, some states have attempted to repatriate those abroad en mass (e.g. Human Rights Watch 2018). At the incident level both examples would create challenges in constructing commensurability between cases such that it would yield data of unclear utility. Some of these data challenges are inherent and can only be mitigated by researchers being clear about their choices. Others are in principle answerable with better analytic tools, such as the possibility to monitor online threats using web scraping and related methods.
Ethical issues also arise in this area. Returning to the example of families of exiles being threatened in the origin state is instructive. Those abroad may not want to publicize the case because doing so may put their family in more risk. On the other hand, other exiles may wish to publicize this very same issue to draw attention to their family's case and to pressure the source government to release them. Datasets that rely on information not previously publicly available must take care to gather and present their data in ways that do not jeopardize the safety or integrity of TR victims. In such instances, research ethics and data protection rules allow a public interest exception. Named cases of individuals and groups allow lawyers, human rights activists, and public authorities to observe patterns of transnational repression against certain groups and by certain states. However, one ethical challenge that researchers might face with the use of information not already in the public domain is the issue of consent. In the context of political exiles, obtaining explicit consent from individuals to publish their data can be difficult or impossible. There is a tension here between the data protection for individuals via anonymisation and the public interest of non-anonymised databases for the exposure of patterns of rights violations. The European General Data Protection Rule (GDPR) allows for such exemptions on public interest grounds. CAPE, for example, has sought to balance these two ethical imperatives by anonymising its data for archiving after the end of a project where it previously provided non-anonymised data for rights organizations and lawyers to defend human rights.

Conclusion
This research note has highlighted recent advances in data related to TR, a newly salient tool in the foreign policy toolkit of contemporary authoritarian states. In comparing four data gathering projects it has compared existing approaches in terms of geographical and temporal scope, key definitions, sources, and data structure. The aim was both to alert researchers of global politics and foreign policy analysis to the availability of these resources and to discuss the challenges and possibilities in the area as researchers continue to gather data on TR. States will not stop repressing their exiled dissidents, so careful attention to the issue built on clear and replicable data gathering procedures will continue to be an important task for researchers in this area. In this conclusion we briefly flag some potential next steps.
Most data collection efforts to date have focused on identifying affected individuals, cataloguing incidents, and categorizing the physical tactics of TR. In future, researchers may wish to widen the analytical lens to include global aspects of TR and even supplement the focus on authoritarian states with analysis that includes democracies. Moving beyond individual cases, researchers can turn their attention to inter-state relations that facilitate the targeting of individual political exiles or even whole populations.
There is more scope to understand the dynamics of state cooperation. Sometimes this is accomplished via established informal practices of forced return, as is the case with North Koreans who flee to China (Fahy 2019), or through official agreements to repatriate asylum seekers en mass. This type of bilateral cooperation between states falls within the definition of transnational repression but is difficult to capture because it affects groups of people who are not individually identified and often have no obvious link to political activism. These people are perceived to be opponents of the regime simply because they have left its territory (Fahy 2019, 131;Dukalskis 2021, 68). Historical cases like the aforementioned Operation Condor provide rich material to advance understanding of state cooperation (McSherry 2002;Lessa 2019;Yom 2022).
Moving toward more policy-oriented research would mean switching the focus of analysis from the actions of authoritarian origin states to the responses of states that host political emigres, exiles, and refugees. Although the problem of TR has gained more traction among American and European policymakers, there are no academic studies that we are aware of that systematically or comparatively examine available policy responses, their implementation, or efficacy (although Gorokhovskaia and Linzer 2022 examines policy responses of 9 host countries in policy areas of security, migration, and foreign policy). Beyond government policy, some non-governmental organizations such as Front Line Defenders or Protection International offer trainings and resources to transnational dissidents. From a research standpoint there is room for more systematic comparisons on the effectiveness of both government policies and the self-help methods of exiles themselves meant to counter TR.