Asking sensitive questions using the unmatched count technique: Applications and guidelines for conservation

Researchers and practitioners are increasingly using methods from the social sciences to address complex conservation challenges. This brings benefits but also the responsibility to understand the suitability and limitations of these methods in different contexts. After years of use in other disciplines, the unmatched count technique (UCT) has recently been adopted by conservation scientists to investigate illegal and socially undesirable human behaviours. Here we provide guidance for practitioners and researchers on how to apply UCT effectively, and outline situations where it will be the most and least appropriate. We reviewed 101 publications in refereed journals that used UCT to draw conclusions on its use to date and provide recommendations on when and how to use the method effectively in conservation. In particular, we explored: type of studies undertaken (e.g. disciplines; behaviour being studied; rationale for using UCT); survey administration (e.g. sample size, pilot studies, administration mode); UCT outcomes (e.g. type of analyses, estimates, comparison with other methods); and type of recommendations. We show that UCT has been used across multiple disciplines and contexts, with 10 studies that focus on conservation and natural resource use. The UCT has been used to investigate topics falling into five categories: socially undesirable behaviours, socially undesirable views, illegal or non‐compliant behaviours, socially desirable behaviours; and personal topics (e.g. being HIV positive). It has been used in 51 countries and is suitable to several situations, but limitations do exist, and the method does not always improve reporting of sensitive topics. We provide best‐practice guidance to researchers and practitioners considering using UCT. We highlight that alternate methods should be considered if sample sizes are likely to be small, the behaviour in question is likely to be extremely rare, or if the behaviour is not particularly sensitive. UCT can be a useful tool for estimating the extent of non‐compliance within a conservation context, but as with all scientific investigation, careful study design, robust sampling and consistent implementation are required in order for it to be effective.


| INTRODUC TI ON
Researchers and practitioners working on complex conservation challenges are increasingly encouraged to adopt social science methods to better understand human behaviour (Bennett et al., 2016; John, Keane, & Milner-Gulland, 2013). The creation of an interdisciplinary toolbox not only brings benefits, but also a responsibility to understand the requirements, feasibility and potential limitations of methods new to our field. Both qualitative and quantitative social science methods have proven valuable for investigating human dimensions of conservation. However, traditional methods face limitations when researching socially sensitive topics (Tourangeau & Yan, 2007). When asked directly, participants may conceal their true attitudes, beliefs or behaviours if the behaviour in question is illegal (Warner, 1965), or may temper their answers to appear more socially acceptable (social desirability bias), especially if data collection is observed by third parties. For example, teenagers interviewed with their parents present were more likely to deflate their smoking activity, compared to those surveyed privately (Tourangeau & Yan, 2007).
Perceived invasions of privacy, distrust of the interviewer and their research intentions, or fear of reprisal may also lead respondents to refuse to answer questions altogether (refusal bias) (Tourangeau & Yan, 2007). Within conservation, these issues are particularly pertinent when researchers work in remote, rural communities, where levels of literacy may be low, power-relations prevalent and distrust of outsiders, foreigners and authorities high (e.g., Razafimanahaka et al., 2012). In such situations, researchers may not be perceived as neutral, due to associations with non-governmental organizations or government agencies, which may exacerbate topic sensitivity.
In recognition of the biases associated with direct questioning about sensitive or stigmatizing topics, social scientists developed specialized questioning approaches, known as indirect questioning techniques, with the aim of providing respondents with greater levels of privacy and anonymity (Chaudhuri & Christofides, 2013). The most well-known is the randomised response technique (RRT), proposed by Warner (1965) but many other techniques now exist (Nuno & St. John, 2015). One such method is the unmatched count technique (UCT), which was developed to investigate topics such as racism (Kuklinski, Cobb, & Gilens, 1997). Since an initial application in 2013 (Nuno, Bunnefeld, Naiman, & Milner-Gulland, 2013), UCT has increasingly been used to understand sensitive conservation topics (e.g. illegal wildlife trade: Hinsley, Nuno, Ridout, St. John, & Roberts, 2017) with five studies deploying UCT in 2017 alone (see Appendix S1).
Originally the "block total response" method (Raghavarao & Federer, 1979) also commonly known as the item count technique and list experiment (Glynn, 2013), UCT involves random assigning of individuals into two groups: control and treatment. The control group receives a list of non-sensitive statements or "items" whilst the treatment group receives the same list of innocuous items, along with a sensitive item. Individuals in both groups are asked to indicate how many, but not which items apply to them (Figure 1). Prevalence is estimated by calculating the difference in means between the two groups (Droitcour et al., 1991): p = mean (treatment group) − mean (control group), where p is the proportion of participants engaged in sensitive behaviour.
Respondents never reveal having the sensitive characteristic as long as they report a value lower than the total number of items on the treatment list. However, secrecy is removed if someone reports possessing all characteristics in the treatment list, meaning that they may under-report their true answer to avoid admitting directly to the sensitive item ("ceiling effects"). Conversely, if someone only possesses the sensitive characteristic, they may over-report the number of non-sensitive characteristics they possess, to conceal their answer ("floor effects") (Zigerell, 2011). Ways to minimize these issues are discussed later in this paper.
Interest in using UCT has grown in conservation (see Appendix S1) as researchers are trying to understand the prevalence of illegal or otherwise sensitive behaviours looked beyond their field for appropriate methods. In other disciplines, UCT studies were producing higher prevalence estimates of sensitive behaviours than both direct questioning (Tsuchiya, Hirai, & Ono, 2007) and other indirect methods, such as RRT (Coutts & Jann, 2011). Furthermore, UCT was promoted as easy to administer; participants simply state how many items on a pre-prepared list apply to them (Dalton, Wimbush, & Daily, 1994). Indeed, the first conservation study to use UCT demonstrated that it can be adapted for use in areas of limited literacy, where respondents reported being comfortable with UCT as a questioning technique (Nuno et al., 2013).
However, UCT has limitations. It is unsuitable for very rare behaviours due to its lower precision, and requires large sample sizes (Ulrich, Schröter, Striegel, & Simon, 2012). For example, a UCT survey of >1,000 households surrounding the Serengeti, Tanzania, returned an estimate of hunting prevalence with a ±5% SE (Nuno et al., 2013) whilst an online questionnaire of 814 orchid growers estimated a smuggling prevalence with a ±6% SE (Hinsley et al., 2017). Depending on specific research questions and research-user needs (e.g. monitoring behaviour prevalence over time), this uncertainty might undermine the use of this information to guide decisions, meaning trade-offs between accuracy and precision must be carefully considered.
Stimulated by a need to increase method efficiency, variations of UCT have emerged, including the double-list UCT (Glynn, conservation regulations, conservation social sciences, indirect questioning, list experiment, monitoring and evaluation, rule-breaking, social desirability bias, specialized questioning techniques 2013) and single sample count (Petroczi et al., 2011) (Figure 2). In the double-list UCT, participants act simultaneously as control and treatment groups by answering two lists, one of which always has the sensitive statement. Again, respondents are randomly allocated into two groups but on this occasion, half receive Control A and Control B + sensitive item, whilst the remainder view Control B and Control A + sensitive item (see Appendix S2).
Estimates of the sensitive characteristic are averaged across the two groups to derive its prevalence (Droitcour et al., 1991;Glynn, 2013): with p x = mean (treatment group x ) − mean (control group x ), where p is the proportion of participants engaged in sensitive behaviour.
The single sample count (Petroczi et al., 2011), utilizes existing data on population prevalence of innocuous characteristics (e.g. birth month or final digit of telephone number). This approach avoids the need to estimate the prevalence of non-sensitive items meaning all participants are asked the sensitive question (see Appendix S2). The prevalence estimate from single sample count data is then calculated as: where p is the estimated population distribution of the sensitive item, λ is the observed number of "yes" answers, n is the sample size and b is the population mean value of responses for the baseline non-sensitive questions. However, there may be challenges in locating data on non-sensitive characteristics that complement sensitive topics of conservation interest.
Again, drawing inspiration from UCT, Trappman, Krumpal, Kirchner, and Jann (2013) proposed the item sum technique which enables researchers to estimate quantities of sensitive activities (e.g. number of hours engaged in undeclared work). Respondents randomly assigned to the control group respond to a question such as "How many hours did you spend in village meetings during the last 3 months?" whilst the treatment group answers the control and a sensitive question (e.g. "How many hours did you spend hunting in the last 3 months") simultaneously. Ideally, but not essentially, activities should be measured on the same scales (e.g. hours: Trappman et al., 2013). The quantity of the sensitive act is estimated by computing the mean difference of answers between control and treatment groups.
Given the growing interest in UCT amongst conservation researchers and practitioners (Appendix S1), it is essential to acknowledge and understand the uncertainty about if and when UCT provides better estimates. This is especially important given the economic and temporal expense involved in conducting UCT surveys, particularly given demanding sample sizes. We therefore conducted a systematic review of all empirical applications of UCT in peerreviewed literature, which aimed to: 1. Define the scope of subjects to which UCT has been applied; 2. Identify trends in design, implementation and analysis methods; and 3. Identify the reported success of the method, and the challenges encountered.
We use this information, and our own experience, to provide best practice guidelines to conservationists, highlighting when to use UCT, potential pitfalls and robust study design tips. Ultimately, this will provide a better understanding of the challenges and opportunities for employing UCT, allowing more critical and robust applications of this method for improving data collection about sensitive behaviours, and informing conservation decisions. infractions (Hinsley et al., 2017).
Respondents were randomly assigned to either the control or treatment group and answered using a drop-down list

| MATERIAL S AND ME THODS
We searched Web of Science and Scopus using the keywords "unmatched count technique," "item count technique," "list experiment," "list response technique," "list method," "list randomization" or "block total response method." At the search date (April 14, 2018), this set of criteria identified 661 papers. All titles and abstracts were read by AN, who assessed if each paper potentially used a form of UCT; 121 studies met these criteria. In order to ensure inclusion of recent conservation publications, an additional search in Google Scholar was undertaken; another five papers were identified and included because they specifically dealt with UCT in conservation. A total of 126 articles were thus transferred to the following step.
These 126 papers were randomly assigned for review by AH or AN. For each paper, we recorded: discipline/field; behaviour(s) being studied; rationale for using UCT; location; spatial scale; survey administration (e.g. sample size, pilot study, administration mode and material); and survey outcomes and conclusion (e.g. type of analyses, estimates, comparison with other methods) (see Appendix S3 for full details). Of 126 papers, 14 (11.1%) were theoretical only and/or only used data reported in other studies, six (4.8%) used non-directly comparable methods (e.g. item sum technique), two (1.6%) applied UCT with no sensitive statement to test baseline list design, two (1.6%) could not be accessed, even after contacting authors, and one (0.8%) was not written in English; thus, these 25 papers (19.8%) were excluded from further analysis. The remaining 101 papers were then analysed and summarized (see Appendix S4 for all reviewed papers).
To investigate potential effects on binary variables (e.g. effects on likelihood of checking design assumptions or not), GLMs with binomial error distribution and a logit link were fitted. In order to account for the quantitative nature of the information but without making assumptions about the distance between ordered categories, an ordered logistic regression was used to assess trends in the use of increasingly complex statistical approaches over time.
In the final 101 papers, authors often used multiple UCT lists to explore, for example, different behaviours or different modes of survey administration. To explore potential effectiveness of UCT, we focused on the 73 studies comparing UCT to direct questions, which had a total of 229 separate UCT lists. Using an ordered logistic regression, UCT outcomes measured in terms of "success" (i.e. increase in social desirability bias, no significant difference and reduction in bias when compared to direct questioning) were analysed in function of key study characteristics (administration mode, number of control statements, match in topic of statements, design assumptions checked and pilot conducted) to explore potential predictors of UCT effectiveness.

| Types of studies
We found that UCT has been applied to a wide range of topics, including abortion, anti-immigration, plagiarism and voting. In studies related to natural resource use, all topics are related to noncompliance with conservation regulations (e.g. illegal fishing, hunting). The 101 reviewed studies justified the use of UCT due to their topic(s) of interest being in one or more of the following categories: socially undesirable behaviours (e.g. promiscuity) (n = 38 papers); non-compliant/illegal behaviours (e.g. smuggling) (n = 26); socially undesirable views (e.g. racism) (n = 21); socially desirable behaviours (e.g. recycling) (n = 13); and personal topics linked to possessing a socially stigmatised characteristic (e.g. being HIV positive) (n = 7).
The most frequent study fields were political science (n = 41), sociology (n = 12) and health (n = 10). Ten studies specifically related to biodiversity conservation or natural resource management. Other fields included: statistics (n = 9), development studies (n = 7), psychology and organization studies (n = 4 each) and migration studies and veterinary science (n = 2 each). Studies were conducted in 51 countries, with the highest number of studies in the USA (n = 41), and few or no studies in Asian and African countries ( Figure 3). The majority of studies were at the national level (n = 55), followed by regional (n = 28), local (n = 10) and international (i.e. multiple countries; n = 7). No studies used UCT to study changes in behaviour prevalence over time.

| UCT implementation among surveyed studies
Most studies were administered face-to-face (n = 42) or online (n = 31); other modes of administration included: phone (n = 11) and self-administered questionnaires on paper or digital devices (n = 8), whilst six studies deployed multiple modes. Three did not report how surveys were administered. Most studies used a list of statements read by/to participants; only 6.9% (n = 7) used pictures and one study provided a counting device (stones) to help participants count the number of items that applied to them. Three studies explicitly mentioned using a training question before the actual UCT. Nine studies used the double-list design. Baseline lists are included between 2 (n = 1) and 7 (n = 2) control items, with the majority of studies using 3 (n = 40) or 4 (n = 46). Ideally, baseline items should match the subject of the sensitive item so that they do not stand out (Glynn, 2013), which was true in 68.3% (n = 69) of the studies.
Twenty-nine (28.7%) studies explicitly mentioned piloting, but this information was often brief and/or placed in appendices, so it is possible that more studies pretested in some way without mentioning this in the manuscript. Pilot studies were used to: identify appropriate non-sensitive items for control lists; obtain baseline prevalence rates for non-sensitive items to avoid using those which were either too rare or too common; estimate correlation between items to reduce variability; and refine wording and order of questions.
The number of survey participants receiving UCT questions ranged from 50 to 24,020 (median = 1,000, lower quartile = 562, upper quartile = 1,605). Surprisingly, the sample size was not reported for two studies. We expected that face-to-face studies would have limited sample sizes due to the amount of resources they require, but we found face-toface studies with sample sizes ranging from 50 to 13,686 respondents.

| Outcomes and conclusions from surveyed studies
After data collection, 30.7% (n = 31) of studies checked if at least some UCT design assumptions, such as randomization or design Overall, 73 studies compared UCT to direct questioning, with 10 studies comparing directly to at least one other indirect questioning approach, such as RRT (Figure 4). Across these 73 studies, 229 separate UCT question lists were asked, ranging from 1 list (n = 61) to 16 lists (n = 1) per study. When compared to estimates from direct questioning, online, phone and self-administrated surveys were less likely to result in successful (i.e. reduce bias vs. no significant effect or increase bias) UCT applications when compared to face-to-face questions (Table 1; Figure 4).

| Recommendations for implementing UCT
Some surveyed studies (n = 27) provided specific recommendations for future implementation of UCT (Table 2).

| Steps to designing an effective UCT
As with all research methodologies, the quality of UCT data de-

| Step 1. Sensitive item definition
One of the first choices that must be made is exactly how the sensitive item is to be defined. Sensitive items in conservation are typically an action (e.g. hunting wild animals), but the level of specificity (e.g, "hunting" or "setting wire snares"; whether the target is a specific species or a higher taxonomic group) and time-scale (e.g., "in the last month" or "in the last year") may vary according to study needs. It is also important to con-

| Step 2. Selection of non-sensitive items
The next step is to choose a set of non-sensitive, control list items.
Although conceptually straightforward, selecting control items raises a series of subtle challenges which are not always fully appreciated.
Ideally, a long list of potential items should be piloted to select the final list (see Hinsley et al., 2017). Deciding on the number of non-sensitive items is important; typically, applications of UCT in the literature use three to four control items. While shorter lists place less cognitive burden on respondents and provide greater statistical precision, they must be carefully designed to avoid ceiling and floor effects, in order to protect individual respondents (Zigerell, 2011). Longer lists naturally mitigate these risks, but at the cost of lower statistical power. Another way to minimize these risks is to avoid using items that are likely to be very high or very low prevalence in your sample. However, increasing the overall level of variation in responses can also reduce the statistical  (Glynn, 2013). Consequently, the best current advice is to choose a set of control items that have at least one pair of negatively associated items that is, a respondent saying yes to one item will be very likely to say no to the other (Glynn, 2013). For example,

Hinsley et al., 2017 used a pilot study to measure association between
candidate control statements to ensure that some included in the final design were negatively correlated (e.g. "I have never been to an orchid show" and "I have won awards for my orchids"). If that is not possible, a mixture of high and low prevalence items should be used (Tsuchiya et al., 2007). Prevalence and association between different potential items can be determined using pilot data. Ideally, the control items should be reasonably familiar to the respondent and sufficiently similar in nature and specificity to the sensitive item so that it does not stand out (Droitcour et al., 1991;Kuklinski et al., 1997). For example, Nuno et al. (2013) estimated hunting prevalence by presenting it amongst other livelihood strategies. However, it is also important to avoid control items that might themselves be subject to biased, for example by being sensitive in some unanticipated way, or socially desirable and therefore prone to positive exaggeration.

| Step 3. Pretesting and piloting
Piloting and pretesting is an essential part of a UCT project, but many empirical papers did not mention this stage, or give details of pilot studies that would enhance replicability of the experiment. The need to balance respondent secrecy, statistical precision and ease-ofcomprehension means that researchers must have an excellent understanding of the system they wish to study. The early stages of a UCT design should draw heavily on qualitative understanding, existing literature and local expertise, and initial design ideas and "long lists" of potential control items should be carefully pretested, ideally with respondents who are representative of the target group. These pretests might take the form of in-depth qualitative interviews and/or focus group discussions, with the primary focus on assessing cultural ac- it should also be formally piloted to determine whether further refinements are required and to ensure that respondents are comfortable with the method and confident that their privacy is being protected.

| Step 4. Choosing between standard UCT design and its variants and extensions
In order to overcome some UCT limitations, several variants of the basic method exist. For example, in some settings we may be interested in more than one sensitive item (e.g. we want to obtain estimates of hunting prevalence separately for two threatened species). In this case, rather than dividing the sample into two groups ("control" and "treatment"), the sample can be divided into three or more groups ("control," "Treatment 1" and "Treatment 2"), provided there is a large sample size.
A common challenge for the application of UCT is the need for large samples. The double-list UCT can reduce the overall cost of obtaining a prevalence estimate of the sensitive behaviour with the desired precision (Glynn, 2013), but is harder to design (e.g. it can be challenging to find enough reasonable control items), may be harder to explain to research assistants and increases the risk of respondent fatigue. Another modification designed to increase the precision of estimates is to ask respondents who do not receive the sensitive item separately about each of the control items (Corstange, 2009).
Other authors have argued that after a UCT, respondents should be asked about the sensitive item directly whenever this is ethical and feasible (Blair & Imai, 2012). However, comparison between UCT and direct questions in conservation have sometimes shown no significant differences, a finding that reflects our review of the wider literature ( Figure 4). This has been reported to be because the behaviours were not sensitive enough to create bias (Thomas, Gavin, & Milfont, 2015) or because the survey was carried out online, where participants felt sufficiently protected (Hinsley et al., 2017).
Including a direct question also allows researchers to test some core assumptions of the method, such as the lack of design effects (i.e. to verify that responses to the non-sensitive items are not affected by the presence or absence of the sensitive item), the honesty of responses (i.e. to assess whether respondents include the sensitive item in their count if it applies to them), and the ignorability of treatment assignment (i.e. that the allocation of respondents to treatment or control groups is truly random) (Imai, 2011). Procedures for testing UCT assumptions are implemented in specialized software designed for UCT analysis, such as the list package (Blair & Imai, 2010) in r.

| Step 5. Deciding what else to ask
Although UCT is inherently a quantitative, large-sample technique, it is almost always a good idea to use it together with other forms of data collection. Simple follow-up questions can help researchers understand how sensitive the topic was for participants, how much they trusted the protection offered, and how easy they found the technique to follow (e.g., Thomas et al., 2015). It can also provide opportunity to collect additional information that can improve understanding of the context in which the sensitive behaviour takes place.
For example, multivariate analyses that incorporate UCT data and socio-demographic or socio-psychological information can improve our understanding of why people engage in sensitive behaviours of conservation concern (e.g., Hinsley et al., 2017).

| Step 6. Implementing UCT in the field
The success of even the best-designed UCT study depends on how the method is implemented in the field. In some conservation contexts, choosing a survey mode may be straightforward, for example where respondents are located in areas where online or telephone surveys are not possible. While our findings suggest that online, phone or self-administered UCTs are less successful than face-to-face UCTs at reducing social desirability bias compared to a direct question, there are exceptions (e.g., a successful online UCT about drug use: Coutts & Jann, 2011), and UCTs that are not faceto-face should still be used where pilot results suggest that direct questions are not ethical or appropriate.
In conservation, several other aspects of implementation deserve particular attention, the most important of which is the ethics of studying sensitive or illegal behaviours. Techniques such as UCT can provide an additional level of anonymity and protection for individual respondents, but it is still important that free prior informed consent is sought from all participants. It should also be noted that UCT does not assure perfect protection, as individual anonymity may not automatically translate into anonymity for a group (St. John et al., 2016).
For an unfamiliar technique such as UCT, careful thought should therefore be given to how the research will be introduced and ex- plained. An introductory script should be planned and included in the pre-testing and piloting to ensure that it uses appropriate language and can be easily understood. Planning should also include practicing agreed verbal explanations, in case respondents have poor eyesight or do not understand the words or images shown.
A practice UCT question on an innocuous topic can also be useful.
Although this was rarely implemented in our reviewed studies, it can help participants to understand what they need to do, while also acting as a warm-up exercise, which can help build rapport between participant and researcher.
Other simple considerations can also help to improve the quality and usefulness of UCT data. Although UCT is usually intended to be answered by an individual (whether for themselves or on behalf of a household), in conservation settings it may be difficult in practice to create a situation where the intended respondent is alone when giving their answers. The presence of other people (e.g., other household or community members) may influence the participants' responses and their willingness to answer honestly so it is important to have a clear plan in place for how to deal with this. Careful preparation can also help to make it easier for research assistants to follow experimental design and to record the data accurately, while also ensuring that the technique is presented in a simple, clear way to participants (e.g., randomisation to treatment or control groups should be carried out in advance).

| What to use instead
Considering the time and resources needed to follow these steps and design and implement a good quality UCT, it is clear that several factors must be considered in order to assess the suitability of the method in any given study ( Figure 5).
Where UCT is not appropriate, there are several other options. Six other types of indirect questioning technique have been identified, all of which ensure respondent privacy to encourage honest reporting (Nuno & St. John, 2015). The widely used RRT (Warner, 1965) uses a randomizing device (e.g. dice or a spinner) to force some respondents to answer "yes" or "no" to the sensitive questions. By considering the probability of forced yes responses, the population prevalence of the sensitive act can be calculated.
Furthermore, a specialized form of logistic regression can be applied to investigate predictors of rule-breaking (Chang, Cruyff, & Giam, 2018;Heck, 2018). Different forms of RRT have been applied in various conservation contexts to estimate the proportion of people engaged in sensitive acts (Nuno & St. John, 2015;Solomon, Jacobson, Wald, & Gavin, 2007) and the frequency of such acts (St. John et al., 2018). However, RRT takes time to explain to participants, can be cognitively burdensome and some do not like being "forced" to give an answer that implies they are a rule-breaker, when their true answer is otherwise. This latter effect may explain why RRT estimates of hunting by men living near Kerinci Seblat National Park, Indonesia were negative for three of four study species (St. John et al., 2018). In our reviewed studies, most UCT and RRT results were not significantly different, so consideration of the various limitations and trade-offs should be used to choose the best method.
Where the aim of a study is simply to gauge the group-level prevalence of a sensitive behaviour and multivariate analysis of individual-level data is not required, the simplicity of the bean  (Travers et al., 2016).
Vignettes and short descriptions of hypothetical situations, can be used to elicit participants' behavioural responses in situations similar to the one described, without individuals having to reveal their actual behaviour (Eifler, 2007). However, it is also important to note that discrepancies often exist between what people say they would do and what people actually do (Eifler, 2007).
Where access to known rule-breakers is possible, interviews can F I G U R E 5 Decision tree to assess when unmatched count technique (UCT) is suitable to use, and when other methods may be more appropriate

| CON CLUS IONS
The UCT is simple to administer and the data are easy to derive prevalence estimates from, meaning that it often seems like a "silver bullet" that will enable researchers to rapidly collect data on conservation rule-breaking. It is true that with careful design, UCT can provide useful results in situations where direct questioning is difficult, or help to validate the answers from other methods. However, it is the responsibility of researchers to understand the limitations of the methods they are using, and the contexts in which they are most suitable. With better understanding of how best to use them, methods such as UCT have real potential to allow researchers and practitioners to produce reliable findings that can be used to underpin conservation decision-making.

ACK N OWLED G EM ENTS
A

DATA ACCE SS I B I LIT Y
All data used in the analyses are freely available in the University of Oxford research archive at this link: https://ora.ox.ac.uk/objects/ uuid:556a8a97-2d3d-4bf2-8fc1-359ce9786986. Data were gathered from 101 English language publications that empirically tested the UCT method. For each paper, information on 17 variables was collected, including the context (e.g., discipline, behaviour studied, rationale for using UCT, location), details of survey administration (e.g., whether a pilot study was conducted, whether design assumptions were checked), type of analysis, and comparisons to other methods (e.g., direct questions