Introduction

HIV is very effectively transmitted during anal intercourse unprotected by condoms (UAI), with a meta-analysis finding that women may have an 18-fold greater HIV acquisition risk during UAI compared to vaginal intercourse unprotected by condoms (UVI) [1]. Thus, even a small proportion of intercourse acts being AI may therefore substantially contribute to HIV transmission [2, 3]. However, the role of anal intercourse (AI) within heterosexual epidemics has not been sufficiently examined and is frequently overlooked [4]. For example, recent reviews on HIV risk behaviour among female sex workers (FSW) in China [5] and among young people in Africa [6] examined multiple measures of sexual risk-taking but neither included AI practice. Likewise, public health messaging to FSW on HIV transmission seems to routinely neglect AI practice. For example, none of the studies included in two systematic reviews on HIV prevention interventions among African FSW reported whether or not messaging on safe AI was included in the interventions [7, 8]. This omission may contribute to the lack of awareness of transmission risk during AI among FSW [3, 9] and subsequently to condoms being used less consistently during AI compared to VI (vaginal intercourse) [3, 10].

The practice of AI among FSW has been reported in many articles. However, the extent to which AI is practised by FSW and how often it is practised by age, region and over time has yet to be comprehensively described. It is particularly pertinent to examine these patterns among FSW, compared to other population groups, as FSW experience a far greater burden of HIV and STI infection than women in the general population [11]. This review will be useful to improve our understanding of AI practices, inform prevention messages and identify knowledge gaps. Parameter estimates derived from this review can be used in mathematical models to explore the contribution of AI to the HIV epidemic and assess the influence of AI on the predicted effectiveness of prevention interventions.

In order to estimate the contribution of AI to HIV and STI incidence among FSW and transmission to their sexual partners, it is first necessary to accurately described AI practice in this group. To estimate this contribution, we need data on the proportion of FSW who practise AI and at what frequency, with which types of partner AI is practised and whether condoms are used [4]. The equivalent information for VI is required for a complete understanding of an individual’s potential HIV risk through heterosexual sex. Our review aims to systematically review and summarise published estimates on the proportion of FSW reporting AI and the number of AI acts, and to examine the sources of variation in AI practice.

Methods

The systematic review was undertaken following PRISMA guidelines for reviews of observational studies [12].

Search Strategy

PubMed, Embase and PsycINFO were searched for English-language articles published 1st January 1980 to 31st October 2018 reporting on sexual behaviour among FSW (see Supplement A for full search terms). The screening of identified records was conducted by only one reviewer; with BNO conducting the search from 1990 onwards alone and JE from 1980 to 1989. We did not include the term ‘anal’ in our search to avoid rejecting studies that, while eligible, did not refer to AI in the title or abstract. We discarded titles that were obviously irrelevant, then screened abstracts and retrieved full-text articles if any sexual behaviour among FSW (defined as exchanging sexual services for payment, either cash or in-kind) was reported. Bibliographies of included articles were scanned for further relevant articles. Studies were included in the review if they fulfilled the following criteria:

Published, peer-reviewed articles on cross-sectional studies, cohort studies or randomised control trials (RCTs) that reported data on FSW from which it was possible to extract or calculate the proportion practising AI and/or the number of AI and UAI acts over any recall period.

Although grey literature can be useful, its inclusion can introduce difficulties in ensuring that the search is systematic and that the studies included are methodologically sound. We therefore chose to restrict our review to capture the highest quality peer-reviewed evidence available using an easily replicable search strategy.

Data Extraction

We defined a priori the variables to be extracted. We used a standard procedure to extract data to a spreadsheet. Each publication was examined by two reviewers independently, with differences resolved by consensus. The intra-class correlation coefficient (ICC) was calculated for each outcome of interest to estimate inter-rater reliability. Our outcomes of interest were (1) AI prevalence (the proportion of participants reporting practising AI), (2) monthly frequency of AI and VI, (3) fraction of all intercourse acts and all unprotected intercourse acts which are AI and UAI (details of how these were derived are in Supplement B and C). We extracted participant and study characteristics, including measures of study quality (listed in Table 1, with the addition of alcohol and drug use and sexual and physical violence victimisation). Baseline data only were extracted from longitudinal studies and unadjusted estimates were extracted from studies using respondent-driven sampling. We contacted authors of included studies when key variables of interest were not reported.

Table 1 Summary of (A) study and participant characteristics and (B) quality of included studies

Data Synthesis and Statistical Methods

Prevalence Data

We produced forest plots of individual study estimates for the most common recall periods. We calculated overall pooled estimates and 95% confidence intervals (95%CI) for AI prevalence across each available recall period. As our review includes diverse populations of FSW, we anticipated substantial heterogeneity in AI prevalence estimates across studies. We therefore pooled results using random-effects models and conducted extensive sub-group analysis to explore sources of heterogeneity [13,14,15]. Sub-group analysis on the effect of participant characteristics and study characteristics on pooled AI prevalence estimates were conducted for recall periods with over 10 estimates. Continuous variables were dichotomised at the median. To compare condom use during AI and VI we calculated the proportion reporting any UAI among those reporting AI, as well as the equivalent for VI. We plotted these individual study estimates and produced pooled estimates by recall period (for recall periods with > 3 estimates). Where studies reported condom use as ‘always’, ‘sometimes’ or ‘never’, rather than over a specific recall period, we define answers other than ‘always’ as practising UAI or UVI and refer to this recall period as general condom use. All models were fitted using maximum-likelihood random-effects models [16, 17] with the procedure ‘Metafor’ [18] in R version 3.20.1 [19]. Heterogeneity across study estimates was investigated using Cochran’s Q test and its p value [20] as well as I2 estimates [21].

Frequency Data

To enable comparison across studies which reported number of AI acts by different recall periods, we standardised frequency estimates to number of acts per month. Where possible, we derived the proportion of all intercourse acts that were AI or UAI. When the mean number of AI acts was reported only among the sub-samples who practise AI, we also derived the mean among the whole sample, when AI prevalence was also reported. As very few studies reported measures of variance of intercourse act data, we were unable to conduct statistical synthesis of frequency data; thus, we limited our analysis to graphically exploring the effects of participant and study characteristics on the proportion of intercourse acts that were anal.

Dealing with Bias

Our sub-group analyses included exploring the effect of different measures of methodological quality; interview method, study design, recruitment method and response rate. We also examined through sub-group analysis the section in the article where AI was first mentioned (title, abstract or main text), which we used to explore the possible effect of publication bias as authors may be more likely to include or highlight AI data when the practice is more common.

Results

Search Results

Figure S1 summarises the study selection procedure and search results. Of the 13,658 unique articles initially identified, 131 were included. Most articles were identified from the database searches, and two were identified through reference scanning. Additional information was obtained from 23 of the 35 authors contacted. Inter-rater reliability for the outcomes of interest was high, with ICC ranging from 0.85 for AI frequency data to 0.96 for AI prevalence data.

Study and Participant Characteristics

Details of each included study are presented in Table SI and participant and study characteristics are summarised in Table 1. AI prevalence was reported over various recall periods by 128 studies (including five studies reporting UAI prevalence only [22,23,24,25] with five comparing AI prevalence over two or more recall periods [3, 9, 29,30,31]. The most common AI prevalence recall periods were lifetime (N = 30) and 1 month (N = 18). A very large number of studies failed to state the recall period at all (N = 52); these included 35 studies which reported whether FSW provided AI as part of their service. AI frequency data (either number of AI acts and/or the proportion of intercourse acts which were AI) was provided by only 13 studies.

Sample sizes ranged from 12 to 9667 for a total sample size of 74,426 across all studies (Table SI). Nearly half of the studies specified partner type, with 15 reporting AI practice separately for non-paying partners and paying clients. Most studies were conducted in Asia (N = 53), followed by Africa (N = 34) and Europe (N = 23), with few conducted in the Americas (N = 14 in North, N = 10 in South America, respectively). Median age across studies was 28 years and median survey year 2003. The vast majority of studies either did not report location of work (N = 53) or reported on samples with a mixture of indoor and outdoor sex workers (N = 38).

We were unable to include the use of alcohol (reported by 23 studies, or drug use (reported by 20 studies) or physical and sexual violence (reported by 12 and 11 studies, respectively) in our analysis, because they were too rarely reported and when reported, used a wide range of recall periods.

Study Quality and Potential Bias

More studies reported on FSW who worked only indoors (N = 33), than outdoors (N = 12) (Table 1). Most studies used face-to-face interviews (FTFI) (N = 111), were cross-sectional in design (N = 116) and employed convenience sampling (N = 96). Three studies compared the reporting of AI practice by interview method [23, 26, 27]. Most failed to report the response rate (N = 110). More studies first mentioned AI in the main text (N = 88), than abstract (N = 32) or title (N = 11) (Table 1).

Meta-analysis of AI Prevalence

Figure 1 displays pooled estimates of AI prevalence for all recall periods and Fig. S2a–c displays individual study estimates for the three most common recall periods (lifetime and past month), respectively. Reported AI prevalence varied substantially between studies, ranging from 0.0 to 84.0% across recall periods (Table S1). Estimates stratified by recall period remained very heterogeneous (I2 > 90% and all Q tests showing statistically significant heterogeneity). Pooled AI prevalence did not vary substantially by length of recall period apart from 2 months, 15 days and 1 day recall periods, which all only had one study each (Fig. 1). Aside from these, pooled estimates varied between 10.5% (95%CI 5.5–15.6%, N = 8) in the past week and 21.5% (95%CI 15.6–27.5%, N = 6) in the past year, and the pooled estimate for reporting ever having practiced AI was 15.7% (95%CI12.2–19.3).

Fig. 1
figure 1

Pooled estimates of the prevalence of anal intercourse over each recall period reported. AI anal intercourse, NA not applicable, 95% CI 95% confidence interval. The top of each diamond represents the pooled estimate, while furthest points represent 95% CI. I2 and Q Test are both measures of heterogeneity, with higher values in both indicating greater heterogeneity. I2 ranges from 0–100%. The results of the Q Test are displayed in bold when the p-value is < 0.05, which indicates that the level of heterogeneity found is statistically significant

Sub-group Analysis of AI Prevalence

Table 2 shows pooled estimates from sub-group analyses of AI prevalence by participant and study characteristics for recall periods with sufficient numbers of study estimates (ever and past 1 month).

Table 2 Sub-group analysis of AI prevalence over the most common recall periods, by participant and study characteristics

Participant Characteristics

Pooled estimates of lifetime AI practice tended to be higher among older FSW [28+ years = 20.7% (95%CI 14.5–26.9%, N = 13) vs. < 28 years = 11.9% (95%CI 7.9–15.9%, N = 14)], in studies conducted after 2002 (2003 onwards = 19.2% (95%CI 15.4–24.8%, N = 18) vs pre-2003 = 12.9% (95%CI 5.3–19.2%, N = 13). The same patterns were seen for AI practice in the past month, but as with lifetime prevalence, differences between sub-groups were not significant. Pooled estimates did not vary by partner type, continent, average number of clients or location of work.

Study Quality and Bias

Pooled estimates of lifetime and past month prevalence for cross-sectional studies were lower compared to estimates from RCT and cohort studies, respectively. However, these observations are inconclusive as there was only one RCT and one cohort study reported lifetime and past month prevalence, respectively. Pooled estimates of lifetime and 1 month AI practice was higher when the word ‘anal’ was first mentioned in the article title compared to in the abstract or main text [e.g. for lifetime, title = 23.9% (95%CI 14.0–33.8%, N = 4) versus text = 13.2% (95%CI 8.0–18.3%, N = 17)]. Pooled estimates did not vary by interview method, recruitment method or response rate.

Comparative Condom use During AI and VI

Pooled estimates of the prevalence of UAI among those reporting AI were higher than UVI among those reporting VI in four of the five recall periods over which it was reported (Fig. 2) [e.g. general UAI = 46.0% (95%CI 30.8–61.3), UVI = 31.6% (95%CI 18.7–44.5)], although 95%CIs overlapped substantially (individual study estimates are plotted in Fig. S3a–d).

Fig. 2
figure 2

Pooled estimates of the prevalence of anal intercourse and vaginal intercourse unprotected by condoms, by recall period. Pooled estimates of the proportion of those who report any AI unprotected by condoms among those reporting any AI over the most commonly reported recall periods, and the equivalent pooled estimates for UVI. UAI anal intercourse unprotected by condoms, UVI vaginal intercourse unprotected by condoms, 95% CI 95% confidence interval, general report that condom use is anything other than ‘always’ using condoms

Frequency of AI Compared to VI

Of the 13 studies which provided data on the number of AI acts, we were able to extract or derive eight estimates among the subset of FSW who report practising AI [3, 9, 10, 28,29,30,31,32] and eight over the whole sample [3, 10, 26, 32,33,34,35,36], which includes FSWs not practising AI (Table 3). AI frequency estimates vary substantially across studies. Across the studies providing data among the subset of FSWs reporting AI, the number of AI and UAI acts per month ranged from 1.8 to 27.8 (N = 8) and from 0.2 to 6.2 (N = 3), respectively. Among studies reporting mean frequency across the whole study sample, the total number of AI and UAI acts ranged from 1.1 to 16.9 (N = 8) and 1.0 to 1.7 (N = 3). The percentage of all intercourse acts that were anal ranged from 2.4 to 15.9% in the six studies that reported it across the whole sample [3, 26, 33,34,35,36]. In the sole study which reported it among the subset practising AI [3], 17.0% of intercourse acts were anal. The proportion of intercourse acts that were anal did not vary substantially by any participant or study characteristics (Fig. 3).

Table 3 Frequency of anal intercourse acts, standardised per month and fraction of reported vaginal and anal intercourse acts that are anal
Fig. 3
figure 3

Proportion of intercourse acts that are anal by selected study and participant characteristics Scatter plots of the proportion of intercourse acts that are anal among the whole sample (i.e. including those reporting no AI) participant characteristics and study characteristics. ACASI audio computer assisted self-interview, CD coital diary, CRS cluster-randomised sampling, FTFI face-to-face interview, Mix data only available for men and women combined, NS not stated, RCT randomised controlled trial, RDS respondent-driven sampling, SAQ self-administered questionnaire, SRS simple randomised sampling, TLS Time-location sampling

Discussion

This extensive review adds to the current literature and understanding of AI practices among FSW. We found that reported AI practice is generally common among FSW worldwide, with a pooled estimate of 15.7% (95%CI 12.2–19.3) ever having practised AI. There was substantial heterogeneity across study estimates that largely was not explained by any of the measured participant and study characteristics. AI tended to be more often unprotected by condoms compared to VI, although this was not statistically significantly different. Although scarce, the available data on AI frequency suggests that AI is practised frequently, with 2.4–15.9% of all intercourse acts being anal among all FSW study participant samples.

Similar to previous review findings regarding heterosexual AI practice among young people and South Africans [37, 38], we found a non-statistically significant indication that AI prevalence may have increased over time. In qualitative research Indian and East African FSW have described AI practice during sex work as becoming more common over time due to increased client demand [9, 39,40,41]. Pooled AI prevalence varied little across recall periods and in the four studies which reported AI practice over multiple recall periods AI prevalence changed little as recall periods lengthened [3, 28, 42, 43]. These findings suggest that those who initiate AI continue to practise it.

The strengths of our study include conducting a wide search and identifying a large number of eligible studies, resulting in a large sample size. Our review was greatly strengthened by using wide search terms, for example, omitting the word ‘anal’, ensured that we captured eligible studies which first mentioned AI in the main text, rather than the title or abstract. Given that AI prevalence tended to be lower the later in the article that AI was first mentioned, our search strategy limited the impact of publication bias, thus increasing the accuracy of our results. Deriving estimates for AI practice where possible also helped reduce publication bias. We conducted a detailed sub-group analysis to identify potential sources of heterogeneity in AI practice based on characteristics measured in the study, including measures of study quality.

Our review has a number of limitations. We did not include articles in languages other than English, or grey literature, which may have resulted in omission of potentially eligible articles. Our language restriction resulted in the exclusion of 42 potentially eligible full-text articles. Eleven percent full-text articles examined were found to be eligible, and if the same proportion of identified non-English full-text articles were eligible, this would have resulted in the inclusion of an additional four or five studies to our review. However, the language restriction is unlikely to have influenced results substantially given the large number of articles included (N = 131). We searched for grey literature in our similar review of heterosexual AI among South Africans [37] and found none eligible.

Our review was mainly limited by the quality of reporting on AI practice. Of the 131 included studies, 52 failed to report the recall period of AI prevalence. Only a third of studies reporting AI prevalence also provided data on condom use during AI as well as VI. Only 10% (13 of 131 studies) of included studies reported any type of AI frequency data, and a single study provided the number of each type of intercourse act necessary to fully describe AI frequency (number of anal and vaginal acts over the same recall period, both condom protected and not) [36]. Only two studies [3, 26] provided standard deviation or 95%CI for intercourse act data, which prevented us from pooling the few data available.

AI is a highly stigmatised behaviour in many societies and thus its reporting is likely subject to social desirability bias and is likely more accurately reported using more confidential interview methods [37, 38]. As the majority of studies in this review used FTFI, the least confidential interviewing method, our pooled estimates of AI prevalence and estimates of AI frequency likely underestimate its practice among FSW. Our sub-group analysis found that AI prevalence was not higher in the small number of studies which used more confidential methods compared to those that used FTFI. However, the two included studies which compared AI prevalence by interview method both found non-significantly higher prevalence using more confidential methods compared to FTFI [23, 27]. One study in this review compared AI frequency by interview method, finding more than five times as many anal intercourse acts were reported by FSW in South Africa when using coital diaries compared to daily FTFI [26].

Recommendations for Future Reporting of AI Practice

It is clear from this review and others [37, 38] that data collection on AI practice requires improvement, especially given how effectively HIV is transmitted during AI and how commonly it is practiced. Previous research suggests that survey items must be carefully piloted in order to minimise misunderstanding and that one effective approach may be the use of pictograms to unambiguously clarify what is meant by AI [44]. Using confidential interview methods would help reduce social-desirability bias.

We need data that paints a complete picture of AI practice and which allows the proportion of all intercourse acts that are anal to be estimated. Accurately estimating this proportion is key to estimating the extent to which AI impacts on HIV epidemics among FSW [4]. In order to minimise bias when estimating the fraction of intercourse acts that are AI, the same recall period should be used to collect data on AI and VI practice. We recommend that the following questions be included in all surveys on sexual behaviour among FSW:

  • Have you had AI in the past 12 months?

  • How many VI acts have you had in the past week with (a) clients and (b) non-paying partners?

  • Was a condom used throughout your last VI act with (a) a client and (b) a non-paying partner

  • How many AI acts have you had in the past week with (a) clients and (b) non-paying partners?

  • Was a condom used throughout your last AI act with (a) a client and (b) a non-paying partner

These recall periods may not be suitable for all FSW populations. In the case of low client volume, for example, we recommend collecting data on the number of intercourse acts over the past month. Equivalent questions should also be included in surveys among general population men and women, although past month may be a more suitable recall period for intercourse act data.

Public Health Implications

This review provides valuable information that can be used to guide policy, research and survey design internationally, as well as to inform future mathematical models of HIV epidemics among FSW and to predict the influence that AI practice may have on intervention effectiveness. Our review has found that, while varied, AI is commonly and frequently practised by FSW, and that condoms are often less consistently used during AI compared to VI. As such, AI may substantially contribute to HIV epidemics among FSW and their sexual partners. Messaging on safe AI practice is often absent from current interventions among FSW, but should be included [39, 45, 46]. As practice of AI by FSW is most often driven by client demand [9, 39, 40, 47], programmes should address the social and environmental factors which contribute to vulnerability and hinder negotiation of safe practice; as well as target clients with safe AI messages.