Anopheles salivary antigens as serological biomarkers of vector exposure and malaria transmission: A systematic review with multilevel modelling

Background: Entomological surveillance for malaria is inherently resource-intensive and produces crude population-level measures of vector exposure which are insensitive in low-transmission settings. Antibodies against Anopheles salivary proteins measured at the individual level may serve as proxy biomarkers for vector exposure and malaria transmission, but their relationship is yet to be quantified. Methods: A systematic review of studies measuring antibodies against Anopheles salivary antigens (PROSPERO: CRD42020185449). Multilevel modelling (to account for multiple study-specific observations [level 1], nested within study [level 2], and study nested within country [level 3]) estimated associations between seroprevalence with Anopheles human biting rate (HBR) and malaria transmission measures. Results: From 3981 studies identified in literature searches, 42 studies across 16 countries were included contributing 393 study-specific observations of anti-Anopheles salivary antibodies determined in 42,764 samples. A positive association between HBR (log transformed) and seroprevalence was found; overall a twofold (100% relative) increase in HBR was associated with a 23% increase in odds of seropositivity (OR: 1.23, 95% CI: 1.10–1.37; p<0.001). The association between HBR and Anopheles salivary antibodies was strongest with concordant, rather than discordant, Anopheles species. Seroprevalence was also significantly positively associated with established epidemiological measures of malaria transmission: entomological inoculation rate, Plasmodium spp. prevalence, and malarial endemicity class. Conclusions: Anopheles salivary antibody biomarkers can serve as a proxy measure for HBR and malaria transmission, and could monitor malaria receptivity of a population to sustain malaria transmission. Validation of Anopheles species-specific biomarkers is important given the global heterogeneity in the distribution of Anopheles species. Salivary biomarkers have the potential to transform surveillance by replacing impractical, inaccurate entomological investigations, especially in areas progressing towards malaria elimination. Funding: Australian National Health and Medical Research Council, Wellcome Trust.


Introduction
Sensitive and accurate tools to measure and monitor changes in malaria transmission are essential to track progress towards malaria control and elimination goals. Currently, the gold standard measurement of malaria transmission intensity is the entomological inoculation rate (EIR), a population measure defined as the number of infective Anopheles mosquito bites a person receives per unit of time. EIR is calculated as the human biting rate (HBR; measured at the population level by entomological vector-sampling methodologies [gold standard: human landing catch]) multiplied by the sporozoite index (proportion of captured Anopheles with sporozoites present in their salivary glands). However, estimation of EIR and HBR via entomological investigations is inherently labour and resource intensive, requiring trained collectors, specialised laboratories, and skilled entomologists. Furthermore, these approaches provide a crude population-level estimate of total vector exposure at a particular time and location, precluding investigation of heterogeneity and natural transmission dynamics of individual-level vector-human interactions (Monroe et al., 2020). For example, indoor human landing catches provide poor estimates of outdoor biting and thus total vector exposure (Mathenge et al., 2005). The sensitivity of EIR is further compromised in low transmission settings where the number of Plasmodium-infected specimens detected is low and often zero.
Evaluation of the human antibody response to Anopheles spp. salivary proteins has the potential to be a logistically practical approach to estimate levels of exposure to vector bites at an individual level. Several Anopheles salivary proteins have been shown to be immunogenic in individuals naturally exposed to the bites of Anopheles vectors and have been investigated as serological biomarkers to measure Anopheles exposure (Badu et al., 2012b;Drame et al., 2013a;Drame et al., 2010a;Drame et al., 2010b;Drame et al., 2015;Rizzo et al., 2011a;Rizzo et al., 2011b;Stone et al., 2012;Drame et al., 2012), malaria transmission (Londono-Renteria et al., 2015a;Noukpo et al., 2016), and as an outcome for vector control intervention studies (Drame et al., 2013a;Drame et al., 2010a;Drame et al., 2010b;Noukpo et al., 2016;Idris et al., 2017). However, a major shortcoming of the literature is that studies are largely descriptive and do not quantify the association between entomological and malariometric measures and anti-Anopheles salivary antibody responses. We undertook a systematic review with multilevel modelling to quantify the association between HBR, EIR, and other markers of malaria transmission, with anti-Anopheles salivary antibody responses, and to understand how these associations vary according to transmission setting and dominant Anopheles vectors which can exhibit different biting behaviours. In particular, we were interested in comparing the African context (where Anopheles gambiae and Plasmodium falciparum predominates) to non-African settings (where An. gambiae is absent and where both P. falciparum and Plasmodium vivax are prevalent). This knowledge is pertinent to advance the use of salivary antibody biomarkers as a vector and malaria transmission serosurveillance tool.

Search strategy and selection criteria
We performed a systematic review with multilevel modelling according to the Meta-analysis of Observational Studies in Epidemiology (MOOSE) and Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009;Stroup et al., 2000) (Reporting Standards Document). Five databases were searched for published studies investigating antibodies to Anopheles salivary antigens as a biomarker for mosquito exposure or malaria transmission published before 30 June 2020. The protocol (Appendix 1) was registered with PROSPERO (CRD42020185449).
The primary criterion for inclusion in this systematic review was the reporting of estimates of seroprevalence or total levels of immunoglobulin (Ig) in human sera against Anopheles salivary antigens. We considered for inclusion cross-sectional, cohort, intervention, and case-control studies of individuals or populations living in all geographies with natural exposure to Anopheles mosquitoes. Studies that were solely performed in participants not representative of the wider naturally exposed population (i.e. mosquito-allergic patients, soldiers, returned travellers) were excluded.

Measures Outcomes
The primary outcome of our systematic review was antibodies (seroprevalence or levels, including all Ig isotypes and subclasses) against any Anopheles salivary antigens (full-length recombinant proteins, peptides, and crude salivary extract). Study-reported salivary antibody data was extracted at the most granular level (i.e. for each site; time point), with each observation of seroprevalence or levels included as a study-specific salivary antibody observation. As measurement of antibody levels does not produce a common metric between studies, only values of seroprevalence could be included in multilevel modelling analyses. Therefore, to maximise data, authors of studies that reported only antibody levels were contacted and asked to classify their participants as 'responders' or 'non-responders' according to seropositivity (antibody level relative to unexposed sera). Studies that provided antibody levels or categorised seropositivity based upon arbitrary cut-offs are included in narrative terms only.

Exposures
The primary exposures of interest were the entomological metrics HBR (average number of bites received per person per night) and EIR (infectious bites received per person per year). Secondary exposures included study-reported prevalence of Plasmodium spp. infection (confirmed by either microscopy, rapid diagnostic test (RDT), or polymerase chain reaction [PCR]) and seroprevalence of antimalarial antibodies against pre-erythrocytic and blood stage Plasmodium spp. antigens. Where exposure estimates were not provided, we attempted to source data from other publications by the authors or used the site geolocation (longitude and latitude) and year to obtain estimates of EIR from the Pangaea dataset (Yamba et al., 2018), P. falciparum rates in 2-10 year olds (PfPR 2-10 ), and dominant vector species (DVS) from the Malaria Atlas Project (MAP; The Malaria Atlas, 2017). Malarial endemicity classes were derived by applying established endemicity cut-offs to MAP PfPR 2-10 estimates (Bhatt et al., 2015). For the purposes of the modelling analyses, we defined DVS as where An. gambiae sensu lato (s.l.) was the only DVS, where An. gambiae s.l. was present with additional DVS, or where An. gambiae s.l. was absent. Studies of salivary antigens where exposure variables could not be sourced and data could not be extracted were excluded.

Statistical analysis
Where observations of the seroprevalence of antibodies against the same salivary antigen and exposure of interest were reported in more than one study, generalised linear multilevel modelling (mixed effects, logistic) was used to quantify associations between the exposures of interest and salivary antibody seroprevalence measurements (Song et al., 2019). Random intercepts for study and country were estimated to account for nested dependencies induced from multiple study-specific salivary antibody observations (level 1) from the same study (level 2) and studies from the same country (level 3). Additionally, study-level random slopes for the entomological and malariometric exposure parameters were estimated to model study-specific heterogeneity in the effect of the exposure of interest (HBR/EIR/malaria prevalence/antimalarial antibody seroprevalence). The associations between the various exposures and the different salivary antigens were analysed separately; however, observations of IgG seroprevalence against the recombinant full-length protein (gSG6) and synthetic peptide (gSG6-P1, the one peptide determined in all studies utilising peptides) form of the gSG6 antigen were analysed together.
Potential effect modification of the associations between exposures and anti-Anopheles salivary antibody responses was explored. In analyses quantifying the associations between HBR, as well as EIR, and seropositivity, we included an interaction term with DVS and for vector collection method (human landing catch or other indirect measures, e.g. light traps, spray catches, etc.). For the association between Plasmodium spp. prevalence and seropositivity, interaction terms with malaria detection

Results
Literature searches identified 158 potentially relevant studies, of which 42 studies were included in the systematic review ( Figure 1) and are described in Table 1. From these studies, we extracted n = 393 study-specific observations of anti-Anopheles salivary antibodies determined from antibody measurements in a total of 42,764 sera samples. These studies were performed in 16 countries mostly in hypo-or mesoendemic areas of Africa (32 studies), with a minority performed in South America (four studies), Asia (four studies), and the Pacific (two studies). Studies were classified according to their DVS which reflected the region where the study was conducted. An. gambiae s.l. was a DVS in all African study sites (n = 151 study-specific observations from 23 studies where An. gambiae s.l. was the only DVS and n = 68 from 16 studies where An. gambiae s.l. was present with additional DVS [i.e. Anopheles funestus, Anopheles pharoensis]), with the exception of one study, which together with the 10 non-African studies contributed n = 174 study-specific estimates where An. gambiae s.l. was absent. Most observations came from cross-sectional (n = 191 from 16 studies) or repeated crosssectional studies (n = 137 from 18 studies), with n = 60 from cohort studies (six studies) and n = 5 from case-control studies (two studies).
Generalised linear multilevel modelling (mixed effects, logistic) of n = 132 study-specific observations from 12 studies estimated a positive association between Anopheles spp.-HBR (log transformed) and seroprevalence of IgG to An. gambiae gSG6 salivary antigen Drame et al., 2015;Rizzo et al., 2011a;Stone et al., 2012;Drame et al., 2012;Soma et al., 2018;Traoré et al., 2019;Sagna et al., 2013b;Sarr et al., 2012;Pollard et al., 2019; Figure 2, Appendix 4-table 1). As we have log transformed HBR to account for the non-linear relationship between HBR and log odds of gSG6 IgG seropositivity, we have presented estimated odds ratios for different incremental percent increases in HBR (Figure 2-figure supplement 1). For example, the magnitude of the association was such that a twofold (100% relative) increase in HBR was associated with a 23% increase (OR: 1.23; 95% CI: 1.10-1.37; p<0.001) in the odds of anti-gSG6 IgG seropositivity (Figure 2). Heterogeneity in the effect of HBR on gSG6 across studies was observed (likelihood ratio χ 2 (1) = 109.25, p<0.001); the 95% reference range of study-specific effects for a twofold increase in HBR ranged from a 12% reduction to a 70% increase in odds (OR: 0.88-1.70). There was no evidence that the association between HBR and gSG6 IgG varied according to vector collection method (human landing catch or other indirect methods; p=0.443) or study design (longitudinal cohort or cross-sectional/repeated cross-sectional; p=0.138). Given the global heterogeneity in the distribution of Anopheles species, we sought to quantify the extent to which the association between An. gambiae gSG6 IgG seropositivity and HBR is moderated by DVS. We observed that the magnitude of the association between An. gambiae gSG6 IgG seropositivity and HBR was greatest in African studies where An. gambiae s.l. was the only dominant vector (p<0.001, Appendix 5); a twofold increase in HBR was associated with a 37% increase (OR: 1.37; 95% CI: 1.19-1.58; p<0.001) in the and non-African studies where An. gambiae s.l. was absent (OR: 1.05 per twofold increase in HBR; 95% CI: 1.03-1.08; p<0.001). In order to quantify the relationship between gSG6 IgG seroprevalence and HBR, for given HBR values we estimated gSG6 IgG seroprevalence by producing model-based predicted probabilities overall and by DVS ( Figure 3). In African studies where An. gambiae s.l. is the only DVS, predicted seroprevalence of An. gambiae gSG6 ranged from 21% (95% CI: 0-45%) to 86% (95% CI: 67-100%) for an HBR of 0.1-100 bites per person per night, respectively ( Figure 3, Figure 3-figure supplement 1).
Similar positive associations were also found between anti-gSG6 IgG levels, HBR, and EIR in 11 studies Drame et al., 2012;Stone et al., 2012;Soma et al., 2018;  Observed study anti-gSG6 IgG seroprevalence Average anti-gSG6 IgG probability (95%CI) by HBR Figure 2. Association between anti-gSG6 IgG seroprevalence and log 2 human biting rate (HBR). Figure shows the observed anti-gSG6 (either recombinant or peptide form) IgG seroprevalence (%) and HBR for each study-specific observation, as well as the predicted average anti-gSG6 IgG seroprevalence (predicted probability for the average study and country) with 95% confidence intervals (95% CI). Circles are proportional to the size of the sample for each study-specific observation, with colours indicating sample size: black (<50), red (50-100), navy (100-150), and green (>150). Association estimated using generalised linear multilevel modelling (mixed effects, logistic) to account for the hierarchical nature of the data, where study-specific anti-gSG6 IgG observations are nested within study and study is nested within country (model output shown in Appendix 4; p<0.001).

2012;
The association between anti-gSG6 IgG seroprevalence and population-level prevalence of Plasmodium spp. infection was investigated. Generalised linear multilevel modelling (mixed effects, logistic) of n = 212 from 14 studies that measured Plasmodium spp. prevalence contemporaneously in their study [Badu et al., 2012b;Rizzo et al., 2011b;Soma et al., 2018;Koffi et al., 2015;Traoré et al., 2019;Sagna et al., 2013b;Sarr et al., 2012;Perraut et al., 2017;Drame et al., 2010a;Idris et al., 2017;Proietti et al., 2013;Kerkhof et al., 2016] showed that for a twofold increase in the prevalence of Plasmodium spp. infection the odds Figure 3. Forest plots of predicted anti-gSG6 IgG seroprevalence (%) and Anopheles species-specific human biting rate (HBR). Panels show the predicted average anti-gSG6 IgG seroprevalence (predicted probability for the average study and country) with 95% confidence intervals for given HBR, for all Anopheles spp. (using model output from Appendix 4) and for specific-dominant vector species (DVS): where An. gambiae s.l. is the only DVS, where other DVS were present in addition to An. gambiae s.l. and where An. gambiae s.l. was absent (using model output from Appendix 5).
The online version of this article includes the following figure supplement(s) for figure 3: of gSG6 IgG seropositivity increased by 38%, although the confidence intervals were wide (OR: 1.38; 95% CI: 0.89-2.12; p=0.148) and heterogeneity in the study-specific effects was observed (95% reference range: 0.30-6.37; likelihood ratio χ 2 (1) = 235.5, p<0.001) ( Figure 5 and Appendix 7). In the association between gSG6 IgG seropositivity and Plasmodium spp. infection, there was no evidence for a moderating effect of Plasmodium spp. detection method (light microscopy or PCR, p=0.968), or species (African studies with P. falciparum versus non-African studies where P. falciparum and P. vivax are co-prevalent, p=0.538).
Additionally, 14 studies reported observations of anti-gSG6 IgG levels and the prevalence of Plasmodium spp. infections measured contemporaneously in their study. The median anti-gSG6 IgG antibody levels increased with increasing Plasmodium spp. prevalence in six of these studies Idris et al., 2017;Poinsignon et al., 2010b;Sarr et al., 2012;Kerkhof et al., 2016), or in Plasmodium spp.-infected compared to non-infected individuals (Londono-Renteria et al., 2015a;Montiel et al., 2020), but showed no association in eight studies (Rizzo et al., 2011b;Soma et al., 2018;Koffi et al., 2015;Traoré et al., 2018;Traoré et al., 2019;Sagna et al., 2013b;Poinsignon et al., 2009). Furthermore, we also investigated associations with serological measures of malaria exposure and found that for a twofold increase in pre-erythrocytic and blood stage antigen seroprevalence there was a 2.19-fold (OR: 2.19; 95% CI: 1.18-4.04; p=0.013) and 41% to 5.69-fold (OR range: 1.41-5.69; p range: <0.001 to 0.523) increase in the odds of anti-gSG6 IgG seropositivity, respectively (Appendix 8). Observed study anti-gSG6 IgG seroprevalence Average anti-gSG6 IgG probability (95%CI) by EIR Figure 4. Association between anti-gSG6 IgG seroprevalence and log 2 entomological inoculation rate (EIR). Figure shows the observed anti-gSG6 (either recombinant or peptide form) IgG seroprevalence (%) and EIR for each study-specific observation, as well as the predicted average anti-gSG6 IgG seroprevalence (predicted probability for the average study and country) with 95% confidence intervals (95% CI). Circles are proportional to the size of the sample for each study-specific estimate, with colours indicating sample size: black (<50), red (50-100), navy (100-150), and green (>150). Association estimated using generalised linear multilevel modelling (mixed effects, logistic) to account for the hierarchical nature of the data, where study-specific anti-gSG6 IgG observations are nested within study and study is nested within country (model output shown in Appendix 6; p<0.001).
The online version of this article includes the following figure supplement(s) for figure 4: To give epidemiological context, we estimated anti-gSG6 seroprevalence by producing modelbased predicted probabilities by malarial endemicity class (a categorical variable derived by applying established cut-off values for the PfPR 2-10 extracted from MAP). Generalised linear multilevel modelling (mixed effects, logistic) on 297 study-specific salivary antibody observations from 22 studies shows that the estimated anti-gSG6 IgG seroprevalence is higher for the higher endemicity classes ( Table 2). Interactions with DVS or region (Africa/non-Africa) could not be explored due to collinearity with malaria endemicity class. Therefore, in addition using Bayes best linear unbiased predictions (BLUPs) we estimated country-specific gSG6 IgG seroprevalence from an intercept-only multilevel model fitted to 301 study-specific salivary antibody observations from 22 studies. It showed that IgG seroprevalence to An. gambiae gSG6 was lowest in countries in the Pacific region where An. gambiae is absent (Vanuatu [31%] and Solomon Islands [32%]) and highest in countries where An. gambiae is a DVS (Benin [72%] and Burkina Faso [65%]; Appendix 9).
Assessments of internal and external study validity revealed there was a moderate risk of selection bias (Appendix 2) due to the study-specific inclusion criteria of populations at higher risk of malaria which contributed gSG6 seroprevalence observations. Sensitivity analyses exploring potential studylevel outlier influence on the estimated associations between anti-gSG6 IgG seroprevalence, HBR, and EIR showed no evidence of bias (effect estimates for each sensitivity analysis were consistent with model estimates overall) for studies identified as exhibiting potential influence (HBR: n = 6; EIR: n = 6). Observed study anti-gSG6 IgG seroprevalence Average anti-gSG6 IgG probability (95%CI) by Plasmodium spp. prevalence Figure 5. The association between anti-gSG6 IgG seroprevalence (%) and log 2 Plasmodium spp. prevalence (%). Figure shows the observed anti-gSG6 (either recombinant or peptide form) IgG seroprevalence (%) and prevalence of any Plasmodium spp. infection (%) for each study-specific observation, as well as the predicted average anti-gSG6 IgG seroprevalence (predicted probability for average study) with 95% confidence intervals (95% CI). Circles are proportional to the size of the sample for each study-specific observation, with colours indicating sample size: black (<50), red (50-100), navy (100-150), and green (>150). Association estimated using generalised linear multilevel modelling (mixed effects, logistic) to account for the hierarchical nature of the data, where study-specific anti-gSG6 IgG observations are nested within study. See Appendix 7 for model output.

Discussion
This systematic review and multilevel modelling analysis provides the first quantification of a positive non-linear association between seroprevalence of An. gambiae gSG6 IgG antibodies and HBR and demonstrated that its magnitude varied with respect to the DVS present in the area. Importantly, this review identified a paucity of studies conducted outside of Africa, as well as investigating salivary antigens representing different Anopheles spp. and antigenic targets. gSG6 antibodies were positively associated with the prevalence of Plasmodium spp. infection as well as established epidemiological measures of malaria transmission: malaria endemicity class and EIR. Overall, our results demonstrate that antibody seroprevalence specific for Anopheles spp. salivary antigens has the potential to be an effective measure of vector exposure and malaria transmission at the population and, potentially, individual level.
An. gambiae gSG6 IgG seropositivity increased with increasing HBR, although these increases had diminishing impact on An. gambiae gSG6 IgG seropositivity at higher levels of HBR (approximately greater than two bites per person per night). In our study, 17 studies performed across Africa (Angola, Benin, Burkina Faso, Cote d'Ivoire, and Senegal) and the Asia Pacific (Cambodia, Myanmar, and the Solomon Islands) reported an HBR < 2, demonstrating the applicability of gSG6 as a biomarker of HBR across a broad range of malaria-endemic regions. We also observed that the association was strongest in areas where An. gambiae s.l. was the only DVS (i.e. concordant An. gambiae species-specific HBR with An. gambiae gSG6 antibodies). Associations, albeit weaker, were also observed between discordant species-specific HBR and gSG6, most likely because the An. gambiae SG6 gene shares moderate sequence identity with vector species that are dominant in other regions (Africa: 80% An. funestus; Asia: 79% Anopheles stephensi and Anopheles maculatus; 54% An. dirus; Pacific: 52.5% Anopheles farauti), and is absent from the DVS of the Americas (An. albimanus and An. darlingi) . The generalisability of An. gambiae gSG6 IgG as a biomarker of exposure to other Anopheles spp. may therefore be limited. However, our review also identified a paucity of studies investigating additional salivary antigenic targets and Anopheles species not present in Africa. The identification of novel salivary antigens that are species-specific will be valuable in quantifying exposure to the other Anopheles vectors that share limited identity with An. gambiae SG6 (such as An. farauti and An. dirus), as well as Anopheles spp. which lack SG6 (as done for An. albimanus and An. darlingi; Londono-Renteria et al., 2020a;Londono-Renteria et al., 2020b). An Anopheles species-specific serological platform could advance vector surveillance by more accurately capturing exposure to DVS in the South American and Asia Pacific regions which exhibit diverse biting behaviours and vector Table 2. Association between gSG6 IgG seroprevalence (%) and malarial endemicity (PfPR 2-10 ). competence (DVS typically bite outdoors during the night and day, respectively; The Malaria Atlas, 2017; Sinka et al., 2012;Sinka et al., 2010;Trung et al., 2005;Herrera et al., 2015;Chaumeau et al., 2018), as well as the increasing threat of urban malaria from An. stephensi in Africa (Takken and Lindsay, 2019;Sinka et al., 2020). This review demonstrated that the prevalence of Anopheles salivary antibodies increased with increasing prevalence of Plasmodium spp. infection (although confidence intervals were wide and we observed heterogeneity in the effect between studies) as well as established epidemiological measures of malaria transmission: malaria endemicity class and EIR. Anti-salivary antibodies, such as SG6 IgG, may therefore have the potential to serve as a proxy measure for receptivity of a population to sustain malaria transmission. Their application could be particularly relevant in pre-elimination areas, or non-endemic areas under threat of imported malaria, where Anopheles salivary antibodies are more readily detectable than parasites; salivary antibodies were predicted to be prevalent (20%) in areas defined as eliminating malaria (<1% PfPR 2-10 ). Furthermore, if SG6 IgG seroprevalence can be effectively combined with a measurement of the sporozoite index, salivary antibodies as a marker of HBR could help overcome sensitivity limitations of EIR in low transmission areas. Additional measures could include estimates of malaria prevalence or serological biomarkers that are species-or life stagespecific (e.g. Plasmodium spp. pre-erythrocytic antigens as biomarkers for recent parasite inoculation). Indeed, positive associations between antibodies specific for Plasmodium spp. pre-erythrocytic and blood stage antigens with gSG6 were demonstrated in analyses of data from diverse malariaendemic areas. Serological tools combining salivary antigens with antigens specific for the different Plasmodium spp. could be easy to employ and complement malaria surveillance programmes. These tools may be particularly useful in the Asia Pacific, a region of relatively low malaria transmission with goals of elimination, but the highest burden of P. vivax malaria where blood stage infection can be caused by relapses from dormant liver stages. In these areas, parasite prevalence may therefore overestimate ongoing malaria transmission, making vector surveillance tools essential to informing elimination strategies in the Asia Pacific and other regions where P. vivax is endemic.
The gold standard entomological measures HBR and EIR provide crude population-level estimates of vector and malaria exposure that are specific in space and time and preclude investigation of individual-level heterogeneity and natural transmission dynamics. Our study demonstrated that salivary biomarkers measured at the individual level, such as gSG6 IgG, can be used to quantify total vector exposure at the population level, without requiring laborious entomological experiments. However, validating an individual-level serological measure, which demonstrates considerable individual-level variation, against the imperfect population-level gold standards of HBR and EIR is challenging and reflected in the variation in study-specific estimates in the association between gSG6 IgG and HBR in modelling analyses. However, the accuracy of salivary antibodies to measure individual-level exposure to Anopheles bites is yet to be validated; literature searches identified no studies investigating this association at the individual level. Without detailed measurements of individual-level vector exposure, or a detailed knowledge of the half-life of Anopheles salivary antibodies post biting event, the true accuracy of salivary antibodies, such as SG6 IgG, to measure individual-level HBR remains unknown. This knowledge is particularly pertinent where Anopheles salivary biomarkers might be applied to assess the effectiveness of a vector control intervention or used to measure temporal changes in malaria transmission; particularly in areas or populations where there is considerable heterogeneity in individual-level risk of Anopheles exposure (e.g. unmeasured outdoor biting due to occupational exposure for forest workers; Sandfort et al., 2020).
The broad nature of our inclusion and quality criteria was a key strength of our systematic review, which aimed to provide a comprehensive analysis of all Anopheles salivary biomarkers and determine their associations with entomological and malariometric measures of transmission. However, this review has two main limitations. First, despite the inclusive nature, assessment of the external validity of the review revealed a moderate risk of bias; some studies exhibited a high risk of selection bias as they were performed in specific high-risk populations not representative of the overall population (i.e. children only). This is accounted for to some degree by specification of a random effect (i.e. intercept) for study, which accounts for unmeasured study-specific factors that may introduce study-specific measurement error to measurement of the outcome. Second, with respect to internal validity, there may be potential selection bias introduced by the exclusion of studies reporting zero HBR (7 observations from three studies; Rizzo et al., 2011b;Pollard et al., 2019;Sagna et al., 2013b), EIR (22 observations from three studies; Rizzo et al., 2011b;Soma et al., 2018), and malaria prevalence (15 observations from three studies; Idris et al., 2017;Sagna et al., 2013b;Kerkhof et al., 2016) estimates, given we modelled the log of these factors. However, adding a small constant (e.g. 0.001) to a zero value to permit modelling of a log estimate can also introduce considerable bias (i.e. seemingly small differences between values become very large on the log scale). In light of this, we also chose to provide estimates of association and gSG6 IgG seroprevalence according to a selected range of epidemiologically relevant hypothetical HBRs (no widely accepted HBR classification exists in the literature) and according to widely accepted, discrete, endemicity classes according to MAP estimates (which permitted inclusion of all studies) to provide epidemiological context. However, there is the potential for misclassification of malarial endemicity class derived from geospatially extracted MAP predictions of PfPR 2-10 which increase in uncertainty in areas with scarce data. Similarly, we used MAP vector occurrence data to inform DVS categories for 7 (out of 42) studies. Cross-referencing these 7 studies with a 2017 updated database for African vectors (using data for the nearest neighbouring village) identified 10 discrepant datapoints from 3 studies (from a total of 28 datapoints from 7 studies) (Snow, 2017). Any misclassification events may cause us to underestimate the standard error in the effect of malaria endemicity class and DVS on gSG6 IgG.

Conclusions
In order to advance progress towards malaria elimination, the World Health Organization has called for innovative tools and improved approaches to enhance vector surveillance and monitoring and evaluation of interventions (World Health Organization, 2017). Our systematic review has provided evidence that Anopheles salivary antibodies are serological biomarkers of vector and malaria exposure, by quantifying their positive association with Anopheles-HBR and established epidemiological measures of malaria transmission. These salivary biomarkers have the potential to replace crude population-level estimates of entomological indices with a precise and scalable tool that measures Anopheles vector exposure at the individual level. This approach could be expanded into a serosurveillance tool to assess the effectiveness of vector control interventions, define heterogeneity in malaria transmission, and inform efficient resource allocation that would ultimately accelerate progress towards elimination. Freya    We performed a systematic review with multilevel modelling of the published literature according to the MOOSE guidelines (Stroup et al., 2000) and the PRISMA specifications (Moher et al., 2009). The protocol was registered with PROSPERO (CRD42020185449). The electronic databases PubMed, Scopus, Web of Science, African Index Medicus, and the Latin American and Caribbean Health Sciences Literature (LILACS) were searched for studies published before 30 June 2020 investigating Anopheles salivary antigens as a biomarker for mosquito exposure or malaria transmission. Search terms were as follows: Anophel* AND saliva* AND (antibod* OR sero* OR antigen OR marker* OR biomarker* OR gSG6* OR gSG* OR SG* OR cE5). The reference lists of included studies were screened for additional studies, and Google Scholar was used to identify additional works by key authors. No formal attempt was made to identify unpublished population studies as it would have required significant description of the design, methods, and analysis used in these studies, and a review of ethical issues.

Selection criteria
The primary criterion for inclusion in this systematic review was the reporting of observations of seroprevalence or total levels of Ig antibodies (including all isotypes and subclasses) in human sera against recombinant or synthetic peptide Anopheles salivary antigens. We considered for inclusion cross-sectional studies, cohort studies, intervention studies, and case-control studies of individuals or populations (including sub-populations) living in all geographies with natural exposure to Anopheles mosquitoes. Studies that were solely performed in participants not representative of the wider population (i.e. mosquito-allergic patients, soldiers, returned travellers) were excluded. The minimum quality criteria for inclusion in this review were antibody detection performed using enzyme-linked immunosorbent assay (ELISA), multiplex or Luminex assays.
The exposure variables of interest included entomological and malariometric parameters, including (i) HBR, defined as the number of bites received per person per unit of time; (ii) EIR, defined as the number of infectious bites per person per unit of time, calculated as the HBR multiplied by the sporozoite index; (iii) estimates of malaria prevalence; and (iv) population-level seroprevalence estimates against Plasmodium spp. malarial antigens. To ensure HBR estimates were given for the same unit of time (bites per person per night), biting rates given per week were divided by 7, and biting rates given per month we multiplied by 12 and divided by 365. Similar approaches were employed to ensure consistent units for EIR (infectious bites per person per year). Plasmodium spp. infections had to be confirmed by either microscopy, RDT, or molecular methods (PCR). Plasmodium spp. diagnosis was included for all Plasmodium spp. combined and the species level if provided. Where exposure estimates were not provided, we attempted to source data from other publications by the authors or used the site geolocation and year to obtain estimates of EIR from the Pangaea dataset (Yamba et al., 2018). P. falciparum rates in 2-10 year olds (globally, 2000-2017) and DVS from the MAP (The Malaria Atlas, 2017). Studies of salivary antigens where exposure variables could not be sourced and data that could not be extracted were excluded.

Selection of studies
One author performed database searches and screened reference lists to identify possible studies. One author screened studies against inclusion criteria, with discussion and input from a second reviewer.

Approaches to include all available studies
The authors of any studies that did not contain relevant information on the study design, populations, eligibility criteria, or key study data were contacted and relevant data requested. Authors were contacted via an initial email detailing the precise nature of the systematic review and the data required. If the authors did not reply to three email requests or were unable to provide relevant data, the studies were deemed to insufficiently meet inclusion/quality criteria and were excluded. As measurement of antibody levels does not produce a common metric between studies, authors were asked to classify their participants as 'responders' or 'no-responders' according to seropositivity (antibody level relative to unexposed sera) within each study to allow comparisons of seroprevalence between studies (Cutts et al., 2020;Cutts et al., 2014;Fowkes et al., 2010). Studies that were only able to provide antibody levels or categorised seropositivity based upon arbitrary cut-offs were excluded from multilevel modelling analyses and included in narrative terms. Where the salivary antibody response and exposure variable were measured in the same population and reported in multiple publications, the study with the largest sample size was included, otherwise the earliest study was included.

Data extraction
Data were extracted using a data collection form by one reviewer. Any data that was provided at the sub-population level was extracted at the lowest level, that is, if a study was performed across multiple sites, and an estimate for both salivary antibody seroprevalence/levels and the exposure of interest is given for each site, it was included the site level, rather than an aggregated level.

Outcomes
The primary outcome of interest of our systematic review was the reported antibody response (both seroprevalence and levels of all Ig subclasses and isotypes) to Anopheles salivary antigens. Multilevel modelling analyses were performed where the seroprevalence of antibodies against the same antigen and the exposure of interest were reported in more than one study.

Exposures
The primary exposures of interest included in the multilevel modelling analyses were the HBR and EIR, a measure of the average number of bites received per person per night and infectious bites received per person per year, respectively. Secondary exposures assessed include the prevalence of any Plasmodium spp. infection (including P. falciparum only, P. vivax only, or untyped infections). Additional secondary exposures include the P. falciparum infection rate in 2-10 year olds extracted from MAP, as well as the seroprevalence of antimalarial antibodies against pre-erythrocytic and blood stage antigens.
Clinical and methodological heterogeneity were explored using prespecified variables to minimise spurious findings. Variables considered for inclusion were study design (cohort, crosssectional, repeated cross-sectional), DVS, study participants (adults only, children only, adults and children), preparation of salivary antigen (recombinant full-length protein, synthetic peptide), malaria detection methodology (light microscopy, RDT, PCR), and entomological vector collection methodology (human landing catch, light traps, and spray catches).

Statistical analysis
Where there were sufficient data to pool observations of the same exposure and outcome measures, generalised linear multilevel modelling was used to undertake analyses quantifying associations between the exposures of interest and salivary antibody seroprevalence measurements. Models were generalised through use of the logit link function and binomial distribution (statistical notation for HBR model shown below as Equation 1). Seroprevalence was modelled in binomial form as the number of individuals seropositive to the total sample size. A three-level random effects model with a nested framework was used to account for dependency in the data, with random intercepts for country (level 3) and study (level 2) estimated. Hence, level 1 units represented multiple salivary antibody observations within a study induced by the study design (i.e. multiple time points, sites, age categories). Additionally, study-level random slopes for entomological and malariometric exposures were estimated to permit the effects to vary across studies. Model structure was determined empirically through likelihood ratio tests (p<0.05), with the exception of country at the third,level which was included in HBR and EIR analyses to estimate country-specific seroprevalence estimates of anti-salivary antibodies. The associations between the various exposures and the different salivary antigens were analysed separately; however, observations of IgG seroprevalence against the recombinant full-length protein (gSG6) and synthetic peptide (gSG6-P1, the one peptide determined in all studies utilising peptides) form of the gSG6 antigen were analysed together, with a fixed term for antigen construct considered for inclusion in the model. Of note, gSG6 peptide 2 (gSG6-P2) was excluded from being analysed with gSG6 and gSG6-P1 as the two studies that reported anti-gSG6-P2 IgG seroprevalence also reported the seroprevalence of anti-gSG6-P1 IgG and only one could be included. Potential effect modification of the associations between the exposures of interest and the anti-Anopheles salivary antibody responses was explored and undertaken by estimating interaction terms for DVS (An. gambiae s.l. only, An. gambiae s.l. and other DVS, or An. gambiae s.l. absent) and for vector collection method (human landing catch or other indirect measures, e.g. light traps, spray catches, etc.). For the association between Plasmodium spp. prevalence and gSG6 IgG seropositivity, interaction terms for malaria detection methodology (light microscopy or PCR), and malarial species type (P. falciparum only, or P. falciparum and P. vivax) were estimated. Other variables considered for inclusion in adjusted models were study design, participant, and salivary antigen construct; however, these variables showed no association with anti-gSG6 IgG and were thus excluded.
AIC and BIC fit indices were used to determine the best-fitting functional forms for the association between log odds of gSG6 IgG seropositivity and HBR, EIR, and Plasmodium spp. prevalencelinear, log, quadratic, and cubic functions were fitted, with a log transformation exhibiting superior model fit (Appendix 1-table 1). To aid interpretation, we present our results as a relative increase in the odds of the gSG6 IgG seropositivity for a twofold (100% relative) increase in the exposures. Additional relative percent changes in HBR and EIR are also presented.
Appendix Empirical Bayes BLUPs were used to estimate the probability of gSG6 IgG seropositivity in the average study and country, which is equivalent to an estimated gSG6 IgG seroprevalence. In order to maximise the number of included studies in our modelling, we predicted anti-gSG6 seroprevalence according to endemicity class, derived by applying established endemicity cut-offs to PfPR 2-10 estimates (Bhatt et al., 2015) extracted from MAP using site year and geolocation (if MAP data unavailable endemicity as stated in study). ICCs and 95% reference ranges were estimated for country-, study-, and slope-specific heterogeneity (where appropriate) using estimated model variance components.
Statistical notation for the generalised linear multilevel model (mixed effects, logistic) used to estimate the association between An. gambiae gSG6 IgG seropositivity and HBR The model can be formally written as where ζ 1j ~ N(0, ψ 1 ), ζ 2i ~ N(0, ψ 2 ) and ζ 3j log ( HBR ) ij ~ N(0, ψ 3 ), (2) where x ij is the vector of model covariates, β 1 is the model constant and represents the log odds (probability) of gSG6 IgG seropositivity for a log HBR of zero, β 2 is the fixed effect for log HBR for country j and study i, ζ 1j is the random effect (i.e. intercept) for between-country heterogeneity in probability of gSG6 IgG seropositivity, ζ 2i, is the random effect (i.e. intercept) for between-study heterogeneity in probability of gSG6 IgG seropositivity, and ζ 3i is the random effect (i.e. coefficient) for between-study heterogeneity in the effect of log HBR.

Risk of bias in individual studies
For cross-sectional, cohort or intervention studies, selection bias was assessed by reviewing the studies' inclusion and exclusion criteria. Any case-control studies or studies that presented salivary antibody data stratified by malaria infection status were included in narrative terms only. Risk of bias was assessed by one reviewer using the Risk of Bias in Prevalence Studies tool (Hoy et al., 2012). The risk of bias pertains to the reported observations of anti-Anopheles salivary antibody seroprevalence included in the multilevel modelling.

Model fit indices
Akaike's information criterion 2995.7 Bayesian information criterion 3010.1 HBR association: log odds ratio and standard error (SE), 95% confidence interval (95% CI), p-value, random-effect components (RE): variances (ψ), conditional intraclass correlation coefficient ICC (ρ),* and model log likelihood (ℓ) from generalised linear multilevel modelling (mixed effects, logistic). † This analysis is based upon n = 132 studyspecific observations from 12 studies. Of note, five studies that measured HBR and IgG antibodies to gSG6 were excluded from this analysis as they only reported gSG6 IgG levels. *ρ = ψk+ ...+ ψnk ψk+ ...+ ψnk+ π 2 /3 , where ψ k through ψ nk are random-effect variance estimates pertaining to each of the respective variance components (see table notes ‡- ¶ ) from the generalised linear multilevel modelling (mixed effects, logistic) for a specific ICC estimate. †Generalised linear multilevel modelling (mixed effects, logistic) estimating the association between log transformed HBR and anti-gSG6 IgG seropositivity with random effects for country-specific and study-specific heterogeneity in gSG6 IgG seroprevalence and study-specific heterogeneity in effect of HBR. ‡ ψ 1 , ψ 2 , and ψ 3 represent variances of the random effects for country, study, and effect of HBR, respectively. § ρ 1 represents conditional ICC for salivary antibody observations from the same country but different study. ¶ ρ 2 represents conditional ICC for salivary antibody observations from the same country and study with the median HBR.

Appendix 6
Association between gSG6 IgG seropositivity and EIR

Model fit indices
Akaike's information criterion 1071.3 Bayesian information criterion 1079.5 Entomological inoculation rate (EIR) association: log odds ratio and standard error (SE), 95% confidence interval (95% CI), p-value, random-effect components (RE): variances (ψ), conditional intraclass correlation coefficient (ICC) (ρ)* and model log likelihood (ℓ) from generalised linear multilevel modelling (mixed effects, logistic). † This analysis is based upon n = 38 study-specific observations from eight studies. * ρ = ψk+ ...+ ψnk ψk+ ...+ ψnk+ π 2 /3 , where ψ k through ψ nk are random-effect variance estimates pertaining to each of the respective variance components (see table notes ‡- ¶ ) from the generalised linear multilevel (mixed effects, logistic) modelling for a specific ICC estimate. † Generalised linear multilevel modelling (mixed effects, logistic) estimating the association between log transformed EIR and anti-gSG6 IgG seropositivity with random effects for country-specific and study-specific heterogeneity in gSG6 IgG seroprevalence and study-specific heterogeneity in effect of EIR. ‡ ψ 1 , ψ 2 , and ψ 3 represent variances of the random effects for country, study, and effect of EIR, respectively. § ρ 1 represents the conditional ICC for salivary antibody observations from the same country but different study. ¶ ρ 2 represents the conditional ICC for salivary antibody observations from the same country and study with the median EIR

Model fit indices
Akaike's information criterion 5202.5 Bayesian information criterion 5215.9 Any Plasmodium species infections (including prevalence estimates of P. falciparum only, P. vivax only, both P. falciparum and P. vivax and untyped infections): log odds ratio and standard error (SE), 95% confidence interval (95% CI), p-value, random-effect components (RE): variances (ψ), conditional intraclass correlation coefficient (ICC) (ρ),* and model log likelihood (ℓ) from generalised linear multilevel modelling (mixed effects, logistic). † This analysis is based upon n = 212 study-specific observations from 14 studies. Of note, six studies that measured Plasmodium spp. prevalence and IgG antibodies to gSG6 were excluded from this analysis as five only reported gSG6 IgG levels and one was a case-control study. * ρ = ψk+ ...+ ψnk ψk+ ...+ ψnk+ π 2 /3 , where ψ k through ψ nk are random-effect variance estimates pertaining to each of the respective variance components (see table notes ‡ and § ) from the generalised linear multilevel modelling (mixed effects, logistic) for a specific ICC estimate. † Generalised linear multilevel modelling (mixed effects, logistic) estimating the association between the log prevalence of any Plasmodium spp. infection and anti-gSG6 IgG seropositivity with random effects for studyspecific heterogeneity in gSG6 IgG seroprevalence and study-specific heterogeneity in effect of Plasmodium spp. prevalence. ‡ ψ 1 and ψ 2 represent variances of the random effects for study and effect of Plasmodium spp. prevalence, respectively. § ρ 1 represents the conditional ICC for salivary antibody observations from the same study and with the median Plasmodium spp. prevalence.