A High-throughput Bead-based Affinity Assay Enables Analysis of Genital Protein Signatures in Women At Risk of HIV Infection

Longitudinally collected genital samples from women in HIV-serodiscordant relationships were analyzed using a high-throughput bead-based affinity assay, revealing elevated levels of proteins involved in epithelial barrier integrity and inflammation. Proteins identified using the affinity set-up were validated by label-free tandem mass spectrometry in a partially overlapping cohort with concordant results. The identified proteins are important markers to follow during assessment of mucosal HIV susceptibility factors and a high-throughput bead-based affinity set-up could be suitable for such evaluation. Graphical Abstract Highlights Genital samples were analyzed using a high-throughput bead-based affinity assay. Proteins were validated by tandem mass spectrometry with concordant results. Genital epithelial proteins were increased in HIV-serodiscordant women. Women at high risk of HIV infection, including sex workers and those with active genital inflammation, have molecular signatures of immune activation and epithelial barrier remodeling in samples of their genital mucosa. These alterations in the local immunological milieu are likely to impact HIV susceptibility. We here analyze host genital protein signatures in HIV uninfected women, with high frequency of condom use, living in HIV-serodiscordant relationships. Cervicovaginal secretions from women living in HIV-serodiscordant relationships (n = 62) were collected at three time points over 12 months. Women living in HIV-negative seroconcordant relationships (controls, n = 25) were sampled at one time point. All study subjects were examined for demographic parameters associated with susceptibility to HIV infection. The cervicovaginal samples were analyzed using a high-throughput bead-based affinity assay. Proteins involved in epithelial barrier function and inflammation were increased in HIV-serodiscordant women. By combining several methods of analysis, a total of five proteins (CAPG, KLK10, SPRR3, elafin/PI3, CSTB) were consistently associated with this study group. Proteins analyzed using the affinity set-up were further validated by label-free tandem mass spectrometry in a partially overlapping cohort with concordant results. Women living in HIV-serodiscordant relationships thus had elevated levels of proteins involved in epithelial barrier function and inflammation despite low prevalence of sexually transmitted infections and a high frequency of safe sex practices. The identified proteins are important markers to follow during assessment of mucosal HIV susceptibility factors and a high-throughput bead-based affinity set-up could be a suitable method for such evaluation.


In Brief
Longitudinally collected genital samples from women in HIVserodiscordant relationships were analyzed using a highthroughput bead-based affinity assay, revealing elevated levels of proteins involved in epithelial barrier integrity and inflammation. Proteins identified using the affinity set-up were validated by label-free tandem mass spectrometry in a partially overlapping cohort with concordant results. The identified proteins are important markers to follow during assessment of mucosal HIV susceptibility factors and a highthroughput bead-based affinity set-up could be suitable for such evaluation.  (1), and about 64% of these infections occurred in sub-Saharan Africa. Sexual transmission account for most new HIV infections and given that young women run a 44% higher risk of HIVinfection as compared with age-matched males (1), research to understand biological factors affecting sexual transmission is a global health priority.
In Kenya, about 66% of HIV-infected adults live in HIVserodiscordant relationships (17). HIV-serodiscordant couples who have unprotected sexual intercourse despite the risk of HIV transmission have been the focus of several studies to determine markers of natural HIV resistance (18 -23). Understanding protective mucosal factors in the FGT of such a highly relevant risk group of HIV acquisition, and how these factors are expressed over time, may provide novel avenues for prevention or tools to evaluate efficacy of pre-exposure prophylaxis trials at mucosal surfaces. Genital protein signature profiles including immune activation have indeed been proposed as objective measures of mucosal safety in clinical trials (24 -26). Elucidating these protective factors will also improve our understanding of natural resistance in high-risk groups such as HIV-serodiscordant couples.
Previous studies of genital protein signatures of HIV-exposed seronegative (HESN) female sex workers have been undertaken using mass spectrometry (MS)-based techniques and been cross-sectional (7)(8)(9)(10). Bead-based affinity proteomic techniques provide an alternate approach that confers a higher-throughput assessment of individual samples (27,28). Here, we applied this technique to examine genital secretions in a unique cohort of HIV-serodiscordant couples from multiple time points, using samples from HIV-uninfected women in HIV-seroconcordant relationships as a control. This study thus complements previous reports which defined mucosal molecular signatures in women representing other risk groups for HIV infection, such as sex workers and women with active genital inflammation (7)(8)(9)(10)12). In contrast to these previous studies, the HIV-serodiscordant women in this study had a low prevalence of clinical signs and symptoms of genital inflammation as well as a high frequency of safe sex practices, and we therefore hypothesized that their genital proteome composition would be comparable to that of control women. The objectives were thus to characterize mucosal protein signatures of women living in HIV-serodiscordant relationships and to evaluate the feasibility of using a highthroughput bead-based affinity set-up for evaluation of protein expression in cervicovaginal secretions (CVS).

EXPERIMENTAL PROCEDURES
Study Setting and Participants-All study participants were recruited and screened through voluntary counseling and testing centers in Nairobi, Kenya and were part of a larger cohort study as presented elsewhere (21). Study participants were women in heterosexual HIV-serodiscordant relationships (the "serodiscordant" group) in which the male partner was HIV-infected and the female uninfected. In addition, HIV-uninfected women in HIV-seroconcordant relationships were enrolled as a control group. Eligible participants were older than 18 years of age, not pregnant, reported sexual intercourse with their study partner three or more times in the three months prior to screening, and planned to remain together for the duration of the study. Couples in which the male partner was on antiretroviral therapy or had a history of clinical AIDS (WHO stage IV) were excluded. For the present study, we also excluded women using any type of hormonal contraception. Written informed consent was obtained from all study participants, and ethical approval granted by Institutional Review Boards at University of Washington, Karolinska Institutet, University of Manitoba and Kenyatta National Hospital.
Clinical and Laboratory Procedures-Women in the larger study were seen in the clinic at enrollment and every 3 months for 12-24 months. During these visits, questionnaires were administered, and blood specimens were collected. CVS samples included in the present study were collected at enrollment (0 months) and at 6 and 12 months from the serodiscordant women, whereas control women were sampled at a single time point at enrollment. Two cotton-tipped swabs were used to collect secretions from the cervical os and posterior vaginal fornix. Both swabs were placed in the same vial containing 5 ml of phosphate-buffered saline (PBS) and transferred on ice to the laboratory within 2 h of collection, centrifuged at 800 g for 10 min at 4°C to remove cellular debris, and the supernatant was aliquoted and cryopreserved in cryovials at Ϫ80°C. All participants were tested for HIV-1, bacterial vaginosis, Trichomonas vaginalis, Treponema pallidum, and herpes simplex virus type 2 (HSV-2). HIV-1 rapid tests were conducted at the study clinic using both the Determine HIV-1/2 Rapid Test (Abbott Laboratories, IL) and the Bioline Recombigen HIV Test (Standard Diagnostic Inc, Suwon, South Korea). If either commercial rapid kit was positive, results were confirmed with an HIV-1 ELISA, which was performed using the Vironostika HIV Uni-Form II Ag/Ab ELISA kit (Biomerioux Laboratories, Netherlands). Bacterial vaginosis was defined as Nugent score 7-10 in a vaginal gram stain and vaginal swabs were used for culture of Trichomonas vaginalis (InPouch TV, BioMed Diagnostics, San Jose, CA). Syphilis was tested for using rapid plasma reagin tests (Becton, Dickinson and Company, Franklin Lakes, NJ) with confirmation using a Treponema pallidum hemagglutination assay (Randox Laboratories, Crumlin, UK). HSV-2 serostatus was determined by the HerpeSelect IgG ELISA kit (Focus Diagnostics, Cypress, CA) with optical density Ͼ3.5 defined as positive. HIV-1 and HSV-2 testing was performed at all visits included in this study (enrollment, at 6 and at 12 months), and syphilis, trichomoniasis and bacterial vaginosis were assessed at enrollment.
For the HIV infected male partners, HIV-1 RNA levels were measured from blood samples using the Gen-Probe HIV-1 viral load assay (Gen-Probe Incorporated, San Diego, CA, United States). The lower limit of detection for the viral load assay was 150 copies/ml. Male CD4 ϩ T-cell counts were measured using Becton Dickinson FACS Calibur flow cytometer (Becton Dickinson Bioscience, Franklin Lakes, New Jersey, United States). HIV-1 RNA levels and CD4ϩ T-cell count were measured at enrollment and every 6 months.
Selection of Protein Targets and Suspension Bead Array Generation-Proteins analyzed were selected based on previous MS analysis (13,30) and supplemented with proteins described as related to HIV resistance or inflammation (8,9,11) (supplemental Fig. S1). This resulted in the inclusion of 329 antibodies (Human Protein Atlas (HPA) Antibodies) to 171 proteins with pro-and anti-inflammatory roles as well as involvement in antimicrobial response, proteolysis, innate immunity, redox reactions, chemotaxis, cell adhesion, tissue integrity, and apoptosis (supplemental Table S1). In addition, we also included three antibodies targeting different forms of PSA. Two positive (antialbumin and anti-human IgG, Dako, Glostrup, Denmark) and two negative (bare bead and IgG from nonimmunized rabbits, Bethyl, Montgomery, AL) controls were also included. The antibodies were immobilized onto color-coded magnetic beads (MagPlex, Luminex Corp., Austin, TX) as described previously (28).
Protein Profiling on Suspension Bead Arrays-Previously reported body fluid profiling protocols were here adapted for analysis of mucosal secretions (27). Briefly, the crude CVS samples were randomized, diluted, biotinylated and heat treated together with twelve technical replicates. The bead array was distributed in 384 well microtiter plates followed by sample transfer and incubation over night at ambient temperature. Subsequent washing, crosslinking and detection was performed and for each sample, at least 50 beads of each identity were measured in a FlexMap 3D instrument (Luminex Corp., Austin, TX) and the results displayed as median fluorescence intensity (MFI) in arbitrary units. To evaluate assay reproducibility, samples were relabeled and incubated with the previously generated bead array. This analysis was performed as part of a larger study also including women using various hormonal contraceptives. Further details of the protein profiling are included in the supplemental Methods.
Immunohistochemistry-Paraffin blocks corresponding to 44 different normal tissue types including cervix and vagina were used for production of tissue microarrays (TMAs), and each tissue type was represented by 1 mm cores from three individuals obtained through the Uppsala Biobank as previously described (31). Four-micrometer sections of the TMA blocks were deparaffinized in Neo-Clear® (Merck Millipore, Darmstadt, Germany), hydrated in graded alcohols and blocked for endogenous peroxidase in 0.3% hydrogen peroxide. Antigen retrieval was performed boiling the slides in Citrate buffer®, pH6 (Lab Vision, Freemont, CA) for 4 min at 125°C, followed by cooling of the slides down to 90°C. Automated immunohistochemistry was performed using an Autostainer 480 instrument® (Lab Vision) as previously reported (31). Primary antibodies anti-actin fila- [PI3] (HPA017737) and the secondary reagent (UltraVision LP HRP polymer®, Lab Vision) were applied for 30 min each at room temperature, and slides were developed for 10 min using Diaminobenzidine (Lab Vision) as chromogen. Incubations were followed by rinse in Wash buffer® (Lab Vision) for 5 min and slides were counterstained in Mayers hematoxylin (Histolab) and cover slipped using Pertex® (Histolab) as mounting medium. Incubation with PBS instead of primary antibody served as negative control. The dilution of the primary antibody was optimized by testing the antibody on a selection of normal tissues and comparison with previously published literature, gene characterization data and RNA expression data. The AperioScan-Scope XT Slide Scanner (Aperio Technologies, Vista, CA) system was used to capture digital whole slide high-resolution images.
Statistical Analysis-Statistical analysis and data visualization was performed using R (32). Samples with deviating profiles were identified as outliers using robust principal component analysis (R package rrcov) and excluded from further data analysis. Raw data (in MFI) was subjected to MA-LOESS normalization to adjust for effects related to the four 96-well plates, based on the assumption that the mean for each plate should be similar (33). This is considered a very mild normalization method and was used for all analyses, and the data subjected to this type of normalization is referred to as "normalized data". Because of the potentially biologically relevant variation in sample composition, no other normalization was performed. Variation of MFI for each antibody across technical replicates was evaluated via their coefficients of variation. Two-group comparisons between study groups were evaluated for normalized data from the three time points separately using Wilcoxon rank sum test (p Ͻ 0.05 regarded as significant). To account for the simultaneous and parallel testing of all included analytes, the obtained p values were subjected to multiple testing corrections using the Benjamini-Hochberg method (p.adjust function, R package stats) (34) and the presented numbers are the least significant over all time points and repeated assays.
Assessment of the Influence of Potential Confounders-We used two statistical models to assess the influence of potential confounders: a bivariate linear regression model and a linear mixed effects (LME) model.
Bivariate Linear Regression Model-Proteins that were significantly associated with the HIV-serodiscordant phenotype after multiple testing adjustments were further investigated to determine if confounding factors accounted for the observed associations. Bivariate linear regression models were fit with the log transformed intensity of each analyte included as the dependent variable and HIV-serodiscordant/ control status and the potential confounder included as independent variables. Separate adjusted models were fit for each potential confounder, which were selected from factors differentially distributed (p Ͻ 0.1) between HIV-serodiscordant and control participants (age, any unprotected sex, PSA-positivity, HSV-2 positivity and bacterial vaginosis). Marital status also differed between the groups (p ϭ 0.001), but was because 98% of HIV-serodiscordant women were married, this was not included as a potential confounder in the model. The coefficient for HIV-serodiscordant status from the adjusted model was compared with the coefficient from a crude unadjusted model. Confounding was present if the adjusted coefficient differed from the crude coefficient by greater than Ϯ10%.
Linear Mixed Effects Model and Principal Variance Component Analysis-Analytes differentially detected in HIV-serodiscordant individuals were further investigated for any associations with sexual activity or clinical information collected at baseline in a LME model using the lme4 package in R (35). Details can be found in supplemental Methods. Briefly, age, PSA-positivity, reporting of any unprotected sex, HSV-2 positivity and bacterial vaginosis were included as fixed effects; visit code as an estimate of sample time was included as a random effect within the LME model. Principal variance component analysis was used to estimate sources of variability within the data attributed to both fixed and random effects within the LME model before and after adjustment (36). The resulting data from the LME model was subjected to both univariate analysis as described above as well as a multivariate, discriminant data analysis approach de-scribed below. Data from the LME model is referred to as "adjusted data," as compared with the normalized data described above.
Multivariate Analysis Using Adjusted Data from the LME Model-The least absolute shrinkage and selection operator (LASSO) algorithm was applied to determine the minimum set of proteins necessary to distinguish our two study groups. This was implemented using the Matlab software (MathWorks, Natick) as previously described (13). Briefly, K-fold cross-validation was used to determine the optimum value of the tuning parameter ("s") such that the resulting model had the lowest possible mean-squared prediction error. Resulting features were chosen as the minimum set of attributes necessary for phenotypic classification, and partial least-squares discriminant analysis (PLSDA) was used to assess the predictive ability of LASSOselected features to distinguish the study groups. To prevent any potential confounding effects of seminal exposure, samples that tested positive for PSA (as identified by either the ARCHITECT Total PSA assay or the bead-based array analysis as described above) were excluded from this analysis.
Tandem Label-free MS/MS Analysis-To validate our findings using a separate technique, a subset of samples (n ϭ 102) from the larger cohort (21) was analyzed using both the above described suspension bead array technique as well as a label-free tandem MS/MS technique. This subset consisted of women in serodiscordant relationships both with and without hormonal contraceptives (women with hormonal contraceptives were excluded from the first part of the study). Results based on the MS analysis have been published previously (14). Briefly, 50 g of protein from each sample was denatured in 8 M urea and digested with trypsin enzyme using filter-aided sample preparation. Peptides were cleaned of salts and detergents with reverse-phase liquid chromatography (high pH RP, Agilent 1200 series micro-flow pump, CA) as described previously (13). Equal amounts of sample peptides were analyzed using a Linear Trap Quadrupole Orbitrap Velos mass spectrometer (Thermo Fisher) coupled to a nano-flow Easy nLC II (Thermo Fisher). MS data was analyzed label-free based on peak intensities using the program Progenesis LC-MS Software (v3 Nonlinear Dynamics) with default settings. Peptide quantification was performed in Progenesis QI software with "Relative quantification using Hi-N" where the 3 (n ϭ 3) abundant peptides for each protein are averaged for quantification. For minimum threshold for data quantification, Progenesis Q1 defaults were used. Searches were performed in Mascot v2.4.0 (Matrix Science); peptide identity search was performed using Mascot v2.4.0 against the UniProtKB/SwissProt database (356786 entries at the time of search, 19/02/2015) restricting taxonomy to Human (Genus), Bacteria (Kingdom), Candida (Genus), Trichomonas (Genus). For the present study, only proteins restricted to Human taxonomy were analyzed. Enzyme specificity was set to "Trypsin" and 2 missed cleavages were allowed. A fixed modification was carbamidomethyl (C) of cysteine, and a variable modification was set to oxidation (O) of methionine. Mass tolerance for precursor ions was 10 ppm and 0.5 Da mass tolerance for fragment ions. Protein probabilities were assigned by the Protein Prophet algorithm (37). The peptides identified were further confirmed using Scaffold (version 4.4.1) with the following criteria: Յ0.1% false discovery rate (FDR) peptide identification, Յ1% FDR protein identification and Ն2 unique peptides/protein. When proteins could not be differentiated based on the MS analysis because they contain similar peptides, proteins were grouped parsimoniously. The normalized abundance of each protein was calculated in Progenesis LC-MS Software by comparing logtransformed ratiometric abundance data to a normalization reference using a median and mean absolute outlier filtering approach (38). To measure technical reproducibility, coefficient of variance among proteins quantified in 9 mix samples was used and proteins with a covariance Ͼ25% between mixes were excluded from further analysis. Samples with a protein median outside of the Ϯ 1.5 ϫ interquartile range of median abundance of proteins identified across all samples were identified as outliers and excluded from analysis.
Comparing MS and Bead-based Affinity Proteomics-To enable comparison of protein levels without influence of adjustments related to potential confounders, normalized MFI values from the bead-based affinity technique were compared with normalized abundance data from the MS analysis using Spearman's correlation in GraphPad Prism version 6. p values generated were subjected to multiple comparisons testing using Benjamini-Hochberg. Antibodies with a median MFI Ͻ median MFI of the negative control bead with IgG from nonimmunized rabbits were excluded for the comparison analysis.
Experimental Design and Statistical Rationale-As explained in Protein profiling on suspension bead arrays, for the bead-based affinity technique the samples were analyzed twice to evaluate assay reproducibility and there was minimal difference between the two runs. Three technical replicates per 96-well plate, resulting in a total of twelve replicates for the entire assay, were used to monitor technical variation. The total number of samples analyzed were 368 because of sample availability. These samples included women using hormonal contraceptives. For the part of this study comparing women in serodiscordant relationships with controls, women using hormonal contraceptives were excluded. Samples were distributed over the 96-well plates using an in-house generated R script that randomized the samples equally in terms of group (HIV serodiscordant versus control) and age. Because the data was not normally distributed, Wilcoxon rank sum test was used to identify proteins associated with the HIV-serodiscordant versus control phenotype. A bivariate linear regression model was used to evaluate the influence of potential confounders (parameters differentially abundant between the study groups), as explained in Bivariate Linear Regression Models. By including visit code as a random effect in the LME model, we could adjust for differences in protein expression between the longitudinal visits of the same woman, and thereby reach a more accurate estimate of proteins associated with either study group. In addition, in the LME model other potential confounders were adjusted for. All results were subjected to multiple testing using Benjamini-Hochberg to reduce type 1 error.
For the MS experiment, the samples were run once, and no replicate analysis was performed. The total number of samples analyzed by MS were 102. These samples were originally chosen from the larger cohort described in Study Settings and Participants (21) to be included in a separate study examining the effect of injectable hormonal contraceptives and intravaginal practices on the mucosal barrier and microbiome in the FGT (14). This 102-sample subset consisted of women in HIV-serodiscordant relationships both with and without hormonal contraceptives.
This resulted in a subset of samples from serodiscordant women being analyzed by both MS and bead-based affinity proteomics. As described in Tandem Label-free MS/MS Analysis, protein mixes were used to monitor technical variation. In this MS/MS experiment, no biological replicates were performed, however the same tandem MS/MS method has previously been validated with Western blotting (8). Because we could not assume normal distribution in our two data sets (bead-based affinity proteomics and MS), Spearman's correlation was used for the correlation analysis.

RESULTS
Study Group Characteristics-We performed protein profiling of 368 CVS samples using a bead-based affinity approach with direct biotin-labeling of proteins. Of these samples, 244 were selected for the initial part of the study where women not using hormonal contraceptives in HIV-serodiscordant rela-tionships (referred to as serodiscordant) were compared with women in HIV-seroconcordant relationships (both partners HIV-seronegative, referred to as controls). For the women in serodiscordant relationships, three consecutive samples were collected every 6 months over a one-year period. The control women were sampled once. Out of the 244 samples, 17 samples from 13 women (10 serodiscordant and 3 controls) were identified as outliers. In order to only include complete time series from all women in the data analysis, we also excluded the additional time points from the 10 serodiscordant women. This resulted in 211 samples (62 serodiscordant women with 3 samples per woman and 25 controls with 1 sample per woman) that were subjected to biological data analysis in the initial stage (see Fig. 1 for an overview of the study). At enrollment, demographic data were collected and compared between the groups (Table I). A significant difference was found in age (mean: 31 years old in the serodiscordant group versus 25 years old in the control group, p ϭ 0.007) and marital status (percentage married: 98% versus 64%, p ϭ 0.001). Women in serodiscordant relationships were expected, partly because of extensive counseling at study enrollment, to have more protected sex (unprotected sex in past month 18% versus 76%, p Ͻ 0.001) and less PSA positivity (5% versus 35%, p ϭ 0.002). Further, women with an HIV-seropositive partner were more likely to be HSV-2 seropositive (65% versus 36%, p ϭ 0.03). Serodiscordant women also displayed lower levels of bacterial vaginosis (10% versus 26%, p ϭ 0.071). Groups were however comparable in number of sex acts in the past month, frequency of vaginal washing and vaginal drying habits, number of reported lifetime sexually transmitted infections (STIs), trichomoniasis, cervicitis, vaginitis and seropositivity for syphilis.
Seminal Proteins Detected in Both Serodiscordant and Control Women-Because of the higher degree of PSA pos-itivity (as determined by the ARCHITECT Total PSA assay) in the control group at enrollment, we investigated the levels of seminal proteins including PSA and semenogelin-1 and 2 [SEMG1 and SEMG2] in all samples using our bead array. In concordance with the demographical data, a subset of individuals also showed high levels of these additional proteins (supplemental Fig. S2). Apart from one antibody detecting free PSA (p Ͻ 0.01, antibody 8A6), the differences at group level were not significant.

Proteins Involved in Inflammation and Epithelial Barrier Remodeling Found At Elevated Levels At All Three Time Points in
Serodiscordant Women-To find differences in protein expression between the serodiscordant and control women, the results from each of the three consecutive time points were separately compared with the control group. Out of the 171 proteins analyzed we identified 24 proteins, analyzed by 30 antibodies, as significantly altered at all time points (Table II) with a median technical variance of 7% (range 2-13% with two antibodies on 22 and 25%, respectively). For proteins targeted by multiple antibodies, the antibody providing the highest significance was selected for visualization (supplemental Fig. S3). A high correlation for antibodies targeting different parts of the same protein indicate that the correct protein was detected.

TABLE II Proteins with altered levels in CVS of serodiscordant women compared to controls.
Proteins with significantly altered levels at all three time points between the two study groups all displayed higher levels in the serodiscordant group. Adjusted p values were corrected for multiple testing with Benjamini-Hochberg (BH) and results for adjusted data are based on the linear mixed effects model.

Gene
Gene description Uniprot ID Antibody found to be higher in the serodiscordant group than the control group. The levels of four of these proteins (TTR, SPRR3, KLK10, and CAPG) also differed significantly when p values were adjusted for multiple testing ( Fig. 2A, see supplemental Fig. S4 for the remaining 20 proteins). The intensities obtained from all individuals for the 24 altered proteins were subjected to correlation analysis and correlation coefficients between antibodies used for hierarchical clustering to investigate potential co-variance in protein levels. This resulted in three main clusters with KLK10 and CAPG organized into the same cluster and TTR and SPRR3 in two separate clusters (Fig. 2B).

Adjustment for Potential Confounders Reveal Concordant
Results-To investigate whether potential confounders (age, unprotected sex, PSA positivity, HSV-2 seropositivity or bacterial vaginosis) affected the observed protein profiles for TTR, SPRR3, KLK10 and CAPG, we fit linear regression models, both with and without inclusion of the potential confounder, and re-evaluated the association between the serodiscordant women and controls. The results revealed that several of the parameters altered the relative associations by Ϯ10%, however most adjusted models resulted in stronger associations and all comparisons remained statistically significant (see supplemental Table S2). However, the magnitude of the association between serodiscordant status and CAPG decreased by 17% after adjustment for the presence of PSA (adjusted p value ϭ 0.02 versus crude p value ϭ 0.001) and the association between the serodiscordant status and SPRR3 decreased by 12% after adjustment for bacterial vaginosis (adjusted p value ϭ 0.01 versus crude p value ϭ 0.002). In addition, all obtained intensities were adjusted using a LME model and this adjusted dataset was also subjected to univariate analysis. This analysis revealed concordant significant results for 13 of the 24 proteins, as indicated in Table II.
Changes in the Mucosal Proteome of the FGT Over Time Is Associated with Partner's Viral Load-Individual variation in protein intensity levels was assessed longitudinally in all time points collected in the serodiscordant group. Levels were classified as 'changed' if the intensity ratio varied more than 1.5-fold between any of the three time points. For each protein, the number of samples with a profile classified as "unchanged"/"changed" was summarized, revealing annexin A3  We identified a group of serodiscordant women (n ϭ 16) with a changing profile for most proteins (ranging from 179 -311 out of the 332 antibodies) (supplemental Fig. S5) and compared their clinical and demographic characteristics to the remaining serodiscordant women to identify potential correlates of these changing profiles (supplemental Table S3). The only significant correlate was partner's viral load at enrollment, which was one log 10 copies higher among the women who exhibited a high degree of protein profile changes over time.
Hierarchical clustering of protein level ratios was performed for the 24 proteins associated with serodiscordant status (Fig.  3). Among these, SPRR3 and S100 calcium-binding protein A9 [S100A9] were the two least-changing proteins; FGA, TTR and alpha 2-HS glycoprotein [AHSG] were the proteins with the highest ratios, indicating the greatest levels of change in serodiscordant women.
Multivariate Modeling Confirm Protein Profiles and Reveal Additional Anti-inflammatory Proteins Associated with the Serodiscordant Phenotype-As a complement to the univariate data analysis described above, data-driven multivariate models were generated using LASSO as a feature-selection method to reveal potentially intersecting biological relationships or functions. PLSDA was applied to the selected features to determine how well each protein set classified our phenotypes of interest and the models separately compared each of the three time points to the control group. These models were performed on the LME adjusted data and PSApositive samples were not included. The first model (Fig. 4, Model I), from time point 1 (month 0), demonstrated 89% calibration and 87% cross-validation accuracy, with the first two latent variables accounting for 29% of the variance (p ϭ 0.021). Latent Variable 1 (LV1) partially differentiated serodiscordant participants (red) from the control group (blue). Five features (CSTB, SPRR3, TTR, KLK10, and PI3) were negatively loaded on LV1, indicating that they were positively associated with the serodiscordant group and/or negatively associated with controls, whereas six other proteins (mucin 12 [MUC12], C-C motif chemokine ligand 5 [CCL]-5, CCL18, calpain 1 [CAPN1], superoxide dismutase [SOD1], and lactotransferrin LTF) were positively loaded on LV1, indicating a negative association with the serodiscordant group and positive association with controls. In a similar manner, the model generated using samples from time point 2 (month 6) demonstrated 91% calibration and 85% cross-validation accuracy, with the first two latent variables accounting for 31% of the variance (p ϭ 0.0032; Fig. 4, model II). Three features were negatively loaded on LV1, whereas four features were positively loaded on LV1. The model generated using samples from time point 3 (month 12) demonstrated 92% calibration and 81% cross-validation accuracy, with the first two latent variables accounting for 35% of the variance (p ϭ 0.007; Fig.  4, model III). Eight features were negatively loaded on LV1, whereas three different proteins were positively loaded on LV1.
The three models generated through this analysis varied in feature number and specificity. Despite this variability, two factors were consistently associated with the serodiscordant phenotype at all three time points, namely SPRR3 and TTR. Further, CSTB, a thiol protease inhibitor, PI3, an antiprotease, and KLK10, a serine protease, were positively associated with the serodiscordant phenotype in two out of three time points measured. CAPN1, a thiol protease involved in cytoskeletal remodeling; KRT1, a keratin involved in the inflammatory response, and SOD1, an antioxidant; were negatively associated with the serodiscordant phenotype in two out of the three time points.
Top Proteins of Interest Expressed in the Human Cervix and Vagina-The four proteins identified in the univariate analysis as significantly increased in the serodiscordant group after correction for multiple testing (TTR (39), SPRR3 (40), KLK10 (41), and CAPG (42)) were investigated for protein expression in human cervical and vaginal tissue sections from donors unrelated to the serodiscordant women (Fig. 5). Two additional proteins (CSTB (43) and PI3 (44)) that were associated with the serodiscordant phenotype in at least two out three time points using the data-driven multivariate models were also included. Visualization of a specific protein in a tissue section suggests that it is locally produced and not only a result of plasma transudation into the genital secretion. TTR could not be visualized in any of these sections (data not shown) whereas SPRR3, KLK10, CSTB and PI3 were found expressed in superficial layers of squamous epithelium throughout the cervix and vagina. SPRR3 and CSTB displayed nuclear and cytoplasmic staining whereas KLK10 and PI3 indicated cytoplasmic and membranous expression. For CAPG, the antibody used for the protein profiling did not identify the protein in the tissue samples whereas an independent antibody (HPA018843) generated distinct cytoplasmic staining in macrophages present in connective tissue of both cervical and vaginal mucosa. In summary, five (SPRR3, KLK10, CAPG, CSTB, PI3) out of the six proteins identified as associated with the serodiscordant phenotype by using multivariate analyses were also identified in situ in female genital tissue.

DISCUSSION
Here we found proteins involved in epithelial barrier integrity and inflammation at higher levels in CVS from HIV-uninfected women living in HIV-serodiscordant relationships relative to women with an HIV-seronegative partner. Previous studies using proteomics have identified both "danger signals" and "resistance factors" for sexual HIV acquisition in sex workers and in women with a high frequency of genital inflammation (7)(8)(9)(10)12). In contrast to these previous studies, the present study group of HIV-serodiscordant women had a low preva- lence of clinical signs and symptoms of genital inflammation, as well as a high frequency of safe sex practices. These women were therefore hypothesized to have a comparable genital protein signature to control women. However, our data showed a significantly different molecular signature in women in HIV-serodiscordant relationships as compared with control women. This implies that also women living in HIV-serodiscordant relationships must be evaluated for subclinical mucosal markers of epithelial barrier dysfunction and inflammation, factors that could influence HIV susceptibility. Recently, it was demonstrated that vaginal dysbiosis (45) and genital inflammation (46) impair topical antiretroviral efficacy, thus highlighting the importance and clinical relevance of studies on the genital mucosal milieu in women living in HIV-endemic countries.
In the univariate analysis our bead-based affinity set-up identified several proteins associated with the serodiscordant phenotype, including serine proteases, serine protease inhibitors, and other factors with multiple functions in mucosal proand anti-inflammatory pathways. Common underlying functions of these proteins, such as SPINK5, KLK10, and SPRR3, included epithelial differentiation and wound repair. SPINK5, a serine protease inhibitor, is involved in epithelial desquamation and contributes to epithelial barrier integrity (47). KLK10 plays an active role in the desquamation of vaginal epithelial cells and the activation of cervicovaginal antimicrobial proteins (48,49). SPRR3 is induced during epithelial squamous differentiation and is an important component for epithelial barrier integrity (50,51). Both KLK10 and SPRR3 were visualized in tissue sections of cervical and vaginal mucosa. Some other factors, including the thyroid transport protein TTR, are most likely derived from plasma transudate in genital secretions because it has no described role in genital mucosa and could not be visualized in the genital tissue sections. Although TTR protein levels are stable over the menstrual cycle (52), TTR gene expression can be affected by HIV-exposure (53). In our present study, TTR levels were also stable over time and consistently associated with the HIV-serodiscordant phenotype. The mechanisms influencing TTR levels in this study group are however unknown.
By also using data-driven multivariate models and confirming the presence of identified proteins in situ in genital tissue sections, we altogether revealed five proteins (CAPG, KLK10, SPRR3, PI3 and CSTB) to be associated with the serodiscordant phenotype. PI3, also named elafin or skin-derived antileukoprotease, is a serine protease inhibitor that plays a major role as an anti-inflammatory mediator at mucosal surfaces and participates in the control of epithelial barrier integrity (54). PI3 is secreted from epithelial cells in the FGT (55) and exhibits anti-HIV-1 activity in vitro (55,56). Additionally, elevated levels of PI3 in genital secretions have been associated to relative HIV resistance in both cross-sectional and independent prospective cohort studies of commercial sex-workers from HIV-endemic geographical areas (7,56). Two other studies representing the same sex-worker cohort (8,9) expanded these findings to identify elevated levels of both PI3 and CSTB among other proteins. Our present study not only complements previous studies assessing genital protein profiles in commercial sex workers but also studies in women with genital inflammation living in HIV endemic areas (7)(8)(9)(10)12). In one of these studies, women with elevated mucosal cytokines and increased genital inflammation displayed protein signatures of dysfunctional epithelial barriers and increased immune cell movement, and it was proposed that these mechanisms contribute to increased risk of HIV acquisition (12).
An interesting observation in our study was the HIV-serodiscordant group's elevated levels of not only serine protease inhibitors, but also serine proteases. These serine proteases were primarily involved with tissue desquamation. The balance of protease and protease inhibitors, as well as the combined effect of other pro-and anti-inflammatory proteins, likely contributes to the mucosal antiviral innate immune defense (57). To better determine this total effect further functional analyses of the samples would be required. Serine protease inhibitors, including CSTB, aid in wound healing and have a role in epithelial damage repair and inhibition of inflammatory proteases. HIV-resistant women seem to overexpress serine protease inhibitors and underexpress serine proteases, possibly contributing to an anti-inflammatory environment and a more robust epithelial barrier (8 -10). Another study further supported the hypothesis that HIV resistance is the result of a balance between down-regulation of serine proteinases and upregulation of their inhibitors in the genital mucosa (10). In contrast, inflammation profiles, which are associated with increased HIV transmission risk (58), led to significant epithelial barrier disruption in the FGT, reduction of cornified envelope factors, and lower antiprotease levels (12).
MS-based techniques have been used extensively to study secretions in the FGT (8,(12)(13)(14)45). However, MS is time and labor intensive and high-throughput antibody-based microarray techniques have emerged as a complement (59). A large portion of the work can be automated and performed in microtiter plates, thereby allowing many samples to be processed at the same time in a high-throughput fashion. Although the detection of a limited number of proteins using antibody-based methods, such as ELISA, is widely used, the use of high-throughput antibody-based assays for analysis of CVS is to our knowledge rare.
In a parallel study, a partially overlapping sample set was analyzed with shotgun mass spectrometry and this allowed us to make a direct comparison between two complementary proteomic techniques. Several proteins exhibited a high degree of correlation between the two methods, among them proteins associated with the HIV-serodiscordant phenotype (SPPR3, KLK10, PI3, and CAPG) whereas CSTB showed lower concordance. In addition, for several proteins there was a difference in correlation between antibodies directed toward the same protein. There are several reasons for such discrepancy between the two methods and between antibodies direct toward the same protein. First, both methods offer only relative, not absolute, quantification. Khoonsari et al. observed a greater correlation between MS and bead-based affinity proteomics when performing data normalization with spiked-in chicken ovalbumin as compared with other normalization methods, indicating that a more quantitative approach could yield higher correlations (60). However, there were some major differences in their study as compared with ours, such as studying cerebrospinal fluid (not CVS) and the depletion of high abundance proteins prior to MS analysis. Second, the HPA antibodies used in our study have been validated for antibody selectivity using protein microarrays and their functionality explored for immunohistochemistry, but not protein detection in solution. It is known that antibodies are context dependent (61) and the antibodies might not recognize its target protein in its natural configuration with hidden epitopes and/or posttranslational modifications that could affect antibody affinity. However, in general we observed strong correlations between antibodies directed toward different regions of the same protein, indicating that the correct protein was captured. In summary, results from the bead-based affinity proteomics technique correlated well with an MS based method and is suitable for evaluating protein expression in genital secretions in a high-throughput setting.
Overall, the present study indicates a different baseline of protein abundance in HIV-serodiscordant women as compared with lower-risk women. Such a difference must be taken into consideration when interpreting results from clinical trials comprising individuals with high risk of HIV infection. Large intra-individual variations were observed for many proteins. ANXA3, ITIH1, and APOE had changing profiles in the largest number of individuals (60 of 62 serodiscordant women). Of the 24 proteins identified in the univariate analysis, SPRR3 and S100A9 were the two least-changing proteins, whereas FGA, TTR, and AHSG were the most variable in the largest fraction of serodiscordant women. The latter proteins are acute phase response proteins indicating a recent reaction to an inflammatory stimulus. Many parameters including semen deposition (62,63), composition of the vaginal microflora (15,16,64), hormonal status (13,14,65) and STIs (6,66) can influence genital inflammatory activity and protein expression. Such activities could possibly contribute to the fact that some of the serodiscordant women in our study exhibited changing profiles for most of the examined proteins. In the present study, we excluded women using any type of hormonal contraception and carefully recorded seminal contaminants and history of STIs to control for confounders. A closer look at their demographic parameters could however not reveal any specific activity that differed between women with changing versus stable protein expression, except for the sexual partner's plasma HIV levels, which were higher in those with high intensity variations. It could be speculated whether sexual exposure to higher levels of HIV induces inflammatory reactions, reflected as higher intensity variation, but the present study format does not allow such an evaluation. We also lack information on the menstrual cycle phase of the women as a potential confounder. Studies indicate that the menstrual cycle influences the genital proteome (13,65), and it is possible that the fluctuations of the genital proteome in our study were partly attributed to such hormonal changes. A previous study in the same cohort showed a very low prevalence of Chlamydia trachomatis and Neisseria gonorrhoeae (67), and therefore these potential confounders were not tested here.
The present study has further limitations including differences between the study groups for some of the demographic parameters including age, unprotected sex, PSA positivity, HSV-2-seropositivity and bacterial vaginosis. However, for TTR, SPRR3, CAPG and KLK10, most of the adjusted models resulted in stronger associations, and the HIV-serodiscordant versus control comparisons remained statistically significant. Similarly, 13 of the 24 proteins with significant association to the HIV-serodiscordant group in normalized data also showed concordant results when data was adjusted for the abovementioned parameters in a LME model, indicating that our findings are valid despite differences in demographics. Because seminal fluid in itself may induce inflammation in the cervix (62,63), it is especially relevant that the results remain after adjusting for PSA-positivity and unprotected sex. Despite this, unknown confounders may still exist that could contribute to differences in protein expression between the groups. Further, although some proteins were found with increased levels, their biological activities cannot be determined by the present techniques. Functional activity is dependent on the simultaneous presence of other proteins, including counteractive proteins, as well as their timing and location in the mucosal tissue (57). Another physiological parameter that could not be addressed was that proteins from CVS are not always representative of proteins active in the adjacent tissue. Basolateral secretions have a more direct effect on potential HIV target cells as well as the underlying mucosal fibroblasts, and the secretions of these cells, in turn, contribute to the female reproductive tract immune environment (68,69). Even though the methods do not address functional activity of the significantly elevated proteins, we could confirm their expression in situ. However, these stainings were performed on tissues that were neither age nor race matched to the women in the main study. An additional limitation is the low number of controls included in the study which could have contributed to the study being underpowered. However, despite our low sample number we observed statistically significant differences in protein expression between HIV-serodiscordant women and controls, yet larger studies are needed to confirm our findings.
In conclusion, factors that influence the integrity of the epithelial barrier and genital inflammation are present also in the general population in HIV-endemic regions and may con-tribute to altered susceptibility to HIV. Thus, the characterization of cervicovaginal protein signatures in different cohorts of women at risk for HIV infection, such as women living in HIV-serodiscordant relationships, and how it varies over time, is highly relevant to understand HIV transmission and pathogenesis. In addition, a high-throughput bead-based affinity set-up could be a suitable method for evaluation of such cervicovaginal protein signatures.