Identification of Potential Serodiagnostic and Subunit Vaccine Antigens by Antibody Profiling of Toxoplasmosis Cases in Turkey*

Toxoplasmosis, caused by infection of the protozoan parasite Toxoplasma gondii, is associated with mild disease in healthy individuals, whereas individuals with depressed immunity may develop encephalitis, neurologic disorders, and other organ diseases. Women who develop acute toxoplasmosis during pregnancy are at risk of transmitting the infection to the fetus, which may lead to fetal damage. A diagnosis is usually confirmed by measuring IgG, or IgM where it is important to determine the onset of infection. A negative IgM result essentially excludes acute infection, whereas a positive IgM test is largely uninterpretable because IgM can persist for up to 18 months after infection. To identify antigens for improved diagnosis of acute infection, we probed protein microarrays displaying the polypeptide products of 1357 Toxoplasma exons with well-characterized sera from Turkey. The sera were classified according to conventional assays into (1) seronegative individuals with no history of T. gondii infection; (2) acute infections defined by clinical symptoms, high IgM titers, and low avidity IgG; (3) chronic/convalescent cases with high avidity IgG but persisting IgM; (iv) true chronic infections, defined by high avidity IgG and no IgM. We have identified 38 IgG target antigens and 108 IgM target antigens that can discriminate infected patients from healthy controls, one or more of which could form the basis of a ‘tier-1′ test to determine current or previous exposure. Of these, three IgG antigens and five IgM antigens have the potential to discriminate chronic/IgM persisting or true chronics from recent acutely infected patients (a ‘tier-2′ test). Our analysis of the antigens revealed several enriched features relative to the whole proteome, which include transmembrane domains, signal peptides, or predicted localization at the outer membrane. This is the first protein microarray survey of the antibody response to T. gondii, and will help in the development of improved serodiagnostics and vaccines.


Toxoplasmosis, caused by infection of the protozoan parasite Toxoplasma gondii, is associated with mild disease in healthy individuals, whereas individuals with depressed immunity may develop encephalitis, neurologic disorders, and other organ diseases. Women who develop acute toxoplasmosis during pregnancy are at risk of transmitting the infection to the fetus, which may lead to fetal damage. A diagnosis is usually confirmed by measuring IgG, or IgM where it is important to determine the onset of infection. A negative IgM result essentially excludes acute infection, whereas a positive IgM test is largely uninterpretable because IgM can persist for up to 18 months after infection. To identify antigens for improved diagnosis of acute infection, we probed protein microarrays displaying the polypeptide products of 1357
Toxoplasma exons with well-characterized sera from Turkey. The sera were classified according to conventional assays into (1) seronegative individuals with no history of T. gondii infection; (2) acute infections defined by clinical symptoms, high IgM titers, and low avidity IgG; (3) chronic/convalescent cases with high avidity IgG but persisting IgM; (iv) true chronic infections, defined by high avidity IgG and no IgM. We have identified 38 IgG target antigens and 108 IgM target antigens that can discriminate infected patients from healthy controls, one or more of which could form the basis of a 'tier-1 test to determine current or previous exposure. Of these, three IgG antigens and five IgM antigens have the potential to discriminate chronic/ IgM persisting or true chronics from recent acutely infected patients (a 'tier-2 test). Our analysis of the antigens revealed several enriched features relative to the whole proteome, which include transmembrane domains, signal peptides, or predicted localization at the outer membrane. This is the first protein microarray survey of the antibody response to T. gondii, and will Toxoplasma gondii is an intracellular protozoan parasite found world-wide that infects mammals and birds, and is the causative agent of the disease toxoplasmosis in humans. Cats are the definitive host, and infection is transmitted to humans from infected fecal matter via ingestion of oocysts from contaminated soil or cat litter. Infection can also be transmitted from infected meat by consuming raw or undercooked meat (1,2). Seroprevalance, a measure of numbers chronically infected, increases with age and decreased socioeconomic status. In general, rates have been declining in developing countries to around 20%, although seroprevalance may exceed 70% elsewhere (1,3,4). In Turkey, recent estimates indicate seroprevalance at 30 -60% (5)(6)(7)(8)(9). Most infections are asymptomatic in healthy individuals, although around 10 -20% may experience mild, flu-like symptoms and lymphadenopathy during a primary infection. Individuals with compromised immune systems, such as those with AIDS or those undergoing therapeutic immunosuppression after transplantation, are particularly at risk from fatal complications, such as encephalitis, myocarditis, and pneumonitis. Women that become infected for the first time during the first trimester of pregnancy may also transmit infection to the fetus (congenital toxoplasmosis), causing spontaneous abortion or stillbirth, or overt symptoms in the newborn. Infections acquired later in pregnancy are usually asymptomatic in the newborn, although most will go on to develop retinochoroiditis, and some will develop blindness (1). Rapid and accurate diagnosis of acute infection in the pregnant mother is vital because treatment can reduce the risk of transmission and seriousness of disease in the neonate. An incorrect diagnosis can result in unnecessary abortion or treatment with potentially teratogenic drugs (10).
Routine diagnosis of toxoplasmosis is based on the detection of T. gondii-specific antibodies. The original test, developed in 1948, is the Sabin-Feldman Dye Test (DT) test for specific IgG (11). This is a sensitive and specific bioassay in which live parasites are sensitized with patient sera in the presence of complement, and neutralization (lysis) of the organisms is quantified. The dye test is still considered to be the reference test. Currently it is used by only a few specialized diagnostic laboratories owing to the requirement for cultured organisms a high level of technical skill. More rapid and userfriendly ELISAs and agglutination tests are now routinely used for diagnosis. Most are based on largely undefined T. gondii lysates as the detection antigens (12)(13)(14), and recent attempts to use more defined antigens have been met with only limited success (15)(16)(17)(18)(19)(20)(21)(22)(23). An indirect immunofluoresence assay (IFA) 1 using whole, formalin-fixed T. gondii tachyzoites, is also widely used to detect specific IgG (24,25).
Suspected cases are confirmed by testing for IgG, although this can be an indicator of either current or prior exposure. A negative result in someone with clinical symptoms of toxoplasmosis requires the test is repeated after 2-3 weeks, after which an immunocompetent infected individual should seroconvert. IgG testing can also be conducted to help diagnose congenital toxoplasmosis in the newborn. A negative IgG result helps exclude infection, whereas a positive result is interpreted with caution as it may be passively-acquired maternal IgG. Diagnosis in immunocompromized individuals is particularly challenging. For example, IgG titers in AIDS patients are often low and frequently below the sensitivity of detection. Diagnosis based on other biomarkers or PCR based molecular diagnostic techniques are being sought for these patients (26,27).
During pregnancy, a positive IgG result is followed by an IgM test to help determine whether the infection is current, or from a previous exposure. Despite the widespread use of commercially available IgM test kits, their performance differs widely and results open to misinterpretation. A negative IgM test helps rule out acute infection, although a positive result is difficult to interpret because IgM can persist long after a primary infection (15,28,29). A positive IgM, negative IgG result requires that the patient is tested again ϳ2 weeks later to confirm seroconversion to IgG. No change in the IgG titer indicates the IgM was a false positive. If the patient is pregnant and positive for both IgG and IgM, an IgG avidity test is performed to help establish the time of infection (30,31). High avidity IgG in the first trimester indicates the exposure probably occurred before pregnancy and the fetus is at no risk for congenital toxoplasmosis. In contrast, low avidity IgG may help in the diagnosis of acute infection, although the result should be interpreted with caution as some individuals have low affinity IgG that persists for several months after infection (32)(33)(34). In all cases, it is recommended that a diagnosis of recent or acute infection is re-tested by an experienced toxo-plasmosis reference laboratory using a panel of serologic and molecular tests, including the complement fixation test. Confirmation is particularly important in cases of suspected acute infections during pregnancy as decisions whether to terminate a pregnancy will rest on accurate diagnosis.
Overall, the algorithm for diagnosis of recent infection by serology is complex, and would benefit from a simplified number of tests. We hypothesized that a proteome-microarray approach to profile the antibody response during infection against thousands of different T. gondii proteins might help identify novel antigens with more precise serodiagnostic use. The aims of this study were to identify (1) IgM target antigens that were high in acute infection but which declined thereafter, and (2) IgG target antigens that were low in acute infection but high and chronic with persisting IgM. To address these aims we used protein microarrays to screen 1357 prioritized T. gondii exon products with 106 well-characterized sera from Turkish toxoplasmosis cases and controls. We report we have identified target antigens of IgG and IgM specifically associated with infection, and a subset of antigens that may discriminate acutely from chronic/IgM persisting and true chronically infected patients. Seroreactivity is unevenly distributed among protein antigens assigned to certain Gene Ontology (GO) categories or predicted to localize to outer membrane. Our study will help understand the human response to natural infection and may lead to improved diagnosis of T. gondii infection.

MATERIALS AND METHODS
Ethics Statement-Human sera were obtained from patients collected previously for serological testing of toxoplasmosis in Bornova/ İzmir, Turkey and maintained at the Ege University Medical School, Department of Parasitology, Bornova/İzmir, Turkey. All patients provided written informed consent prior to usage of their previously collected sera. For the present study, sera were coded and anonymized, and supplied to UCI for probing without patient identifiers other than clinical diagnosis, as approved by the Ege University Medical School, Research Ethics Committee (protocol number 10 -10/7).
Sera-The serum samples used in the present study were classified into four groups. Group 1 was composed of seronegative individuals from Turkey with no known history of T. gondii infection. Group 2 was composed of recently acute patients' sera collected in an outbreak that occurred in Turkey in 2002 (35). Sera were collected from these patients 1-2 weeks after the onset of symptoms. Group 3 is composed of chronic patients with persisting IgM antibodies and a high IgG avidity index. Group 4 is composed of chronic patients with negative IgM antibodies and high IgG avidity index.
IgG IFA-IFA was performed as described with some modifications (36 -39). Briefly, the slides coated with HeLa cell culture and BALB/c derived T. gondii RH Ankara strain tachyzoites were probed with anti-Toxoplasma IgG positive patient serum samples at dilutions of 1/16, 1/64, 1/128, 1/256, 1/512, and 1/1024 for 30 min at 37°C, and washed three times with phosphate buffered saline (PBS). The slides were then probed with anti-Human IgG antibody conjugated with fluorescein (Biomerieux, France) at 1/1250 for 30 min at 37°C. Slides were washed and examined under an immunofluorescence microscope (Olympus, Tokyo, Japan) for quantification of fluorescent parasites. Sera that retained activity over 1/16 dilution were considered seropositive.
IgG and IgM ELISA-ELISAs were performed as described (14, 37, 40 -42) with some modifications. Detection antigen was prepared from T. gondii RH Ankara strain tachyzoites obtained from peritoneal exudates of infected BALB/c mice. The tachyzoites were centrifuged at 500 ϫ g for 5 min and quantified in the supernatant with a hemocytometer. The supernatant was then centrifuged for 10 min at 3000 ϫ g and the pellet washed 3 times with PBS (pH 7.4). The pellet was resuspended in 1% SDS in distilled water and subjected to several cycles of freezing and thawing. Finally, the lysate was centrifuged at 14,000 ϫ g for 15 min and the supernatant (antigen suspension) passed through 0.22 m filter (Macherey-Nagel, Germany).
For ELISAs, each well of a flat-bottom high-binding microtitre plate (Costar, Corning Inc., Corning NY) was coated with 100 l antigen suspension containing the equivalent of 1 ϫ 10 5 tachyzoites. Plates were incubated for 1.5 h at room temperature (RT). Next, serum samples at dilution of 1/256 for IgG ELISA and 1/64 for IgM ELISA in blocking buffer (0.5% casein in 1ϫPBS, pH 7.5) were added to each well, incubated for 1 h at RT and washed 3 times with 1ϫ PBS. The wells were probed with recombinant protein G (Zymed Laboratories Inc., San Francisco, CA) or anti-Human IgM (Sigma) conjugated with peroxidase at dilutions of 1/50,000 and 1/5,000, respectively for 30 min at RT. Thereafter, bound antibodies were visualized after adding 3, 3Ј, 5, 5Ј tetramethylbenzidine substrate. Reactions were stopped by adding 75 l of 2 N sulfuric acid and the results quantified in a micro titer plate reader (Bio-Tek ELx808, U.S.A.) at 450 nm. Samples were considered positive if the absorbance value (AV) of the serum samples exceeded the mean AVϩ7S.D. (for IgG ELISA) and AVϩ5S.D. (for IgM ELISA) of the negative control serum samples.
IgM Capture ELISA-A commercially available IgM capture ELISA kit (Radim Diagnostics, Italy) was used according to the manufacturer's instructions. Controls provided in the kit and the serum samples were diluted to 1/100 in sample diluent and added to the microtiter plate precoated with monoclonal anti-human IgM antibody to capture serum IgM. The plate was incubated for 1 h at 37°C and washed 4 times with 1ϫ PBS-T (PBS containing 0.05% Tween-20). Each well was probed with lyophilized inactivated Toxoplasma antigen reconstituted with monoclonal anti-Toxoplasma antibody conjugated with biotin, incubated for 1 h at 37°C and washed 4 times with 1ϫ PBS-T. After incubation in streptavidin-conjugated HRP at 37°C for 30 min, the plates were washed 4ϫ in 1xPBS-T and bound antibodies visualized using tetramethylbenzidine developer at RT for 15 min. Reactions were stopped and quantified as above. The presence or absence of anti-Toxoplasma IgM was defined against the AV of the cutoff control supplied in the kit.
Avidity Assay-The IgG avidity assay was performed as described (30,43,44) with minor modifications. Flat bottom high binding microtitre plates (Costar) were coated with tachyzoite lysate as described for IgG ELISAs above. Next, serum samples diluted to 1/256 in 0.5% casein buffer were added to each well in two titration rows (Row A and B) and incubated for 15 min at RT. Then, 6 M urea in 0.5% casein buffer was added to each well of row A and 0.5% casein buffer without urea was added to row B. After incubation for 15 min at RT, each well was washed 3 times with 1ϫ PBS and probed with recombinant protein G-HRP conjugate (Zymed Laboratories Inc.) at a dilution of 1/50,000 for 15 min at RT. Thereafter, bound antibodies were visualized using tetramethylbenzidine developer and stopped as above. The avidity index (AI) was expressed as a percentage using the formula (absorbance value row A /absorbance value row B ) ϫ 100. Sera associated with early infection (Ͻ 3-4 months) typically have an AI Ͻ20%. Sera associated with late infection (Ͼ 6 months) typically have AI Ͼ 30%, whereas between 20 -30% was considered borderline. In the present study, a serum sample with a low AI, that is also positive by IgM capture ELISA, was classified as an infection occurring in previous 3-4 months (i.e. recently acute infection). High AI and pos-itive IgM assay was classified as chronic/IgM persisting, whereas high AI and negative IgM were chronic.
Microarray Fabrication and Probing-Proteome microarrays were fabricated essentially as described previously (45)(46)(47) by PCR amplification of coding sequences in genomic DNA, followed by insertion of amplicons into a T7 expression vector by homologous recombination, and expression in coupled transcription-translations in vitro before printing onto microarrays. Rather than use cDNA as the PCR template, which may under represent genes expressed at low levels in vivo, we used genomic DNA and amplified exons separately, a strategy we have used previously with the malaria parasite (46). For PCR primer design, the genomic sequence of type II strain ME49 of T. gondii was obtained from the Toxoplasma Genomics Resource (http://toxodb.org/toxo/) and a sequential bioinformatic filtering strategy was applied to prioritize genes targeted for cloning based on features we have found in previous studies to be enriched within seroreactive antigen sets (see results). Custom PCR primers were designed to amplify 2000 exons. The PCR primers comprised 20 bp of exon-specific sequence with 20 bp of "adapter" sequences, and were used in PCR reactions with 20 ng genomic DNA. Genomic DNA was obtained from type II Prugniaud strain T. gondi parasites that were freshly lysed out of monolayers of human foreskin fibroblasts and extracted using the Wizard Genomic Purification Kit (Promega) as per the manufacturer's instructions. For genes Ͼ3 kb, additional primer pairs were designed to amplify overlapping fragments of 3 kb each. PCR primers were also designed to amplify complete genes RON5(48), ROP13(49), PP2C-hn (50), PP2C2 (50), 002200, RON4, Toxolysin-1 putative rhoptry metalloprotease (51), ISP2(52) and ISP1(52) from plasmids encoding cDNAs. The adapter sequences, which become incorporated into the termini flanking the amplified gene, are homologous to the cloning site of a linearized T7 expression vector and allow the PCR products to be cloned by in vivo homologous recombination in competent DH5␣ cells. We have because introduced a more efficient method which allows the recombination to occur in vitro with much lower amounts of recombination product to be transformed into Escherichia coli (53). The resulting protein incorporates an ATG translation start codon, a 5Ј polyHistidine epitope, a 3Ј influenza hemagglutinin epitope and a T7 terminator. Up to three additional rounds of amplification were attempted for amplification failures, which were usually recovered by adjusting the PCR conditions, or using a different polymerase, or ordering new primers. For cloning, PCR products were mixed with a linearized expression vector (Antigen Discovery Inc., Irvine, CA) and used to transform supercompetent DH5-alpha cells to kanamycin resistance. The cells were cultured at 37°C with vigorous aeration and checked for turbidity the following day. DNA was purified from the overnight cultures without prior colony selection using QIAprep 96 Turbo Miniprep Kits from Qiagen.
For array fabrication, purified minipreparations of DNA were expressed in the E. coli based in vitro transcription/translation (IVTT) expression system (RTS-100 from Roche). Ten-microliter reactions were set up in sealed 384 well plates and incubated for 16 h at 24°C in a platform shaker at 300 rpm. A protease inhibitor mixture (Cømplete, Roche Diagnostics) and Tween-20 to a final concentration of 0.05% were then added prior to printing. The RTS reactions were printed in singlicate without further purification onto three-pad nitrocellulose-coated FAST slides (Whatman) using a Gene Machine Om-niGrid Accent microarray printer (Digilabs Inc.) in 4 ϫ 4 sub-array format, with each subarray comprising 108 spots. Each subarray included multiple negative control spots comprising "mock" RTS reactions lacking DNA template. Each subarray also included positive control spots of 4 serial dilutions of mouse, rat and human whole IgG and two serial dilutions of human IgM and mouse IgM. Together these positive and negative controls are used to normalize the data from different arrays (see below). Also included were purified recombinant Epstein-Barr virus nuclear antigen-1 (EBNA-1, DevaTal, Inc., Hamilton, NJ), which is recognized by the majority of humans, and which serves as a useful guide to serum quality.
To monitor the expression in each spot we used antibodies against the N-terminal poly-His (clone His-1, Sigma) and the C-terminal HA (clone 3F10, Roche) tags engineered into each protein. Arrays were first blocked for 30 min in Protein Array Blocking Buffer (Whatman, Keene, NH) at RT and then probed for 1 h with antitag antibodies diluted 1/1,000 in blocking buffer. The slides were then washed 6x in tris(hydroxymethyl)aminomethane (Tris)-buffered saline containing 0.05% (v/v) Tween 20, (T-TBS) and incubated in biotinylated secondary antibodies (Jackson ImmunoResearch). After washing the slides 6ϫ in T-TBS, bound antibodies were detected by incubation with streptavidin-conjugated SureLight® P-3 (Columbia Biosciences, Columbia MD). The slides were then washed 3x each in T-TBS followed by TBS, and dipped in distilled water prior to air drying by brief centrifugation. Slides were scanned in a Perkin Elmer ScanArray confocal laser scanner and data acquired using ScanArrayExpress software.
For probing with human sera, samples were diluted to 1/200 in Protein Array Blocking Buffer supplemented with E. coli lysate (Antigen Discovery Inc.) at a final concentration of 10 mg/ml protein to block anti-E. coli antibodies, and incubated at 37°C for 30 min with constant mixing. Meanwhile the arrays were incubated in Protein Array Blocking Buffer for 30 min and probed with the pretreated sera overnight at 4°C with gentle rocking. The slides were then washed 6ϫ in T-TBS and incubated in biotinylated anti-human IgG HϩL (Jackson ImmunoResearch) diluted 1/400 in Protein Array Blocking Buffer. After washing the slides 3ϫ each in T-TBS and TBS, bound antibodies were visualized as described above.
Data Analysis and Statistical Treatment-Raw data were collected as the mean pixel signal intensity data for each spot. To stabilize variance of the raw data, a variant of the log-transformation (asinh) was used (54,55), and negative and positive control spots (the "no DNA" and IgG spots, respectively) were used to normalize the data using the "VSN" package in R from the Bioconductor suite (http:// Bioconductor.org/). We then calculated p values on the normalized data by comparing signals between groups of donors using a Bayesregularized t test adapted from Cyber-T for use with protein arrays (54 -59). To account for multiple test conditions, we calculated p value adjustments by the Benjamini-Hochberg method (60). Reactive antigens were defined as positive when the normalized signal intensity was above the mean ϩ 4SD of the average "no DNA" control spots. Discriminatory antigens were those having a Benjamini-Hochberg adjusted Cyber T p value Ͻ0.05. Multiple antigen classifiers were built using Support Vector Machines (SVMs). The ''e1071Ј' and ''ROCRЉ packages in R were used to train the SVMs and to produce receiver operating characteristic curves, respectively. For heat maps and histograms, raw data were used; signals for each donor were first subtracted of the average signal of their corresponding control spots (IVTT lacking DNA template) before the plots were generated. To assess functional enrichment significance, computational predictions of signal peptides and transmembrane domains were obtained from the toxoDB database. Predictions of subcellular localizations were made using WoLF pSort (61). p values for enrichment statistical analysis were calculated using Fisher's exact test in the R environment.

RESULTS
Chip Fabrication-The Toxoplasma Genomics Resource (62) or "ToxoDB" (http://toxodb.org/toxo/) lists 8,155 genes in the T. gondii genome, comprising a total of 43,010 exons. The genes have varying numbers of exons, ranging from 1 (n ϭ 2135 genes) to 63 (n ϭ 3 genes) (Fig. 1A). All genes have lengths varying from 71 to 35,589 bps. As expected, there is a general trend for the longer genes to have more exons (Fig.  1B). We aimed to down-select the number of exons to ϳ2000 using a bioinformatic filtering process based on antigenic features seen in other proteome-wide serological screens in bacteria (63,64). First, genes lacking a mass spectroscopy profile were excluded to enrich for functional genes. Then we used GO annotation and "product description" from ToxoDB to identify proteins belonging to the categories of "outer membrane," "heat shock protein," "chaperone," "transport protein," "integral membrane protein," "transmembrane protein," "lipoprotein," or "virulence associated protein." This gave a list of 1059 genes (ϭ 6829 exons). Exons below 200 bp were also excluded, based on observations from our work with Plasmodium falciparum arrays and many bacterial arrays which revealed that the majority of reactive antigens tended to be Ͼ 8 kDa. PCR primers for the remaining 2705 exons (from 952 genes) were designed for high throughput cloning and expression for microarray printing.
Cloning commenced with the smallest exons first. A first iteration of the chip ("TG1") comprised the first 1357 exon products (from 615 genes) amplified from T. gondii Prugniaud strain, which ranged from 67 to 158 amino acids in length. This represents 50% of the target number of 2705 exons. An additional 10 gene products from cloned cDNAs, listed in the Materials and Methods, were also expressed and printed. Protein expression for each spot on the array was verified using antibodies to N-and C-terminal polyhistidine and hemagglutinin epitope tags. This confirmed 93% of the expression products were detected by at last one of the epitope tag antibodies (supplemental Table S1).
IgG Profiles Defined by Microarray Correlate with Conventional IgG Assays-Sera used in this study were classified into four groups according to the panel of four conventional antibody assays (see Materials and Methods). These data are summarized in Table I. Group 1 was composed of individuals from Turkey with no known history of T. gondii infection and who were seronegative by all conventional assays. Group 2 comprised recently acute cases from the 2002 Turkish outbreak (35). T. gondii infection in immunocompetent people is symptomatic in about 10% cases where the most common symptoms are cervical or occipital lymphadenopathy (1). Thus, it is not easy to find sporadic recently acute toxoplasmosis cases because of these subtle clinical symptoms. Antibody levels are also widely different owing to the unknown onset of infection. In contrast, outbreak cases have roughly similar times of infection and antibody levels. All sera were collected within ϳ2-3 weeks of the outbreak occurring, and each donor had clinical symptoms consistent with a recent infection. These individuals were IgG positive/IgM positive and low IgG avidity in conventional tests. Group 3 comprised chronic/IgM persisting infections, which were characterized as IgG positive/IgM positive and high IgG avidity in conventional tests. Although having the high avidity IgG, a hallmark of chronic infection, the persisting IgM response normally precludes them from being classified as regular chronic cases.
Similarly, the high avidity excludes them from the acute group. Group 4 were true chronic infections characterized by being IgG positive, with high avidity, but IgM negative.
An overview of the IgG antibody profile is shown by the heat map in Fig. 2. The data obtained from the conventional IgG and IgM assays are also represented above the heat map for comparison. The reactive antigens (listed on the vertical axis) are separated into discriminatory (BH-corrected p value Ͻ0.05; n ϭ 38) and nondiscriminatory (BH-corrected p value Ն0.05; n ϭ 16) when the negative control population was compared with each of the three infected populations by t test. Overall the array data agreed well with the conventional assays. Reactivity by the seronegative group, was also minimal on the array, whereas the acute population showed strong IgG reactivity on the arrays. Interestingly, three antigens were recognized by the majority of the donors in the acute stage, (027620_2, dense granule protein GRA2; 001390_1, small secreted protein, and the putative rhoptry metalloproteinase TLN-1), consistent with the conventional IgG ELISA and IFA. It is noted that all three of these are secreted proteins. The remaining reactive antigens (nϾ30) were recognized by less than half the acute donors only. These donors may represent the earliest acute infections where the maximal profile is yet to be attained, although there appears no increase in IgG titer or avidity associated with the expanded IgG profile as might be expected. The chronic/IgM persisting group was characterized by the broader antibody profile, as was seen by around half of the acute infections. Thus, a second characteristic feature of the chronic/IgM persisting phenotype, in addition to the persistence of IgM itself, is a broad IgG antibody specificity profile. Unexpectedly, the true chronic cases, which were diagnosed on the basis of high IgG/high avidity but no IgM, were seen to have a narrower IgG specificity profile on arrays. Overall, the breadth of the IgG profile appears to increase to a peak in the chronic/IgM persisting stage, and decrease again in the true chronic population.
Identification of IgG Responses with Potential Diagnostic Use-To begin to identify serodiagnostic antigens, we focused on the IgG target antigens that best discriminated between naives and each of the three infected states in Fig. 3. Eight antigens are identified that discriminate negative versus acute cases (Fig. 3A). Seven showed minimal reactivity in negative controls, stronger reactivity in acute and chronic/IgM persisting groups, but lower reactivity in the true chronic stage. An exception was antigen 086450_1/dense granule protein 5, whose average signal continued to rise throughout time course. Reactivity to antigen 070220_3/hypothetical protein was seen in all four stages, including the negative control population, and may represent nonspecific cross-reactivity that is unrelated to Toxoplasma infection. The largest number of discriminatory antigens (n ϭ 37) was found when comparing negative versus chronic/IgM persisting, i.e. when the re-sponse peaks, although the majority of these antigens had weak signals (Fig. 3B). Finally, four antigens discriminated chronic from negative cases (Fig. 3C). Antigen 086450_1/ GRA5 and 001390_1/hypothetical protein were discriminatory in all three stages of infection, and antigen 070250_2/GRA1 appeared to react more strongly in chronic populations than in acute populations. These antigens of interest are detailed in supplemental Table S2 where the properties of each protein are described. Antigens characterized by "ascending" IgG responses (i.e. low signal in acute but high in chronic/IgM persisting and chronic) may be of particular use in excluding a diagnosis of acute infection. To better identify such antigens, we com-  (Table I) defined by a panel of conventional assays as the "gold standard." The seropositive sera were subclassified into three groups: acute, chronic/IgM persisting, and chronic infections. Array data were log transformed and normalized for statistical analysis, and raw data were used to generate the heat map. Antigens are listed on the vertical axis, and the individual donors along the top. Corresponding IgG and IgM titers by conventional assays for each donor are shown immediately above the heat map. Only the positive antigens ("hits") are listed (defined by an average signal of an infected group Ն mean ϩ 4SD of "no-DNA" control spots; n ϭ 55 out of 1357). These antigen "hits" were subclassified by T-tests into discriminatory (p Ͻ 0.05; n ϭ 38) and nondiscriminatory (p Ͼ 0.05; n ϭ 16) by comparing the negative group with each of the three groups of infected individuals separately. The three sets of antigens were combined and duplicates that were significant in more than one comparison removed. The hits are ranked by the average response of the chronic group, and donors sorted from left to right within each of the four patient populations by increasing average signal. Antigen 070220_3 was significant when comparing negative and acute populations, but lost significance when the naives were compared with groups 2-4 as a whole. pared acute stage infections with chronic/IgM persisting and chronic (Figs. 4A and 4B, respectively) using t tests. Responses to only two of the 18 discriminatory antigens in Fig.  4A were of the ascending type (exons 086450_1/GRA5, and 070250_2/GRA1). The response to antigen PP2C-hn (phosphotase 2C) also remained relatively high in chronic infection. The remaining 15 responses peaked in the chronic/IgM persisting stage and fell to almost background levels by the chronic stage. In isolation, these antigens would provide no discrimination between acute and chronic infection. The IgG response to exon 027620_2/GRA2 was too high in both the acute and the chronic groups to be of diagnostic use. In Fig.  4B is shown a similar analysis comparing acute infections to true chronic infections. Responses to five antigens were of the ascending type, whereas the remaining four peaked in chronic/IgM persisting and were lower in true chronic individuals. As noted earlier, one antigen (exon 070220_3/hypothetical protein) was recognized by control individuals and therefore excluded. Of those remaining, average responses to two ascending antigens (exons 086450 _1/GRA5 and 070250_2/ GRA1) were low in acute infection, whereas high in both chronic/IgM persisting infections and true chronic infection. The three best discriminatory antigens (exons 086450_1/ GRA5, 070250_2/GRA1 and PP2C-hn/phosphatase 2C) are shown in dot plots of (Fig. 4C) to display the response at the level of the individual donor. Of the three antigens, 086450_1/GRA5 appears to have the best discriminating capacity.
Human IgM Profiles-An overview of the IgM response is shown in the heat map in Fig. 5. There were 108 antigens that discriminated between the negative population and the three infected populations (BH-corrected p value Ͻ0.05; Fig. 5). This number is considerably greater that the number of discriminatory antigens recognized by IgG (n ϭ 38). The reactivity in negative donors was minimal, highest in acute and chronic/IgM persisting stages, and returning to background levels in chronic donors. IgM target antigens that discriminate between naïve controls and each of the three infected states are shown in Fig. 6.There were 93 antigens that could discriminate acute cases from negative controls (the top 40 of which are shown in Fig. 6A), 60 antigens that could discriminate between chronic/IgM persisting cases from negative controls (top 30 in Fig. 6B), 48 of which could discriminate both from negative controls, and 3 that could discriminate chronic cases from negative cases (Fig. 6C). These antigens are listed in supplemental Table S3 with the properties of each.

Identification of IgM Responses with Diagnostic Use-
The detection of IgM responses is useful for diagnosing acute infection by many pathogens. However, conventional IgM ELISAs are of limited use in toxoplasmosis owing to IgM that persists in many individuals. It was of considerable interest, therefore, to use the arrays to identify specific antigens recognized by IgM in the acute stage of infection whose titers consistently fell in the chronic/IgM persisting and chronic stages ("descending" antigens). We first combined the chronic/IgM persisting and chronic data sets and compared the pool with acute infections using BH-corrected T-tests. A total of 91 discriminatory antigens (p Ͻ 0.05) were found, of which 20 were peaked in the acute stage (Fig. 7A). However, IgM titers to all but three of these antigens remained elevated in both the chronic/IgM persisting and acute stages stage and therefore of limited use for diagnosing acute infection. These three were 046330_5/CRAL/TRIO domain-containing protein, 044280_1 and 023540_10 (both hypothetical proteins). We repeated the analysis comparing IgM in acute versus chronic/ IgM persisting only to determine if additional candidates could be discovered. Eight antigens were discriminatory (Fig. 7B), of which five were of the descending type and of interest. The top 3 (046330_5, 044280_1, and 023540_10) were discovered previously. Responses to the remaining two antigens, 048670_ 5/Hϩ translocating inorganic pyrophosphatase/TVP, and 088400_9 (hypothetical protein), were also potentially diagnostic although the signals were low. These 5 IgM targets antigens are different from the three IgG targets shown in Fig. 4C.
Although the averages presented here indicate an overall trend to peak in the acute stage and descend thereafter, it is important to know whether this also applies at the individual patient level. Thus, these five best descending-type antigens are shown in the dot plots in Fig. 7C. Although a significant number of the chronic/IgM persisting and true chronic cases retain relatively high IgM titers, further analysis of these candidate diagnostic antigens in more conventional ELISA and immunostrip formats is warranted. To validate the recognition by another method, 5 candidate serodiagnostic antigens were expressed in vitro and probed with immune sera in Western blots (supplemental Fig. S1). Sera from patients with chronic infection reacted against the antigens with variable signal intensities, whereas naive samples showed very little reactivity against these antigens.
Receiver Operating Characteristic (ROC) Analysis to Build a Serodiagnostic Classifier-To assess the accuracy of this collection of antigens to distinguish acute infection from all chronic cases (chronic and chronic/IgM persisting), crossvalidation ROC curves and area under the curve (AUC) box plots were generated (Fig. 8). The candidate serodiagnostic antigens were ranked by decreasing single antigen AUC. The three IgG target antigens (086450_1, 070250_2, and pp2C-hn) have AUC values of 0.83, 0.79, and 0.64, respectively, with 086450_1 giving the best single antigen discrimination with sensitivity and specificity 81% and80% (Figs. 8A and 8B), respectively. The five IgM exons (046330_5, 044280_1, 023540_10, 088400_9 and 048670_5) have AUC values of 0.92, 0.82, 0.79, 0.73 and 0.65, respectively. Antigen 046330_5/CRAL/TRIO domain-containing protein gave the best single antigen discrimination, with sensitivity and specificity at 85 and 83%, respectively (Figs. 8C and 8D). We used kernel methods and support vector machines (65,66) to build linear and nonlinear classifiers. As input to the classifier, we used the highest-ranking 1, 2, 3, 4, and 5 antigens on the basis of single antigen AUC. The results were validated with 30 runs of 3-fold cross-validation, and the validation results are averaged over the rounds. Variability and mean value of accuracy from this model has been shown in the AUC box plot. This classifier yielded the highest sensitivity and specificity rate of 81 and 85% for the top 2 IgG antigens, with a mean accuracy rate of 85%. Although combining the top three antigens increased the sensitivity to 85%, the specificity fell to 80%. For IgM antigens, the combined top three produced sensitivity and specificity over 85% with mean AUC of 94%.
FIG. 5. Heat map overview of the IgM profiles. Arrays were probed with four groups of patient sera as described in Fig. 2 and IgM visualized as described in the Materials and Methods. Shown are only seropositive antigens that were classified by t-tests into discriminatory (p Ͻ 0.05; n ϭ 108) by comparing the negative group with each of the three groups of infected individuals separately. The three sets of antigens were combined and duplicates that were significant in more than one comparison removed. The antigen names, and the entire non-discriminatory reactive antigens (p Ͼ 0.05) have been omitted for clarity. Antigens are ranked by the average response of the acute group, and donors sorted from left to right within each of the four patient groups by increasing average signal to all 108 antigens.
Overlap of IgM and IgG Profiles-Class-switching from IgM to other immunoglobulin isotypes is an important component of the maturation of an immune response. We noted earlier that the number of IgM targets that discriminate between negative controls and all three stages of infection was substantially more than the discriminatory IgG targets (108 and 38, respectively). Scatter plots (Fig. 9) of the average IgG and IgM signals in each of three stages of infection illustrate the extent of the overlap; combined, there are a total of 126 different antigens in both IgM and IgG profiles, of which 20 were seen in both, consistent with class switching. In addition, 88 antigens were seen in the IgM profile but not the IgG profile, and a further 18 antigens were seen in the IgG profile but not the IgM profile.
Enrichment Analysis-To better understand the properties of proteins that may determine whether or not it produces an antibody response during infection, we performed an enrichment analysis of the antigens identified in this study that discriminated between naïve controls and Toxoplasma infection. For this, the reactive antigens were assigned to a GO classification (component, process and function) as defined by ToxoDB. In addition, computational predictions were made for transmembrane domains, signal peptides, isoelectric point (pI), ortholog group information and subcellular localization. The number of reactive antigens identified on the array in each classification was divided by the total number of genes the T. gondii genome with this classification to give a figure for fold-enrichment. The significance of enrichment values were also calculated using Fisher's exact test in the R environment. Classifications over-represented (enriched) among "hits" have values Ͼ1 and those under-represented have values Ͻ1. A p value of Ͻ0.05 indicated a significant fold-enrichment.
Computational analyses are shown in Table II. We found proteins that harbor transmembrane domains were significantly enriched in discriminatory antigens. Interestingly, as the number of predicted transmembrane domains increased from 1 to 10, fold enrichment also increased from 2.2 to 9.6, with p values of 4.47E-04 and 1.327E-10, respectively. Conversely, proteins without transmembrane domains were significantly underrepresented (0.6 fold-enrichment; p value 7.01E-18). Proteins with signal peptides were significantly enriched, as were outer membrane proteins (fold -enrichment of 2.9 and 2.9, respectively, and p values 1.366E-21 and 2.533E-13, respectively). Conversely, proteins that do not have signal peptides were significantly underrepresented (0.5-fold), as were proteins predicted by WoLF pSort to localize in cytosol and nucleus (0.6 fold and 0.4 fold respectively).
Antigens classified according to GO components are shown in Table III. Membrane associated proteins were enriched (fold-enrichment of 5.6; p value 4.079E-15), in agreement with the computational predictions. Interestingly, there were 2 antigens that were classified as GO protease complexes, compared with 22 total GO protease complexes in T. gondii genome (6.4-fold enrichment; p value 0.038). Proteins not assigned to GO component categories were underrepresented (0.7-fold enrichment; p value 3.876E-10).
We also analyzed enrichment of antigens assigned by GO process and GO function classifications (supple-mental Tables S4 and S5, respectively). When assigned by GO process function, proteins involved in ATP biosynthetic process were very enriched (23.3-fold; p value 8.361E-09). Several proteins involved in transport were also significantly enriched: ion transport, protein transport, vesicle mediated transport, and other transport functions were enriched (7.8-, 4.5-, 6.9-, and 7.0-fold, respectively). Proteins involved in metabolic process, proteolysis, and signal peptide processing were also enriched (3.4-, 4.1-, and 20.0-fold, respectively). Conversely, proteins not assigned with GO process categories were significantly underrepresented (0.5-fold; p value 3.301E-21). When assigned by GO functions, proteins involved in protein binding, catalytic activity, transporter activity, transferase activity were significantly enriched (2.0-, 4.0-, 5.3-, and 2.8-fold, respectively). Proteins with enzymatic activity other than kinase activity were enriched at 2.0-fold, and enzyme regulator activity, structural molecule activity and ion channel activity were enriched at 21.5-, 9.7-, and 7.6-fold, respectively. Interestingly, we identified 2 an- tigens with GO solute:hydrogen antiporter activity, out of 4 from the genome, leading to 32.2-fold enrichment. There were a total of 5491proteins with GO null functions, which was 0.6-fold underrepresented. Proteins involved in nucleotide and nucleic acid binding were also underrepresented at 0.4-fold. DISCUSSION A specific aim of the work here was to identify antigens that could be used to improve the serodiagnosis of toxoplasmosis. Many antigens were identified that were discriminatory between controls and all infected cases. From one or more of these could be derived a first level ('tier-1Ј) test to simply establish evidence of a current or previous infection. The persistence of IgM in many patients compromises current IgM-based tests for accurate diagnosis of acute infection. Therefore, it was of considerable interest to use the antibody profiling technology to identify particular IgM responses that peaked in the acute stage, but which rapidly declined there-after. In addition, IgG responses that were low in the acute stage, but which were elevated in chronic/IgM persisting and chronic stages may also be of diagnostic use. As a result, three IgG and five IgM candidate target antigens were identified (Figs. 4C and 7C). One or more of these could be used for a 'tier-2Ј test to help diagnose whether the exposure is current/acute. These eight antigens are currently undergoing testing in ELISA and immunostrip formats to evaluate their use in meeting these aims. These data may also assist in the discovery of subunit vaccine candidates, as currently there exists no vaccine against T. gondii.
Over the past couple of decades, around 20 different proteins from T. gondii have been evaluated as antigens for serodiagnosis of toxoplasmosis (reviewed recently in (16). These antigens include surface antigens (SAG1 and 2), rhoptry proteins (ROP1 and 2), dense granule proteins (GRA1, 2, and 4 -8), microneme proteins (MIC2-5), and a matrix protein (MAG1). Most have shown only limited use for distinguishing recently acute from chronic infections and have not found their way into routine diagnosic use (15)(16)(17)(18)(19)(20)(21)(22)(23). Meanwhile, the array employed here represents only a fraction of the nearly 8000 genes encoded in the T. gondii genome. Although the protein microarray strategy has been used extensively in bacteria (45,47,53,63,64,67,68), its application to antigen discovery in eukaryotic genomes is relatively novel, having been applied previously only to Plasmodium falciparum (46,69). Of the 15 known serodiagnostic antigens listed above, the smallest exons of four were represented on the current chip, namely GRA1 (070250_1 and _2), GRA2 (027620_2), GRA5 (086450_1), and MIC5 (077080_1). The two exons of GRA1 (070250_1 and 070250_2) were both reactive with IgG, indicating relevant epitopes were presented by both fragments. GRA1, 2 and MIC5 were also reactive, with GRA1, 2 and 5 being discriminatory between cases and controls (See Fig. 2). (The larger exons from the 12 antigens absent from the array will be expressed on the next iteration of the chip, TG2, which will have 2705 exon products). One of the strongest  Another aim of the study was to use the antibody profiling technology to provide insight into the humoral immune response to T. gondii infection in humans. In toxoplasmosis, we see a simultaneous appearance of IgG and IgM antibodies in the acute stage, consistent with other reports (72). However, the arrays revealed two sub-populations. One is a group with a broad IgG profile (Ͼ30 antigens), the other has a more focused profile with 3 antigens dominating the response in most individuals (027620_2, 001390_1 and TLN-1). The latter may represent an earlier stage of infection, although there was no correlation of IgG profile on the array and IgG titer or avidity by conventional assay. Studies conducted with animals infected at the same time can be compared with human data to better understand the antibody response against these antigens. Interestingly, TLN1 protein contains 13 tandem repeats of 11 amino acids (YPDDLPTSSTP). Tandem repeats have been reported in several secreted proteins in T. gondii and the repeats are thought to be involved in immune evasion or protein-protein interactions (51,73). The high prevalence of its detection here may indeed suggest that these repeats are highly immunogenic in human infections. The portion of the protein containing the repeats appears to be highly immuno-genic when used to produce antibodies in mice (Hajagos and Bradley, unpublished).
An interesting aspect of T. gondii infection is that T. gondiispecific IgM may persist for up to 2 years after infection (the so-called chronic/IgM persisting group). This contrasts with the more usual pattern in which IgM titers arise transiently during acute infection, followed by a more gradual but sustained IgG reponse as the response matures. The array data revealed that the chronic/IgM persisting individuals, in addition to having a long-lived IgM response, are also characterized by having the broadest IgG profile of all three stages of infection. This group also has a slightly lower IgG avidity than the true chronics (average AI ϭ 50 Ϯ 13 and 62 Ϯ 10, respectively), which is consistent with some of the chronic/ IgM persisting group being in an intermediate stage between acute and chronic stages. Somewhat unexpectedly, the IgG profile is reduced in the true chronic stage, despite the avidity being highest overall. One possibility is that the IgG profile expands during acute and chronic/IgM persisting stages, but as the parasite enters latency in the chronic stage, fewer parasite genes are expressed thereby reducing the antigenic dose presented to the immune system. Individuals in which IgM (and IgG) persists may have an increased parasite load, or may have not entered latency or may have recurrent infection.
To better understand the rules underlying antigencity of microbial pathogens in general, we have analyzed the properties of the proteins recognized by the humoral response to T. gondii. It is not surprising that some of the enriching features in bacteria, such as transmembrane domains, signal peptides, outer membrane localizations, are also shared by parasites. We also found that proteins classified with certain GO categories, including GO component protease complex, proteins involved in ATP biosynthetic process, transport, metabolic process, proteolysis, and signal peptide processing were significantly enriched, as are proteins predicted by GO function with catalytic activity, transporter activity, transferase activity, some enzymatic activity excluding kinase activity, enzyme regulator activity, structural molecule activity, solute: hydrogen antiporter activity, and ion channel activity. We wish to emphasize that our analysis may overrepresent the enrichment of antigens in the full proteome because of the fact that we only studied a subset of proteins on the array that were selected based on antigenic features seen in bacteria. We anticipate, however, essentially similar results from the full proteome chip. This is our first enrichment analysis with eukaryotic organisms, which would be especially helpful in prioritization of other eukaryotic genome in a quick and relatively small scale protein array study, and important for prediction of immune responses in other eukaryotic organisms.