Autoimmune Regulator (AIRE) Is Expressed in Spermatogenic Cells, and It Altered the Expression of Several Nucleic-Acid-Binding and Cytoskeletal Proteins in Germ Cell 1 Spermatogonial (GC1-spg) Cells*

Autoimmune regulator (AIRE) is a gene associated with autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED). AIRE is expressed heavily in the thymic epithelial cells and is involved in maintaining self-tolerance through regulating the expression of tissue-specific antigens. The testes are the most predominant extrathymic location where a heavy expression of AIRE is reported. Homozygous Aire-deficient male mice were infertile, possibly due to impaired spermatogenesis, deregulated germ cell apoptosis, or autoimmunity. We report that AIRE is expressed in the testes of neonatal, adolescent, and adult mice. AIRE expression was detected in glial cell derived neurotrophic factor receptor alpha (GFRα)+ (spermatogonia), GFRα−/synaptonemal complex protein (SCP3)+ (meiotic), and GFRα−/Phosphoglycerate kinase 2 (PGK2)+ (postmeiotic) germ cells in mouse testes. GC1-spg, a germ-cell-derived cell line, did not express AIRE. Retinoic acid induced AIRE expression in GC1-spg cells. Ectopic expression of AIRE in GC1-spg cells using label-free LC-MS/MS identified a total of 371 proteins that were differentially expressed. 100 proteins were up-regulated, and 271 proteins were down-regulated. Data are available via ProteomeXchange with identifier PXD002511. Functional analysis of the differentially expressed proteins showed increased levels of various nucleic-acid-binding proteins and transcription factors and a decreased level of various cytoskeletal and structural proteins in the AIRE overexpressing cells as compared with the empty vector-transfected controls. The transcripts of a select set of the up-regulated proteins were also elevated. However, there was no corresponding decrease in the mRNA levels of the down-regulated set of proteins. Molecular function network analysis indicated that AIRE influenced gene expression in GC1-spg cells by acting at multiple levels, including transcription, translation, RNA processing, protein transport, protein localization, and protein degradation, thus setting the foundation in understanding the functional role of AIRE in germ cell biology.

diasis-ectodermal dystrophy (APECED), a rare autoimmune disorder in humans (1). Since then, it has become increasingly evident that AIRE is a protein with multiple talents. While some studies have found that AIRE is expressed exclusively in the thymic epithelial cells (2), several groups have reported its expression in multiple peripheral tissues (3)(4)(5). The expression of AIRE has been demonstrated in the testes (6) and more recently in stem cells (7). In the thymus, AIRE has been shown to mediate the expression of a number of peripheral tissue restricted proteins (promiscuous gene expression) and hence thought to play a pivotal role in regulating the negative selection of self-reactive thymocytes (8 -10). In testes, it has been shown to regulate the scheduled apoptosis event necessary for normal spermatogenesis (6). Moreover, Aire-knockout mice have been shown to be infertile (2,11).
AIRE has several functional domains point toward its possible role as a transcriptional regulator (12). It has a nuclear localization signal, which explains its predominantly nuclear localization in tissues and cultured cells (10,13). However, in transfected cell lines, AIRE was localized in nuclear dots and cytoskeletal filaments (11,12). AIRE has a Speckled nuclear protein 100, AIRE-1, Nucleoside phosphorylase, Deaf1 transcription factor (SAND) domain, a DNA-binding domain found in many nuclear proteins (13,14). It has been demonstrated that AIRE has the capability to bind to DNA at the SAND domain (12,15) and upon binding, activate transcription (16) of the downstream genes. AIRE also has two plant homeodomains (PHD1 and PHD2) (14,15). PHD domains are found in many DNA-binding proteins and are known to mediate protein-protein interaction (17,18). Recent studies have shown that AIRE binds to unmethylated H3K4 residues using its PHD domain (19,20). The PHD1 domain of AIRE has been shown to have E3 ubiquitin ligase activity (21), and it has been suggested that ubiquitin-proteasome pathway is important in AIRE-dependent gene regulation (22). Also, AIRE has a caspase recruitment domain (23), also referred to as homogeneously staining region, a region usually involved in oligomerization of proteins that mediates apoptosis (24,25). Extended caspase recruitment domain has also been shown to bind DNA (26). Studies have pointed toward a possible role of AIRE in regulating germ cell apoptosis (6). Recent studies suggested that AIRE-positive cells were likely to undergo spontaneous apoptosis and that they were less resistant to apoptosis inducers (27).
Although AIRE is considered to be a transcription factor, the exact mechanisms through which AIRE functions remain elusive. The difficulty in studying the mechanism of AIRE-de-pendent gene expression is the number of target genes, which is in the order of several hundred in thymic epithelial cells, peripheral lymph nodes, stromal cells (28,29), and modified monocytes (30). Also contributing to the problem is the diversity of the genes under the regulation of AIRE, as it seemingly targets different sets of genes in different cell types (31,32).
An analysis of the mutations in AIRE causing APECED (33) indicated that the loss of function of AIRE due to the loss of SAND and/or PHD domains might be behind the disease phenotypes in APECED patients. Arg15fsSTOP19, asp70 fsSTOP216, arg139STOP, ala170fsSTOP219, gln173STOP, arg203STOP and arg257STOP are some of the reported disease-causing truncations of AIRE resulting in the loss of SAND and PHD domains (33). Knock-out studies have revealed that AIRE positively and negatively regulates the expression of a plethora of genes, encoding both tissue-restricted antigens and otherwise (32,34). Thus, most of the available data suggestive of the functional role of AIRE were deduced from AIRE-deficient models. However, complementary gain of function studies has been limited. Moreover, the potential targets of AIRE in testes, which is also a site for promiscuous gene expression, is not known. In this context, we evaluated the expression of Aire expression in mouse testes during the initiation of first wave of spermatogenesis and testicular germ cells at various stages of development. Though the spermatogonia expressed AIRE (6), we report that spermatogonia-derived GC1-spg cells did not express AIRE endogenously. Using GC1-spg cells as an efficient AIREdeficient model, we evaluated the impact of transient AIRE expression on the cellular proteome of these cells and the possible gain of function that could be attributed to AIRE. Functional analysis of the proteome revealed that the major classes of proteins differentially displayed as result of overexpression of AIRE in GC1-spg cells are the (i) nucleic-acidbinding proteins and transcription factors and (ii) cytoskeletal elements and structural proteins. Network analysis revealed two highly interacting clusters: (i) proteins involved in transcription and translation and (ii) proteins involved in protein degradation.
Animals-Mus musculus, Swiss albino strain, housed and inbred at the Laboratory Animal Research Centre (LARC) of Rajiv Gandhi Centre for Biotechnology, Trivandrum, India, were used. All animal experiments were approved by the Institutional Animal Ethical Committee vide approval nos. IAEC/65/PRK/2008 and IAEC/200/PRK/2013.
RNA Isolation and cDNA Preparation-RNA isolation from mouse tissues and transfected cell lines were carried out using TRI reagent. From each age group, four animals were used for RNA extraction. In the case of transfected cells, two sets from two independent transfections were used. Total RNA (5 g in 33 l RNase-free water) quantitated with Nanodrop and having 260/280 ratio of 2 or above were reverse transcribed using Ready-To-Go TM T-primed first strand cDNA synthesis kit (Amersham Biosciences, NJ). RNA isolation from FACS sorted cells and primary culture of testicular cells after puromycin selection was carried out using RNeasy Mini Kit (Qiagen, Hilden, Germany), and 1 g RNA was reverse transcribed using SUPERSCRIPT ® VILO TM cDNA synthesis kit (Invitrogen, Carlsbad, CA).
FACS-Based Separation of Mature Mouse Testicular Cells-Testes of two mature male mice were surgically removed and placed in PBS. The tunica albuginea and the blood vessels on the surface of seminiferous tubules were removed. The tubule masses were rinsed three times in PBS, minced into small pieces, and then placed in 5 ml DMEM supplemented with 10% FBS at room temperature. Fragments of seminiferous tubules were flushed in and out several times through a micropipette tip to ensure maximal dispersal of cells. This was followed by 15 min incubation room temperature to allow sedimentation of large fragments. The supernatant was centrifuged at 600 ϫ g for 5 min, and the pellet obtained was resuspended in DMEM-FBS. Nuclear staining was performed by incubating the cells with 2 g/ml Hoechst 33342 (Sigma Aldrich, MO) at 37°C for 1 h. The cell suspension was filtered using 40 m cell strainer (BD Falcon, Mexico) and was analyzed using BD FACS Aria Flow cytometer (Becton Dickinson, CA) with Diva software. Cells collected were directly used for protein and RNA extraction.
Isolation and Culture of Spermatogonia from Mouse Testis-Isolation and culture of spermatogonia from neonatal and adult mouse testes was done following published protocols (35)(36)(37)(38). The single cell suspension of testicular cells prepared was resuspended in Shinohara medium with FBS (SF). The SF cell culture media was composed of StemPro34 SFM base (Invitrogen) supplemented with StemPro34 nutrient supplement, nongrowth factor components [5 mg/ml bovine serum albumin (Calbiochem); 6 mg/ml d-(ϩ) glucose, 10 g/ml d-biotin, 25 g/ml insulin, 30 g/ml pyruvic acid sodium salt, 0.06% dl-lactic acid (60% solution), 100 M ascorbic acid, 30 M sodium selenite, 60 M putrescine, 100 g/ml bovine Apo-transferrin, 60 ng/ml progesterone, 30 ng/ml ␤-estradiol 17-cypionate, 10 M 2-mercaptoethanol (Sigma-Aldrich); 2 mM L-glutamine, 1X antibioticantimycotic, 1X MEM vitamins, 1X nonessential amino acids (Invitrogen, 1% fetal bovine serum (Hyclone, Logan, UT)] and growth factor components [10 3 U/ml LIF, 10 ng/ml recombinant human basic FGF (Sigma Aldrich, 20 ng/ml recombinant mouse EGF, 15 ng/ml recombinant rat glial cell derived neurotrophic factor (GDNF; (R&D Systems, Minneapolis, MN)]. The cells were placed in a gelatin-coated 12-well plate at a cell density of 2 ϫ 10 5 cells/ml and incubated for 16 -24 h at 37°C in a humidified incubator with 5% CO 2 . The somatic cells of the testis attached to the gelatin plate, and the germ cells did not. The floating germ cells were harvested using a P 1000 pipette and then transferred onto a new 12-well plate after centrifugation at 270 ϫ g for 5 min at room temperature. The germ cells were replenished with fresh SFM every 3 days for 1 week. The disaggregated germ cells formed clusters from day 2, and they grew into smaller spermatogonial stem cell (SSC) colonies by day 7. Compact SSC colonies were observed in cultures derived from testicular cells of both immature and adult mice on day 9 after seeding. The SSC colonies were harvested using a micropipette and were placed into fresh SF medium in new 12-well plates and were used for RNA and protein preparation as detailed elsewhere.
Quantitative Real-Time PCR-Quantitative real-time PCR was performed on the cDNA samples prepared from EGFPN1-AIRE-transfected GC1 cells (test) as well as empty-vector-transfected GC1 cells (control) using SYBR green master mix (Applied Bio Systems) in ABI PRISM ® 384-well optical reaction plates. All reactions were performed in triplicates. The following cycling parameters were used: 50°C for 10 min, 90°C for 10 min, and 95°C for 10 min. This was followed by 40 cycles at 95°C for 10 s and a combined annealing/ extension temperature of 60°C for 2 min. Real-time primers were designed for selected few genes in the up-regulated set (showed greater than 1.5-fold increase in level of expression in the Airetransfected samples as compared with the empty-vector-transfected control), control set (showed no change in the level of expression), and down-regulated set (showed lesser than 0.5-fold decrease in the expression levels in Aire-transfected cells). The primers used for real-time PCR are listed in Table I. Expression levels of 18S rRNA was used as an internal control. A minimum of two biological replicates with three technical replicates each was used for the experiment. The mean fold change and standard deviation for each of the target genes were computed and a histogram was plotted using Microsoft Excel software (Microsoft Corporation, Mountain View, CA).
Retinoic Acid Treatment of GC1-spg Cells-Mouse-derived spermatogonia cell line, GC1-spg, maintained in DMEM supplemented with 1% Penn-strep and 10% FBS at 37°C in 5% CO 2 in air atmosphere was seeded into a fresh dish containing the same medium supplemented with 10 Ϫ7 M trans-9 retinoic acid dissolved in 0.05% ethanol. Cells grown in medium supplemented with 0.05% ethanol served as control. We also maintained another control dish in which the cells received no treatment. The cells were maintained for 2 weeks, and total RNA was prepared from RA-treated, ethanoltreated, and untreated cell lines using TRI reagent, and cDNA was synthesized using a Superscript VILO cDNA synthesis kit (Invitrogen) following the protocols recommended by the manufacturers.
Plasmid Construction-Full-length AIRE was cloned into plasmid enhanced green fluorescent protein N-terminal 1 (pEGFPN1) between EcoR1 and Sal1 sites. The insert was amplified from a previously generated plasmid for expression by T7 RNA polymerase (pET)32a-AIRE construct available in our laboratory, with 5Ј-GAATTCATGGC-GACGGACGCGGCGCTA as forward primer and 5Ј GTCGAC-GAGGGGAAGGGGGCCGCCGG as reverse primer. The amplified insert was gel-purified. Both the insert and the plasmid were incubated separately with five units of EcoR1 and 10 units of Sal1 (New England Biolabs) at 37°C for 4 h. Digested products were run on agarose gel and were purified using QIAquick Gel Extraction Kit (Qiagen). A 10 ìl ligase reaction was set up with T4 DNA ligase and ligase buffer containing ATP (New England Biolabs, MA), along with double-digested insert and vector. The reaction was incubated in a thermal cycler at 16°C for 12 h, followed by inactivation step at 65°C for 20 min. The ligated product was used for transformation of Escherichia coli DH5␣. The transformation reaction was plated onto se-lection plates, and the colonies obtained after 12 h of incubation were screened for positive clones. The constructs were authenticated by direct sequencing of the inserts and flanking vector regions.
RT-PCR-Reverse transcriptase PCR was performed on the cDNA samples prepared from EGFPN1-AIRE-transfected GC1 cells (test) and empty-vector-transfected GC1 cells (control) using full-length Aire specific primers: 5Ј GAATTCATGGCGACGGACGCGGCGCTA as forward primer and 5Ј GTCGACGAGGGGAAGGGGGCCGCCGG as reverse primer. The PCR conditions were 94°C for 2 min, followed by 35 cycles of 94°C for 30 s, 58°C for 45 s, and 72°C for 1.5 min, followed by a final extension of 10 min at 72°C. EGFPN1-AIRE plasmid was used as a PCR positive control. The PCR products were separated on 1% agarose-Tris borate-edetic acid gel, and the image was captured using VersaDoc Gel Imager (BioRad, Hercules, CA).
Overexpression of AIRE in GC1-spg Cells and Confocal Microscopy-For overexpression in GC1-spg cells, full-length AIRE-pEGFP-N1 plasmid was isolated using Nucleobond Xtra Midi Plus EF kit (Macherey Nagel, Duren, Germany) as per the manufacturer's instructions, and plasmids were quantitated using Nanodrop (Nano Drop 1000, Thermo Scientific, Waltham, MA). GC1-spg (ATCC, Manassas, VA) cells were cultured in DMEM supplemented with 10% FBS and antibiotics. All transfections were carried out in Opti MEM, reduced serum medium, using Lipofectamine 2000 following manufacturer's instructions. Four hours after transfection, the culture medium was replaced with fresh culture medium. For imaging the transfected cells, GC1 cells were grown on coverslips and transfected as described above. 24 h after transfection, the cells were washed in PBS, fixed in 4% formaldehyde, stained with propidium iodide (Sigma Aldrich), and imaged using a Nikon A1R Confocal Microscope using NIS Elements AR 4.00.04 software (Nikon Instruments, Shizuoka-ken, Japan).
Western Blotting-Proteins lysates from mouse testis, freshly isolated spermatogonia, and SSC cultures were prepared following previously published protocols (22). EGFPN1-AIRE-transfected GC1 cells (test) as well as empty-vector-transfected GC1 cells (control) were lysed in RIPA buffer (150 mM NaCl, 1.0% octyl phenoxy polyethoxy ethanol (IGEPAL), 0.5% sodium deoxycholate, 0.1% SDS, and 50 mM Tris, pH 8.0) containing protease inhibitor mixture (Sigma Aldrich). The lysate was clarified by centrifugation at 4°C at 12,000 ϫ g, and the supernatant was collected as the protein sample. The protein samples were resolved on a 12% polyacrylamide gel at a constant voltage of 120 V using a PowerPac HC (Bio-Rad), followed by electrophoretic transfer for 2 h at a constant current of 175 mA in a Mini Trans-Blot cell (Bio-Rad), to a PVDF membrane in the presence of 20% methanol, 25 mM Tris, and 190 mM glycine (pH 8.3). The membrane was blocked using 5% milk in PBS containing 0.1% Tween 20 (PBST). After three washes in PBST, the membranes were incubated for 2 h in 1:1000 dilution of anti-Aire antibody (sc-33188, Santa Cruz Biotechnology), at room temperature. Membrane was washed and incubated for 1 h in PBST containing 1:2000 dilution of HRP-conjugated anti rabbit secondary antibody (sc-2317, Santa Cruz Biotechnology), and developed using 0.1% NiCl 2 , 0.05% 3,3Ј-diaminobenzinidine, and hydrogen peroxide (H 2 O 2 ).
Quantitative Proteomics of GC1-spg cells Overexpressing AIRE-AIRE Overexpression and Protein Extraction-The GC1-spg cells were transfected with pEGFPN1-AIRE construct, as detailed earlier. Empty pEGFPN1-transfected GC1 cells were used as control for all our experiments. Cells were harvested 24 h posttransfection. 100 l of packed cell volume was homogenized in 500 l of detergent-free hypotonic lysis buffer (10 mM HEPES, pH 7.9, 1.5 mM MgCl 2 , 10 mM KCl) by passing it through a narrow gauge hypodermic syringe and centrifuged at 11,000 ϫ g for 20 min. The supernatant was saved as the cytoplasmic fraction, and the pellet was further extracted using 100 ml of extraction buffer (20 mM HEPES, pH 7.9, 1.5 mM MgCl 2 , 0.42 M NaCl, 0.2 mM EDTA, 25% (v/v) glycerol). The crude cytoplasmic and the nuclear fractions were pooled together as the total protein.
Tryptic Digestion-The extracted protein samples were centrifuged at 18,000 ϫ g for 15 min, and protein concentration of the supernatant was measured using Bradford assay. The concentration of the samples was normalized using 50 mM ammonium bicarbonate to yield a final concentration of 1 g/l. Approximately 100 g of proteins from each sample was subjected to in-solution trypsin digestion to generate peptides. Disulfide bonds were reduced by incubation of proteins with 100 mM dithiothreitol in ammonium bicarbonate for 30 min at 60°C. After cooling at room temperature for 5 min, the samples were given a short spin at 18,000 ϫ g, and 200 mM iodoacetamide in ammonium bicarbonate was added to perform the alkylation step in the dark at room temperature for 30 min. Proteins were then digested by using sequencing grade modified trypsin (Sigma Aldrich) at a trypsin: Total protein ratio of ϳ1:25 and incubated for 17 h at 37°C. The enzymatic reaction was stopped by adding formic acid to each sample so that final formic acid concentration is 1.0% and incubating at 37°C for 20 min. The digested peptide solutions were centrifuged at 18,000 ϫ g for 12 min and the supernatant was collected. The supernatant was transferred to autosampler vials (Total Recovery Vial, Waters, Manchester, UK) for peptide analysis via LC-MS E (MS at elevated energy) with ion-mobility.
Liquid Chromatography-The tryptic peptides were separated by reversed-phase chromatography on a nanoACQUITY UPLC ® chromatographic system (Waters). Instrument control and data processing were done with MassLynx4.1 SCN781 software. The peptide sample was injected in partial loop mode in 5 l loop (injection volume 3.0 l).Water was used as solvent A and acetonitrile was used as solvent B. All solvents for the UPLC system contained 0.1% formic acid. The tryptic peptides were trapped and desalted on a 180 m ϫ 20 mm C18 trap column (5 m pore size) (Symmetry ® , Waters) for 1 min at a flow rate of 15 l/min. The trap column was placed in line with the reversed-phase analytical column, a 75 m inner diameter ϫ 200 mm Bridged ethylene hybrid (BEH) C18 (Waters) with particle size of 1.7 m. Peptides were eluted from the analytical column with a linear gradient of 1 to 40% solvent B over 55.5 min at a flow rate of 300 nl/min followed by a 7.5 min rinse of 80% solvent B. The column was immediately re-equilibrated at initial conditions (1% solvent B) for 20 min. The column temperature was maintained at 40°C. The lock mass, [Glu 1 ]-Fibrinopeptide B human (Sigma Aldrich), (positive ion mode [Mϩ2H] 2ϩ ϭ 785.8426) for mass correction was delivered from the auxiliary pump of the UPLC system through the reference sprayer of the NanoLockSpray TM source at a flow rate of 500 nl/min. Each sample was injected in triplicate with blank injections between each sample.
MS Analysis-Mass spectral analysis of eluting peptides from the nanoACQUITY UPLC ® was carried out on a SYNAPT® G2 High Definition MS™ System (HDMS E System, Waters) controlled by MassLynx 4.1 SCN781 (Waters Corporation, Milford, MA) software. The instrument settings were: nano-ESI capillary voltage-3.2 kV, sample cone-35 V, extraction cone-4 V, ion mobility spectrometry (IMS) gas (N 2 ) flow-90 (ml/min). To perform the mobility separation, the IMS T-Wave™ pulse height is set to 40 V during transmission and the IMS T-Wave™ velocity was set to 800 m/s. The traveling wave height was ramped over 100% of the IMS cycle between 8 V and 20 V.
All analyses were performed using positive mode ESI using a NanoLockSpray TM source. The lock mass channel was sampled every 45 s. The time of flight analyzer of the mass spectrometer was calibrated with a solution of 500 fmole/l of [Glu 1 ]-Fibrinopeptide B human Sigma Aldrich, This calibration set the analyzer to detect ions in the range of 50 -2000 m/z. The mass spectrometer was operated in resolution mode (V mode) with a resolving power of 18,000 full width at half maximum (FWHM), and the data acquisition was done in continuum format. The data were acquired by rapidly alternating between two functions-Function-1 for acquiring low energy mass spectra (MS) and Function-2 for acquiring mass spectra at elevated collision energy with ion mobility (HDMS E in Function-2, collision energy was set to 4 eV in the trap region of mass spectrometer and is ramped from 20 eV to 45 eV in the transfer region of mass spectrometer to attain fragmentation in the HDMS E mode. The continuum spectral acquisition time in each function was 0.9 s with an interscan delay of 0.024s. Data Analysis-The acquired ion mobility enhanced MS E spectra were analyzed using ProteinLynx Global SERVER™ v2.5.3 (Waters, Manchester, UK) for protein identification as well as for the label-free relative protein quantification. Data processing included lock mass correction postacquisition. Processing parameters for ProteinLynx Global SERVER™ v2.5.3 were set as follows: noise reduction thresholds for low energy scan ion-150 counts, high energy scan ion-50 counts and peptide intensity-500 counts (as suggested by manufacturer). The protein identifications were obtained by searching against the mouse database downloaded from NCBI ͑ftp://ftp.ncbi.nih.gov/refseq/M_musculus/mRNA_Prot/dated July 22, 2014͒ containing 77,623 entries. During database search, the protein false positive rate was set to 4%. The parameters for protein identification was made in such a way that a peptide was required to have at least one fragment ion match, a protein was required to have at least three fragment ion matches, and a protein was required to have at least one peptide match for identification. Oxidation of methionine was selected as variable modification and cysteine carbamidomethylation was selected as a fixed modification. Trypsin was chosen as the enzyme used with a specificity of one missed cleavage. Data sets were normalized using the "auto-normalization" function of Protein-Lynx Global SERVER™ v2.5.3, and label-free quantitative analyses was performed by comparing the normalized peak area/intensity of identified peptides between the samples. Thus, we obtained parameters such as score, sequence coverage, and number of peptides identified for each protein. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (39) via the Proteomics Identification (PRIDE) partner repository with the dataset identifier PXD002511 and 10.6019/PXD002511. The protein data set was filtered by considering only those identified proteins that have at least two peptides. Furthermore, only a fold change higher than 50% difference (ratio of either Ͻ0.50 or Ͼ1.5) was considered to be indicative of significantly altered levels of expression.
Bioinformatics-Functional analysis of the differentially expressed proteins was carried out using Protein analysis through evolutionary relationships (PANTHER) gene analysis tool (http://pantherdb.org/), and functional network analysis was carried out using Search tool for the retrieval of interacting genes/proteins (STRING) network analysis software (http://string-db.org/). The influence of the changes in the proteome on molecular networks in GC1-spg cells was analyzed using NetworkAnalyst (http://www.networkanalyst.ca/) (40, 41) using the up-and down-regulated proteins as input (Table S1). The data were trimmed to minimum connected network and Function explorer was run using all nodes in the network as input and Gene Ontology (GO) term molecular function (MF).

Aire Transcripts and Protein Are Present in Mouse Testis of
All Age Groups-Testes from mice of age groups 1 through 4 weeks were evaluated for the expression of Aire in the context of the initiation of the first wave of spermatogenesis. Aire gene was amplified from cDNA using primers 1306F and 1456R, which yielded a single band of 150 bp length (Fig. 1A). The amplified fragment was sequenced, and the identity of the product was confirmed (Fig. S1). Real-time PCR data showed a gradual decline in Aire expression with sexual maturity. Fold differences in Aire expression in other age groups with respect to mature mouse was calculated. The highest level of Aire expression was seen in 1 week old testis, which was significantly higher (p Յ 0.005) when compared with 4 weeks old testis. The levels of Aire transcripts were four-and threefold higher (p Յ 0.05) in the testes of 2-week-old and 3-weekold mice when compared with that of 4-week-old mice (Fig.  1A). AIRE protein was detectable in mouse testicular lysates of all evaluated stages of germ cell development (Fig. 1B). Testicular germ cells were sorted based on Hoechst emission in the red and blue channels (Fig. 1C). Spermatogonia enriched from neonatal mice were positive for Thy1 and Aire. The spermatogonia enriched from adult mouse testis were also positive for Aire, though the signal for Thy1 was very weak (Fig. 1D). Further, the GFR␣ ϩ spermatogonal cells from adult mouse testis expressed AIRE (Fig. 1E). The cell population that was SCP3 ϩ /GFR␣ Ϫ (meiotic germ cells, enriched primary spermatocytes) was also positive for AIRE. Similarly, the PGK2 ϩ /GFR␣ Ϫ population (postmeiotic cells, enriched in secondary spermatocytes and spermatids) was also positive for AIRE (Fig. 1E). Enriched populations of spermatogonia from neonatal mouse testis (Spg N) when maintained in culture for 12 days formed colonylike structures, confirming their stemness (Fig. 1F). Thus, AIRE is expressed in spermatogonia, premeiotic spermatocytes, and postmeiotic spermatocytes.
Mouse GC1-spg Cells Do Not Express AIRE-Protein lysates from adult mouse testis, freshly isolated spermatogonia and SSC in culture expressed AIRE, which could be detected on Western blots. However, GC1-spg cells derived from mouse spermatogonia were negative for AIRE ( Fig. 2A). Stimulation of GC1-spg cells with retinoic acid activated the transcription of Aire (Fig. 2B) and its translation (Fig. 2E) as detected by RT-PCR and Western blotting, respectively, while vehicle control had no effect.
Transient Overexpression of Aire EGFPN1 in GC1 Cell Line-Aire EGFPN1 construct and the empty EGFPN1 vector construct were transfected into mouse spermatogonial cell line GC-1(70 -80% confluent) using lipofectamine-mediated transfection method. GC1 cells transfected with empty vector were negative for AIRE, and the pEGFPN1-AIRE-transection brought about overexpression of AIRE in GC1 cells, as confirmed by RT-PCR (Fig. 2C) and Western blotting analysis (Fig.  2D). In GC1 cells, AIRE protein was predominantly detected in the nucleus as dots and in the cytoplasm as filamentous structures very similar to the localization pattern that has been reported in other cell lines (Fig. 3).
Identification of Differentially Expressed Proteins in Aire Overexpressing GC1 Cells-At 24 h, transfection efficiency of 60 -80% was achieved upon which the AIRE-transfected (test) and empty-vector-transfected (control) GC1 cells were harvested and lysed for protein preparation. Proteins were analyzed using liquid chromatography in tandem with mass spectrometry as described in the methods. Proteins uniquely present or with Ͼ1.5-fold higher level of expression in the AIRE-transfected GC1 cells when compared with the controls were considered to be up-regulated. Proteins uniquely present or with Ͼ1.5-fold lower levels of expression in AIREtransfected GC1 cells were considered to be down-regulated. The protein data set was filtered by considering only those identified proteins that have at least two matched peptides. Further, only those proteins that showed a differential display in at least two out of the three biological replicates were considered to have significantly altered levels of expression. Thus, we were able to shortlist a total of 371 proteins that showed differential expression pattern as result of overexpression of AIRE in GC1-spg cells. 100 proteins were upregulated, whereas, 271 proteins were down-regulated. The details of the shortlisted proteins are listed in Supplemental  Table S1.
Classification of Differentially Expressed Proteins and Network Analysis-Functional analysis of the differentially expressed proteins was carried out using PANTHER gene analysis tool. The results of the analysis are summarized in the Figs. 4A and 4B. A detailed list of proteins is included in the Supplemental Table S2. Major classes of proteins that were up-regulated were the nucleic acid binding and transcription factors (33% in the up-regulated set versus 17% in the downregulated set). Majority of the proteins belonging to this category were RNA binding proteins such as ribosomal proteins (Rps28, Rpl12, Isg15, Rpl29, Rplp1), ribonuclear proteins, translational factors (Eif1ax, Eif4h, Eef1a2 and Gsto1), and proteins involved in mRNA processing (Hnrnpa/b, LSm3, Naa38, Ewsr1). Strikingly, a down-regulated set of proteins also included many RNA-binding proteins, especially RNA helicases such as ADdx39a, Ddx39b, Ddx3y, Ddx3x and Ddx21.
We built two molecular function (GO term ϭ MF) networks using the up-regulated and the down-regulated proteins listed in Table S1 as inputs. The list of 100 proteins in the upregulated class yielded 45 seed proteins that generated 74 nodes and 84 edges (Supplemental Fig. S2A-S2C). DNAbinding (Supplemental Fig. S2A) and RNA-binding (Supplemental Fig. S2B) molecules were predominant in this network, though a small class of tubulin-binding (Supplemental Fig.  S2C) branches were also detected. The list of 271 downregulated proteins yielded 135 seed proteins that generated 229 nodes and 401 edges (Fig. S3). Here again, DNA-binding (Supplemental Fig. S3A) and RNA-binding (Supplemental Fig.  S3B) proteins were highly enriched in the network. Interestingly, stemness determining genes including Nanog, Kit, and Pou5f1 were present in these networks, along with Rara, which is involved in spermatogenesis.
Transcript-Level Analysis of a Few Candidates from Our Up-Regulated and Down-Regulated Set of Proteins-Aire is a putative transcription factor. In order to determine if the observed change in the protein expression upon AIRE overexpression can be attributed to a change in the gene expression, we carried out real-time PCR analysis for a few selected genes in the up-regulated set, down-regulated set, and control set (which did not show any change in the protein expression).The list of the genes and the real-time primers are included in Table I. The seven genes tested from the up-regulated set showed greater than 1.5-fold increase in the expression in the AIRE overexpressing cells as compared with the empty-vector-transfected control (Fig. 6A). However, no such corresponding decrease in the mRNA levels could be detected for down-regulated set of genes ( Fig. 6B). Further, the three control genes tested remained around the fold level of 1.0 showing no change in their expression (Fig. 6C). DISCUSSION Until the identification of Aire protein in mouse germ cells, testicular Aire expression was thought to be part of promiscuous gene expression in testes. We have amplified and cloned full-length Aire from mouse testis (22) and have established the presence of AIRE in testes and mouse spermatozoa. In this study, we extended Aire expression analysis to mice of different age groups and to germ cells from different stages of division. We could confirm the expression of Aire in spermatogonia, primary spermatocytes, and postmeiotic germ cells.
A large number of studies carried out on AIRE-deficient models revealed that AIRE regulated promiscuous expression of a large number of genes in immune-related tissues. Although AIRE is abundantly expressed in testes, another site known for its promiscuous gene expression, the potential targets of AIRE in testes are not known. Several of the dis-  (32,34). Thus, most of the available data suggestive of the functional role of AIRE were deduced from AIRE-deficient models. There have been no gain-of-function studies on AIRE, that too in cell models that express AIRE in normal conditions. Moreover, the potential targets of AIRE in testis, which is also a site for promiscuous gene expression, is not known. Despite having several molecular signatures of a transcription factor, the only function assigned for AIRE in testes is the reduction in germ cell apoptosis in AIRE-deficient mice that resulted in reduced fertility due to impaired spermatogenesis (6). There has been an argument that AIRE-deficient mice are infertile due to autoimmune reasons and that a Recombination activating gene (RAG)-1-KO on AIRE-KO background was fertile since T/B lymphocytes were depleted in this model (42). However, these authors did not evaluate the effect of Aire deficiency on spermatogenesis per se. AIRE is abundantly expressed in the testes, spermatogonia, and SSCs in culture ( Fig. 2A). However, GC1-spg cells, a mouse spermatognia-derived cell line, were negative for AIRE expression ( Fig. 2A). Though GC1-spg cells are AIRE-negative, transplantation of them into testes results in their differentiation into spermatids (43), indicating their functional competence. retinoic acid could induce the expression of Aire in GC1-spg cells (Fig. 2B), raising the possibility that GC1-spg cells could resume Aire expression in testicular microenvironment. This provided us with an ideal cell model to evaluate the AIRE-mediated changes in the germ cell proteome. This study reports the analysis of the impact of AIRE expression on the cellular proteome of spermatogonia-derived germ cell line, by comparing AIRE-transfected and empty-vector-transfected GC1-spg cells. We preferred this model over the retinoic-acid-induced Aire expression model of GC1-spg cells since the latter would account for RA-dependent and AIRE-independent changes in cellular proteome as well. We identified a total of 371 proteins that are directly or indirectly influenced by AIRE in GC1 cell line out of which 100 proteins showed an increased expression and 271 proteins showed a decreased expression in the AIRE overexpressing cells as compared with the control. Real-time analysis of a few of the putative targets of AIRE that showed an up-regulation also showed an increase in the expression at the transcript level in our real-time PCR experiments. However, the transcripts of the selected molecules from the down-regulated class did not show the expected down-regulation at mRNA level. We suspect that this could be the result of posttranscriptional regulation of gene expression mediated directly or indirectly by AIRE, affecting the translation of these mRNAs. The functional analysis of the differentially expressed proteins and the comparison with previous studies carried out on AIRE yielded important insights. We observed increased expression of several nuclear-binding proteins and transcription factors in the AIRE-transfected GC1 cells, a major portion of which included RNA-binding proteins, ribosomal proteins, and proteins involved in mRNA processing and translation. This result correlates very well with the observation that pre-mRNA processing and/or storage occurs in the nucleus in bodies (44) that resemble AIRE-  TGG GGA GTC GCT TT  GCT ACT TGA TTG AGG GGC GT  Zyx  TCA CCT GCT TCA CTT GCC AT  AAA GTG TCG GTG TAG CAG  containing bodies. AIRE has also been shown to interact specifically with proteins involved in nuclear transport, chromatin binding, transcription elongation, and pre mRNA processing (45). Five proteins from our proteomics data were previously reported to interact with AIRE in the mammary epithelial cell (MEC) line in a study carried out by Abramson et al. (45) out of which one protein was shown to be up-regulated (nuclear autoantigenic sperm protein (Nasp)), and three proteins were shown to be down regulated (small nuclear ribonucleoprotein (Snrpd3), importin 7 (Ipo7), Cullin-associated NEDD8 dissociated protein 1 (Cand1), and RuvB like 2 (Ru-vbl2)) in our proteomics data. However, besides importin 7, many other nuclear transfer/carrier proteins (importin 5, importin subunit alpha 3, importin subunit alpha 2, importin 5, importin 7, exportin 2) were also seen to be down-regulated in the Aire-transfected GC1 cells. A large number of structural proteins were down-regulated (7% in the down-regulated set as compared with 1% in the up-regulated set) in the AIRE-transfected sample, which included a number of keratin, actin, myosin, tubulin, and kinesin subunits as previously described. Our data also showed an up-regulation of a number of chaperones (Hspe1, Hspa1a, and Hspb1). In the study carried out by Colome et al. (27), 2D difference gel electrophoresis (DIGE) and Isotope coded pro- tein labeling (ICPL) were used to analyze the effect of AIRE on the cellular proteome of the AIRE-transfected and nontransfected epithelial cell line. They showed a similar increase in the levels of several chaperones (HSC70, HSP27, and tubulinspecific chaperone A) in AIRE-expressing cells, while various cytoskeleton-interacting proteins (transgelin, caldesmon, tropomyosin alpha-1 chain, myosin regulatory light polypeptide 9, and myosin-9) were decreased. We could identify at least five proteins in our proteomics data (heat shock protein beta 1 (Hspb1), Post synaptic density protein (PSD95), Drosophila disc large tumor suppressor (Dlg1), and zonula occludens-1 protein (zo-1) (PDZ) and Lin11, Isl-1 & Mec-3 (LIM) domain protein, trp/asp (WD) repeat containing protein 1, tubulinspecific chaperone A, and keratin type I cytoskeletal 18) that were shown to be modulated in the study carried out by Colome et al. (27). Their studies suggest that AIRE-positive cells were more likely to undergo spontaneous apoptosis and that they were less resistant to apoptosis inducers. Besides, recent studies have pointed toward a possible role of AIRE in regulating germ cell apoptosis (6) and that it is associated with NO-induced cellular stress in AIRE-expressing cells (46). Caspase 3, executioner caspase has been shown to be upregulated in our proteomics data. Three antiapoptotic factors were shown to be down-regulated (mast/stem cell growth factor receptor kit, DnaJ homolog subfamily B member 4, and signal transducer and activator of transcription 3). Further, keratin 17 and AIRE together regulated gene expression in skin tumor keratinocytes (47). The presence of a large number of DNA/RNA-binding proteins in the molecular function network and the appearance of Nanog, Pou5f1, Kit, and Rara as some of the prominent nodes are suggestive of the role of AIRE in functional network switching between stemness and differentiation in spermaogonial cells.
Members of the E3 ubiquitin pathway also showed a decreased expression in the AIRE-transfected cells. This included a large number proteasome regulatory subunits (Psmc5, Psmd3, Psmd14, Psmd13, Psmc3, and Psmd8) and ubiquitin-conjugating enzymes (Nedd4 and Ube2k). Ubiquitin proteasome pathway plays an important role in regulating cell cycle progression, cell signaling, and apoptosis. The PHD 1 domain of AIRE has been shown to have E3 ubiquitin ligase activity (21), and it has been suggested that ubiquitinproteasome pathway is important in AIRE-dependent gene regulation (22). STRING analysis of the differentially expressed proteins showed two major clusters of interacting proteins: (i) proteins involved in pre-mRNA processing, translation initiation factors, ribosomal subunits, and ribonuclear proteins, which are up-regulated in AIRE overexpressing cells and (ii) proteins involved in ubiquitin-proteasome dependent protein degradation pathway, which are down-regulated in Aire overexpressing cells. STRING network analysis also predicted significant interactions among proteins involved in protein transport and localization with a very high confidence.
Thus using a global high throughput approach, we have been able to get an insight into the Aire-induced changes in the proteome of GC1-spg cells. Many of the changes in the proteome have been reflected in the transcriptome as well, which is in agreement with the role of AIRE as a master regulator of gene expression. Our molecular function network analysis suggests that AIRE regulates gene expression in GC1-spg cells by acting at multiple levels, including transcription elongation, translation initiation, RNA processing, protein transport, protein localization, and protein degradation.