Targetome Analysis of Malaria Sporozoite Transcription Factor AP2-Sp Reveals Its Role as a Master Regulator

ABSTRACT Malaria transmission to humans begins with sporozoite infection of the liver. The elucidation of gene regulation during the sporozoite stage will promote the investigation of mechanisms of liver infection by this parasite and contribute to the development of strategies for preventing malaria transmission. AP2-Sp is a transcription factor (TF) essential for the formation of sporozoites or sporogony, which takes place in oocysts in the midguts of infected mosquitoes. To understand the role of this TF in the transcriptional regulatory system of this stage, we performed chromatin immunoprecipitation sequencing (ChIP-seq) analyses using whole mosquito midguts containing late oocysts as starting material and explored its genome-wide target genes. We identified 697 target genes, comprising those involved in distinct processes parasites experience during this stage, from sporogony to development into the liver stage and representing the majority of genes highly expressed in the sporozoite stage. These results suggest that AP2-Sp determines basal patterns of gene expression by targeting a broad range of genes directly. The ChIP-seq analyses also showed that AP2-Sp maintains its own expression by a transcriptional autoactivation mechanism (positive-feedback loop) and induces all TFs reported to be transcribed at this stage, including AP2-Sp2, AP2-Sp3, and SLARP. The results showed that AP2-Sp exists at the top of the transcriptional cascade of this stage and triggers the formation of this stage as a master regulator.

TFs create gene expression patterns for each stage by targeting hundreds of genes directly (6)(7)(8)(9). Therefore, identifying the master TF that governs specific stages and determining its target genes are important strategies for elucidating the mechanisms of infection in this parasite.
Sporozoites are motile forms that play a central role in parasite transmission to the vertebrate host (10). They are generated in the oocysts formed within the mosquito midgut and then migrate to the salivary glands. When a mosquito bites, the sporozoites are released into the skin and are transported to the liver via circulation. There, they enter the liver stage to transform into merozoites that are erythrocyte-invasive forms (11). Studies on gene regulation during this stage are important for understanding the molecular mechanisms underlying parasite infection of the liver. Sporozoites alter infection capacities by controlling gene expression at the transcriptional and translational levels (12,13). Thus far, four TFs have been reported to play important roles in gene regulation during this stage: AP2-Sp, AP2-Sp2, AP2-Sp3, and sporozoite and liver-stage asparagine-rich protein (SLARP) (3,5,14). However, studies of these TFs, including their target genes and their interrelationships, are lacking, and the roles of TFs in transcriptional regulation during the sporozoite stage remain to be elucidated. The likely reason for this paucity of research is the necessity to obtain sporozoites by dissecting infected mosquitoes (10). Genome-wide studies of these TFs, particularly chromatin immunoprecipitation sequencing (ChIP-seq) analyses to determine their binding sequences and target genes, have been hampered due to the small number of sporozoites obtained from infected mosquitoes. Thus, the question of how the gene expression repertoire unique to this stage occurs has been left unanswered.
AP2-Sp is a TF that is essential for formation of the sporozoite stage (5). In the rodent malarial parasite Plasmodium berghei, the expression of AP2-Sp is first observed in the nuclei of oocysts approximately 7 to 10 days after an infective blood meal of a mosquito, i.e., a few days earlier than the onset of sporozoite formation or sporogony in the oocyst. AP2-Sp expression then continues through the sporozoite stage. Depletion of this gene arrested oocyst development before sporogony, and no sporozoites were produced in the oocyst. The AP2-domain of this TF binds the six-base motif [C/T]GCATG. Many genes important for sporozoite functions harbor this binding motif in their upstream regulatory regions, where this motif functions as a cis-acting element of the gene promoters. These findings have suggested an important role of this TF during the sporozoite stage. However, it remains unclear to what extent this TF contributes to gene regulation during this stage and how it interconnects with other concomitantly expressed TFs.
In this study, we aimed to demonstrate the role of AP2-Sp in P. berghei as a master regulator by investigating the whole range of targets or the "targetome" of this TF by ChIP-seq. To circumvent the problems of the low recoveries of sporozoites obtained from mosquitoes, we used whole mosquito midguts containing oocysts under sporogony as the starting materials for ChIP-seq. The results comprehensively demonstrate that AP2-Sp regulates genes expressed during this stage; its targets broadly encompass genes for sporogony, salivary gland infection, and hepatocyte infection. We also show that AP2-Sp directly induces other TFs expressed during this stage, playing the role of a stage-specific master gene regulator.

RESULTS
ChIP-seq of AP2-Sp using mosquito midguts containing oocysts. In ChIP-seq, the number of cells utilized in the assay is an important determinant of data quality because the maximum read depth achieved by ChIP from a single haploid cell is only 1 on a given genome locus. In previous ChIP-seq studies, we used 5 to 10 hundred million haploid parasites to obtain nanogram quantities of DNA by ChIP. To attain this number in the sporozoite stage (in our laboratory, no more than 50,000 oocyst sporozoites and 10,000 to 20,000 salivary gland sporozoites are obtained from one mosquito), it would be necessary to dissect at least 10,000 infected mosquitoes. Thus, performing ChIP-seq analysis of AP2-Sp in this stage was unrealistic.
To solve this problem, we conceived using oocysts as the starting material. After an infective blood meal, parasites transform into round oocysts on the basal lamina of the midgut. The oocysts grow, replicating their genome, and after 10 to 14 days, divide into multinuclear cells, called sporoblasts, which act as centers of sporogony. Hundreds of sporozoites bud promptly from each sporoblast. AP2-Sp expression begins just before sporoblast formation and then is localized in the nuclei aligned beneath the plasma membrane of sporoblasts (5). If an average of 100 oocysts is formed in the mosquito midgut and each oocyst contains 10 thousand nuclei, one midgut is estimated to contain nuclei equivalent to 1 Â 10 6 haploid parasites, and 100 mosquitos would satisfy the required number of nuclei described above. However, one drawback is that the oocysts cannot be purified from the mosquito midgut and thus, to perform this experiment, whole midguts must be used as the starting material. This means that the samples will contain considerable amounts of contaminating proteins and genomic DNA derived from the mosquito host. We anticipate that the efficiency of ChIP will be diminished under such conditions, which will affect the data quality deleteriously.
Therefore, we performed the ChIP-seq experiments using infected midguts and assessed the data quality. We used a transgenic P. berghei parasite line expressing green fluorescent protein (GFP)-fused AP2-Sp developed in a previous study (5). In the first experiment, we harvested 450 Anopheles stephensi mosquito midguts for ChIP-seq at 14 days after an infective blood meal, when AP2-Sp expression was observed in the majority of oocysts. In the midguts, due to asynchronous oocyst development (15), there were oocysts at various developmental stages, including presporogony oocysts, oocysts undergoing sporogony, and oocysts containing sporozoites. Based on the estimation described above, these oocysts were estimated to contain 4.5 Â 10 8 parasite nuclei.
The harvested midguts were immediately fixed with 1% paraformaldehyde and stored in the fixation solution until all mosquitoes were dissected. The fixed midguts were then subjected to ChIP with anti-GFP antibodies. Sequencing of the input DNA, which was extracted from the midgut lysate, showed that the ratio of the parasite genome to the Anopheles stephensi genome in the original sample was approximately 1: 3.1, indicating that the starting sample contained a substantial amount of genomic DNA from the parasites. By peak-calling using ChIP-seq data, 1,270 peaks were called (see Table S1A in the supplemental material; false discovery rate [FDR] , 0.01, fold enrichments > 3).
To investigate the binding sequences in vivo, regions within 100 bp from the peak summits were extracted, and the sequences around the summits were analyzed. The results demonstrated that the motif  Table S1B). This motif is identical to the one that we reported in a previous study, [T/C] GCATG, except that the current motif contains TGCACA and TGCATA as minor variants that were not covered in that study using electrophoretic mobility shift assays (5). Of these peaks, 90.6% contained at least one motif sequence within their peak regions, and the average distance of the predicted summit to the nearest motif was 54.0 bp (Fig. 1B). The mapped view of reads showed that representative sporozoite-specific genes, including the AP2-Sp targets predicted in our previous paper (5), harbor clear AP2-Sp peaks in their upstream regions (examples are shown in Fig. 1C). These results (i.e., a high proportion of parasite DNA in the infected midgut and a low background in the obtained ChIP-seq data) suggest that the efficiencies of ChIP-seq from infected midguts were higher than we initially supposed. Therefore, we next performed ChIP-seq using a smaller number of mosquitoes, i.e., 190 mosquitoes. In this experiment, the ratio of parasite DNA to A. stephensi DNA in the input DNA was 1:0.96. By peak call, 1,156 peaks were identified (see Table S2A), and the motif around the peak summits was identical to that obtained with the former experiment (see Table S2B). Of these peaks, 95.9% contained at least one motif sequence within the peak region, and the average distance of the predicted summit to  Fig. 1D. Between these experiments, there were 895 common peaks (peak summits were within 150 bp) ( Fig. 1E to G), demonstrating that the results were highly reproducible.
Comprehensive activation of host cell invasion-related genes by AP2-Sp. To investigate the role of AP2-Sp in transcriptional regulation in the sporozoite stage, the genome-wide targets of AP2-Sp were predicted. The prediction assumed that the binding sites of a TF identified by ChIP-seq exist within 1,200 bp of the first methionine codon of the gene. This criterion produced useful results in our previous studies, and its validity is also supported by our previous observation that in many sporozoite-specific genes, the binding sequences of this TF were located within this region (5). Using this criterion, 784 and 850 genes were predicted as targets from the peaks obtained via experiments 1 and 2, respectively (see Tables S1C and S2C in the supplemental material), and 697 genes were common between them ( Fig. 2A; see also Table S3A). These common genes were used for the following analyses.
The classification of these targets according to functional categories is shown in Fig. 2B and Table S3B. Genes related to pellicle structure and host cell invasion were the most abundant in the predicted targets. Consistently, Gene Ontology (GO) term analysis revealed that terms such as "pellicle," "cell motility," and "apical complex" were significantly enriched (P , 0.01) in the targets (Fig. 2C). Figure 2D shows target genes related to these terms that are grouped based on the subcellular localization of the products: "pellicle cytoskeleton," "glideosome," "microneme," "rhoptry," and "rhoptry neck." We also classified the target genes according to their involvement in the steps that parasites undergo during this stage: "sporogony," "egress oocyst and salivary-gland invasion," "migration to the liver and hepatocyte invasion," and "development into the liver stage" (Fig. 2E) (10,11,16). Importantly, the predicted targets contained broadly genes expressed in the sporozoite, not only genes belonging to the groups "sporogony" and "egress oocyst and salivary-gland invasion," which may have expression peaks in the oocyst/oocyst sporozoite used for ChIP-seq, but also genes belonging to the groups "migration to the liver and hepatocyte invasion" and "development into the liver stage," which may have expression peaks in the salivary gland sporozoite. In addition, each group contained a comprehensive set of genes that could belong to the group. These results suggest that AP2-Sp determines the repertoire of gene expression during the sporozoite stage and already binds to the promoters of these genes in the oocyst/oocyst sporozoite regardless of when these genes display their highest expression during the sporozoite stage.
Predicted targets constitute a major proportion of the genes highly expressed in sporozoites. To determine the importance of AP2-Sp in transcriptional regulation during the sporozoite stage, the abundance of target gene transcripts in the sporozoite transcriptome was investigated. Transcriptome sequencing (RNA-seq) analysis was performed in the oocyst/oocyst sporozoite (infected midguts at 14 days postinfection) and salivary gland sporozoites (sporozoites at 24 days postinfection), and the P. berghei genes were ordered by expression levels based on RPKM (Reads Per Kilobase of exon per Million mapped reads) values (see Table S4A and B). In the oocyst/oocyst sporozoite stage, 67 genes were targets among the top 100 genes. We further examined the upstream regions of the 33 genes, which were not predicted as targets, to check for If peaks are present, we examined whether the transcription starts downstream of these peaks using RNA-seq data. We also verified whether these 33 genes were initially missed as targets because of incorrect annotations. Detailed manual examinations revealed that these genes included two putative targets: a gene that harbored peaks more than 1,200 bp upstream with accompanying transcripts downstream of the peak and a gene that was not predicted as a target due to incorrect annotation or alternative transcription initiation. These results suggest that targets constitute a major proportion of the highly expressed genes during this stage. Among 31 nontarget genes, 22 were genes related to protein synthesis, including 17 genes encoding ribosomal proteins (Fig. 3A, left). In the salivary gland sporozoite transcriptome, among the top 100 genes, 94 were targets (Fig. 3A, right), including 13 genes that were manually predicted as targets. Eleven genes and two genes were not predicted to be targets due to the 1,200-bp criterion and presumably incorrect annotation of the genes (or alternative transcription initiation), respectively. Three of six nontarget genes were genes related to protein synthesis, including two ribosomal proteins. The high proportion of ribosomal proteins in the nontarget genes in the oocyst/oocyst sporozoite transcriptome suggests that transcripts derived from oocysts before sporogony, which may abundantly produce proteins, are included in the transcriptome. This explains why the proportion of targets was higher in the salivary gland sporozoite than in the oocyst/oocyst sporozoite. Figure 3B shows the relationship between the proportions of target genes and gene expression levels (RPKM values). The proportion of targets increased with RPKM values. These results suggest that the targets of AP2-Sp constitute a major proportion of the highly transcribed genes both in the oocyst/oocyst sporozoite and in the salivary gland sporozoite.
Transcriptional Regulation in the Malaria Sporozoite mBio specific protein 1), SSP3 (sporozoite surface protein 3), SPECT (sporozoite protein essential for cell traversal), SPECT2, B9, and SPELD (sporozoite surface protein essential for liver-stage development), which are transcribed in the sporozoites (17)(18)(19)(20)(21)(22)(23)(24). These results suggest that the chimeric TF could bind the regions upstream of these genes and directly induce them. In the present study, we verified these results using ChIP-seq data. Among these 33 genes, 32 were the targets of AP2-Sp, as predicted in this study (see Table S4C). These results suggest that promoters for sporozoite-specific genes are accessible and also functional in ookinetes. AP2-Sp activates its transcription through a transcriptional positive-feedback loop. Master TFs in the development of multicellular organisms are thought to stabilize their own expression by transcriptional autoactivation mechanisms such as transcriptional positive-feedback loops, which further stabilize expression of the target genes, contributing to establishment of the cell type (25). A similar mechanism may be employed in gametocytogenesis of Plasmodium parasites because AP2-G (a master regulator of gametocytogenesis) harbors binding sites of itself in its own upstream regulatory region (26). In the P. berghei AP2-G gene, ChIP-seq peaks for AP2-G extend across a 2-kbp upstream region beyond the ordinary promoter size (1,200 bp) that we assume in this parasite (27). To investigate whether this is true for AP2-Sp as well, we investigated AP2-Sp ChIP-seq peaks in the upstream regulatory regions of the AP2-Sp gene. As shown in the graphical view in Fig. 4A, in the upstream regions of AP2-Sp there is a large zone harboring multiple ChIP-seq peaks extending between 1,940 and 2,940 bp upstream of the gene and a corresponding cluster of AP2-Sp binding motifs (23 binding motifs in the region, including overlapping motifs). The RNA-seq data showed that a cluster of AP2-Sp transcripts initiates downstream of this area, indicating that this is a regulatory region for the AP2-Sp gene. These observations suggest that AP2-Sp utilizes a transcriptional positive feedback loop, as observed with AP2-G, while AP2-Sp itself was not predicted to be a target due to the 1,200-bp inclusion criterion.
We wanted to prepare parasites with all the motifs mutated to investigate whether this putative positive-feedback loop actually contributes to AP2-Sp transcription. Because the motifs were spread over a large region, it was difficult to mutate all of them simultaneously by standard genetic modification methods. Thus, the entire region was replaced with a synthetic 1,000-bp DNA fragment with mutated motifs using a double CRISPR method (Fig. 4B) (28). Two independent clones were prepared, and the effects of these mutations on sporogony were assessed in mosquitoes. The oocyst number formed on the mosquito midgut wall was normal in these parasites, but the number of oocyst sporozoites obtained from them decreased significantly compared to the original parasite, pbcas9 ( Table 1). The AP2-Sp transcripts extracted from infected midguts were five to seven times lower in both clones than in pbcas9 (Fig. 4C). These results demonstrate that positive transcriptional feedback is important for maintaining AP2-Sp expression at a level sufficient for the normal progression of sporogony.
The results obtained in this study can be explained by reduced expression of AP2-Sp by disrupting a positive-feedback loop, which further reduces the expression of its target genes, resulting in impaired sporogony. We performed RNA-seq analysis of AP2-Sp_pro_mut and the original pbcas9 parasites to examine whether the expression of the target genes decreased in AP2-Sp_pro_mut parasites. As shown in Fig. 4D, the RPKM values of most of the target genes (which were normalized with the total RPKM values of nontarget genes) decreased in AP2-Sp_pro_mut parasites compared to those in pbcas9 parasites (84.1 and 85.8% target gene in clones 1 and 2, respectively). A Student t test of the fold changes for target and nontarget genes revealed that the expression of target genes was significantly decreased in AP2-Sp_pro_mut parasites (P = 1.28 Â 10 266 and P = 3.42 Â 10 297 , respectively). These results were further confirmed by reverse transcription-quantitative PCR (qRT-PCR) assay for genes expressed

mutations (basically, [T/C]GCA[T/C][G/A] was changed to TGAA[T/C][G/A])
, which were confirmed by sequencing (lower panels). (C) RT-qPCR assays of AP2-Sp transcripts were performed for pbcas9 and AP2-Sp_pro_mut parasites. Total RNA was extracted from midguts at 14 days after infective blood meal. The 60S ribosomal protein L 21 mRNA was used as an internal control. The results are shown as expression of AP2-Sp relative to that in pbcas9. The data are averages of three biologically independent experiments (6 the standard errors [SE]). Student t tests were used for statistical analyses. P values of ,0.05 were considered significant (*). (D) The expression of AP2-Sp target genes was compared between pbcas9 and AP2-Sp_pro_mut parasites using RNA-seq. The RPKM values of each gene were normalized with average RPKM values of nontarget genes, and the log 2 (fold change) for these values was calculated for pbcas9 and AP2-Sp_pro_mut parasites. Analyses were performed in two independent clonal parasite lines described earlier, and the results of the target (upper panel) and other (lower panel) genes are shown as histograms. (E) The expression of eight target genes was compared using RT-qPCR assays in pbcas9 and AP2-Sp_pro_mut parasites. Assays were performed as described for panel C. The 60S ribosomal protein L 21 mRNA was used as an internal control. The results are shown as gene expression relative to that in pbcas9. Data represent the averages of three biologically independent experiments (6 the SE). *, P , 0.05; **, P , 0.01. RON2, rhoptry neck protein 2; CSP, circumsporozoite protein; TRAP, thrombospondin-related anonymous protein; MAEBL, merozoite apical erythrocytebinding ligand; SPECT2, sporozoite microneme protein essential for cell traversal 2; UIS4, upregulated in infectious sporozoite 4. in the sporozoite stage, including another AP2 family, TF AP2-Sp2 (Fig. 4E). These results suggest that transcriptional autoactivation by a positive-feedback loop in AP2-Sp is required for stabilizing gene expression at the sporozoite stage.
Transcriptional cascade in sporozoites is triggered by AP2-Sp. AP2 family TFs sometimes have large upstream regulatory regions, as observed in AP2-Sp and AP2-G, and are not predicted to be targets by the 1,200-bp criterion. In sporozoites, four TFs have been reported to be transcribed: the AP2 family proteins, AP2-Sp2, AP2-Sp3, and AP2-L, and the unclassifiable TF SLARP (3,14,29). Except for AP2-Sp2 (Fig. 5A), these TFs were not predicted to be targets of AP2-Sp. However, manual investigation of the upstream regions of these genes revealed that all harbored AP2-Sp peaks, which accompanied their downstream transcripts (Fig. 5B to D). These results suggest that AP2-Sp regulates these genes directly.
Next, we prepared parasites in which the AP2-Sp binding motifs were mutated in the upstream region of the AP2-Sp2 gene and examined whether AP2-Sp2 was directly activated by AP2-Sp. As shown in Fig. 4E, a significant decrease in expression of AP2-Sp2 was observed in AP2-Sp_pro_mut parasites using RT-qPCR. The AP2-Sp2 gene harbors multiple AP2-Sp peaks in its upstream region, with nine binding motifs under the peaks (Fig. 5A). We intended to mutate all of these motifs using a double CRISPR system by targeting those nearest to each end of the peak region, but protospacer adjacent motif sequences were not present around the motif at the 39 end of the peak region. Thus, the third motif inward from this end was targeted by gRNA (Fig. 5E). Two independent clones were obtained. Two motifs near the 39 end of the peak region of one of the clones were not mutated probably because of crossover homologous recombination between the second and third motifs. Despite these differences, the phenotypes were essentially the same in both clones. Oocysts formed as in wild-type parasites, but the number of sporozoites collected from the midguts was approximately five times lower in the mutants than in the wild type (Table 1). RT-qPCR analysis showed that transcripts of AP2-Sp2 decreased significantly (Fig. 5F). These results demonstrate that AP2-Sp activates AP2-Sp2 directly through the seven binding sites and that this direct regulation is essential for the normal progression of sporogony.

DISCUSSION
The mechanisms of gene regulation in the sporozoite stage have been elusive. AP2-Sp is a candidate master regulator of this stage; it is essential for sporogony, it is expressed throughout the sporozoite stage, and many genes important for sporozoite functions harbor its binding motif in their upstream regulatory regions. However, genome-wide studies of its target genes have not been performed due to difficulties in preparing sufficient numbers of parasites. To solve this problem, we explored ways to perform ChIP-seq on this stage and finally determined that high-quality data could be obtained using infected mosquito midguts as the starting material. The data revealed that this TF is a master TF of this stage with two prominent features. First, the target genes of this TF constitute a major proportion of the genes that are highly expressed a For each transcription factor, two biologically independent mutant parasites, clone 1 and clone 2, were prepared from pbcas9 and used for experiments. The numbers of oocysts and oocyst sporozoites were compared between pbcas9 and mutant parasites 14 days after infective blood meal.
during this stage and comprehensively represent genes known to be important for stage-specific functions. This style of gene regulation, which we called "direct and comprehensive regulation" in our previous paper on gene regulation in the ookinete (another motile stage) (6), suggests that AP2-Sp binding determines the gene expression repertoire in this stage. Second, AP2-Sp stabilizes its expression by a positive transcriptional feedback loop and controls other sporozoite-specific TFs, i.e., it is at the pinnacle of transcriptional regulation during this stage. The mechanism of a positive transcriptional feedback loop . Two clones were prepared independently, in which seven (clone 1) and nine (clone 2) motifs (not including overlapped ones) were found to be mutated. (F) RT-qPCR assays of AP2-Sp2 transcripts. Experiments were performed between pbcas9 and AP2-Sp2_pro_mut parasites at 14 days after infective blood meal using mosquito midguts as starting material. The 60S ribosomal protein L 21 mRNA was used as an internal control. The results are shown as expression of AP2-Sp2 relative to that in pbcas9. The data are averages of three biologically independent experiments (6 the SE). The statistical analyses were performed using Student t tests (*, P , 0.05).
Transcriptional Regulation in the Malaria Sporozoite mBio may be important for complete stage conversion by increasing AP2-Sp transcripts rapidly and stabilizing new gene expression repertoires that are constituted from its target genes. These identified features of this TF imply simple mechanisms of stage-specific gene regulation and suggest that a complete picture of gene regulation during this stage can be obtained by studying transcriptional regulation starting with AP2-Sp. One question to be answered in the next step is how different profiles of gene expression are generated among the AP2-Sp target genes. Sporozoites target different organs in vertebrate and mosquito hosts. Consequently, the expression profiles of many genes change before and after salivary gland infection. For example, the expression of UIS4, which is required for liver-stage development, is low before salivary gland infection but upregulated greatly after salivary gland infection (30). In contrast, the expression of serine repeat antigen 5, which is required for oocyst rupture, is high in the oocyst sporozoite but decreases in the salivary gland sporozoite (31). If AP2-Sp binds to the promoters of these genes irrespective of their expression profiles, how can the different gene expression profiles observed in this stage occur? Do additional regulatory mechanisms exist?
At present, the most likely answer to this remaining question(s) is a scenario where TFs downstream of AP2-Sp play these roles, moderating the basal transcriptional activity of the promoter controlled by AP2-Sp and acting as repressors or enhancers to switch the expression pattern of a group of target genes. Indeed, considering that parasites show different phenotypes by disruption of AP2-Sp2, AP2-Sp3, and SLARP at this stage (3,14), there is a possibility that these factors regulate different groups of genes, producing different profiles of gene expression among target genes of AP2-Sp. The present ChIP-seq method should help us to examine this assumption, at least in AP2-Sp2 and AP2-Sp3, which could be expressed in oocysts/oocyst sporozoites, by determining the target genes of these TFs and investigating if binding of these TFs explain their expression profiles. On the other hand, further strategies to analyze genome binding may be required to study on gene regulation by SLARP, which is mainly expressed in the salivary gland sporozoite where lower numbers of parasites are obtained.
As discussed above, the cascade induction of TFs such as AP2-Sp2, AP2-Sp3, and SLARP could play an important role in determining gene expression patterns unique to the sporozoite stage. In contrast, the induction of AP2-L is not responsible for transcriptional regulation in the sporozoite stage but plays a role in initiating prompt conversion to the liver stage (29). This role seems to be analogous to that observed in gametocyte master TF AP2-G. AP2-G induces hundreds of genes and establishes early gametocytes, but at the same time, activates TFs that act in subsequent female gametocytes, which makes it possible to sustain sexual development that continues after expression of AP2-G is over (27). Presumably, the master TFs of this parasite play a role in triggering the conversion to subsequent stages and a series of cascade events of master TFs drives the life cycle forward. Taken together, these observations suggest two roles of a master TF in the life cycle: create the basic gene expression pattern of a new stage and maintain it after stage conversion, and prepare the next stage by activating its master TFs.
According to this assumption, AP2-Sp is transcriptionally activated by a TF in the preceding oocyst stage, and its expression is subsequently maintained by transcriptional autoactivation after completion of stage conversion. In the experiment using AP2-Sp_pro_mut parasites, which lacked a transcriptional positive-feedback loop, the expression of AP2-Sp was significantly reduced. However, a level of expression equivalent to approximately 10% of that observed in the original pbcs9 parasites remained. This phenotype appeared imperfect (because the expression was not completely abolished), but the residual expression can be explained by activation by a TF expressed in the preceding oocyst stage. On the other hand, if AP2-Sp2 transcription is exclusively activated by AP2-Sp, the similar phenotype of AP2-Sp2_pro_mut parasites cannot be explained because transcripts of AP2-Sp2 should have been eliminated in this case. We have previously reported that several genes involved in the development of the female gametocyte are transcriptionally activated in two steps: first activated by a gametocyte master TF AP2-G and then by a female gametocyte master TF AP2-FG (27,32). Therefore, we speculate that transcription of AP2-Sp2 is activated in two steps: initially by a TF in the preceding oocyst stage and then by AP2-Sp. In the next step, investigating the TFs expressed during sporogony and filling the gap in the TF cascade between the oocyst and sporozoite stages by identifying this TF is crucial.
In conclusion, this study demonstrates the role of AP2-Sp as a master regulator in the sporozoite stage and, to the best of our knowledge, is the first to reveal an overall picture of transcriptional regulation in this stage. Gene regulation at the sporozoite stage may be under multiple levels of regulation. Although many issues remain to be resolved, we believe this study is a vital step toward unveiling gene regulation during the sporozoite stage.

MATERIALS AND METHODS
Ethics statement. This study was performed according to the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health in order to minimize animal suffering. These experiments were approved by the Animal Research Ethics Committee of Mie University (permit [23][24][25][26][27][28][29].
Parasite preparations. The ANKA strain of P. berghei was used in all experiments. For infection of mosquitoes, infected BALB/c mice were subjected to A. stephensi mosquitoes. Fully engorged mosquitoes were selected and maintained at 20°C in a fumed environment. The number of oocysts was counted under a microscope 14 days after an infective blood meal. To ensure the quality of mosquito samples used for ChIP-seq, we used mosquitoes for experiments only when 9 of 10 mosquitoes randomly sampled from the breeding cage contained over 100 oocysts in the midgut.
ChIP-seq. The mosquito midgut was excised, fixed immediately in 1% paraformaldehyde, and stored in this solution until the dissections of all mosquitoes were completed. ChIP-seq was performed using the same procedure as reported previously (27). Briefly, the fixed midguts were sonicated with a Bioruptor (Tosho Denki, Yokohama, Japan) in 0.25 mL of lysis buffer (50 mM Tris-HCl, 1% sodium dodecyl sulfate [SDS], 10 mM EDTA) under the following conditions: 20 cycles (30 s/cycle) at middle power. The lysate was centrifuged at 4°C to remove particulates. Input sample was collected at this point from the supernatant of sheared cell lysate, and the resulting supernatant was subjected to immunoprecipitation with anti-GFP antibodies using Dynabeads-protein A (Thermo Fisher Scientific). After a washing step, immunoprecipitated chromatin was eluted from beads in elution buffer (10 mM Tris-HCl, 1% SDS, 5 mM EDTA, 300 mM NaCl), and the supernatant was incubated in a 65°C water bath for 8 h with shaking for cross-link reversal. The sample was further processed with RNase H for 1 h and proteinase K for 2 h. DNA fragments were purified from the immunoprecipitated chromatin by phenol-chloroform extraction and ethanol precipitation and used for library construction. Input DNAs were obtained from the chromatin by the same procedure but without IP. Libraries were constructed using a HyperPrep kit (KAPA Biosystems) according to the manufacturer's protocol. Sequencing was performed using the SOLiD 5500 (Life Technologies) and Illumina NextSeq Systems in experiments 1 and 2, respectively.
Analysis of ChIP-seq data. The sequence data were mapped onto the P. berghei genomic sequence (PlasmoDB, version 3) and the A. stephensi genome (IndV3) (33) using Bowtie software under default settings. The mapping data of P. berghei were further analyzed using the MACS2 peak-calling algorithm (FDR , 0.01). To identify common peaks between two experiments, peaks summits within 150 bp were selected using an in-house script. To estimate the binding sequences of AP2-Sp, six-base sequences concentrated within 100 bp from the predicted summits were investigated by an in-house script using a Fisher exact test as described previously (6). Briefly, a Fisher exact test was performed between the 200bp regions around summits of ChIP-seq peaks and the 200-bp regions excised from the genome excluding the former regions, to cover the entire genome sequence. Common sequence motifs were searched among the sequences with the least P values. Genes were determined to be targets of AP2-Sp when they were located within 1,200 bp downstream of the predicted summits of peaks. When the upstream region was less than 1,200 bp, the whole intergenic region was used instead for predictions. GO term enrichment analysis was performed for P. berghei genes using the statistical programming tool R in combination with the Bioconductor GOstats library. Graphical views of ChIP-seq peaks were created from the IP and input data as follows: reads were mapped on the P. berghei genome as described earlier, and read coverage of input was subtracted from that of IP after normalization with the total read number using bamCoverage of deepTools2 (bin = 10) (34). Data were visualized using the Integrative Genomics Viewer (IGV) software (35).
RNA-seq. For the RNA-seq of the oocyst/oocyst sporozoite, total RNA was extracted from the midguts dissected from mosquitoes 14 days after an infective blood meal. The dissected midguts were soaked in RNAlater solution until all midguts had been collected. Ten midguts were used for each extraction, and three independent experiments were performed. For the RNA-seq of salivary gland sporozoites, total RNA was extracted from salivary gland sporozoites 21 days after an infective blood meal. One hundred mosquitoes were used for each extraction (usually approximately one million sporozoites were obtained), and three independent experiments were performed. RNA was extracted using Isogen II (Nippon Gene) according to the manufacturer's protocol. The dissected midguts were lysed in the lysis Transcriptional Regulation in the Malaria Sporozoite mBio solution supplied with the kit using the TissueLyser system (Qiagen). The library for RNA-seq was prepared using a HyperPrep kit (KAPA). Briefly, mRNA was purified from 400 ng of total RNA and was subjected to cDNA synthesis with random primers. Second strand was synthesized by dUTP incorporation and Y-shaped adapters were ligated to both ends of the double-stranded cDNA. After dUTP strand degradation, strand-specific library for RNA-seq was amplified by PCR (12 cycles). These libraries were used for Illumina sequencing. Reads were mapped on the P. berghei genome with HISAT2 software. The HISAT2 parameters were set to default except for "-max-intron," which was set to 1,000. The mapping data were analyzed by featureCounts software to calculate RPKM values. Average RPKM values obtained by three independent experiments were used to quantify gene expression. To compare target gene expression between pbcas9 and AP2-Sp_pro_mut parasites, RNA-seq analysis was performed for the infected midguts 14 days after an infective blood meal. RPKM values were calculated, and genes with RPKN values of >50 in pbcas9 were used for subsequent analyses. For comparison of gene expression between pbcas9 and AP2-Sp_pro_mut parasites, the obtained RPKM values were normalized with the average RPKM values of nontarget genes, and the fold changes of these values were calculated between mutated parasites and pbcas9. A Student t test was performed using these fold change values for target and nontarget genes. Preparation of mutant parasites by CRISPR/Cas9 system. Parasites with mutated AP2-Sp binding motifs were prepared with the CRISPR/Cas9 system using P. berghei parasites constitutively expressing Cas9 (pbcas9), as reported previously (28). A DNA fragment corresponding to the upstream region of the gene was synthesized with all motifs mutated (two point mutations per motif). Sequences for homologous recombination were added to both sides of the fragment by overlapping PCR and used as donor DNA. Two guide RNA (gRNA) sequences were designed, corresponding to the motifs on opposite sides of the synthesized fragment. The gRNA target sequences were subcloned into a plasmid for double CRISPR gRNA expression. Mature schizonts were transfected with donor DNA and the gRNA plasmid, and mutants were selected with pyrimethamine for 1.5 days, beginning 30 h after transfection. Parasite clones were obtained by limiting dilution, and correct exchange of the original locus with the donor DNA fragment by double-crossover homologous recombination was verified by sequencing. The primers used in this experiment are listed in Table S5.
Reverse transcription-quantitative PCR assay. Twenty midguts were dissected 14 days after an infective blood meal. Total RNA was extracted from the excised midguts using the same procedures as described above. cDNA was synthesized from 1 mg of total RNA using PrimeScript RT reagent kit with gDNA Eraser (TaKaRa). qPCR was performed using TB Green Fast qPCR Mix (TaKaRa) and Thermal Cycler Dice Real Time System II (TaKaRa). For each qPCR 1/40 of the synthesized cDNA was used as a template, and the delta cycle threshold was calculated using the 60S ribosomal protein L 21 (RPL21, PBANKA_1018600) mRNA as an internal control. Three biologically independent experiments were performed in each gene. The primers used in this assay are listed in Table S5.
Data availability. The ChIP sequencing data for this article have been deposited in the Gene Expression Omnibus (GEO) database under accession number GSE188985.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.