Genome-Wide Analysis of Watermelon HSP20s and Their Expression Profiles and Subcellular Locations under Stresses

Watermelon (Citrullus lanatus L.), which is an economically important cucurbit crop that is cultivated worldwide, is vulnerable to various adverse environmental conditions. Small heat shock protein 20s (HSP20s) are the most abundant plant HSPs and they play important roles in various biotic and abiotic stress responses. However, they have not been systematically investigated in watermelon. In this study, we identified 44 watermelon HSP20 genes and analyzed their gene structures, conserved domains, phylogenetic relationships, chromosomal distributions, and expression profiles. All of the watermelon HSP20 proteins have a conserved the α-crystallin (ACD) domain. Half of the ClHSP20s arose through gene duplication events. Plant HSP20s were grouped into 18 subfamiles and a new subfamily, nucleo-cytoplasmic XIII (CXIII), was identified in this study. Numerous stress- and hormone-responsive cis-elements were detected in the putative promoter regions of the watermelon HSP20 genes. Different from that in other species, half of the watermelon HSP20s were repressed by heat stress. Plant HSP20s displayed diverse responses to different virus infections and most of the ClHSP20s were generally repressed by Cucumber green mottle mosaic virus (CGMMV). Some ClHSP20s exhibited similar transcriptional responses to abscisic acid, melatonin, and CGMMV. Subcellular localization analyses of six selected HSP20- green fluorescence protein fusion proteins revealed diverse subcellular targeting. Some ClHSP20 proteins were affected by CGMMV, as reflected by changes in the size, number, and distribution of fluorescent granules. These systematic analyses provide a foundation for elucidating the physiological functions and biological roles of the watermelon HSP20 gene family.


Introduction
Heat shock proteins (HSPs), an important group of molecular chaperones, have been proven to participate in various plant stress responses and in regulating plant growth and development [1]. On the basis of their molecular weight, HSPs can be classified into six families, namely, HSP100, HSP90, HSP70, HSP60, HSP20 (or small heat shock protein), and ubiquitin. Of these groups, HSP20 is the largest family and the most well studied [2]. Numerous investigations have revealed the roles of HSP20 proteins in various abiotic stress responses. Most HSP20s are highly induced by heat stress and transgenic plants overexpressing HSP20s in Arabidopsis, rice, maize, wheat, and Chenopodium sp., by protecting proteins from irreversible denaturation, enhanced tolerance to heat stress [3][4][5][6][7]. Overexpression of OsHSP17.0 and OsHSP23.7 in rice decreases membrane damage, increases the abundance of protective molecules, and leads to improved seed vigor under osmotic and salt stress

Identification of HSP20 Genes in Watermelon and Cucumber
We conducted a genome-wide analysis of HSP20 gene family members in watermelon. Protein Basic Local Alignment Search Tool (BLASTP) searches were performed in the Cucurbit Genomics Database to identify putative HSP20s in watermelon. The search queries were 273 HSP20 protein sequences from Arabidopsis, rice, soybean, tomato, pepper, wheat, barley, and switchgrass. A total of 53 putative hits were identified in the watermelon genome database. Meanwhile, 41 sequences were annotated as being watermelon HSP20 genes after searches, with HMMER 3.0 using the global hidden Markov models (HMM) profile of the HSP20 characteristic domain. Redundant sequences by the above two methods were omitted to obtain unique putative HSP20 genes, and the remaining hits were further filtered using Conserved Domains Database (CDD) and Simple Modular Architecture Research Tool (SMART) according to the presence of structural characteristics and the conserved ACD domain. Finally, 44 HSP20 genes were identified in watermelon. The same methods were applied to identify 45 HSP20s in cucumber. HSP20 genes have been intensively studied in some model plant species and important crops. Among the plants studied to date, Arabidopsis has the smallest number of HSP20s and switchgrass has the largest number. The numbers of HSP20 genes in watermelon (44) and cucumber (45) were greater than those in all other plant species, except for soybean (51) and switchgrass (63) ( Table 1). Table 1. Summary of the number of heat shock protein 20 (HSP20) genes in diverse plant species.

Species
The The amino acid sequences of all ClHSP20s ranged from 97 to 502 amino acids in length, corresponding to molecular weights ranging from 13.72 to 57.37 kDa (Table S1). The HSP20s in watermelon were named according to their protein molecular weight, as in previous studies [2,25,26]. Almost all of the ClHSP20s had a highly conserved α-crystallin domain (ACD) at the C-terminal end, as identified by SMART, as well as two conserved motifs (motif 1 and motif 2), as identified by MEME [30] (Figures 1 and 2). Multiple sequence alignments indicated that the ACD domain contained two conserved regions with a β-sandwich of two antiparallel β-sheets domains ( Figure S1 and Figure 2). These conserved regions are known to play important roles in the chaperone function of HSP20 proteins [31,32]. Seven ClHSP20 proteins (ClHSP18.9A, ClHSP21.8, ClHSP27.5, ClHSP38.8, ClHSP39.8, ClHSP42.6, and ClHSP50.3) had a transmembrane domain in the C-or N-terminal region, respectively. model plant species and important crops. Among the plants studied to date, Arabidopsis has the smallest number of HSP20s and switchgrass has the largest number. The numbers of HSP20 genes in watermelon (44) and cucumber (45) were greater than those in all other plant species, except for soybean (51) and switchgrass (63) ( Table 1). The amino acid sequences of all ClHSP20s ranged from 97 to 502 amino acids in length, corresponding to molecular weights ranging from 13.72 to 57.37 kDa (Table S1). The HSP20s in watermelon were named according to their protein molecular weight, as in previous studies [2,25,26]. Almost all of the ClHSP20s had a highly conserved α-crystallin domain (ACD) at the C-terminal end, as identified by SMART, as well as two conserved motifs (motif 1 and motif 2), as identified by MEME [30] (Figures 1 and 2). Multiple sequence alignments indicated that the ACD domain contained two conserved regions with a β-sandwich of two antiparallel β-sheets domains ( Figure S1 and Figure 2). These conserved regions are known to play important roles in the chaperone function of HSP20 proteins [31,32]. Seven ClHSP20 proteins (ClHSP18.9A, ClHSP21.8, ClHSP27.5, ClHSP38.8, ClHSP39.8, ClHSP42.6, and ClHSP50.3) had a transmembrane domain in the C-or N-terminal region, respectively.  According to the number of introns, the watermelon ClHSP20 genes could be divided into three types: Type I genes have no introns (21 ClHSP20s); type II genes have one intron (20 ClHSP20s); and, type III genes have more than one intron (ClHSP23.1B, ClHSP27.9, and ClHSP46.3) (Figure 1).

Figure 2.
Amino acid sequence alignment of the α-crystallin (ACD) domain from watermelon HSP20s. Sequences were aligned by Clustal X program. Conserved motifs are marked. Conserved region I and II were shown for red and yellow background, respectively, and the typical amino acid residues within these regions were indicated by red boxes.

Phylogenetic Relationship of Plant HSP20 Members
To further analyze the phylogenetic relationships of HSP20 proteins, the amino acid sequences of HSP20 proteins from Arabidopsis, rice, tomato, soybean, switchgrass, cucumber, and watermelon were used to perform multiple alignments analyses and to construct a phylogenetic tree ( Figure 3). The HSP20s were divided into 18 subfamilies. Notably, a new subfamily, nucleo-cytoplasmic XIII (CXIII), was identified in this study. This subfamily comprised only HSP20 proteins from watermelon and cucumber. Of the 18 subfamilies, 16 contained watermelon HSP20 proteins. Among all the watermelon HSP20 genes, 34 were nucleocytoplasmic (C) HSP20 genes (11 subfamilies); three were mitochondrial (M) HSP20 genes (two subfamilies); three were endoplasmic reticulum (ER) HSP20 genes; two were plastidic (P) HSP20 genes; one was a peroxisomal (Px) HSP20 gene; and, one was an orphan gene (ClHSP42.6). Watermelon had no HSP20 genes in the CVIII or CX subfamilies, since these two subfamilies were exclusively found in monocots. The CIV and CXIII subfamilies each had only one dicotyledon HSP20 gene. The largest subfamily was CI, which contained 12 ClHSP20 genes. Members of the CI subfamily were highly conserved. In fact, the gene pairs ClHSP11.1A and ClHSP11.1B, ClHSP17.6C and ClHSP17.6D, and ClHSP18.1A and ClHSP18.1D in CI subfamily shared identical amino acid sequences with each other (Table S2).

Genomic Distribution and Evolutionary Analysis of Watermelon HSP20s
All of the HSP20 genes in watermelon were non-randomly located on the 11 watermelon chromosomes (Chr), except for ClHSP18.1A and ClHSP27.2. Chr07 had the most ClHSP20 genes. Only one ClHSP20 was mapped to each of Chr01, Chr06, and Chr11 ( Figure 4). Three tandem duplication clusters involving in 10 ClHSP20 genes were identified (Table 2). A duplicated gene pair (ClHSP23.1B and ClHSP43.7) was mapped on Chr02. This pair shared 54.2% amino acid sequence Sequences were aligned by Clustal X program. Conserved motifs are marked. Conserved region I and II were shown for red and yellow background, respectively, and the typical amino acid residues within these regions were indicated by red boxes.

Phylogenetic Relationship of Plant HSP20 Members
To further analyze the phylogenetic relationships of HSP20 proteins, the amino acid sequences of HSP20 proteins from Arabidopsis, rice, tomato, soybean, switchgrass, cucumber, and watermelon were used to perform multiple alignments analyses and to construct a phylogenetic tree ( Figure 3). The HSP20s were divided into 18 subfamilies. Notably, a new subfamily, nucleo-cytoplasmic XIII (CXIII), was identified in this study. This subfamily comprised only HSP20 proteins from watermelon and cucumber. Of the 18 subfamilies, 16 contained watermelon HSP20 proteins. Among all the watermelon HSP20 genes, 34 were nucleocytoplasmic (C) HSP20 genes (11 subfamilies); three were mitochondrial (M) HSP20 genes (two subfamilies); three were endoplasmic reticulum (ER) HSP20 genes; two were plastidic (P) HSP20 genes; one was a peroxisomal (Px) HSP20 gene; and, one was an orphan gene (ClHSP42.6). Watermelon had no HSP20 genes in the CVIII or CX subfamilies, since these two subfamilies were exclusively found in monocots. The CIV and CXIII subfamilies each had only one dicotyledon HSP20 gene. The largest subfamily was CI, which contained 12 ClHSP20 genes. Members of the CI subfamily were highly conserved. In fact, the gene pairs ClHSP11.1A and ClHSP11.1B, ClHSP17.6C and ClHSP17.6D, and ClHSP18.1A and ClHSP18.1D in CI subfamily shared identical amino acid sequences with each other (Table S2).

Genomic Distribution and Evolutionary Analysis of Watermelon HSP20s
All of the HSP20 genes in watermelon were non-randomly located on the 11 watermelon chromosomes (Chr), except for ClHSP18.1A and ClHSP27.2. Chr07 had the most ClHSP20 genes. Only one ClHSP20 was mapped to each of Chr01, Chr06, and Chr11 ( Figure 4). Three tandem duplication clusters involving in 10 ClHSP20 genes were identified ( Table 2). A duplicated gene pair (ClHSP23.1B and ClHSP43.7) was mapped on Chr02. This pair shared 54.2% amino acid sequence similarity (Table S2). The Ks value (i.e., synonymous substitution rate) for this duplicated pair was 1.7552, with the corresponding divergence time being 135.02 million years ago (Mya). ClHSP11.1A, ClHSP18.1B, ClHSP17.6C, and ClHSP18 were tandemly repeated on Chr 07 and shared 69.1% to 96.8% similarity with each other (Figure 4 and Table S2). The Ks values of these duplicates ranged from 0.0865 to 1.2543 corresponding to divergence times of 6.65 to 96.48 Mya (Table 2). ClHSP18.1D, ClHSP11.1B, ClHSP18.1C, and ClHSP17.6D were also distributed on Chr 07 and showed 70.1% to 90.6% amino acid similarity. The Ks values of these duplicated genes ranged from 0.3202 to 1.1953, corresponding to divergence times of 24.63 to 91.95 Mya. Interestingly, all of the eight tandem duplicates mapped on Chr 07 belonged to the CI subfamily, suggesting that the watermelon CI subfamily has probably expanded via tandem duplication. Thirteen gene pairs involving 14 ClHSP20s resulted from segment duplication ( Table 2). Eight pairs of segment duplicates occurred between four ClHSP20s on Chr 07 and five genes that were scattered on Chr 03, Chr 04, Chr 10, and Chr 11. Two putative duplicated gene pairs (ClHSP16/ClHSP22. 8 (Figure 4 and Table S2). The Ks values of these duplicates ranged from 0.0865 to 1.2543 corresponding to divergence times of 6.65 to 96.48 Mya (   Phylogenetic relationship of HSP20 proteins in Arabidopsis, rice, tomato, soybean, switchgrass, cucumber, and watermelon. Phylogenetic trees were constructed using neighbor-joining method with bootstrap tests by MEGA 5.0. The Bar represents the relative divergence of the sequences examined. The diverse HSP20 subgroups are indicated with different color arcs. The different colored symbols at the branch tips represent different species., C, cytoplasmic/nuclear; ER, endoplasmic reticulum; P, plastid; PX, peroxisome; M, mitochondria.   There is a high degree of intergenomic homology between watermelon and cucumber, which belong to the Cucurbitaceae family [33]. Therefore, we assessed the syntenic relationships of HSP20 genes from watermelon and cucumber genomes using the Cucurbit Genomics Database to investigate cucurbit chromosomal evolution. A total of 42 syntenic gene pairs involving in 30 ClHSP20s that were located on all watermelon chromosomes, except for Chr 01, were identified between watermelon and cucumber. Four genes (ClHSP17.6C, ClHSP17.6D, ClHSP18.1B, and There is a high degree of intergenomic homology between watermelon and cucumber, which belong to the Cucurbitaceae family [33]. Therefore, we assessed the syntenic relationships of HSP20 genes from watermelon and cucumber genomes using the Cucurbit Genomics Database to investigate cucurbit chromosomal evolution. A total of 42 syntenic gene pairs involving in 30 ClHSP20s that were located on all watermelon chromosomes, except for Chr 01, were identified between watermelon and cucumber. Four genes (ClHSP17.6C, ClHSP17.6D, ClHSP18.1B, and ClHSP18.1D) on watermelon Chr07 had the highest number of syntenic genes (10) in the cucumber genome ( Figure 5).

Analysis of Putative cis-Acting Elements in ClHSP20 Promoters
Heat shock elements (HSE) are common cis-acting elements in the promoters of HSP20s that confer heat shock-inducible expression. We identified and analyzed cis-acting elements in the putative promoter regions of the HSP20 genes in watermelon (Table S3). Numerous hormone-and stress-responsive elements were found in the ClHSP20 promoters. We detected MeJA-, GA-, salicylic acid (SA)-, and ABA-responsive elements in 23, 20, 20, and 10 HSP20 promoters, respectively. Of the 44 watermelon HSP20 genes, 27 had HSE motifs in their promoter regions. Drought-and low temperature-responsiveness motifs were identified in the promoters of 8 and 10 ClHSP20 genes, respectively.

Subcellular Localizations of HSP20 Proteins in Watermelon
The subcellular localizations of six watermelon HSP20 proteins from different subfamilies, namely ClHSP22.8, ClHSP15.9, ClHSP17.4, ClHSP18.9A, ClHSP18.9B, and ClHSP15.3, were analyzed by transiently expressing their green fluorescent protein (GFP) fusion proteins in tobacco leaf epidermal cells. The results showed that these six ClHSP20 proteins were localized to various subcellular compartments, the watermelon ClHSP20 proteins have diverse functions ( Figure 6). Specifically, ClHSP15.9 was predicted to be located in peroxisomal according to the WoLF PSORT website, however, it was predominantly detected in cytoplasmic granules (Table S1 and Figure   Figure 5. Relationship diagrams of syntenic HSP20 genes in chromosomal synteny regions distributed in watermelon and cucumber. Chromosomes of watermelon and cucumber are represented by blue and red arcs according to their own sizes. Differently colored lines link representative syntenic HSP20 genes in watermelon and cucumber.

Analysis of Putative cis-Acting Elements in ClHSP20 Promoters
Heat shock elements (HSE) are common cis-acting elements in the promoters of HSP20s that confer heat shock-inducible expression. We identified and analyzed cis-acting elements in the putative promoter regions of the HSP20 genes in watermelon (Table S3). Numerous hormone-and stress-responsive elements were found in the ClHSP20 promoters. We detected MeJA-, GA-, salicylic acid (SA)-, and ABA-responsive elements in 23, 20, 20, and 10 HSP20 promoters, respectively. Of the 44 watermelon HSP20 genes, 27 had HSE motifs in their promoter regions. Drought-and low temperatureresponsiveness motifs were identified in the promoters of 8 and 10 ClHSP20 genes, respectively.

Subcellular Localizations of HSP20 Proteins in Watermelon
The subcellular localizations of six watermelon HSP20 proteins from different subfamilies, namely ClHSP22.8, ClHSP15.9, ClHSP17.4, ClHSP18.9A, ClHSP18.9B, and ClHSP15.3, were analyzed by transiently expressing their green fluorescent protein (GFP) fusion proteins in tobacco leaf epidermal cells. The results showed that these six ClHSP20 proteins were localized to various subcellular compartments, the watermelon ClHSP20 proteins have diverse functions ( Figure 6). Specifically, ClHSP15.9 was predicted to be located in peroxisomal according to the WoLF PSORT website, however, it was predominantly detected in cytoplasmic granules (Table S1 and Figure 6B,b). ClHSP22.8 and ClHSP15.3-GFP were predicted to be cytoplasmic and nuclear proteins, respectively, but their GFP fusion proteins were mainly agglomerated into granules in cytoplasm, and they were weakly detected in endoplasmic reticulum (ER) (Figure 6A,F,a,f). In contrast, ClHSP18.9A, ClHSP17.4, and ClHSP18.9B were predicted to be located in golgi, cytoplasm, and chloroplasts, respectively, but, in fact, the corresponding GFP fusion proteins were localized to the endoplasmic reticulum ( Figure 6C-E,c-e). Additionally, ClHSP17.4-, ClHSP18.9A-, ClHSP18.9B-, and ClHSP15.3-GFP fusion proteins were also detected in nuclear ( Figure S2). After CGMMV infections, the subcellular localizations of some ClHSP20s appeared to change (Figure 7). For example, ClHSP15.9-GFP fusion proteins accumulated into more vesicles and some newly formed vesicae in the cytoplasm, whereas ClHSP17.4-GFP began to agglomerate in relatively large granules along the endoplasmic reticulum, and ClHSP15.3-GFP appeared to no longer agglomerate in granules, but instead it was present at endoplasmic reticulum. 6B,b). ClHSP22.8 and ClHSP15.3-GFP were predicted to be cytoplasmic and nuclear proteins, respectively, but their GFP fusion proteins were mainly agglomerated into granules in cytoplasm, and they were weakly detected in endoplasmic reticulum (ER) (Figure 6A,F,a,f). In contrast, ClHSP18.9A, ClHSP17.4, and ClHSP18.9B were predicted to be located in golgi, cytoplasm, and chloroplasts, respectively, but, in fact, the corresponding GFP fusion proteins were localized to the endoplasmic reticulum ( Figure 6C-E,c-e). Additionally, ClHSP17.4-, ClHSP18.9A-, ClHSP18.9B-, and ClHSP15.3-GFP fusion proteins were also detected in nuclear ( Figure S2). After CGMMV infections, the subcellular localizations of some ClHSP20s appeared to change (Figure 7). For example, ClHSP15.9-GFP fusion proteins accumulated into more vesicles and some newly formed vesicae in the cytoplasm, whereas ClHSP17.4-GFP began to agglomerate in relatively large granules along the endoplasmic reticulum, and ClHSP15.3-GFP appeared to no longer agglomerate in granules, but instead it was present at endoplasmic reticulum.   (a-f) displays enlarged images of the section delimited by a white square in A to F, respectively. Fluorescent granules were indicated by white arrows. Vesicae were marked by red arrows. White stars labeled the GFP fluorescent signal located on endoplasmic reticulum. Scale bars, 25 μm.

Expression Analyses of ClHSP20 Genes in Response to Abscisic Acid and Melatonin
The expression patterns of 30 randomly selected ClHSP20 genes in response to ABA and melatonin (MT) were analyzed in a qRT-PCR assay ( Figure 8, Table S6). Most of the ClHSP20s could respond to ABA and MT. Under ABA treatment, the expression levels of 14 ClHSP20s were obviously downregulated from 4 h, while eight ClHSP20s exhibited upregulated expression. Of these genes, ClHSP18.1E, ClHSP18.2, and ClHSP23 expression levels increased by 70-, 40-, and 39-fold, respectively, at 4 h after the ABA treatment ( Figure 8A, Table S6). Some ClHSP20s, such as ClHSP11.1A, ClHSP17.6B, and ClHSP21.6, were induced after repressed by ABA treatment. In contrast, MT downregulated the expression of most of the analyzed ClHSP20 genes, especially at 4 and 12 h after treatments. Moreover, the expression levels of some ClHSP20 genes, including ClHSP11.1A, ClHSP26.3, ClHSP21.6, and ClHSP23, were downregulated from 1 to 12 h after the MT treatment ( Figure 8B). Meanwhile, MT upregulated the expression of 11 ClHSP20 genes, especially at 1 h after treatments. The ClHSP17.6A expression level increased by 145-fold at 12 h, whereas ClHSP22.8 expression was evidently downregulated at 1 h, but it increased by nearly three-fold at 4 h after the MT treatment. (a-f) displays enlarged images of the section delimited by a white square in A to F, respectively. Fluorescent granules were indicated by white arrows. Vesicae were marked by red arrows. White stars labeled the GFP fluorescent signal located on endoplasmic reticulum. Scale bars, 25 µm.

Expression Analyses of ClHSP20 Genes in Response to Abscisic Acid and Melatonin
The expression patterns of 30 randomly selected ClHSP20 genes in response to ABA and melatonin (MT) were analyzed in a qRT-PCR assay ( Figure 8, Table S6). Most of the ClHSP20s could respond to ABA and MT. Under ABA treatment, the expression levels of 14 ClHSP20s were obviously downregulated from 4 h, while eight ClHSP20s exhibited upregulated expression. Of these genes, ClHSP18.1E, ClHSP18.2, and ClHSP23 expression levels increased by 70-, 40-, and 39-fold, respectively, at 4 h after the ABA treatment ( Figure 8A, Table S6). Some ClHSP20s, such as ClHSP11.1A, ClHSP17.6B, and ClHSP21.6, were induced after repressed by ABA treatment. In contrast, MT downregulated the expression of most of the analyzed ClHSP20 genes, especially at 4 and 12 h after treatments. Moreover, the expression levels of some ClHSP20 genes, including ClHSP11.1A, ClHSP26.3, ClHSP21.6, and ClHSP23, were downregulated from 1 to 12 h after the MT treatment ( Figure 8B). Meanwhile, MT upregulated the expression of 11 ClHSP20 genes, especially at 1 h after treatments. The ClHSP17.6A expression level increased by 145-fold at 12 h, whereas ClHSP22.8 expression was evidently downregulated at 1 h, but it increased by nearly three-fold at 4 h after the MT treatment.

Gene Ontology Enrichment Analysis
To further explore their functions, the ClHSP20 genes underwent a Gene Ontology (GO) enrichment analysis. The enriched GO terms were grouped into the following three categories; biological process, molecular function, and cellular component ( Figure 10 and Table S4). The significantly enriched molecular function GO term for molecular function was protein binding (GO:0005515). The ClHSP20 proteins were mainly located in the cytoplasm, plastid, and peroxisomal matrix. The main enriched biological process terms were hyperosmotic salinity response (GO:0042538), heat acclimation (GO:0010286), response to heat (GO:0009408), response to hydrogen

Gene Ontology Enrichment Analysis
To further explore their functions, the ClHSP20 genes underwent a Gene Ontology (GO) enrichment analysis. The enriched GO terms were grouped into the following three categories; biological process, molecular function, and cellular component ( Figure 10 and Table S4). The significantly enriched molecular function GO term for molecular function was protein binding (GO:0005515). The ClHSP20 proteins were mainly located in the cytoplasm, plastid, and peroxisomal matrix. The main enriched biological process terms were hyperosmotic salinity response (GO:0042538), heat acclimation (GO:0010286), response to heat (GO:0009408), response to hydrogen peroxide (GO:0042542), and response to high light intensity (GO:0042538). The results of the GO enrichment analysis suggested that the ClHSP20 genes have significant roles in various stress responses. peroxide (GO:0042542), and response to high light intensity (GO:0042538). The results of the GO enrichment analysis suggested that the ClHSP20 genes have significant roles in various stress responses.   peroxide (GO:0042542), and response to high light intensity (GO:0042538). The results of the GO enrichment analysis suggested that the ClHSP20 genes have significant roles in various stress responses.

Discussion
The HSP20 gene family has been identified in Arabidopsis, rice, soybean, wheat, barley, tomato, pepper, and switchgrass, respectively [2,22,25,[27][28][29]34,35]. The number of HSP20 gene family members ranges from 13 (barley) to 63 (soybean). In this study, we identified 44 ClHSP20s in watermelon, fewer than in switchgrass (63), soybean (51), and cucumber (45), but more than in the other species (Table 1). Although the genome size of watermelon (425 Mb) is obviously bigger than in Arabidopsis (125 Mb) and cucumber (367 Mb), the sizes of many gene families in watermelon are similar to those in cucumber but slightly smaller than those in Arabidopsis [36][37][38]. This is probably because watermelon avoided more recent wholegenome duplication events, except for the core-eudicot common hexaploidization (ECH) event [39,40]. However, the number of HSP20s in watermelon is twice than that in Arabidopsis in this study, which appears to contradict the above findings [39], but is consistent with the latest research that a cucurbit-common tetraploidization (CCT, 90-102 Mya) event occurred in the Cucurbitaceae shortly after the ECH event (115-130 Mya) [33]. The 13 tandemly duplicated gene pairs of ClHSP20s arose 6.65 to 135.02 Mya, probably as a result of gene losses and chromosomal rearrangements after the CCT event, However, there are relatively low rates of preserved CCT colinear genes in the watermelon genome (2.5-5.5%), which explains why many other watermelon gene families are quite small [36][37][38][39]. Watermelon has retained high genomic homology and it shares more colinear genes with cucumber [33]. Therefore, we analyzed the linear relationships between HSP20 genes in watermelon and cucumber in this study. Forty-two colinear gene pairs between the two species were identified ( Figure 5). Some HSP20s in the watermelon genome had more than one syntenic gene in the cucumber genome. These watermelon colinear genes were distributed on all of the chromosomes, except for Chr 01, with the largest number of colinear genes on Chr 07. The existence of so many colinear genes and colinear blocks between the watermelon and cucumber genomes was probably due to another round of genomic repatterning after the CCT event [33].
The HSP20 genes are important for responses to various stresses [3,5,6,14,15]. In this study, GO enrichment results indicated that ClHSP20 genes were mainly enriched in the heat, salinity, hydrogen peroxide, and high light signal response pathway. The HSP20 genes are characterized by high and rapid upregulation in response to heat stress. Almost all of HSP20s from soybean, pepper, and switchgrass exhibit upregulated expression in response to heat stress [21,28,29]. In watermelon, all HSP20s could respond to heat stress, except for ClHSP39.8, ClHSP43.7, and ClHSP55.8. However, the expression levels of approximately half of the ClHSP20 genes are repressed by heat shock, especially ClHSP27.5, ClHSP18.1E, ClHSP16.1, ClHSP18.9B, ClHSP15.3, and ClHSP23.1B. The induced HSP20s could enhance heat tolerance by protecting proteins from irreversible denaturation, but it is necessary to elucidate the roles of these downregulated genes [41]. Infections by CGMMV are devastating for cucurbit crops, and they have been responsible for considerable watermelon yield losses [42]. However, little is known about the molecular mechanism underlying the CGMMV-induced watermelon disease. In response to CGMMV infection, most watermelon HSP20s were obviously repressed from 6 h after being slightly induced at 1 h in this study. Just few reports indicated plant HSP20s involved in viral response. Microarray technology analysis indicated that Arabidopsis HSP17.4 was induced under five plant viruses infection and it was speculated to have a common mechanism in response to viral infection, but its homologous gene, ClHSP18.1E in watermelon, was repressed by CGMMV infection [43]. The RNA-Seq data analysis found that two HSP20s from ER subfamily, namely, ClHSP23 and ClHSP21.6, in CGMMV-inoculated watermelon fruits, were significantly repressed, which had similar responses in leaves in this study [44]. The results suggested that ER subfamily genes in watermelon probably played negative roles in response to CGMMV.
Recently, studies found that melatonin widely participated in plant biotic and abiotic stress responses, even it has begun to be considered by some experts in plant hormones [45]. HSP20s have been to proved to participate in stress responses via the MT signaling pathway, and the expression levels of tomato HSP20s increased under heat and cadmium stress after melatonin treatment [46,47]. In this study, the qRT-PCR results indicated that most of the watermelon HSP20 genes were induced by MT at 1 h but obviously repressed at 4 and 12 h. Similarly, most of the watermelon HSP20s were also generally repressed by ABA treatment, as well as the expression levels of nearly all switchgrass HSP20 genes are downregulated in response to an ABA treatment [29]. Notably, more ClHSP20s has similar response patterns to ABA, MT, and CGMMV, as they were generally repressed by the three treatments. Take the newly identified CXIII subfamily gene ClHSP55.8 and ClHSP15.3 as examples, following ABA and MT treatments, the ClHSP55.8 transcript abundance dramatically increased at 1 h, decreased at 4 h, and then increased to high levels at 12 h. Similarly, ClHSP55.8 expression was induced by CGMMV at 1 and 48 h, but was repressed at 6 h. In contrast, ClHSP15.3 expression was repressed by ABA, MT, and CGMMV, except for a transient increase at 1 h in response to ABA. These findings imply that these ClHSP20 genes probably participate in a common signaling pathway in response to MT, ABA, and CGMMV.
Plant HSP20 proteins have been detected in various cellular locations, including the cytosol, nucleus, chloroplast, endoplasmic reticulum, mitochondrion, and peroxisome [25]. This diversity in subcellular localization is probably due to the multiple functions for these proteins. To analyze the subcellular localization and distribution pattern of watermelon HSP20 proteins, six ClHSP20s fused with GFP at their N terminus were constructed and introduced into tobacco epidermal cells. The ClHSP20s displayed different subcellular localizations that were expressed alone in tobacco epidermal cells. The fluorescence signal of ClHSP18.9A-, ClHSP17.4-, and ClHSP18.9B-GFP proteins were mainly detected in endoplasmic reticulum and nuclear ( Figure 6, Figure S2). ClHSP22.8, ClHSP15.9, and ClHSP15.3 formed different fluorescent granules in cytoplasm. These granules probably were large oligomers that were formed in vitro by HSP20 proteins as previous reports ( Figure 6) [48]. Recently, a study found that the Rice stripe virus could alter the sub-cellular distribution of HSP20s in rice and tobacco [23]. In this study, some ClHSP20-GFP fusion proteins, including ClHSP15.9-GFP, ClHSP17.4-GFP, and ClHSP15.3-GFP, were affected by CGMMV, especially regarding the size, number, and distribution of the granules that they formed (Figure 7). The ClHSP15.9-GFP and ClHSP17.4-GFP proteins primarily accumulated in granules. Notably, ClHSP15.9-GFP proteins formed new cytoplasmic vesicles. In contrast, the fluorescence of ClHSP15.3-GFP proteins that was initially detected in granules was subsequently mainly observed in the membrane. These granules or vesicles detected in this study were similar to stress granules, and were identified in animal cells in response to virus infection. The formation of stress granules probably was an anti-viral response [23,49]. A more thorough characterization of these granules will likely clarify the mechanism underlying ClHSP20 responses to viral infections.

Phylogenetic Analysis, Gene Structure Construction, and Motif Analysis
The similarity of HSP20s from Arabidopsis, watermelon, and cucumber was calculated using DNAStar software (Madison, WI, USA). Phylogenetic analysis based on full-length protein sequences was performed using the MEGA 5.0 program by the neighbor-joining (NJ) method with 1000 bootstrap replicates [50]. The structures of all watermelon HSP20 genes were analyzed via the Gene Structure Display Server (http://gsds.cbi.pku.edu.cn/). Motif analysis and annotation of conserved motifs in HSP20 proteins were conducted using MEME (http://meme.nbcr.net/meme/cgi-bin/meme.cgi). The predicted peptide sequences of conserved domains in HSP20 proteins were confirmed using CDD (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi) and SMART databases (http://smart. emblheidelberg.de/). Multiple-sequence alignment of predicted peptide sequences of the conserved α-crystallin (ACD) domain was carried out using Clustal X v1.81 with default parameters [51].

Chromosomal Localization, Gene Duplication, and Evolutionary Analysis
All HSP20 genes were assigned to corresponding watermelon chromosomes based on the Cucurbit Genomics Database. Gene pairs that were separated by fewer than five intervening genes and sharing ≥40% amino acid sequence similarity were considered to have undergone a tandem duplication event [52]. The synonymous (Ks) and non-synonymous (Ka) substitution rates were estimated, as described by Tang et al., 2008 [53]. Synteny analysis was performed with the Cucurbit Genomics Database. CLUSTALW (http://www.genome.jp/tools/clustalw/) was used to align the amino acid sequences and corresponding CDS sequences of HSP20 elements, and then Ks and Ka values were calculated using the Codeml procedure of the PAML (http://www.bork.embl.de/pal2nal/). The divergence time of each duplicated gene pair was estimated using the synonymous mutation rate of substitutions per synonymous site per year, as follows: T = Ks/2x (x = 6.56 × 10 −9 ) [54]. Gene Ontology (GO) enrichment analysis was performed while using AgBase (http://agbase.arizona.edu/ cgi-bin/tools/index.cgi).

Analysis of cis-Acting Elements in HSP20 Putative Promoter Regions in Watermelon
To identify the cis-elements in the promoter sequences of HSP20 genes in watermelon, the 1.5-kb upstream regions were obtained from the Cucurbit Genomics Database, and analyzed using the online tools at the PlantCARE website (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/ search_CARE.html).

Watermelon Plant Growth and Treatments
The watermelon advanced inbred line 'JJZ-M' was used for expression analyses. The plants were grown in a growth chamber in temperature-controlled greenhouses under day/night temperatures of 28/22 ± 1 • C, light intensity of 200 µmol·m −2 ·s −1 , and a 16-h light/8-h dark photoperiod. Three-week-old watermelon seedlings were used for stress and exogenous hormone treatments. For MT and ABA treatments, the seedlings were sprayed with 150 µM MT and 100 µM ABA, respectively [38,55]. The second true leaf on each plant was sampled at 0 (control), 1, 4, and 12 h after treatment. In the heat treatment, watermelon plants were subjected to heat treatment at 42 • C and the leaves were collected at 0 (control), 1, 4, and 12 h. For CGMMV infection, transformed Agrobacterium cells containing the infectious full-length cDNA of the CGMMV genome were incubated at 28 • C for two days. The cells were then resuspended in infiltration solution (500 mM acetosyringone, 10 mM MES, pH 5.8, and 10 mM MgCl 2 ) and injected into the abaxial side of watermelon leaves, as described by Voinnet et al., 2003 [56]. Infiltrated leaves were collected at 0 (control), 1, 6, and 48 h after infection. All treatments were repeated three times and each treatment contained 20 seedlings. All materials were frozen at −75 • C until RNA isolation.

RNA Isolation and qRT-PCR
The total RNA was extracted from samples using TRIZOL reagent (Invitrogen, Germany), according to the manufacturer's protocol. First-strand cDNA was generated from 1 µg of total RNA using the PrimeScript RT reagent kit (Takara, Japan), according to the manufacturer's instructions. Specific primers that were used in the qRT-PCR were designed using Primer 5 software, and each primer was searched in the watermelon database to ensure its specificity. The qRT-PCR reactions (reaction volume, 15 µL) were performed on a CFX96 Real Time System machine (Bio-RAD, USA), programmed to heat for 30 s at 95 • C, followed by 40 cycles of 5 s at 95 • C and 45 s at 55 • C, and at the end, one cycle of 1 min at 95 • C, 30 s at 50 • C, and 30 s at 95 • C. Two biological and three technical replicates for each sample were analyzed using the SYBR Premix Ex Taq kit (TOYOBO, Japan). Watermelon β-actin (Cla007792) was selected as an internal control [57]. The relative gene expression level was calculated using the 2 −∆∆Ct method. The heatmap was generated from relative gene expression data using Multiple Array Viewer.

Subcellular Localization Analyses
The CDS sequences of ClHSP22.8, ClHSP15.9, ClHSP17.4, ClHSP18.9A, ClHSP18.9B, and ClHSP15.3 were amplified using gene-specific primers and they were cloned into the pFGC-eGFP plasmid via the Xba I and BamH I restriction sites (Table S5). Recombinant expression vectors of ClHSP20s fused to the CDS of enhanced green fluorescent protein (eGFP) protein were constructed. These plasmids were transformed into Agrobacteriumt tumefaciens GV3101 and transiently expressed in tobacco leaf cells with (or without) transformed Agrobacterium containing the infectious full-length cDNA clone of the CGMMV genome. The pFGC:eGFP empty vector served as the positive control. Images were acquired at 48 h using a Leica DMLE camera (Leica, Wetzlar, Germany).

Conclusions
HSP20s are the most abundant HSPs in plants and they play important roles in various biotic and abiotic stresses. However, HSP20 genes in watermelon had not been systematically analyzed. Here, 44 HSP20 genes in watermelon were identified and their gene structure, conserved domains, phylogenetic relationships, chromosome evolution, expression profiles, and subcellular localizations were analyzed. All of the watermelon HSP20 proteins contained a conserved α-crystallin (ACD) domain. Plant HSP20s could be divided into 18 subfamilies and a new subfamily, nucleo-cytoplasmic XIII (CXIII) was identified in this study. Of the 26 duplicated gene pairs in the watermelon genome, 13 arose by tandem duplication and 13 arose by segmental duplication. Numerous stress-and hormone-responsive cis-elements were detected in the putative promoter regions of the watermelon HSP20 genes. Almost all HSP20s from soybean, pepper, and switchgrass induced by heat stress, but about half of the watermelon HSP20s were repressed by heat stress via qRT-PCR analyse. Plant HSP20s displayed diverse responses to different virus infections in previous studies and most of the ClHSP20s were generally repressed by Cucumber green mottle mosaic virus (CGMMV). Several ClHSP20s showed similar response patterns to ABA, MT, and CGMMV. Subcellular localization analyses of six selected HSP20-GFP fusion proteins revealed diverse subcellular targeting in the epidermal cells of N. benthamiana. The subcellular localization and distribution patterns of ClHSP20 proteins, especially the size, number, and distribution patterns of the granules formed by ClHSP20-GFP proteins, were markedly affected by CGMMV infection. This systematic analysis will provide a foundation for elucidating the physiological functions and biological roles of the HSP20 family.