Characteristics and expression of lncRNA and transposable elements in Drosophila aneuploidy

Summary Aneuploidy can globally affect the expression of the whole genome, which is detrimental to organisms. Dosage-sensitive regulators usually have multiple intermolecular interactions, and changes in their stoichiometry are responsible for the dysregulation of the regulatory network. Currently, studies on noncoding genes in aneuploidy are relatively rare. We studied the characteristics and expression profiles of long noncoding RNAs (lncRNAs) and transposable elements (TEs) in aneuploid Drosophila. It is found that lncRNAs and TEs are affected by genomic imbalance and appear to be more sensitive to an inverse dosage effect than mRNAs. Several dosage-sensitive lncRNAs and TEs were detected for their expression patterns during embryogenesis, and their biological functions in the ovary and testes were investigated using tissue-specific RNAi. This study advances our understanding of the noncoding sequences in imbalanced genomes and provides a novel perspective for the study of aneuploidy-related human diseases such as cancer.


INTRODUCTION
6][7] Aneuploidy is also frequently present in human cancers. 1,5][10] Not only is the expression of genes located on varied chromosomes (cis) regulated, but the genes in the remainder of the genome (trans) also have altered expression, [10][11][12][13] and the influence can affect the entire gene expression regulatory network. 14,15The molecular basis of genomic imbalance caused by dosage changes of single chromosomes or chromosome segments is related to the disturbance of the balanced relationship of regulatory genes. 16,17Changes in the stoichiometry of subunits of macromolecular complexes, mainly including transcription factors, chromatin proteins, and signal transduction components, 18 will alter their assembly kinetics and overall function. 19,20This view, namely, the Gene Balance Hypothesis, has been confirmed by evolutionary genomics, quantitative trait genetics, and other evidences.2][23] Currently, most of the gene expression studies in aneuploidy are conducted for protein-coding genes, while the modulation of noncoding genes has rarely been studied. 24,25ong noncoding RNAs (lncRNAs) are defined as RNAs that are longer than 200 nucleotides and lack a significant open reading frame. 26ncRNAs are mainly transcribed by RNA polymerase II similarly to mRNAs (some can also be transcribed by RNA polymerase III).Pre-mature lncRNAs usually have 5 0 -end m7G caps and 3 0 -end poly(A) tails and undergo alternative splicing.LncRNAs have high diversity and low evolutionary conservation, but they usually have conserved secondary structures, which may be crucial to their functions. 27,289][30] Some lncRNAs can participate in modulations at epigenetic, transcriptional, and post-transcriptional levels through different mechanisms, and many lncRNAs are regulated by other genes. 27,28Functional lncRNA transcripts can serve as signals, decoys, guides, and scaffolds to exert their cis or trans molecular functions. 31In other cases, the transcription or splicing process of lncRNAs will have a transcription regulation function.][39][40][41][42] Another genomic component that is frequently overlooked is the transposable elements (TEs). 43Almost all eukaryotes contain TEs, with variable abundance and diversity among species. 43There are thousands of TE families in plants, and in some species, TEs constitute more noteworthy that there is an excessive amount of TEs relative to chromosome length on the fourth chromosome, even more than the number of mRNAs (Figure 1B).This is because many TEs tend to accumulate in heterochromatic regions with low recombination rates, and the fourth chromosome is a specific transposon-dense chromosome. 43,69The peak position of the transcript length distribution of TEs is lower than mRNAs and lncRNAs, but there are two small peaks at larger values, indicating that there is a polarization in the size of TEs (Figure 1D).The total expression of TEs in each sample accounts for 0.2%-0.3% of all mapped reads.Analysis of the TE families with the highest expression in each genotype reveals a similar picture (Figure 1H).Copia, INE-1, and roo, the top 3 highly expressed TE families are all retrotransposons, and their expression accounts for approximately half of all TEs.More than half of the TE families with the top 20 levels of expression are LTR (long terminal repeat) retrotransposons, the second most abundant type is LINE-like TE, and a few TIR (terminal inverted repeat) DNA transposons represented by the 1,360 family are also highly expressed (Figure 1H).

There are dosage sensitivities of lncRNAs and TEs in aneuploidy
Previous studies have found that the global expression in Drosophila and other species is affected by genomic imbalance.The genes located on the varied chromosomes (cis) generally show dosage compensation, while the genes on unvaried chromosomes (trans) are mainly subject to negative modulation with chromosome number changes, which is called an inverse dosage effect. 12,13,64To determine whether lncRNAs and TEs are regulated by dosage-dependent effects, we generated ratio distributions of expression changes for different types of transcripts in aneuploidies compared with diploid controls according to their chromosomal locations (Figure 2).
In trisomy chromosome 2 left arm (2L) females, the ratio distribution curves of X-linked lncRNAs and mRNAs are inversely regulated by the addition of an autosomal arm and concentrated at 0.67, as indicated by the vertical yellow dashed line, and there is no significant difference between lncRNAs and mRNAs (Figure 2A; Kolmogorov-Smirnov (K-S) test p value = 0.66).In contrast, the ratio distribution of mRNAs expressed from chromosome 2L is centered around the ratio of 1.0, as indicated by the vertical purple solid line, representing dosage compensation (Figure 2B).However, the ratio distribution of lncRNAs expressed from 2L is significantly different from that of mRNAs (K-S test p value = 4.4e-3).Although there is a high distribution around 1.0, the highest peak appears at a ratio of 0.7 (Figure 2B).This result indicates that most of the lncRNAs expressed from the varied chromosome can be further inversely regulated.At the same time, the distribution of lncRNAs also has a small peak at 1.5 indicated by the vertical yellow solid line, which represents lncRNAs that exhibit a proportional dosage effect without compensation.For lncRNAs and mRNAs expressed from other autosomes, the ratio distribution curves both shift to the left, with the peak of mRNA at 0.6 and the peak of lncRNA at 0.4 (Figure 2C; K-S test p value = 7.5e-6).Therefore, the trans lncRNAs and mRNAs, which are located on the unvaried chromosomes both exhibit an inverse dosage effect, but lncRNAs are affected to a greater extent.
The aneuploid effect in trisomy 2L males is different from that in females.The peak of the distributions of X-linked lncRNAs and mRNAs are both centered around the ratio of 1.0 (Figure 2D; K-S test p value = 0.71), indicating no changes in expression.The peak of the ratio distribution curve of mRNAs expressed from 2L is slightly lower than 1.5 (Figure 2E), which indicates that the expression level of mRNA is compensated to some extent but not completely.The distribution of 2L lncRNAs is significantly different from that of mRNA (K-S test p value = 4.2e-7), with one peak near the ratio of 1.1, representing dosage compensation, and two secondary peaks near 0.7 (inverse dosage effect) and 1.5 (gene dosage effect).The peak of mRNA localized on other autosomes is at the ratio of 0.9 (Figure 2F), representing the slight trans-acting inverse dosage effect.The peak of lncRNAs is around 1.0, but there is a shoulder peak at the ratio of 0.8 and a small peak at a ratio of 0.5 (Figure 2F; K-S test p value = 1.3e-11).This suggests that a subset of trans lncRNAs is more inversely regulated than mRNAs in trisomy 2L males.
In addition to autosomal aneuploidy, we also analyzed the expression of lncRNAs and mRNAs in sex chromosome aneuploidy.For X-linked cis mRNAs, the distribution curve is around 1.0, indicating that the expression of most mRNAs is unchanged in XXX; AA metafemales (Figure 2G).The ratio distribution of cis lncRNAs is significantly different from that of mRNAs (K-S test p value = 2.5e-5), showing a multipeak shape, with the highest peak at 0.7-0.8, which may represent slightly excessive dosage compensation compared to the normal female level.There is a second peak at the ratio of 0.3, which may represent stronger inverse regulation, and a small peak at 1.1, which represents partial incomplete compensation (Figure 2G).The ratio distributions of trans lncRNAs and mRNAs are both biased to the left of 1.0 (Figure 2H), suggesting decreased expression levels due to the inverse dosage effect of the triple X genotype.However, lncRNAs are more left-shifted than mRNA (K-S test p value = 7.6e-13); that is, the expression level of lncRNAs is further down-regulated.
Compared with trisomy 2L males, in trisomy 2L females and metafemales, cis genes appear to be more likely to show dosage compensation, while trans genes tend to exist inverse dosage effects.Therefore, cis and trans genes are generally regulated in the same direction, which indicates that the trans genes affected by an inverse dosage effect and the cis compensated genes may be affected by related mechanisms.In addition, the inverse dosage effect seems to be stronger in females than in males, reflecting a sexual dimorphism of genomic imbalance.Similar to some mRNAs, lncRNAs are dosage-affected in aneuploid Drosophila, and there is dosage compensation of cis lncRNAs and inverse dosage effects of trans lncRNAs as described previously. 12,13,64Whereas the responses of lncRNAs and mRNAs to the addition of an extra chromosome arm (whether autosomal or sex chromosome) are different, in that lncRNAs have a greater magnitude of expression modulation (Figures 2B, 2C, and 2E-2H).This result is also consistent with the comparisons of the mean CPM values of lncRNAs and mRNAs among samples (Figure 1F).Also, the sex chromosomes and autosomes have different characteristics in aneuploidy.In trisomy 2L Drosophila, only lncRNAs expressed from the X chromosome will produce ratio distributions that are not significantly different from mRNAs (Figures 2A  and 2D).
Furthermore, we examined whether lncRNAs have similar responses in other species.In the five trisomies of Arabidopsis (Figure S2), the peaks of the ratio distributions of cis mRNAs are mostly between 1.0 and 1.5, which represents partial compensation to different degrees.The number of cis lncRNAs is small, showing a scattered distribution (Figures S2A, S2C, S2E, S2G, and S2I).The ratio distributions of mRNAs and lncRNAs located on the varied chromosomes are significantly shifted to the left of 1.0 in all aneuploidies, in which lncRNAs have main peaks or secondary peaks with smaller abscissae than mRNAs (Figures S2B, S2D, S2F, S2H, and S2J).As for the ratio distributions of four trisomies of mouse embryonic stem cells, cis mRNAs have a shoulder peak at the position below 1.5 (Figures S3A, and S3D), or have a main peak at the left side of 1.5 (Figures S3G, and S3J).The distributions of cis lncRNAs are more left shifted than that of mRNAs (K-S test p values <0.05), which is close to dosage compensation.The mRNAs expressed from the X chromosome are affected by autosomal aneuploidy, and their ratio distribution curves, centered at 1.0, form a shoulder peak on the left side; X-linked lncRNAs, on the other hand, produce a sharp peak at about 0.7, which represents the consequence of an inverse dosage effect (Figures S3B, S3E, S3H, and S3K).For the genes on unvaried autosomes, the ratio distributions show that the expression levels of most mRNAs in mouse trisomy cells have not changed, but lncRNAs have relatively more left-leaning curves (Figures S3C, S3F, S3I, and S3L), indicating that lncRNAs might be more sensitive to genomic imbalance.In another dataset of induced-pluripotent stem-derived vascular endothelial cells (iPSC-derived ECs) of human trisomy 21 (Figure S4), it is found that, compared with wild type, or isogenic corrected disomy 21 cells, or both of them together, the ratio distribution curves of mRNAs and lncRNAs on chromosome 21 have main peaks at 1.0, which means that most cis genes are completely compensated.It is even found that lncRNAs have a peak around 0.4 when compared with isogenic corrected disomy 21 cells.The main peaks of mRNAs on the X chromosome and other autosomes are all located at 1.0, and there is a shoulder peak on the left (Figures S4B, S4C, S4E, S4F, S4H, and S4I).The distribution curves of (G and H) Ratio distributions of the expression levels of lncRNA and mRNA located on chromosome X (G) and autosomes (H) in metafemales compared with wildtype females.(I-K) Ratio distributions of the expression levels of TE families in trisomy 2L females compared with wild-type females (I), trisomy 2L males compared with wildtype males (J), and metafemales compared with wild-type females (K).The vertical purple solid line represents the ratio of 1.00, the vertical yellow solid line represents the ratio of 1.50, and the vertical yellow dashed line shows the ratio of 0.67.The ratio distributions were generated as described in STAR methods, and the percentages of frequencies were plotted in bins of 0.1.
trans lncRNAs have peaks whose abscissae are smaller than that of mRNAs (Figures S4B, S4E, and S4H), or peaks with an overall left deviation (Figures S4C, S4F, S4I; K-S test p values <0.05).Generally speaking, the datasets of three different species we have analyzed all show that lncRNA and mRNA can be affected by genome imbalance, and lncRNA seems to be more strongly affected by inverse dosage effect than mRNA in aneuploidy.Therefore, the inverse dosage effect and the highly dosage sensitivity of lncRNAs in aneuploidy are generalizable across taxa.
Due to the high conservation of DNA sequences among insertions of the same TE family, we combined all insertions by TE family and calculated their expression.The ratio distributions of TE families no longer distinguish cis and trans chromosomes.In trisomy 2L females, the two main peaks of the ratio distribution are observed at ratios of 1.0 and 0.7, representing the TEs with unchanged expression and the TEs that are subject to an inverse dosage effect (Figure 2I).The ratio distribution curve of trisomy 2L males is different from that of females (K-S test p value = 0.0103), with two main peaks located at ratios of 1.0 and 1.3 (Figure 2J), indicating complete and incomplete dosage compensation, respectively.Meanwhile, there is a small peak at 0.7, which indicates that some TEs are regulated by an inverse dosage effect.In metafemales, the main peak of the distribution is located at 0.7 (Figure 2K), indicating that the expression of most TEs is inversely regulated.The second highest peak is at 1.0, representing no change in expression levels or dosage compensation.Therefore, TEs are also dosageaffected in aneuploidy and regulated by inverse dosage effects to a certain degree.

Differential expression analysis of lncRNAs in aneuploid Drosophila
Next, we performed differential expression analysis for different types of transcripts in aneuploid Drosophila to provide a statistical description of the lncRNAs and mRNAs affected by aneuploidy.We identified 6,195, 3,121, and 5,300 differentially expressed mRNA (DE-mRNA) transcripts in trisomy 2L females, trisomy 2L males, and metafemales, respectively (Figure S5A).Trisomy 2L females and metafemales share more DE-mRNAs, while trisomy 2L females and trisomy 2L males have slightly fewer common DE-mRNAs (2,774 compared with 1,507).In terms of DE-mRNAs, aneuploidies of the same sex and different chromosome arms are more similar than aneuploidies of different sexes and the same chromosome arm.Furthermore, the number of DE-mRNAs shared by any two aneuploidies is significant (Fisher's exact test p value <2.2e-16), suggesting that multiple aneuploidies can affect the same genes.There are 829 common DE-mRNAs in all three aneuploidies (Figure S5A), and their clustering shows obvious regularity (Figure S5B).The expression levels of more than half of the common DE-mRNAs in aneuploidy are higher than those in diploid, but there is still a large number of DE-mRNAs down-regulated (Figure S5B).Functional enrichment analysis of these common DE-mRNAs reveals that they are mainly involved in multiple biological metabolic processes and have various enzyme activities (Figure S5C).Their intracellular locations are mainly in the proteasome, peptidase complex, and CMG complex.In addition, the pathway analysis of these mRNAs also focuses on metabolism and the proteasome (Figure S5D).These results are consistent with the fact that aneuploidy is generally characterized by altered metabolism, reduced viability, abnormal protein production, and increased sensitivity to conditions interfering with protein synthesis, folding, and degradation. 3,4,70or lncRNAs, we found 266, 108, and 239 differentially expressed transcripts in trisomy 2L females, trisomy 2L males, and metafemales, respectively (Figure 3A).The rules found with mRNAs seem to be equally applicable to lncRNAs.The number of differentially expressed lncRNAs (DE-lncRNAs) shared by trisomy 2L females and metafemales is greater than the number of DE-lncRNAs shared by trisomy 2L females and trisomy 2L males (81 compared with 31).Moreover, the intersection of any two groups of DE-lncRNAs is significant (Fisher's exact test p value <0.05), indicating that the modulation of lncRNAs in aneuploidy is also gene-specific.Seventeen lncRNAs are differentially expressed in the three aneuploidies (Figure 3B).To gain a better picture, we analyzed a total of 110 lncRNAs that are significantly differentially expressed in at least two comparisons (Figure S6).
Volcano plots were drawn for all DE-lncRNAs, and the numbers of significantly up-regulated and down-regulated transcripts were counted (Figures 3C-3E).There are more up-regulated lncRNAs in trisomy 2L females and trisomy 2L males, but the number of significantly downregulated lncRNAs is greater than the number of up-regulated lncRNAs in metafemales (Figures 3C-3E).Significantly down-regulated lncRNAs are found in all three trisomies, indicating that there is a subset of lncRNAs that are subjected to statistically significant inverse regulation, especially in sex chromosome aneuploidy.We labeled the DE-lncRNAs which have known GO functions in the plots (Figures 3C-3E), and listed their functions in the table (Figure 3F).The majority of them are related to spermatogenesis (for example, lncRNA:CR43306), and several others are related to growth and development (lncRNA:CR31781), gene expression regulation (lncRNA:let7C), and dosage compensation (lncRNA:roX1 and lncRNA:roX2) (Figure 3F).In addition, the expression levels of lncRNA:CR43306 and lncRNA:CR31781 are significantly up-regulated in two aneuploidies, while the expression levels of lncRNA:Hsromega, which is related to nuclear omega speck organization and the regulation of protein metabolism, are significantly down-regulated in two aneuploidies (Figures 3C-3E).

Potential interactors of lncRNAs affected by aneuploidy
The function of most DE-lncRNAs we identified is unknown.Since lncRNAs may exert their functions by interacting with other factors in cis or trans, 32,65 we predicted their functions by looking for interactors co-expressed or co-located with DE-lncRNAs.The DE-lncRNA and DE-mRNA pairs whose absolute value of Pearson correlation coefficient >0.95 and p value <0.05 were considered as co-expressed.The lncRNA-mRNA pairs shared by at least two aneuploidies were selected to form a co-expression network, which contains 41 lncRNAs, 294 mRNAs, and a total of 568 edges (Figures 4A-4C).It can be observed that there are two large subnetworks in the whole co-expression network, which we call cluster 1 (Figure 4A) and cluster 2 (Figure 4B), in addition to some scattered small clusters (Figure 4C).Protein-protein interaction (PPI) analysis was performed for all DE-mRNAs in the network, and the connectivity degree of each mRNA was determined.Therefore, mRNAs with a higher degree, i.e., the redder nodes in Figure 4 (for example, Ubi-p63E encoding a polyubiquitin precursor, Su(var)205 as a structural component of heterochromatin, Ras85D as an essential component involved in the signal pathway regulating growth and development, etc.), may play a more critical role in the dysregulation of expression in aneuploidy.
Further analysis of the lncRNA-mRNA co-expression network shows that some lncRNAs have numerous co-expressed mRNAs; for example, lncRNA:roX2, which plays a role in the regulation of X chromosome in Drosophila males, has 88 co-expressed mRNAs (Figure 4D).Many potential interactors may indicate the functions of these lncRNAs.On the contrary, most mRNAs have only one co-expressed lncRNA (Figure 4E).Next, functional enrichment analysis and pathway analysis were performed on the two large co-expression subnetworks (Figures 4F-4I).The results showed that the functions of cluster 1 are mainly related to microtubule-based processes, such as cilium movement, cilium organization, and sperm axoneme assembly (Figure 4F); some pathways are enriched by genes in cluster 1, including mitophagy, RNA degradation, amino acid metabolism, aerobic respiration, and so on (Figure 4G).Cluster 2 is mainly enriched in proteasome mediated ubiquitin-independent protein catabolic process, proteasome assembly, and transcription initiation (Figure 4H).Its enriched pathways are proteasome, lysosome, and basal transcription factors (Figure 4I).Using interacting mRNAs to predict the functions of individual lncRNAs (Figure S7A), most of the lncRNAs in cluster 1 are assumed to function in cilium assembly and movement, sperm axoneme assembly, oxidative phosphorylation, biosynthesis of amino acids, mitophagy, and peroxisomes.The lncRNA in cluster 2, represented by lncRNA:CR33938, is speculated to have ubiquitin-dependent protein catabolic process, proteasome assembly, cell fate specification, and other important functions affecting various developmental processes.For the entire lncRNA-mRNA co-expression network, the enriched functions or pathways with the names of genes that contributed to the enrichment were displayed using cnetplots (Figures S7B and S7C).
Furthermore, we analyzed DE-mRNAs close to DE-lncRNAs in genome location (within 10 kb), and constructed a co-located lncRNA-mRNA network.This network consists of 89 co-located lncRNA-mRNA pairs, but they are so dispersed that no major clusters are formed (not shown).For each lncRNA, most have one or two co-located mRNAs; while for each mRNA, only one lncRNA co-located with it (Figures S8A and S8B).Enrichment analysis revealed that the functions of genes in co-localization network are mainly enriched in somatic muscle development, myoblast fusion, organic hydroxy compound metabolic process, and negative regulation of the Notch signaling pathway (Figure S8C).There were also enrichments of multiple metabolic pathways, such as fatty acid biosynthesis, amino sugar and nucleotide sugar metabolism, inositol phosphate metabolism, and carbon metabolism (Figure S8D).

Differentially expressed TE families and their interactors
We additionally performed differential expression analysis for TE families.44, 31, and 43 differentially expressed TE families (DE-TEs) are found in trisomy 2L females, trisomy 2L males, and metafemales, respectively (Figure 5A).Similar to mRNA and lncRNA, the number of DE-TEs shared by trisomy 2L females and metafemales was greater than the number of DE-TEs shared by trisomy 2L females and trisomy 2L males (24 compared with 21), and the species of differentially expressed families was significantly correlated (Fisher's exact test p values <0.05).Fourteen DE-TEs were present in all three aneuploidies (Figure 5B), while 32 TE families were differentially expressed in at least two groups (Figure 5C).Furthermore, most of these DE-TEs are LTR or LINE-like retrotransposons (Figures 5B and 5C).
Subsequently, we performed co-expression analysis of these TE families with DE-mRNAs (Figures 5D-5F).The TE-mRNA co-expression network includes 9 TE families, 117 mRNAs, and 123 pairs of interactions.This network can also be further divided into two smaller subnetworks, cluster 1 and cluster 2 (Figures 5D and 5E), and some scattered clusters (Figure 5F).The DE-TE with the most potential interactors is an LTR transposon, HMS-Beagle, with 58 co-expressed mRNAs (Figure 5G).On the contrary, there are only one or two co-expressed TE families per mRNA, and the majority are one (Figure 5H).Enrichment analysis of cluster 1 with the core of HMS-Beagle showed that its functions are mainly concentrated in ubiquitin-dependent protein catabolic process, proteasome assembly, response to heat, and response to biotic stimulus (Figure 5I).Pathway analysis found that the proteasome, apoptosis, and Hippo signaling pathways are enriched (Figure 5J).The genes in cluster 2 are enriched with lipid modification, ATP metabolic process, purine nucleotide biosynthetic process, and other metabolic processes (Figure 5K).For cellular pathways, they are enriched in oxidative phosphorylation, drug metabolism, phagosome, etc. (Figure 5L).We also predicted the functions of individual TE families based on their co-expressed mRNAs (Figure S9A), and identified specific mRNAs that are responsible for the enriched functions and pathways (Figures S9B and S9C).

Expression patterns of dosage-affected lncRNAs and their interactors in aneuploid Drosophila embryos
To examine the developmental expression patterns of the dosage-affected noncoding genes identified previously, we designed probes (Figure S10) for several lncRNAs and their potential interactors which was used for tyramide signal amplification (TSA)-based fluorescence in situ hybridization (FISH) of Drosophila embryos.With high resolution, sensitivity, and consistency, TSA-FISH can provide information about the temporal and spatial patterns of RNA expression and allow us to predict the function of RNA based on their subcellular and subembryonic localization. 8A total of sixteen candidate genes that may perform different functions were selected for detection, including co-expressed lncRNA-mRNA pairs, as well as a TE family that has the highest connectivity in the co-expression network (Table 1).Full results are shown in Data S1, including subembryonic and subcellular localization of probes and corresponding descriptions; the relative quantification of signal intensities at three developmental stages is also listed.
Stage 1-5 and Stage 6-11 include many important early developmental markers, such as the maternal-zygotic transition and the germband elongation.It was found that the zygotically expressed genes in early embryogenesis are enriched in transcription regulation, cell fate determination, tissue and organ development, and morphogenesis. 8The candidate gene tsh is considered to be able to regulate the development of eye, leg, midgut, head, and prothoracic segment, and participate in transcription regulation and Wg signaling pathway.It is also one of the genes with interesting expression patterns in our FISH experiment.In the two early periods, the subembryonic localizations of tsh show segmented patterns, which are distributed in the midposterior and posterior part of the embryos respectively; and the localization of tsh in wild type and trisomy 2L are similar (Figures 6A, 6A 0 , 6B, 6B', and Data S1).Therefore, tsh may be a vital gene to maintain early embryogenesis.For lncRNAs, although the functions of the majority are unknown, specific expression patterns can also be detected.For example,

Transposable element HMS-Beagle TE -
GID complex, glucose-induced degradation-deficient complex; RISC, RNA-induced silencing complex; CIP/KIP family, CDK-interacting protein/kinase inhibitory protein family; CycE-Cdk2 complex, Cyclin E/Cyclin-Dependent Kinase 2 complex; TE, transposable element.Two other genes, CG3295 and HmgD, in the same co-expression cluster have different localizations in trisomy 2L and wild-type embryos.CG3295, which is involved in regulation of proteasomal ubiquitin-dependent protein catabolic process, loses its localization at posterior yolk plasm in the trisomy 2L blastoderm (Figures 6D, 6D', and Data S1) while HmgD, which is related to transcription regulation and chromatin organization, not only loses its posterior yolk plasm localization in trisomy 2L but also shows an apical enrichment that is not found in wild-type embryos (Figures 6E, 6E', and Data S1).The distribution changes of CG3295 and HmgD may lead to subsequent disturbance of ubiquitination and chromatin organization, which may be associated with the abnormal development of trisomy 2L.In addition, signals of HmgD are strongly enriched in the brain and ventral nerve cords of Drosophila embryos during the third period (Stage 12-17) (Figures 6F and 6F', and Data S1), suggesting the possible nervous system function of HmgD.
The expression patterns of genes in the other co-expressed clusters also reflect their biological functions.For example, the probe signals of ifc related to spermatogenesis are enriched in posterior yolk plasm and pole cell in Stage 1-5, indicating a reproductive related function (Figures 6G, 6G', and Data S1).Su(var)205, which is associated with heterochromatin assembly, shows a pole cell exclusion pattern, and the posterior yolk plasm localization disappears (Figures 6H, 6H', and Data S1).Both nonA-l, which is related to mRNA splicing, and lncRNA:CR45916, which is involved in miRNA-mediated gene silencing, have small foci distributed on the surface of blastoderms, and both have intranuclear subcellular localizations (Figures 6M, 6L, and Data S1).Notably, several probes are enriched in posterior yolk plasm in wild-type embryos in St. 1-5, but not in trisomy 2L (Data S1).These common localization differences show that trisomy 2L has an impact starting in early embryonic development.The characteristic events of Drosophila embryonic development in the third period (St.12-17) are dorsal closure, head involution, and midgut closure.Most of the candidate probes can be detected in the brain, midgut, hindgut, proventriculus, amnioserosa, and the medial side of the embryos during St. 12-17 (Data S1).This may be due to the full activation of zygote genome and generally active transcription of genes in the late embryonic development, in preparation for entering the larval stage.
At the subcellular level, the distribution patterns of most probes (87.5%) are or at some stages are perinuclear (Data S1).tsh, lncRNA:CR45916, nonA-l, and transposable element HMS-Beagle have intranuclear localization patterns (Figure 6K-6N).Among the nine co-expressed pairs determined by RNA-seq results, five pairs of lncRNA and mRNA have the same subcellular localization patterns, validating the interaction relationship between these lncRNAs and mRNAs.In addition, there is no visible difference in subcellular distributions of all candidate genes between wild type and trisomy 2L embryos.
The expression levels of candidate genes in wild type and trisomy 2L Drosophila embryos were analyzed according to the relative fluorescence intensities of probes.Compared with the wild type, the expression levels of most candidate genes changed significantly in trisomy 2L regardless of whether the probe distributions changed or not (Data S1).The expression changes of 75% (12/16) of the candidate genes in most developmental stages are consistent with the results of RNA sequencing (Figure 6O and Data S1).Furthermore, according to the RNA-seq results, sixteen candidate genes contain nine groups of co-expression relationships, and five co-expression relationships exist in the FISH experiment, which confirms the association between these lncRNAs and mRNAs.The differences between sequencing results and FISH quantification may be due to the different development stages of detection.

Biological functions of candidate lncRNAs and TEs in Drosophila gonads
To further investigate the biological functions of candidate noncoding genes, we used the GAL4-UAS system to knock down several candidate lncRNAs and TE in Drosophila gonads.The ovary of female Drosophila consists of dozens of ovarioles containing sequentially developing egg chambers, with the germarium at its anterior.The Sex-Lethal (SXL) protein is required for germline stem cells (GSCs) to enter the differentiation pathway.In the absence of SXL, germ cells will be blocked in a state between GSC and cystoblast. 71According to immunofluorescence staining, SXL protein was only enriched in the cytoplasm of a few cells at the top of the germarium (Figures 7A-7G).There was no obvious difference in its location in lncRNA:CR33938, lncRNA:CR45916, and the TE HMS-Beagle ovarian-specific knockdown lines compared with the control.However, the relative quantity of SXL in germarium of these RNAi Drosophila has been down-regulated, especially in lncRNA:CR33938 RNAi strain (Figure 7H).Studies have reported that lack of SXL will lead to continued proliferation and germline tumors that inappropriately express a set of male specific markers. 72In our experiment, lncRNA or TE knockdown lines showed reduced SXL levels, but no ovarian tumor phenotype was observed.The development of egg chambers is divided into 14 stages, which can be determined by the morphology and size of the egg chamber. 73n addition to the germline cells, the egg chambers also include a layer of somatic follicle cells.Follicle cells undergo mitosis until stage 5, after which they undergo the mitosis-endocycle (M/E) transition and begin to differentiate.When the mitotic marker H3Ser10P was used to indicate the dividing cells, the positive signal in the normal ovary was mainly distributed in the germarium, and follicle cells of the egg chambers before stage 6 (that is, before the M/E switch) (Figures 7A, 7B, 7D, and 7F).However, in the three ovarian-specific RNAi strains, it was observed that the H3Ser10P signal of follicle cells persisted to stage 6 or even stage 7 (Figures 7C, 7E, and 7G).Therefore, knocking down lncRNA:CR33938, lncRNA:CR45916, or HMS-Beagle may interfere the M/E transition of follicle cells, and affect the development of Drosophila ovary.Relative quantification of the signal intensity of H3Ser10P revealed that in lncRNA:CR33938 RNAi strain, the level of H3Ser10P was significantly decreased in stage 4-5 egg chambers, but increased in stage 6-7 egg chambers (Figures 7I and 7J).It is suggested that lncRNA:CR33938 may play a role in female reproduction during oogenesis, and its deficiency will not only reduce the level of SXL in germarium but also delay the development of the egg chambers.For lncRNA:CR45916 and HMS-Beagle RNAi, H3Ser10P was significantly increased in both stage 4-5 and stage 6-7 egg chambers (Figures 7I and 7J), which shows disruption of follicle cell proliferation and abnormal ovarian development.
Based on the results of RNA sequencing, both lncRNA:CR33938 and HMS-Beagle have more than fifty co-expressed mRNAs, many of which have developmental regulatory functions (Figures 4B, 4D; 5D, and 5G).For example, Prosalpha7, CSN3, Ras85D, which are co-expressed with lncRNA:CR33938, and Atg7, ecd, mats, which are co-expressed with HMS-Beagle, have ovarian related functions such as oogenesis, ovarian follicle cell or germ cell development.In addition, dap, which is co-expressed with lncRNA:CR45916 functions in oocyte fate determination and germline cell cycle switching.There is also a link between HMS-Beagle and the male-specific lethal (MSL) complex, as the expression of HMS-Beagle was significantly down-regulated in the transcriptome data of MSL2 overexpressed Drosophila (Figures 7K,  and 7L).These results indicate that candidate lncRNAs and TE are important, which may affect sex determination and dosage compensation, or gonad development and reproductive ability of Drosophila by interacting with SXL, MSL2, and their co-expressed protein-coding genes.
Another candidate gene lncRNA:CR44418 has 48 co-expressed mRNAs (Figures 4A and 4D), of which five (Atg8b, CG42355, Mst84Da, nsr, and Rcd7) have spermatogenesis related functions, and the co-expressed genes are enriched with GO term of Ubiquitous-dependent protein catabolic process (p value = 2.0e-12).Therefore, we examined the level of ubiquitin in the testes of lncRNA:CR44418 gonad-specific knockdown Drosophila.It was found that ubiquitin signal was mainly concentrated in the apex, but it was also widely distributed in other parts of the testes (Figure 7M).Although the distribution of ubiquitin in the testes of lncRNA:CR44418 RNAi Drosophila was unchanged, its relative level was significantly down-regulated (Figure 7N).Since ubiquitination is required for chromatin remodeling and spermatid terminal differentiation (i.e., individualization) of Drosophila sperm, 74,75 reduced expression of lncRNA:CR44418 may lead to decreased ubiquitination in testes and impair spermatogenesis.Furthermore, we also observed a significant decrease of H3Ser10P signal in RNAi testes (Figure 7O), implying a reduction in cell division and a disturbance in the development of male gonads.

DISCUSSION
6][17] This phenomenon has been found in different taxa such as Drosophila, maize, Arabidopsis, Datura, and human cells. 10,11,16Some view that the expression of genes on the varied chromosomes will change proportionally, and this is the primary basis of the deleterious phenotype of aneuploidy. 4,7611][12][13][77][78][79][80] A few studies have examined the expression changes of protein-coding genes in aneuploidy, [10][11][12][13]63,64,77 with a focus on several potential dosage-sensitive factors, mostly components of protein complexes. 16,64,77In this study, we performed an analysis of the expression and characteristics of lncRNAs and TEs in aneuploid Drosophila, as well as their potential interactors.Their distribution patterns during embryonic development and biological functions in gonads were also investigated.It was demonstrated that lncRNAs and TEs are dosage affected and may be involved in gene expression modulation and development control of aneuploidy.
The patterns of expression in aneuploidy were first detected in maize, which are also applicable to other species. 16,81Early studies of genomic imbalance revealed a phenomenon, in which the genes without copy-number changes in aneuploidy are inversely modulated, which is called the inverse dosage effect. 81,82Later, with the rapid development of high-throughput sequencing technology, comprehensive studies were carried out on different dosage series of segmental aneuploidies. 11,77,83It was found that the expression of cis genes generally ranged from compensation to dosage effect, while the expression of trans genes mainly ranged from unchanged to negatively correlated with chromosome changes. 10,11,77Trans genes tend to show an inverse dosage effect when cis genes approach dosage compensation. 11,77Drosophila, with a relatively simple chromosome complement, also has many in-depth studies on genomic imbalance. 16,18In addition, these studies also revealed sexual dimorphism in aneuploid Drosophila, that is, differences between sex-biased and non-sex-biased genes, and the specificity of the response of sex chromosomes. 13,63The fact that the inverse dosage effects act on the entire genome suggests that the dosage compensation of cis genes results from a simultaneous inverse modulation that counteracts the direct proportional dosage effect that might otherwise occur. 64,84,85revious aneuploidy studies have seldom distinguished between protein-coding and noncoding genes, and most have paid little attention to whether different types of genes have different responses.Only a few studies examined the dosage-dependent expression of some atypical RNAs, for example, serine-4 transfer RNA and copia retrotransposons were found to show dosage compensation. 86,87A recent study examined the dosage sensitivity of miRNAs in a series of aneuploid maize, and found that miRNAs are affected as well and could act as regulators in aneuploidy. 25,88We performed an analysis of mRNAs, lncRNAs, and TEs in aneuploid Drosophila and revealed that the expression of lncRNAs and TEs can also be affected by genomic imbalance.Similar to protein-coding genes, a subset of cis noncoding genes is dosage compensated, while trans noncoding genes mostly show down-regulation in trisomies.The down-regulation of lncRNAs is generally more significant than that of mRNAs, indicating that lncRNAs maybe more strongly affected.The greater sensitivity of lncRNAs was documented in different species.Furthermore, the responses of lncRNAs are distinct between males and females in aneuploid Drosophila, and the expression of females tends to be lower, which represents a sexual dimorphism.The specific responses of lncRNAs on the X chromosome were also demonstrated.For TEs, although the expression levels of each insertion were not analyzed by chromosomal location, the overall expression of TE families also showed that TEs are dosage affected, with their expression mainly inversely regulated, and are sexually dimorphic.
It is reasonable that lncRNAs and TEs have similar dosage sensitivity as mRNAs.On the one hand, they share the same transcription mechanism with mRNAs. 89,902][33][34] It has been found that dosagedependent regulators are usually transcription factors, signal transduction components, and chromatin proteins, which have in common that they are members of macromolecular complexes or play roles in multiple interactions. 18Among these regulators, the most clearly studied one is Inverse regulator-a (Inr-a or pcf11), a bridge between RNA polymerase II and nascent transcripts, whose gene copy-number variations will perfectly mimic aneuploid effects. 64,91,922][23] In contrast, in small-scale duplications, genes of the same category are often underrepresented. 21,93Genes whose products have more interactions with other molecules are more likely to be influenced by stoichiometry changes and participate in the regulatory networks in aneuploidy.
LncRNAs and TEs may act as dosage-sensitive regulators in imbalanced genomes.LncRNAs can have regulatory roles by interacting with proteins or other molecules. 32,90For example, a large number of transcription factors interact with lncRNAs, which may be crucial for their functions. 94Many lncRNAs are located on chromatin, and promote or inhibit the binding and activity of their interacting proteins at their targets. 28Some lncRNAs are associated with chromatin remodeling complexes (such as polycomb repressive complexes PRC1 and PRC2), and further participate in chromatin remodeling and epigenetic regulation of gene expression through histone modifications. 90,95In addition, lncRNAs can also form hybridization structures with DNA to influence chromatin accessibility, 28,90 or complementarily combine with pre-mRNA to affect mRNA splicing, editing, and stability. 90,96Other lncRNAs can sponge miRNAs as competitive endogenous RNAs (ceRNAs), indirectly affecting the translation of mRNA. 28,90,97With such a complex intermolecular interaction, lncRNAs are likely to be important in gene regulatory networks and susceptible to the disruption of stoichiometric balance.For TEs, their movement and accumulation may participate in the rewiring of the regulatory networks. 54Studies have found that TEs form a large number of tissue-specific and alternative promoters in various species, 47 and nearly a quarter of the experimentally determined human promoters contain TE-derived sequences. 98In mammals, TEs provide an average of 20% of binding sites for different transcription factors. 47One-third of the transcription factors screened in human cells and tissues bind to the L1 retrotransposon. 99Furthermore, many regulatory RNAs, like some miRNAs and lncRNAs, are derived from TE sequences. 47,58,59Other studies suggest that TEs can also be used as a source of ceRNAs, combining with miRNAs. 100These complicated relationships may make TEs also dosage sensitive.
Through differential expression analysis, we identified many mRNAs, lncRNAs, and TE families affected by aneuploidy.The enriched functions of common DE-mRNAs explain the reduced viability of aneuploids, consistent with previous studies. 3,4,64,70The functions of most DE-lncRNAs are unknown, so we performed co-expression and co-localization analysis with protein-coding genes to determine their potential interactors.We found that the lncRNA-mRNA co-expression network contains two large clusters, and their functions are predicted to be mainly related to microtubule-based processes (cluster 1) and proteasome-mediated protein catabolism (cluster 2).Several lncRNA-mRNA pairs in the co-expression network were selected for subsequent experimental verification.For the differentially expressed TE families, we also established a co-expression network with DE-mRNAs.The TE family with the most interacting mRNAs is HMS-Beagle, whose functions are predicted to be related to ubiquitin-dependent protein catabolic processes and responses to heat and biotic stimulus.
Evidence shows that the expression of lncRNAs is finely regulated and has higher tissue, cell type, developmental stage, and biological context specificity than mRNAs. 35,36In Drosophila, approximately 30% of lncRNAs have the highest expression in testis, 101 and most lncRNAs studied in embryonic development have dynamic expression patterns 35 or complex subcellular localization. 102Moreover, a large number of lncRNAs are significantly up-regulated at the beginning of metamorphosis in Drosophila, indicating that the enrichment of lncRNAs is critical for developmental transformation and organogenesis. 65,103Similar to lncRNAs, TEs are often expressed in tissue and development stagespecific ways. 104,105Highly conserved TE fragments in mammals tend to cluster around genes involved in development and transcriptional regulation. 106A large number of developmentally regulated promoters derived from TEs are found in different species and drive the transcription of neighboring genes in a tissue or stage-specific fashion. 47In addition, endogenous retroviruses (ERVs) are dynamically expressed in human and mouse embryonic development, 61,107 and some TEs of Drosophila can regulate early embryogenesis by inducing the degradation of maternal mRNAs through piRNAs produced by them. 108Since both lncRNAs and TEs seem to have important developmental control functions, we also explored the roles of lncRNAs, TEs, and their interactors in the embryo development of aneuploid Drosophila.
We screened sixteen candidate genes from the differentially expressed lncRNA-mRNA pairs that may perform different types of functions, and an important TE family, and detected their expression patterns in aneuploid Drosophila embryos.The results showed that most candidate genes have specific subembryonic and subcellular localizations in three stages of embryogenesis, and some show different distributions in aneuploidy and wild type.Several candidate genes have interesting localization patterns.For example, the probe signals of tsh, which has multiple developmental and transcriptional regulatory functions, show segmented patterns in early embryogenesis; HmgD, which is related to transcription regulation and chromatin organization, is enriched in the nervous system in late embryonic development; ifc, which is related to spermatogenesis, is enriched in pole cells of blastoderm.Both nonA-l (related to mRNA processing) and lncRNA:CR45916 (involved in gene silencing) have intranuclear subcellular localization.Although the functions of most lncRNAs and TEs are unknown, candidate lncRNAs and TE have specific localization patterns in embryonic FISH.Compared with wild type, the distributions of several probes are changed in early trisomic embryos, indicating that these genes may contribute to the abnormal development of aneuploid Drosophila.Furthermore, almost all of the candidate genes have significant differences in relative fluorescence intensities between aneuploidy and wild-type embryos, and the directions are mostly consistent with the sequencing results.Through the analysis of RNA localization and relative expression, the potential relationship between lncRNAs and their paired mRNAs was confirmed.These results suggest that lncRNAs, TEs, and their interacting mRNAs can affect the development of aneuploid Drosophila embryos through spatiotemporal specific expression.
RNAi transgenic Drosophila lines were constructed to investigate the biological functions of some candidate lncRNAs and TE.We knocked down lncRNA:CR33938, lncRNA:CR45916, and the TE HMS-Beagle in Drosophila ovaries.By immunofluorescence staining, it was found that the relative quantity of SXL protein, whose enrichment at the top of germarium is required for GSC differentiation, was significantly reduced in the germarium in all three RNAi lines.Also, in mutant ovaries, the mitotic marker H3Ser10P tended to be distributed to later stage egg chambers, such as stages 6-7; whereas in control groups, H3Ser10P normally disappears in egg chambers at stage 6 when the follicle cells are entering the endocycle.Therefore, depletion of lncRNA:CR33938, lncRNA:CR45916, or HMS-Beagle may lead to abnormal ovarian development of Drosophila.Furthermore, the three genes all have co-expressed mRNAs with functions related to developmental regulation, and the interaction between them may be critical for their functions.The expression of HMS-Beagle is also affected by MSL2, which is implicated in Drosophila X chromosome expression.We also knocked down lncRNA:CR44418 in Drosophila testes.It was predicted to have functions related to spermatogenesis and ubiquitination.We found that the ubiquitin signal was significantly down-regulated in mutant testes, confirming the relationship between lncRNA:CR44418 and ubiquitination, and indicating that its reduction may affect male Drosophila fertility.These results all suggest that dosage-sensitive lncRNAs and TEs could play important biological functions.
In summary, our study provides a comprehensive picture of the characteristics and expression of lncRNAs and TEs in aneuploidy and demonstrates that lncRNAs and TEs, together with their interactors, are dosage-sensitive.Noncoding genes can also be involved in expression modulation and developmental control in aneuploid Drosophila.Moreover, based on the fact that aneuploidy is a feature of solid tumors 1,5 and that both lncRNAs and TEs are strongly associated to tumorigenesis, 42,49 studying the relationships among noncoding genes, unbalanced genomes, and aneuploidy-related human diseases may help to uncover the molecular mechanisms of disease and develop new therapies.family.Specifically, samtools (version 1.15) 116 was used to process the bam files after HISAT2 alignment, and the multiple alignments were removed (samtools view -h -b -q 1 -F 256).Then, the filtered results were reassembled and quantified by using StringTie.TEs were selected from the genome annotation and the read counts of TE insertions of the same family were added to obtain the total expression of a TE family.

Ratio distributions
The method of making the ratio distributions of transcript expression changes is as described previously. 63The expression of protein-coding genes and lncRNA genes were normalized to CPM (counts per million).The expression of each TE family was normalized with the overall expression of TEs.Low-expression transcripts were filtered to retain only those with mean read counts greater than 5. Subsequently, the ratios of expression levels of different types of transcripts between aneuploidy and diploid were calculated, respectively.These ratios were plotted as distributions in bins of 0.1, with the vertical axis showing the percentage of the number of transcripts contained in each interval to the total number of transcripts of this type.The plots were generated using ggplot2 (version 3.3.6) 114in the R program (version 4.2.1). 117

Differential expression analysis and functional enrichment analysis
Differential expression analysis was performed for protein-coding genes and noncoding genes with mean read counts greater than 5 in each sample using R package DESeq2 (version 1.36.0). 112Adjusted p value <0.1 was set as the threshold for significantly different expression.The heatmaps were generated using ComplexHeatmap (version 2.12.1). 118Enrichment analysis was performed by ClusterProfiler (version 4.4.4) 113using the annotation package org.Dm.e.g.,.db (version 3.15.0)from Bioconductor (https://www.bioconductor.org/).The KEGG pathway data was obtained from network (https://www.kegg.jp/,accessed on 27 August 2022).The differential expression of TE family was analyzed separately.

Screening of potential interactors of lncRNAs and TEs
For differentially expressed lncRNAs (DE-lncRNAs) or TEs (DE-TEs), we screened their possible interacting genes according to co-localization or co-expression.To find mRNAs that were highly related to the expression pattern of the DE-lncRNAs, we calculated the Pearson correlation coefficient between the DE-lncRNAs and the differentially expressed mRNAs (DE-mRNAs).LncRNA-mRNA pairs with absolute values of correlation coefficient >0.95 and p value <0.05 were considered to have interactions.In addition, there may be cis interactions between lncRNA and mRNA with adjacent genome coordinates.Therefore, the DE-mRNAs within 10 kb upstream and downstream of the DE-lncRNAs was found by bedtools to form co-located lncRNA-mRNA pairs.We used the mRNAs in these gene pairs to speculate the function of the corresponding lncRNA.The interaction networks of lncRNA or TE and mRNA were drawn using Cytoscape (version 3.7.1). 114uorescence in situ hybridization (FISH) of Drosophila embryos TSA-FISH of Drosophila embryos was performed as described. 63In brief, the embryos of three different developmental stages from wildtype and aneuploid Drosophila were first collected according to the corresponding time window, fixed with formaldehyde and stored in methanol at À20 C. Next, primers containing flanking T7 promoter elements are designed for candidate genes (Figure S10A).After PCR amplification and in vitro transcription, antisense RNA probes containing digoxigenin-labeled UTP were obtained.The probes were tested by agarose gel electrophoresis (Figure S10B).Subsequently, the embryos were permeabilization, pre-hybridization, and hybridization according to the previously published protocol. 119Probe signals were detected using tyramide signal amplification (TSA) signal amplification system.Finally, Drosophila embryos were placed in anti-fade mounting medium to make slides, and observed with fluorescence microscope (Zeiss Investigated Fluorescence Microcopy Observer Z1) and confocal microscope (ZEISS LSM880).The same probes of different genotype samples for relative fluorescence intensity analysis were photographed using the same parameters.Fluorescence images were processed and analyzed using ImageJ (version 1.53c) (https://fiji.sc/).

Immunofluorescence of Drosophila ovaries and testes
Three-day-old females and one-day-old males were selected to dissect ovaries and testes.The tissues were fixed for 20 min in 4% paraformaldehyde at room temperature.Next, tissue samples were washed with PBTT (1 3 PBS with 0.1% Tween 20 and 0.3% Triton X-100) for 3 times (15 min each time), and blocked with 3% BSA for 30 min.Subsequently, ovaries or testes were incubated with primary antibodies overnight at 4 C.The following day, samples were first incubated with 1 mg/mL DAPI for 15 min and then washed with PBTT for 4 times (20 min each time).Secondary antibodies were added to samples and incubated for 2 h at room temperature.After three 20 min washes with PBTT and three 5 min washes with PBS, slides can be made with the anti-fade mounting medium.The acquisition of microscopy images was done with confocal microscope (ZEISS LSM880).The following antibodies were used in this study: Rabbit anti-H3Ser10P (1:300, EMD Millipore, 06-570); Mouse anti-SXL (1:100, Developmental Studies Hybridoma Bank, M18-s); Mouse anti-Ubiquitin (1:100, Santa Cruz Biotechnology, sc-8017); Alexa Fluor 488 and 594 (1:200, Jackson Immuno Research).and transcripts with adjusted p value <0.1 were considered to be differentially expressed.Kolmogorov-Smirnov tests were used to compare the difference of ratio distributions between two groups, and a p value <0.05 were considered significant.Fisher's exact tests were used for enrichment analysis.The threshold of co-expressed lncRNA-mRNA pairs was set as absolute values of correlation coefficient >0.95 and p value <0.05.Co-located lncRNA-mRNA pairs were defined as being within 10 kb upstream and downstream.ImageJ was used to analyze the fluorescence intensity of photographs taken by confocal microscopy.Two tailed Student's t-tests were performed to compare the relative fluorescence intensity, and the level of significance were shown in figure legends.All statistical tests and plots were performed with R software.

Figure 1 .
Figure 1.Characteristics of long noncoding RNA (lncRNA) and transposable elements (TEs) in Drosophila samples (A) Percentage of different types of lncRNAs in Drosophila.(B) The number of lncRNA, mRNA, and TE insertion located on different chromosomes.(C) Density plot of exon numbers of lncRNA and mRNA.(D) Density plot of transcript length of lncRNA, mRNA, and TE.(E) Boxplots of the expression levels (CPM) of lncRNA and mRNA.(F) Mean expressions of lncRNA and mRNA in each sample.The error bars indicate standard deviation (SD) of three biological replicates.(G) Percentage of different types of TEs in Drosophila.(H) Pie charts showing the proportion of the expression of each TE family to the total expression of TEs in each sample.The text colors of the top 20 TE families listed represented their classification, while the green text represented LTR (long terminal repeat) retrotransposons, blue represented LINE-like TEs, yellow represented SINE-like TEs, and red represented TIR (terminal inverted repeat) DNA transposons.CF, wild-type female control; CM, wild-type male control; 2LF, trisomy 2L female; 2LM, trisomy 2L male; XXX, metafemale.

Figure 2 .
Figure 2. Distributions of the expression ratios of lncRNA and TE family in aneuploid Drosophila (A-C) Ratio distributions of the expression levels of lncRNA and mRNA located on chromosome X (A), 2L (B), and other autosomes (C) in trisomy 2L females compared with wild-type females.(D-F) Ratio distributions of the expression levels of lncRNA and mRNA located on chromosome X (D), 2L (E), and other autosomes (F) in trisomy 2L males compared with wild-type males.(Gand H) Ratio distributions of the expression levels of lncRNA and mRNA located on chromosome X (G) and autosomes (H) in metafemales compared with wildtype females.(I-K) Ratio distributions of the expression levels of TE families in trisomy 2L females compared with wild-type females (I), trisomy 2L males compared with wildtype males (J), and metafemales compared with wild-type females (K).The vertical purple solid line represents the ratio of 1.00, the vertical yellow solid line represents the ratio of 1.50, and the vertical yellow dashed line shows the ratio of 0.67.The ratio distributions were generated as described in STAR methods, and the percentages of frequencies were plotted in bins of 0.1.

Figure 3 .
Figure 3. Differential expression analysis of lncRNA in aneuploid Drosophila (A) Venn diagram of the number of differentially expressed lncRNA (DE-lncRNA) in trisomy 2L female, trisomy 2L male, and metafemale.(B) Clustering heatmap of lncRNAs differentially expressed in all three aneuploidies.(C-E) Volcano plots of DE-lncRNA in trisomy 2L female (C), trisomy 2L male (D), and metafemale (E).The positions of DE-lncRNAs with known functions are marked in the plots.The texts in the upper left corner are the statistics of the number of up-and down-regulated lncRNAs.(F)List of lncRNAs with known functions that are differentially expressed in at least one aneuploidy.CF, wild-type female control; CM, wild-type male control; 2LF, trisomy 2L female; 2LM, trisomy 2L male; XXX, metafemale.

Figure 4 .
Figure 4. Co-expressed differentially expressed lncRNAs and mRNAs (A-C) Network of co-expressed lncRNAs and mRNAs.The triangular nodes represent lncRNAs, and the round nodes represent mRNAs.The color of the round nodes indicates the degree of this mRNA in the protein-protein interaction (PPI) network composed of DE-mRNAs, with red indicating high connectivity and green indicating low connectivity.The lncRNA and mRNA connected by edges are co-expressed (the absolute value of Pearson correlation coefficient >0.95 and p value <0.05).The entire co-expression network can be divided into two larger clusters (A and B), and a number of smaller, dispersed clusters (C).(D) The top 10 lncRNAs with the largest number of co-expressed mRNAs.(E) The number of co-expressed lncRNA owned by per mRNA.(F) GO functional enrichment analysis of mRNAs involved in the first large cluster of the co-expression network.(G) KEGG pathway enrichment analysis of mRNAs involved in the first large cluster of the co-expression network.(H) GO functional enrichment analysis of mRNAs in the second large cluster of the co-expression network.(I) KEGG pathway enrichment analysis of mRNAs in the second large cluster of the co-expression network.

Figure 5 .
Figure 5. Co-expressed differentially expressed TE families and mRNAs (A) Venn diagram of the number of differentially expressed TE families (DE-TEs) in trisomy 2L female, trisomy 2L male, and metafemale.(B) Clustering heatmap of TE families differentially expressed in all three aneuploidies.(C) Clustering heatmap of TE families differentially expressed in at least two groups of comparisons.The text colors of the TE families represented their classification, with green indicating LTR retrotransposons, blue indicating LINE-like TEs, and red indicating TIR transposons.CF, wild-type female control; CM, wild-type male control; 2LF, trisomy 2L female; 2LM, trisomy 2L male; XXX, metafemale.(D-F) Network of co-expressed TE families and mRNAs.The diamond nodes represent TE families, and the round nodes represent mRNAs.The color of the round nodes indicates the degree of this mRNA in the PPI network composed of DE-mRNAs, with red indicating high connectivity and green indicating low connectivity.The TE family and mRNA connected by edges are co-expressed (the absolute value of Pearson correlation coefficient >0.95 and p value <0.05).The entire co-expression network can be divided into two larger clusters (D and E), and a number of smaller, dispersed clusters (F).(G) TE families sorted by the number of their co-expressed mRNAs.(H) The number of co-expressed TE families owned by one mRNA.(I)GO functional enrichment analysis of mRNAs involved in the first large cluster of the co-expression network.(J) KEGG pathway enrichment analysis of mRNAs involved in the first large cluster of the co-expression network.(K) GO functional enrichment analysis of mRNAs in the second large cluster of the co-expression network.(L) KEGG pathway enrichment analysis of mRNAs in the second large cluster of the co-expression network.

Figure 6 .
Figure 6.Expression patterns of candidate genes in embryonic development of Drosophila (A-J) Subembryonic distribution of probes in wild-type Drosophila.(A 0 -J 0 ) Subembryonic distribution of probes in Trisomy 2L Drosophila.The name of the gene is shown in the left of the picture, and the genotype is shown above.Red, probe; green, DAPI.Arrowheads indicate regions of enriched or differential probe signal.Scale bars, 50 mm.(K-N) Subcellular localization of probe signals.Probe names are written in the lower left corner of the picture.Red, probe; green, DAPI.Arrowheads indicate the foci of probe signals.Scale bars, 10 mm.(O) Expression changes of candidate genes in aneuploid Drosophila detected by RNA-seq and TSA-FISH.The colors of the heatmap represent log2 fold changes in trisomy compared with wild type.2LF, trisomy 2L female; 2LM, trisomy 2L male; XXX, metafemale.FISH data represents the mean value of 10 embryos.

Figure 7 .
Figure 7. Immunofluorescence staining of ovaries and testes from lncRNAs and TE knockdown Drosophila (A-G) Immunofluorescence staining of Drosophila ovaries.Each multi-color fluorescent photograph focuses on one ovariole whose egg chambers is marked with the development stages, and an asterisk indicates the germarium.The genotypes are annotated on the left side of each row.Red, SXL; green, H3Ser10P; blue,

Figure 7 .
Figure 7. Continued DAPI.The red dashed squares indicate germarium and are enlarged in the middle panels.The green dashed rectangles indicate egg chambers and are enlarged in the right panels.Scale bars, 25 mm.(H) Relative quantification of SXL at the top of germarium based on fluorescence intensity.Sample size R12.(I and J) Relative quantification of H3Ser10P in stage 4-5 (I) and stage 6-7 (J) egg chambers based on fluorescence intensity.Sample size R12.(K and L) Scatterplots of gene expression after overexpression of MSL2 in female (K) and male (L) Drosophila.The axis indicate the average expression of a gene, in the form of regularized-logarithm transformation.Red and blue dots represent significantly up-regulated and down-regulated genes, respectively (padj < 0.1).DEGs in the candidate genes mentioned above and MSL complex are marked with their symbols.(M) Immunofluorescence staining of Drosophila testes.The apex of the testes, indicated by the arrows, is shown in the middle and right panels.The genotypes are annotated on the left.Red, Ubiquitin; green, H3Ser10P; blue, DAPI.Scale bars, 50 mm.(N and O) Relative quantification of Ubiquitin and H3Ser10P in Drosophila testes based on fluorescence intensity.Sample size R4.Student's t test *p < 0.05, **p < 0.01, ***p < 0.001.

Table 1 .
Information of candidate genes