Potential pitfalls of modelling ribosomal RNA data in phylogenetic tree reconstruction: Evidence from case studies in the Metazoa

Background Failure to account for covariation patterns in helical regions of ribosomal RNA (rRNA) genes has the potential to misdirect the estimation of the phylogenetic signal of the data. Furthermore, the extremes of length variation among taxa, combined with regional substitution rate variation can mislead the alignment of rRNA sequences and thus distort subsequent tree reconstructions. However, recent developments in phylogenetic methodology now allow a comprehensive integration of secondary structures in alignment and tree reconstruction analyses based on rRNA sequences, which has been shown to correct some of these problems. Here, we explore the potentials of RNA substitution models and the interactions of specific model setups with the inherent pattern of covariation in rRNA stems and substitution rate variation among loop regions. Results We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models. Conclusions Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non-stationarity of substitution rates and base compositional heterogeneity, can disrupt the signal of ribosomal RNA data. All these factors dictate sophisticated estimation of evolutionary pattern in rRNA data, just as other molecular data require similarly complicated (but different) corrections.


Results:
We found an explicit impact of RNA substitution models on tree reconstruction analyses. The application of specific RNA models in tree reconstructions is hampered by interaction between the appropriate modelling of covarying sites in stem regions, and excessive homoplasy in some loop regions. RNA models often failed to recover reasonable trees when single-stranded regions are excessively homoplastic, because these regions contribute a greater proportion of the data when covarying sites are essentially downweighted. In this context, the RNA6A model outperformed all other models, including the more parametrized RNA7 and RNA16 models.

Conclusions:
Our results depict a trade-off between increased accuracy in estimation of interdependencies in helical regions with the risk of magnifying positions lacking phylogenetic signal. We can therefore conclude that caution is warranted when applying rRNA covariation models, and suggest that loop regions be independently screened for phylogenetic signal, and eliminated when they are indistinguishable from random noise. In addition to covariation and homoplasy, other factors, like non-stationarity of substitution rates and base compositional heterogeneity, can disrupt the signal of ribosomal RNA data. All these factors dictate sophisticated estimation of evolutionary pattern in rRNA data, just as other molecular data require similarly complicated (but different) corrections.

Background
Progress of molecular techniques has eased the use of genomic data for phylogenetic analyses. Nevertheless, whole genomes are currently available for relatively few metazoans. Molecular studies of phylogenetic relationships within higher taxonomic groups, e.g. at the intra-ordinal level, therefore still rely on individual genes, among which the nuclear and mitochondrial ribosomal RNA genes are the most frequently sequenced. A pattern of highly variable positions, nested within conserved, slowly substituting sites across the alignment, yields a valuable resource for studying phylogenetic relationships of both recent and ancient splits [1][2][3][4]. This, combined with the ease of amplification, has lead to a widespread use of rRNA genes in phylogenetics and furthermore uncovered several specific properties of these genes, which should be considered, using these sequences as phylogenetic markers. Paired regions in rRNA sequences evolve via selectively neutral substitutions in the form of compensatory mutations [5] to maintain energetically stable secondary structures. Additionally, a strong bias of nucleotide composition between paired and unpaired areas has been observed [5,6]. Ribosomal RNA loop and stem regions are therefore subject to very different selectional regimes, which can hamper the use of rRNA genes for phylogenetic purposes [2,5,[7][8][9][10][11]. In particular, correlated variation of nucleotides in stem regions, has been suggested to corrupt phylogenetic analyses as these covariation patterns of paired sites do not display independent phylogenetic signal. Ignoring this correlation results in an overestimation of phylogenetic information of these sites, which can lead to inflated measurements of tree robustness [12,13]. As a solution, rRNA secondary structural information as independent set of characters have been advocated to aid tree reconstruction by the use of specific RNA substitution models [12,[14][15][16][17]. Application of RNA substitution models in phylogenetics is still confined to a few studies [11,[18][19][20][21][22][23][24][25][26][27], most of them emphasing improvement of the analyses. In contrast, a recent study on hexapod phylogeny found mixed RNA/DNA model setups leading to a higher sensitivity to systematic problems ("long-branch attraction") [28]. This has been suggested as a result of potential homoplasy in loop positions. RNA models virtually downweight stem partitions, leading to an increased impact of loops. If these loop positions are saturated and/or misaligned, this "noisy" signal might dominate the phylogenetic signal of unsaturated stem positions and lead to inaccurate tree reconstruction.
In the present study, we want to test this hypothesis by comparing the performance of mixed RNA/DNA model setups in the tree reconstruction of different ribosomal RNA data sets with the level of relative homoplasy in loop and stem positions. Current studies on the topic of modelling rRNA data in tree reconstruction have utilised simulation analyses [28,29], which can generally be seen to be a sophisticated complement to empirical studies in order to test hypotheses in algorithmically rooted phylogenetics. However, in Letsch et al. [28], tree reconstructions on simulated data were not able to reveal a potential correlation between homoplasy and data modelling. Consequently, the present analyses were based on case studies. Eight ribosomal RNA data sets were initially compiled, using from one to three mitochondrial and/or nuclear ribosomal RNA gene partitions, covering a broad spectrum of phylogenetic levels (Echinodermata (18S), Tunicata (18S), Heterobranchia (18S) Chilopoda (18S), Hexapoda (18S + 28S), Mammalia (12S + 16S), Primates (12S + 16S) and Anisoptera (12S + 16S + 28S)). All data sets were aligned with the RNASALSA alignment software [30], considering rRNA secondary structures. Ambiguously aligned positions where identified and excluded prior to the tree reconstruction. Based on the complete aligned data sets, we further conducted Maximum Likelihood (ML) tree reconstructions with the RAxML v7.2.6 software package [31][32][33] with (1) a standard DNA model setups and (2) 13 mixed RNA/DNA model setups. In the latter, loop positions are covered by a standard DNA model and stem positions are covered by a specific RNA model. Performance of different model setups was compared according to recent morphological and molecular expectations of taxonomy. To test the relative homoplasy between stem and loop regions, all alignments were divided into unpaired (loop) and paired (stem) positions according to a consensus secondary structure. Both partitions were then separately tested for homoplasy by estimating the level of substitutional saturation. Additionally, ML analyses were conducted on loop and stem partitions separately and the results were compared to the trees from the combined data sets. The analyses setup is depicted in Figure 1.

Phylogenetic analyses
In the following, we represent and discuss the results of Echinodermata, Tunicata and Mammalia data sets as

Maximum Likelihood
Test for potential homoplasy Loops Stems examples. Discussion on the results of all other data sets are provided in Additional file 1. To investigate the impact of homoplasy in loop regions on the behaviour of mixed RNA/DNA models setups in the tree reconstruction, the tree reconstruction results of all 13 RNA models were compared to the trees relying on the DNA model setup. Trees were evaluated in comparisons with recent morphological and molecular understanding of the accordant group, where we mainly focus on "benchmark clades" to check the reliability of each model setup. Clades were defined as "benchmark clades", if they have repeatedly received support in previous studies, based on independent morphological and/or molecular data.

Phylogeny of echinoderm classes
Echinodermata is divided into five extant classes, the Crinoidea (sea lilies), Ophiuroidea (brittle stars), Asteroidea (starfishes), Holothuroidea (sea cucumbers) and Echinoidea (sea urchins). Monophyly in these five classes is well founded [34], whereas the relationships among them are still debated. Nevertheless, there is some consensus regarding major aspects of echinoderm phylogeny [34][35][36]. Crinoids are seen as the most basal split within Echinodermata, forming the sister group to the four remaining classes (Eleutherozoa). Furthermore, there is strong support for a sister group relationship of echinoids and holothurians (Echinozoa). Debates on the phylogenetic position of the stellate forms (starfishes and brittle stars) revolve around two competing hypotheses: are the ophiurids alone sister group to Echinozoa [37,38] or do asteroids and ophiuroids form a clade (Asterozoa), which is then the sister taxon to Echinozoa [34]? The above outlined hypotheses are only reflected by the results of the GTR and the RNA6A model setups. These trees all show basal Crinoidea and Eleutherozoa divided into Asterozoa and Echinozoa. In contrast, all other mixed RNA/DNA model setups show either Holothuroidea or a clade of Holothuroidea + Echinoidea the sister taxon to the rest of Echinodermata (Figure 2), thus clearly contradicting current expectations of echinoderm phylogeny.

The position of Appendicularia within Tunicata
Molecular approaches to the phylogeny of Tunicata are generally hampered by the base composition biases and elevated substitution rates in Aplousobranchia, Appendicularia and Mogulidae (Stolidobranchia). Appendicularia retain larval characters throughout their lifespan, which made an understanding of their phylogeny crucial for understanding the evolution of body plans and developmental modes in Tunicata [27]. Recent molecular studies using phylogenomic or rRNA data to target the phylogeny of Tunicata usually recover Appendicularia as sister group to all other tunicate groups [39][40][41][42]. However, this position is suspected to be a result of a "long branch attraction" artefact, due to genome-wide elevated substitution rates in this group [27,40]. As an alternative, Appendicularia as sister to Stolidobranchia has been recovered through analyses of 18S rRNA genes [27,39,43,44]. However, this position was generally weakly supported and has been discussed as a possible result of base composition bias in Appendicularia and Mogulidae, a family of Stolidobranchia [27]. These problems are reflected by the results of our study on tunicates ( Figure 3), which either show Appendicularia as first split within Tunicata (RNA7C, E, F and RNA16A model setups) or as sister group to Stolidobranchia (all other model setups). According to the currently unresolved position of Appendicularia, none of these alternatives can be chosen as superior.

Relationships within Mammalia
The first molecular analyses on the phylogeny of Mammalia, using mitochondrial genes have remarkably challenged previous morphological hypotheses on the relationship among mammalian groups [45]. However, subsequent studies on nuclear markers and more sophisticated analyses of mitochondrial genomes led to more consistent hypotheses of mammalian relationships, which are in several aspects congruent to morphological studies [45][46][47][48][49][50][51]. General congruence among these independent markers has resulted in a well resolved and strongly corroborated backbone tree of mammalian groups, representing the four "superorders" Xenarthra, Afrotheria, Euarchontoglires and Laurasiatheria and several subgroups, e.g. monophyletic Theria, the Paenungulata (containing elephants, hyraxes and sirenians), Tetytheria (elephants and sirenians), and Euarchonta (Scandentia + Dermoptera + Primates). Consequently, the evaluation of tree reconstructions targeting mammalian phylogeny has been formalised by defining theses groups as "benchmark clades" [52], whose appearance is used to evaluate the performance of the method that was used. In the present study, most analyses are highly congruent in their results of mammalian relationships and display many of the proposed benchmark clades with sufficient support. However, in the RNA6A, RNA16 and RNA16A analyses, a clade combining Afrotheria + Xenarthra is sistergroup to Laurasiatheria and Rodentia appear paraphyletic to the remaining eutherian groups, with Muroidea + Anomalurus as first split within Eutheria. This reflects the suggestions of several previous analyses based on mt genes, but must been interpreted as a result of model misspecification ignoring among-site rate variation [53,54] and compositional bias [55]. In contrast, all other analyses show Afrotheria + Xenarthra as first split within Eutheria and paraphyletic Rodentia, but the latter are nested within monophyletic Laurasiatheria. It is notable, that bootstrap support values for potentially correct groupings increased, if mixed RNA/DNA model setups are applied (cf. Figure 4 and Additional file 2 for complete mammalian trees).

Separate analyses of loops and stems
Homoplasy due to multiple substitutions was tested with the index of substitution saturation (ISS) [56,57], which assumes a critical index of substitution saturation (ISSc) that defines a threshold for significant saturation in the data. The ISSc is compared with the observed ISS of the data. If the ISS value is larger than the critical ISSc values, saturation is assumed. To contribute to different tree shapes, the ISSc is estimated, using a symmetrical (balanced) and an asymmetrical (pectinate) tree topology. The test for homoplasy reveals striking differences between paired and unpaired positions. In the stem portions of all data sets, the asymmetrical and the symmetrical ISSc is always larger than the observed ISS. The differences are significant, thus indicating that the paired partitions are not saturated. In contrast, we detected potential saturation of substitution in the unpaired positions of several data sets. For Echinodermata, Hexapoda, Tunicata, Chilopoda and Heterobranchia, the ISS of was notably larger than the asymmetrical ISSc, suggesting substantial saturation in these alignments. The complete results of the saturation tests are summarised in Table 1 Table 2 saturation vanishes in all of the combined data sets.
Subsequently, ML tree reconstructions were conducted on the separated loop and stem partitions, using a DNA model setup. To characterise the phylogenetic signal in both partitions, we checked whether the trees from the paired or the unpaired partition were more congruent to the combined data results. Trees resulting from all three setups (combined, paired, unpaired) were compared with the Robinson-Foulds [58] (RF) distance score, which accounts for topology differences. This indicates a closer similarity of trees based on combined and unpaired data. Comparisons between the combined data set, analysed under different mixed model schemes, usually strengthen this effect. With the exception of Chilopda and Hexapoda, most RF distances between the combined data and the unpaired data diminished, whereas RF distances between the combined data and the paired data often increased (cf. Figure 6 and Additional file 3 Table S4 and S5).

Relative homoplasy and RNA modelling
To our knowledge, this is the first work on a separate characterization of homoplasy in paired and unpaired regions of rRNA sequences. We examined relative homoplasy separately, as due to their distinct physiological  function in protein biosynthesis, paired and unpaired positions can be expected to evolve differentially and might therefore also differ in their sensitivity to numerous substitutions occurring at the same position, thus hiding or completely erasing phylogenetic signal. Our results indicate separate exploratory analyses of loops and stems as crucial, because homoplasy due to multiple substitutions in loop positions could not be detected if the combined data sets (loops + stems) are tested for excessive homoplasy (Table 2). Pooling loop portions may be subject to the same kind of underestimation of homoplasy, if rates among loop regions are highly heterogeneous, as is apparent in Van de Peer et al. [59]. Thus, it may be advisable to estimate homoplasy in each loop region separately. These homoplastic substitution patterns have generally been addressed as "substitutional saturation" [60,61]. However, "saturation" is a concept that is only relevant to distance based analyses, where "saturation" refers to saturation curves, in which increasing phylogenetic depth does not increase pairwise distances. Phylogenies have nodes at many levels, from the tips to the root, and character based analyses can insulate homoplasy and mediate errors due to homoplastic sites in ways that distance based analyses cannot [62,63], for example by increasing the taxon sampling to break up long branches [64][65][66][67][68][69]. In this context, it can further be stated that "saturation" is not an inherent character of an aligned sequence position in a given alignment, it rather depends on the considered phylogenetic level. Additional measurements of the ISS in the tunicate data set, applied to the loop partitions of distinct monophyletic subgroups within Tunicata, shows an ISS significantly smaller than the ISSc for both symmetrical and asymmetrical topologies, thus indicating a decrease of homoplasy in these groups ( Table 3). The ISS method applied here accounts for position specific nucleotide frequency pattern in a given alignment, which are supposed to reflect the occurrence of multiple substitutions in the data set [57], thus implying relative homoplasy. Therefore, the use of the ISS might be a reasonable heuristic to estimate the level of homoplasy according to the deep splits discussed in the considered data sets. As depicted in Figure 5, most of the RNA models only lead to reasonable tree hypotheses, if the loop regions are found to contain meaningful phylogenetic data. In data sets identified as saturated, nearly all RNA models failed to recover an expected hypothesis. In contrast, the standard DNA model setup usually led to trees congruent with recent views on the relationships in the accordant groups. In this context, especially the results of the hexapod data are interesting, as these analyses did not provide superior trees by the DNA models setups, although saturation is detected. In the case of hexapods, no model setup led to the expected tree hypothesis. This indicates a generally insufficient phylogenetic signal of the 18S and 28S RNA data to resolve the shortest of the internodes among the deep hexapod splits. However, in [28], Bayesian inference analyses of the identical data set led to wrong tree hypotheses in the mixed RNA/ DNA model setup, but not in the standard DNA model setup. Kjer [11] found that topologies were virtually  identical between the standard and RNA models, but support values varied: in some cases, expected nodes were more strongly supported with GTR models (Archaeognatha, Pterygota, Paraneoptera + Holometabola, and Hymenoptera) other cases favoured the doublet model (Hexapoda, lestoids sister to other Zygoptera). Similarly, support for a paraphyletic Anisoptera (presumably incorrect) went down with the doublet model. Nevertheless it is noteworthy, that mixed RNA/DNA model setups frequently led to an increased support for probably correct clades in data sets without homoplastic loops (Mammalia and Primates), thus supporting previous studies on RNA models in phylogenetics [14,19,23]. Our analyses setup also allows comparisons between the different RNA model setups. Current RNA substitution models can be divided into two distinct classes, rooted in population genetics [5,10]. Models of the first class assume a one-step process of compensatory substitution in paired positions, thus allowing double substitutions (e.g. AU ↔ GC): a mutation in a base pair (AU GU) may led to slightly deleterious UG or GU pairs. If selection against these intermediates is strong, these are kept in low frequency in the population. If a second mutation occurs at the corresponding site (GU GC), drift in gene frequency may lead to a domination of this new base pairing in the population (RNA6A-D, RNA7A-B, D and RNA16A). In contrast, models of the second class assume a two-step process of compensatory substitution in paired positions, considering one substitution in base-pairs, with a probability of zero for all double substitutions. This approach considers a fixation of intermediate states in the population at a high frequency, after a mutation in a base pair and before a second mutation at the corresponding site (RNA7C, F and RNA16, B). RNA models can be further discriminated by their treatment of mismatches: 6-state models completely ignore these pairings, 7-state models lump all mismatches in one category, whereas 16-state models apply distinct frequency and substitution rate parameters to the individual mismatches.
Previous studies on RNA models in phylogenetics have predicted the superiority of models allowing double substitutions and the superiority of the most general models (RNA6A, RNA7A) [5,10,70]. The latter is corroborated by the AICc modeltest of the present analyses, which frequently show higher likelihoods and AICc values for the most general models (see Additional file 3 Table S2 and S3). Furthermore, the RNA6A model led to the expected topologies in two data sets (Echinodermata and Chilopoda) showing relative homoplasy in loop partitions, whereas all other RNA models fail to display presumably correct trees, if significant homoplasy was identified ( Figure 5). In this context, the RNA6D and RNA16B models are performing worst, as both are only able to display one potentially correct tree hypotheses. Additionally, congruencies between the performance of RNA models and the results of the AICc modeltest can be drawn from our results. According to the AICc modeltest, in the class of the RNA6 models, the most general RNA6A model is always superior to all other RNA6 models (see Additional file 3 Table S2 and S3), which is further congruent to its performance in tree reconstruction analyses. This is not reflected by the RNA7 and RNA16 models, where the models with the highest AICc scores (RN7A-B and RNA16 did not perform best (cf. Figure 5).

Potential pitfalls of RNA modelling
Consequences of different evolutionary constraints in stem and loop regions of rRNA sequences for phylogenetic analyses has long been suspected and led to different recommendations for weighting stem positions in parsimony analyses [2,7,8]. Beside suggestions for simple one-half weighting of paired positions [7], empirical investigation of compensatory substitution rates in stem positions [8] reveals a rate of about 40% of that expected under a hypothesis of perfect compensation. Therefore, the weighting of stem characters is suggested to be reduced by no more than 20%. Consequently, in model based tree reconstruction methods, like Bayesian inference and Maximum Likelihood, it should be reasonable to use specific RNA models (which can be seen as an algorithmic equivalent to weighting stem positions in Maximum Parsimony) as simply applying a standard DNA model to data from one part of the helical regions. Application of these RNA models has frequently been justified by a consistent phylogenetic signal of coevolved paired sites, decreasing the information content in the data [5,18]. Analyses ignoring this interdependence should tend to overestimate the support for dubious or even wrong nodes in a tree [13]. Due to a reduced number of effective sites, the application of specific RNA models, which take interdependencies into account, reduces tree confidence, but is more reliable in the light of the information content in the data [12,16].
Our results actually imply a reduced impact of stem positions in the combined data set, if mixed RNA/DNA model setup are used. This is depicted by the tree distance results of the separate analyses of the stem and loop partitions. In most data sets, the distances between the trees based on only the loop partition and the combined data are reduced, if RNA models are applied for the combined data, whereas the distances between the stem partition and the combined data are mostly enlarged (Figure 2). This could have been expected, if coevolution in paired sites is assumed and thus these positions do not provide independent phylogenetic information. However, for several of the currently tested data sets, the substitution saturation test reveals that the unpaired positions clearly experience excessive homoplasy, which indicates a loss of phylogenetic information, as these positions are no longer informative [71]. In this context, stem positions in the current data sets should contain more reliable signal, compared to loop regions, as they exhibit a much lesser grade of homoplasy due to multiple substitutions. Consequently, the application of RNA models increases the relative impact of noisy positions in the data set and reduces the influence of more informative portions. Thus, the results of the current analyses corroborate the hypothesis proposed by Letsch et al. [28].
RNA models doubtlessly provide a better depiction of the phylogenetic information content of rRNA data sets, but this might be a trap, if homoplasy is far greater in loop positions. In this case, the informative phylogenetic content is obscured by noise. The situation might probably be depicted best, if we thought of a weighting scheme for rRNA data sets: in standard DNA model setup, the signal of paired positions is virtually weighted twice, as both positions are linked and signal of pairs can be seen redundant. As outlined above, previous studies have mostly targeted this as a problem [5,10,12,13,18,19], but the current analyses showed relative homoplasy as delimiting the confidence of the phylogenetic signal provided by loop regions, revealing a more or less hidden coherence between two factorscovariation and homoplasy -contributing to the phylogenetic signal of the rRNA data sets. For this scenario it can be stated, that in contrast to a previously proposed overestimation of wrong support by ignoring site interdependencies [13], the application of RNA models will tend to overestimate the support for dubious or wrong nodes in a tree.
As depicted above, loops and stems can be expected to experience different selectional regimes, which has resulted in the development of the specific RNA models. Nevertheless, it has been noted as early as 1991 [72] that substitution rates do not fit neatly into stem-loop partitions, and thus weighting according to stems vs. loops might be problematic, which was later demonstrated by Van de Peer [59]. Consequently, selectional constraints on rRNA may not only differ between paired and unpaired regions, but also among the individual loop or stem regions, which would depend on the individual function and their relative location in the 3D rRNA molecule. Binding sites of ribosomal proteins, for example protein L11-binding domain (L11-BD) within the LSU rRNA domain II and the sarcin-ricin loop within domain VI, constituting the GTPase-associated center [73] or the LSU rRNA domain V, which contains the peptidyl transferase center (PTC) [74], are highly conserved throughout metazoa. Furthermore, many of the rRNA regions of domain IV that are involved in tRNA and inter-subunit interactions are also preserved [74,75]. In contrast, the domain I of the mt LSU is highly variable on sequence level and until now, no conserved secondary structures could be detected [4,73]. Consequently, the partitioning into loops and stems must be seen as only an relative coarse approximation to model rRNA sequences. In future phylogenetic studies on rRNA, more sophisticated partitioning schemes, depending on the function, base composition and relative location of rRNA regions, would be able to enhance model based tree reconstruction analyses

Conclusions
The results of the present study can be interpreted as a trade-off between using specific RNA models for a hopefully more accurate estimation of covariation in paired sites and the risk of augmenting relatively homoplastic unpaired positions in the tree reconstruction. For future phylogenetic studies based on rRNA sequences, we would therefore highly recommend a separate test for saturation of substitution in loop and stem partitions of the aligned data set. The use of a mixed RNA/DNA model setup should be avoided if saturation occurs in the loop partitions, as otherwise the valuable phylogenetic signal of the stem partitions might be masked by potentially noisy signal provided by the loops. In contrast, if no substantial homoplasy is detected in the data, the use of mixed RNA/DNA models can be highly recommend, as these lead to an increased support to probably correct clades.
Based on the presents results, we cannot advocate an general exclusion of the potentially noisy loop positions: First, noise is not an inherent character of a certain nucleotide position, but depends on the considered phylogenetic level. And second, differences among loop (or stem) regions can be expected and excluding these regions as a whole reduces the phylogenetic signal of the data set. Consequently, we rather recommend to think about enhanced partitioning strategies, which would allow a more careful modelling of rRNA sequences and provide a first approach to detect noisy signal among loop (or stem) partitions.
However, covariation and substitution saturation are only two parameters of the evolutionary inherent pattern displayed (or hidden) in the data. Other phenomena, like non-stationarity of substitution rates across sites and branches as well as base composition heterogeneity, might also maul the signal content of the data set. A previous study [26] based on nuclear rRNA genes, identified deviation of base composition in certain clades as probably misleading tree reconstruction analyses, rather than the covariation pattern in stem regions. A sophisticated estimation of evolutionary pattern in rRNA sequence data is therefore principally desirable and newly developed methods should be applied, which are able to consider background knowledge as covariation, non-stationary processes or heterogeneity in the data [26,76].

Compilation of data sets
As exemplary data sets to test the performance of we chose Echinodermata (18S), Tunicata (18S), Heterobranchia (18S) Chilopoda (18S), Hexapoda (18S + 28S), Mammalia (12S + 16S), Primates (12S + 16S) and Anisoptera (12S + 16S + 28S). All sequences were downloaded from NCBI Genbank (Additional file 4 provides complete taxon tables, including Genbank accession numbers). To apply mixed RNA/DNA models in the tree reconstruction, we had to infer reliable individual secondary structures. Consequently, we only considered 18S sequences with at least 1700 bp and 28S sequences with at least 3000 bp. 12S and 16S rRNA sequences in the primate and mammalian data sets were taken from entire mitochondrial genomes and therefore span the entire rRNA locus. In Anisoptera, the 12S and 16S rRNA sequences have minimum lengths of 500 bp and 1250 bp respectively. For the combined data sets, we only considered taxa that were represented by all genes.

Alignment procedures
Alignment was done with the RNASALSA software [30], which aligns ribosomal RNA sequences by utilising existing hypotheses of structural patterns, in order to constrain thermodynamic folding algorithms and favour the alignment of sites that contain compensatory substitutions. In three steps, RNASALSA accumulates structure information, until each sequence receives its individual secondary structure string. In the first steps, conserved structure features are recognized via primary sequence conservation and consistent and/or compensatory substitution, which provides a structure skeleton for the next step, where the more variable regions gain structures by thermodynamic folding. Finally, the combined sequence and structure strings are simultaneously aligned, where sequence and structure information come into account. The program uses structural constraints as an input file, and our constraints of nuclear and mitochondrial SSU/ LSU genes (see Supplement S1), were originally retrieved from the European Ribosomal Database (ERDB) [77][78][79]. The structures of these sources are coded in the proprietary DCSE format and were recoded into the required dot-bracket format with the program extractfromdcse of the PHASE software package [18,80]. The ERDB homepage is offline now, but readily recoded constraint structure files (representing various metazoan groups), as well as tools to divide loop and stem partitions, are available at the RNAsalsa homepage http://rnasalsa.zfmk.de. RNA-SALSA also requires a "pre-alignment" input file, which was obtained from the E-INS-i algorithm of the MAFFT alignment package [81], using default settings. The stringency settings for adoption of secondary structures in different alignment steps was relaxed (0.51), as we wanted to retain as much structure information as possible (see [30] for a detailed description of the RNASALSA method). Subsequent evaluation of the alignments was done with ALISCORE [82], which identifies ambiguously aligned regions in multiple sequence alignments. For gap treatment (g), window size (ws) and random pairwise comparisons (pc), the following settings were used (g: gaps as ambiguous characters; ws: four positions; pc: taxa 2 ).

Maximum Likelihood analyses
Maximum Likelihood analyses were conducted with RAxML v7.2.6 [31][32][33], which is an enhanced program for computing phylogenetic trees based on Maximum likelihood inference that includes RNA substitution models (RNA6A-D, RNA7A-F, RNA16, RNA16A and RNA16B, for a detailed description of the RNA models, please refer to the manual of the PHASE software package [80]). To define paired an unpaired partitions, the consensus structures in dot bracket format were used, obtained from the RNAsalsa alignments. In the standard DNA setup, the GTR model was used with all model parameters estimated from the data, with among site rate variation modelled with gamma distributed rates across sites with four discrete rate categories. Additionally, model parameters were optimised for different partitions, representing SSU and LSU rRNA sequences respectively. In the RNA model setups, a third partition is defined, according to the consensus secondary structure of the whole alignment and all paired position are extracted and pooled in third partition. The consensus structure provided by RNAsalsa Model fitting in both single nucleotide partitions is applied as in the standard model setup and the in paired nucleotide partition a specific RNA model is used. Within each class of RNAmodels, the best model is evaluated by an Akaike Information criterion (AICc) test.

Test for homoplasy
Relative homoplasy was examined between loop and stem regions. For this purpose, the aligned data sets were divided into paired and unpaired partitions, according to the consensus structures, provided by the RNASALSA alignments. Subsequently, each partitions was compared for the level of homoplasy in the data, using the substitution saturation test of the program package DAMBE v5.2.9 [56,57], which estimates an "index of substitution saturation", based on the notion of entropy in information theory. Prior to the saturation test, we accounted for invariant sites, which provides a more reasonable estimation of potential saturation in the data sets [57].

Additional material
Additional file 1: Additional tree reconstruction results. Detailed discussion on the results of tree reconstruction of Chilopoda, Hexapoda, Anisoptera, Primates and Heterobranchia.
Additional file 2: Tree reconstruction results. All trees (Newick le format) provided by the DNA model setups the and mixed RNA/DNA model setups of all applied data sets.
Additional file 3: Tables. Tables providing the Genbank accession numbers of constraint sequences used for the RNASALSA alignment, the detailed results of the AICc test and the detailed results of the Robinson-Foulds distance measurements.
Additional file 4: Taxa list. A list of all applied sequence data with according Genbank accession numbers.