Circular RNAs: Methodological challenges and perspectives in cardiovascular diseases

Abstract Circular RNAs are generated by back‐splicing of precursor‐mRNAs. Although they have been known for many years, only recently we have started to appreciate their widespread expression and their regulatory functions in a variety of biological processes. Not surprisingly, circular RNA dysregulation and participation in the pathogenic mechanisms have started to emerge in many instances, including cardiovascular diseases. Detection, differential expression analysis and validation are the three critical points for the characterization of any RNA, and circular RNAs are no exception. Their characteristics, however, generate several problems that are yet to be completely addressed, and literature still lacks comprehensive definitions of well‐defined best practices. We present a map of the current knowledge regarding circular RNAs and the critical issues limiting our understanding of their regulation and function. The goal was to provide the readers with the tools to critically decide which of the many approaches available is most suitable to their experimental plan. Although particularly focused on cardiovascular diseases, most critical issues concerning circular RNAs are common to many other fields of investigation.

widespread use of deep sequencing techniques aimed towards the detection of ncRNAs-prompting the need of their functional characterization.
Although no generalized function has been identified yet, four main roles have been described 3,4,[9][10][11][12] and are shared by a subset of circRNAs. CircRNAs can work as miRNA sponges. Some circRNAs contain an above-average amount of seed sequences to bind a specific miRNA and are therefore able to compete for the miRNA and sequester it, reducing its bioavailability and final effect. A key example of this is CDR1AS, a circRNA transcribed in antisense from the CDR1 locus, which contains more than 60 sites for miR-7. CircRNAs can also act as RNA binding protein sponges. CircRNAs have been observed to interact with RNA binding proteins in order to bind protein complexes and target them towards specific sequences. For instance, Argonaute, RNA Polymerase II and MBNL1 can all bind exonic circRNAs. Moreover, circRNAs can impact gene expression of their host gene locus. CircRNA expression modulation is often correlated with modulation of their linear counterparts, although the mechanism through which this is achieved is still unclear. CircRNAs may bind U1 snRNP and act as cis regulator, or just reduce the amount of pre-mRNA available for canonical splicing. Finally, cir-cRNAs can have a role in translation, as a small fraction of circRNAs seems to contain the necessary information to be translated with a cap-independent mechanism.
CircRNAs detection experiments identified thousands 3,13,14 of tissue-and condition-specific species. As of today, no golden standard is available for these kinds of experiments, however, and the differences between the approaches can be significant.
In spite of a number of still open issues, our understanding of the regulation and role of circRNAs is expanding rapidly in all areas of biomedical investigation, including the cardiovascular system. This comes as no surprise, as cardiovascular diseases are among the leading causes of mortality. Indeed, ischaemic heart disease alone accounted for more than 15% of all deaths in 2015 worldwide (WHO Media centre. http://www.who.int/mediacentre/factsheets/ fs310/en/.), highlighting the need of a better understanding of the pathogenetic mechanisms underpinning cardiovascular diseases.
Additionally, circRNAs are stable and resist degradation, they are present in the bloodstream 15 and in exosomes, 16 and therefore, they represent excellent candidates as non-invasive biomarkers.
The goal of this review was to provide an overview of the approaches used for circRNA investigation. For each method, positive and negative aspects will be illustrated, in order to let the readers critically choose the one more appropriate to their own experimental design. We will also review some of the landmark papers on circRNA regulation and role in the cardiovascular system, paying attention to the methodological aspects adopted in these studies. Despite that, there are two common models of biogenesis available that can explain the known features of circRNAs, as schematized in Figure 1. The first model is based on lariat-driven circularization F I G U R E 1 Proposed Mechanisms of CircRNA Biogenesis. A, Canonical linear splicing: Canonical linear splicing determines the maturation of a pre-mRNA by joining exons together. A donor site and a downstream acceptor site are spliced together, as the introns generate a lariat which is afterwards degraded. B, Lariatdriven circularization: The linear splicing event takes place first. The lariat generated by the splicing event can then be spliced itself to remove the introns, giving birth to an exonic circRNA. C, Intron pairing-driven circularization: Alu repeats and other complementary intronic sequences are statistically overrepresented in introns adjacent to back-splice junctions. RNA molecules are hypothesized to acquire a secondary structure by binding of these sequences and to facilitate circularization ( Figure 1B). Linear alternative splicing commonly produces exon-skipping events in which the pre-mRNA is spliced by joining two nonadjacent exons by the recognition of a downstream branch point. 17 This event generates a lariat containing the skipped exon (or exons) that can undergo splicing itself. The removal of lariat introns causes circularization of the exon(s) involved. This circularization event is consequence of a linear splicing, and this implies the production of a co-linear RNA containing the unskipped exons, pairing the expression of each circRNA with the expression of a specific isoform of the linear counterpart. This model has been the first suggested, arising from the evidence of the correlation between circRNAs and their co-linear counterparts, 18 a trend not observed in further separate studies. 5 The second model is based on intron pairing-driven circularization ( Figure 1C). Introns adjacent to the exons involved in the formation of the back-splice junction have been observed to be significantly longer 19 and to have a significantly higher concentration of intron motifs and ALU repeats. 17 It has been theorized that, in the pre-mRNA, these introns may pair by the interaction between two complementary motifs, promoting a circularizing splicing. This event, although similar to the previous one, does not couple linear splicing and circularization, as a single pre-mRNA can produce either a circular or a linear species, thus putting the two events in competition for the pool of pre-mRNA.

| BIOGE NESIS
Additional hypotheses have been presented based on the biogenesis of aberrant non-collinear splicing events identified in cancer. 17 Regardless the specific mechanism, it is critical to remember that the biogenesis might explain expression correlations between circular and linear species, although this relation is not always true.

| DATABASES
The collective effort in circRNA detection using either custom or published pipelines has created the need for a comprehensive database that easily summarizes published events that were predicted, detected or validated.
At the time of writing, we were able to find 11 online databases that include circRNAs, have a web-based interface and are easily reachable from the links provided in the corresponding papers (Table 1): BBBomics, 20 circ2Traits, 21 circBase, 22 circInteractome, 23,24 circNet, 25 circRNADb, 26 CSCD, 27 exorBase, 28 PlantcircBase, 29 Soma-miR, 30 TSCD. 31 A recent study by Ying Xu provides an overview of the major databases currently available. 32 The two most widely used are cir-cBase and circNet, but their manual curation slows down the inclusion of the increasing amount of studies published every month on the topic. It is evident that, at present, we lack comprehensive databases that gather all information regarding circRNAs in one common place. To achieve this, a database needs to be generic, structured, updated frequently and partially automated in a fashion similar to Gene Expression Omnibus, RefSeq 33,34 and miRBase. 34 The first critical step would be to allow the submission of published detected cir-cRNAs from research teams and to provide a background system to aggregate the submitted data. This ever-growing, low-maintenance system can then be used as a base to provide all kind of exploratory and statistical data on circRNAs.

| Detection
The advent of total RNA sequencing libraries depleted of rRNA opened the door for the detection and analysis of all classes of ncRNAs characterized by the absence of a 3′ polyadenilated tail. Cir-cRNAs are present in such libraries, although canonical alignment methods are not able to detect them, as they rely on the properties typical of linear splicing events. The alignment begins with a seed that is then extended and, when an intron is reached, the alignment is resumed downstream to take into account introns and collinear splicing. Non-collinear splicing events are characterized by a splicing that may connect one exon with another that is largely distant, upstream or on a different chromosome. All reads supporting a non-collinear splicing, including back-splicing, are discarded by classic aligners.

Database Website Notes
BBBomics http://bioinformaticstools.mayo.edu/bbbomics/ Specific for blood-brain barrier circ2Traits http://gyanxet-beta.com/circdb/index.php CircRNA-disease associations The discovery of non-collinear splicing events generated a collective effort for the development of pipelines designed for the detection of these events in general and of circRNAs in particular. A comprehensive review by Zeng et al 2 describes the main circRNA detection tools (Table 2) and evaluates them for precision and sensitivity. While all have merit, the space for improvement is still clearly present for back-splicing events detection.
CircRNA detection is based on the correct assignment ("recall") of otherwise not-mappable reads to putative back-splice junctions and can be divided into two large categories: segmented read based (or fragmentation based) and candidate based (or pseudo reference based).
Segmented read-based strategies split unmapped reads in smaller fragments that are aligned separately. The reconstruction of the alignment pattern allows to identify non-collinear splicing events if the position of the fragments is discordant, that is on different loci, chromosomes, upstream or just in a different orientation. Their nature allows de novo detection and is completely independent from an existing annotation.
Candidate-based strategies rely on existent annotation and exploit it to generate putative events on which reads are tested for alignment. These tools lack the positive traits of segmented read-based strategies and can often report only exonic circRNAs, but they tend to be faster in execution and to produce a list of exonic cir-cRNA with a more reliable position of the splice site.
All tools implement a number of filters built to increase precision.
The nature and strength of these filters directly correlate with precision and sensitivity, with the tools using the most stringent filters being the most precise and less sensitive, while the tools using less stringent filters tend to trade precision in favour of sensitivity. As suggested for other non-collinear splicing events, 35 it is advisable to achieve the highest sensitivity and add biologically meaningful, manually curated filtering whenever possible.
CircRNAs often display low expression levels compared to mRNAs. 13,14,36 This is particularly crucial in RNA-Seq datasets, because the most important filter to apply is based on the reads spanning the back-splice junction 35 40 With a balance more on sensitivity than on precision, they provide longer lists of events that may easily reach the level of thousands. Flexible approaches are more suitable for subsequent differential expression analysis, allowing to manually define the selection criteria used to rank the events for validation.
In the landscape of detection tools, miARma-Seq 41 fills a different role. miARma-Seq is a complete pipeline that allows detection of short and long RNAs bundling major detection and differential expression tools in a single package. CircRNA detection is performed using CIRI internally, with similar expected results.

| Differential expression
Differential expression for alternative splicing events is a matter only partially solved. Back-splicing events introduce an additional level of complexity to the problem, a layer that has not yet been properly defined. CircRNAs, in particular, possess a number of critical aspects that influence the differential expression analysis, and we shall list them. (a) Coverage distribution on circRNAs is yet to be modelled. It is important to remember that, regardless the tool of choice, so far no differential expression analysis software has been designed to correctly handle circRNAs, thus making any result unreliable. Nevertheless, there are two main tools that can be used for differential expression analysis: limma 42 and edgeR. 43 EdgeR method limits the amount of assumptions for the statistical analysis, and it is the method of choice in every occasion in which the expression pattern is unknown and not modelled. The downside of this approach is its sensitivity to variability between samples, a common characteristics in circRNA data.
Limma method was designed for differential expression analysis based on linear models on microarray data and was subsequently used for RNA-Seq data as well. Of particular interest, it was the introduction of the voom transformation, a method explicitly created to increase the power of the analysis by compensating variability.
Limma therefore provides the tools to soften one of the heaviest problems in circRNA data, provided the assumption that the expression patterns of circRNAs follow those of mRNAs.
A third alternative is represented by DESeq2, a method allowing quantitative data analysis focused on the strength rather than the presence of differential expression. 44 However, further tests are necessary to validate its use for circRNA analysis.
The quantification of circRNAs relies on methods that are avail-

| Proposed pipeline
Differential expression analysis of circRNAs in RNA-Seq datasets still lacks a fully validated pipeline. Therefore, there are at least three control points which allow freedom of choice that might drastically change the final results: detection, differential expression analysis and enrichment (Figure 2).
Several detection tools exist, and they can be categorized according to their precision and sensitivity. A precise tool is advisable when high-confidence, highly expressed circRNAs are the objective. At contrary, sensitive tools provide large datasets useful for exploratory studies and enable tentative differential expression analysis.
Differential expression analysis tools are multiple, but, because of the expression levels of the circRNAs, no whole-transcriptome method is efficient or described as valid. Differential expression analysis is therefore limited to PCR quantification of selected events or normalization and comparison via clustering and heatmap.
Enrichment and validation are more evolved and rely on the combination of multiple methods to confirm position and existence of back-splice junction, circularity and nucleotide sequence by qPCR, RNase R digestion and sequencing, respectively.

| MICR OAR RAYS
Microarrays are the election method for detection and differential expression analysis. Arraystar Inc. produces the first commercial microarray that hybridizes specifically on a selected panel of cir-cRNAs (https://www.arraystar.com/arraystar-human-circular-rna-mic roarray) and was used in milestone studies in the field. 3,13,14,36,48,49 A number of recent investigations take advantage Arraystar's microarrays and report strong detection and validation efficiency. [50][51][52][53][54][55][56] Microarray analysis removes the uncertainty that shrouds RNA-Seq analysis because of a lack of generalization. The procedure is targeted and, when the reproducibility and efficiency are ensured by the manufacturer, the standard analysis methods can be perfectly applied regardless of the hybridized RNA species. It is, however, important to remember that any microarray analysis depends on the annotation accuracy at the time of development and allows only the detection of the annotated RNA species.
Finally, the procedure of labelling requires digestion with RNase R to enrich for circular species. As we will see in the "Validation" section, digestion might introduce biases that are not yet quantified, potentially skewing the differential expression analysis for a small amount of circRNAs.

| VALIDATION
Validation of circRNAs is a multi-step passage. Initially, the presence of the back-splice junction must be verified. This can be achieved with a standard qPCR experiment using divergent primers ( Figure 2  Additionally, whenever relevant, CRISPR-Cas9 genome editing technology can be used to remove the locus encoding the circRNA. 60

| CIRCRNA IN CARDIOVASCULAR DISEASES
As it happened in many other areas of investigation, circRNAs regulation and function in the cardiovascular system have become a research hotspot in the last few years. Although very little is still known, evidence of circRNA implication in heart and blood vessels development, function and disease is clearly emerging. We will review some of the most paradigmatic studies, hoping to stimulate a rapid growth of the field.

| Heart function and disease
A landscape of the circRNAs expressed in the heart has started to be defined in broad RNA-seq projects, 61,62 as well as in some more focused studies characterizing human, rat and mouse hearts. [63][64][65] CircRNAs have been shown to be dynamically expressed in a model of cardiac development constituted by human induced pluripotent stem cell-derived cardiomyocytes. 66 Sequencing revealed more than 4500 circRNAs, some of which with a host-gene-independent regulation. Examples of these circRNAs of interest are those generated from ATXN10, CHD7, DNAJC6 and SLC8A11 genes, which may interact with ribosomes and the RISC complex. In a qPCR screening of 100 randomly selected circRNAs, mm9circ-012559, renamed HRCR, was found to be decreased in a mouse model of heart hypertrophy. 68 CircRNA HRCR can act as a sponge for miR-233, resulting in the increase in the miR-233-target ARC and reducing cardiac hypertrophy and heart failure in mice.
Hypothesis-driven studies are also important. CDR1AS is a wellknown circRNA, first identified as a miR-7a/b sponge and inhibitor in brain cells. 3,60, 69 Geng and collaborators found that CDR1AS is also expressed in the heart. In myocardial cells, it acts as a miR-7a sponge, interfering with the miRNA protective role in myocardial infarction injury. 70 Ageing is a major risk factor of decreased heart function and a crucial stratification factor for heart-related diseases. circFOXO3 has been found to be up-regulated in aged hearts of both humans and mice 71 and correlates with markers of cell senescence. The functional relevance of circFOXO3 was demonstrated in doxorubicininduced mouse cardiomyopathy. In these mice, heart disease is aggravated by the overexpression of circFOXO3 and is attenuated by its silencing. Mechanistically, circFOXO3 interacts with ID1, E2F1, FAK and HIF1A in the cytoplasm, blocking their nuclear translocation, thus inhibiting their anti-stress and anti-senescence functions.
RNA-splicing regulators are of obvious relevance for circRNA biogenesis and modulation. Indeed, important insights came from the study of RBM20, a gene involved in the process of exon skipping in the heart. Of note, its mutation in patients causes a severe form of familial dilated cardiomyopathy. 72 CircRNA profiling by RNA-seq of human hearts allowed the identification of 80 circRNAs originating from the titin gene (TTN), a gene that is known to undergo highly complex alternative splicing. 73 A subset of these circRNAs are dynamically regulated in dilated cardiomyopathy and RBM20-null mice completely lack these titin circRNAs. Specifically, the loss of RBM20 affects only the circRNAs that originate from the I-band of titin. Thus, RBM20, by excluding specific exons from the pre-mRNA, might provide the substrate to form this class of titin circRNAs.
Another important RNA-binding protein that regulates pre-mRNA splicing is Quaking. The expression of Quaking gene is down-regulated in the murine myocardium exposed to doxorubicin and its deletion increases cardiomyocyte sensitivity to the treatment, while its overexpression blocks doxorubicin-induced apoptosis. 74 Of note, Quaking regulates the expression of specific circRNAs derived from a subset of genes, including titin, and the inhibition of titin-derived circRNA increases the susceptibility of cardiomyocytes to doxorubicin.

| Endothelium and angiogenesis
A landscape of the circRNAs expressed in vascular cells has started to be defined. 61 Whole transcriptome analysis of circRNAs in endothelial cells exposed to low oxygen tension was performed by Boeckel et al. 75 Among the hypoxia-induced circRNAs identified, silencing of circZNF292 reduced tube formation and spheroid sprouting of endothelial cells in vitro. Moreover, circZNF609 was identified among the most abundant circRNAs in endothelial cells. 75

| Atherosclerosis
CircRNAs studies in atherosclerosis patients are dominated by cir-cANRIL. This is an antisense circRNA generated by the 9p21 locus, CARRARA ET AL.
| 5183 whose SNPs have been linked in GWAS studies to atherosclerotic vascular disease, as well as to type 2 diabetes mellitus and other diseases. 80,81 The predominant circANRIL isoform consists of exons 5, 6 and 7, and circANRIL is expressed in both healthy and diseased human vascular tissues, as well as smooth muscle cells and monocyte/macrophages, which all play an important role in atherogenesis. 82 Interestingly, in these cells and tissues, the abundance of the circular species largely exceeds that of their linear counterpart.
Carriers of the coronary artery disease-protective haplotype at 9p21 show increased expression of circANRIL and decreased linAN-RIL in peripheral blood mononuclear cells; linear regression analysis indicates that patients with high circANRIL expression develop less coronary artery disease and highest circular/linear ANRIL ratios are found in disease-free patients. 82 Mechanistically, circANRIL is able to inhibit ribosome biogenesis, triggering the activation of the p53 pathway, and, thus, increasing apoptosis and reducing proliferation. This, in turn, induces atheroprotection by reducing the proliferation of the cells within the plaque. 82 Mechanistically, these events are mediated by the binding of circAN-RIL with Pescadillo homologue 1 (PES1), an essential 60S-pre-ribosomal assembly factor, preventing rRNA maturation.
In an independent study in a rat model of atherosclerosis, reduced circANRIL has been correlated with decreased coronary atherosclerosis and reduced apoptosis and inflammatory factors expression, as well as lower endothelial damage. 83 While further studies are necessary, this apparent contradiction between human and rat data highlights the importance of a close attention to the techniques used to modulate circANRIL in vivo and to the targeted cells (endothelial cells vs smooth muscle and macrophages).

| Stroke
Microarrays on HT22 mouse neuronal cells exposed to oxygen-glu- Accordingly, circDLGAP4 overexpression significantly inhibits endothelial-mesenchymal transition and protects the blood-brain barrier integrity in the mouse stroke model.

| Biomarkers in cardiovascular diseases
As previously stated, circRNAs possess strong biomarker potential and many studies aim to characterize new markers for risk stratification and early detection of diseases. 85 CircRNA MICRA associates with heart failure after acute myocardial infarction. 86,87 It allows to predict the development of heart failure and enables risk stratification. Of note, MICRA also originates from the ZNF609 locus, although different exons are involved compared to circZNF609 described in the vasculature, 75,76 as well as in muscle and neuronal cells. 10,77 In the plasma of coronary heart disease patients, microarrays detected 24 differentially expressed circRNAs. Bioinformatics analysis generated an interaction network mediated by hsa-miR-130a that involves TRPM3 and 9 circRNAs. 88 The 9 circRNAs are described as sponges for miR-130a and therefore able to indirectly cause the upregulation of TRPM3.
Finally, atherosclerotic plaque rupture is accompanied by an acute decrease in the carotid plaque expression of miR-221. 89 CircR-284 is a potential inhibitor of miR-221 activity and serum circR-284/ miR-221 ratio displayed to be a potential diagnostic biomarker of carotid plaque rupture and stroke.

| CONCLUSION S
The attention of the scientific community to circRNAs is growing

CONF LICT OF I NTEREST
The authors state that there are no conflicts of interest.