iCRBP-LKHA: Large convolutional kernel and hybrid channel-spatial attention for identifying circRNA-RBP interaction sites

Circular RNAs (circRNAs) play vital roles in transcription and translation. Identification of circRNA-RBP (RNA-binding protein) interaction sites has become a fundamental step in molecular and cell biology. Deep learning (DL)-based methods have been proposed to predict circRNA-RBP interaction sites and achieved impressive identification performance. However, those methods cannot effectively capture long-distance dependencies, and cannot effectively utilize the interaction information of multiple features. To overcome those limitations, we propose a DL-based model iCRBP-LKHA using deep hybrid networks for identifying circRNA-RBP interaction sites. iCRBP-LKHA adopts five encoding schemes. Meanwhile, the neural network architecture, which consists of large kernel convolutional neural network (LKCNN), convolutional block attention module with one-dimensional convolution (CBAM-1D) and bidirectional gating recurrent unit (BiGRU), can explore local information, global context information and multiple features interaction information automatically. To verify the effectiveness of iCRBP-LKHA, we compared its performance with shallow learning algorithms on 37 circRNAs datasets and 37 circRNAs stringent datasets. And we compared its performance with state-of-the-art DL-based methods on 37 circRNAs datasets, 37 circRNAs stringent datasets and 31 linear RNAs datasets. The experimental results not only show that iCRBP-LKHA outperforms other competing methods, but also demonstrate the potential of this model in identifying other RNA-RBP interaction sites.


Introduction
Circular RNAs (circRNAs) are a large class of non-coding RNAs that ubiquitously exist in many species [1,2], which have the characteristics of stable structure and high tissue-specific expression [3,4].CircRNAs affect transcription and translation processes by acting as transcriptional regulators, microRNA (miR) sponges and interacting with RNA binding proteins (RBPs) [5].The interaction with RBPs is one of the main activities of circRNAs.CircRNAs participate in the occurrence and development of diseases by interacting with RBPs.For example, circCwc27 plays a critical role in Alzheimer's disease pathogenesis by binding the purinerich element-binding protein A (Pur-α) [6].The interaction of circFndc3b and RBP FUS improves the function reconstruction of myocardium after infarction [7].Identifying cir-cRNA-RBP interaction sites have become a fundamental step for exploring the role of circRNA in the occurrence and progression of diseases [8][9][10][11][12].
Since high-throughput sequencing technology is expensive and time-consuming, researchers have proposed many computational methods to predict circRNA-RBP interaction sites [13][14][15].Recently, many DL-based methods have achieved remarkable results on predicting circRNA-RBP interaction sites.For example, CRIP [16] used a stacked codon-based encoding scheme and a hybrid deep learning architecture incorporating CNN and LSTM to predict cir-cRNA-RBP interaction sites.CircSLNN [17] predicted circRNA-RBP interaction sites by combining CNN, LSTM and conditional random field (CRF).PASSION [18] selected optimal feature subset from six feature encoding schemes using XGBoost algorithm, then applied CNN and BiLSTM to identify the interactions between circRNAs and RBPs.iCircRBP-DHN [19] proposed a novel encoding schemes CircRNA2Vec and used deep multi-scale residual network (MSRN) and self-attention BiGRUs to predict circRNA-RBP interaction sites.Inspired by iCircRBP-DHN, CRBPDL identified circRNA-RBP interaction sites by introducing five feature encoding schemes and AdaBoost algorithm [20].ASCRB used five feature encoding schemes and channel attention mechanisms to identify circRNA-RBP interaction sites [21].These methods have achieved impressive results in predicting circRNA-RBP interaction sites.Nevertheless, they still have several limitations.For long nucleotide sequence data of circRNA, traditional CNN or LSTM cannot effectively capture long-distance dependencies (relationships between non-adjacent nucleotides in a circRNA).Furthermore, existing methods fail to effectively utilize the interaction information of multiple features, and insufficient consideration of interaction information leads to biased circRNA-RBP interaction relationships.
To overcome these limitations, we propose iCRBP-LKHA, based on a large convolutional kernel and hybrid channel-spatial attention for identifying circRNA-RBP interaction sites.iCRBP-LKHA adopts five sequence encoding schemes, including k-nucleotide frequency (KNF), Doc2Vec, electron-ion interaction pseudopotential (EIIP) [22], chemical characteristic of nucleotide (CCN) and accumulated nucleotide frequency (ANF) to extract comprehensive feature information.Subsequently, the large kernel convolutional neural network (LKCNN) is applied to capture long-distance dependencies and update the feature maps [23].Then, the updated feature maps are fed to a modified hybrid channel-spatial attention module CBAM-1D (convolutional block attention module with one-dimensional (1D) convolution) [24], which focuses on important features, multiple features interaction information and suppresses unnecessary features.Finally, the refined feature is fed to a bidirectional gated recurrent unit (BiGRU) network to identify circRNA-RBP interaction sites [25,26].The schematic overview of iCRBP-LKHA is shown in Fig 1 .To verify the effectiveness and generalizability of iCRBP-L-KHA, we compared the performance of iCRBP-LKHA with state-of-the-art methods on 37 cir-cRNAs and 31 liearRNAs datasets, respectively.Experimental results show that iCRBP-LKHA outperforms other competing methods.Moreover, we observe that iCRBP-LKHA can accurately identify linear RNA-RBP binding sites.

Model performance under different network layers
The performance of a neural network depends heavily on its architecture, especially the network depth.Compared with shallow neural networks, deep neural networks exhibited stronger ability to extract features and learn complex representations.However, too many layers can lead to overfitting and reduce model performance.In this section, we analyze the impact of network depth by reducing or increasing the convolutional blocks in LKCNN.iCRBP-LKHA adds two 1x1 convolutional layers or reduces two 1x1 convolutional layers, and the modified models are called iCRBP-LKHA+2 and iCRBP-LKHA-2 respectively.

Model performance under different feature encoding schemes
The performance of neural networks is affected by feature encoding scheme.To evaluate the contribution of the feature encoding scheme we used (named Fea-iCRBP-LKHA), under the same neural network architecture of iCRBP-LKHA, we replaced the original feature encoding scheme with the encoding scheme of PASSION (named Fea-PASSION) [18] and the encoding scheme of CRIP (named Fea-CRIP) [16], which are two widely used encoding schemes.The AUC line graphs of the three methods on 37 circRNAs datasets were shown in Fig 2B .The AUC values of the three methods on 37 circRNAs datasets were listed in S2 Table.
As shown in Fig 2B, Fea-iCRBP-LKHA performs better than Fea-PASSION and Fea-CRIP on all datasets.As shown in S2 Table, the average AUC of Fea-iCRBP-LKHA is 0.9423, which is higher than Fea-PASSION's 0.8844 and Fea-CRIP's 0.8772.The experimental results clearly demonstrate the effectiveness of the adopted feature encoding scheme.

Contributions of different encoding schemes
To evaluate the contribution of each encoding scheme relative to all five encoding schemes together, we conducted leave-one-encoding-out experiments on 37 circRNAs datasets.We trained the iCRBP-LKHA models using merely four encoding schemes with the same hyperparameters and compared the performances with the models trained with all encoding schemes together.
As shown in Fig 3A, the models suffer from performance drop when using different four encoding schemes.Among these five encoding schemes, ANF is the most important encoding scheme and CCN is the second most important encoding scheme.The results demonstrate the effectiveness of five encoding schemes used together.The detailed results were recorded in S3 Table.

Performance of neural network architecture in iCRBP-LKHA
To evaluate the performance of the neural network architecture in iCRBP-LKHA, the same five features (see Section Feature encoding in Materials and methods) were fed to six CNNbased methods and compared the performance of these methods with iCRBP-LKHA.These methods are iDeepE [27], ResNet [28], CRIP [16], CRBPDL [20]

Comparison with traditional machine learning methods
In this section, we compared iCRBP-LKHA with SVM (Support Vector Machine) [29], Random Forest (RF) [30], XGBoost [31], LightGBM [32] and Rotation Forest [33] to test the prediction performance of iCRBP-LKHA.We used the same feature sets and applied feature selection method PCA (Principal Component Analysis) [34] followed by application of these shallow learning algorithms.Here we implement these methods, which are trained and evaluated using the 37 circRNAs datasets and 37 circNRAs stringent datasets.The detailed parameters of these shallow learning algorithms were presented in Table 1.All experiments were done on an NVIDIA RTX 3090 GPU with 24 GB VRAM, and the evaluation metrics are AUC, ACC, F1 and MCC.The results were shown in Fig 4 .The AUC of the 37 circRNAs datasets were listed in Table 2, and ACC, F1and MCC of the 37 circRNAs datasets were listed in S6, S7 and S8 Tables respectively.The AUC, ACC, F1 and MCC of the 37 circRNAs stringent datasets were listed in S9, S10, S11 and S12 Tables respectively.

The generalizability performance of methods
To evaluate the generalizability performance of methods, we trained these models (iCRBP-L-KHA, ASCRB, iCircRBP-DHN, PASSION, CRIP, CSCRites, CircSLNN and CRBPDL) on one dataset and tested the capabilities of these models on the other dataset.We constructed a training dataset and an independent testing dataset using 37 circRNAs datasets.The training dataset consists of 26 circRNAs datasets, and the number of samples is 537698 (268849 positive samples and 268849 negative samples), which is about 80% of the total number of samples.The testing dataset includes the remaining 11 datasets (67127 positive samples and 67127 negative samples).The circRNA names in training and testing datasets were listed in S16 Table.
As shown in Table 4, in terms of four evaluation metrics, iCRBP-LKHA outperforms other competing methods.The results show that iCRBP-LKHA has excellent generalization capacity.

The prediction performance on 31 linear RNAs datasets
CircRNA-RBP interaction identification methods are generally able to identify linear RNA-RBP interaction sites.To assess the effectiveness of iCRBP-LKHA in identifying linear RNA-RBP interaction sites, we compared iCRBP-LKHA with ASCRB [21], CRBPDL [20], iCircRBP-DHN [19], CRIP [16], CSCRites [35], CircSLNN [17] and iDeepS [36] using 31 benchmark datasets of linear RNAs.iDeepS is a linear RNAs-RBP interaction prediction method that integrates both sequence and secondary structure information.The results were shown in Fig 7A -7D.The ROC curves of iCRBP-LKHA were shown in Fig 7E .The AUC were listed in Table 6 and the last row is the average AUC.The ACC, F1 and MCC were listed in S20, S21 and S22 Tables respectively.
As shown in Table 6, the average AUCs of iCRBP-LKHA, ASCRB, iCircRBP-DHN, CRIP, CRBPDL, iDeepS, CSCRites and CircSLNN are 0.9374, 0.9393, 0.895, 0.860, 0.9163, 0.842, 0.833 and 0.803, respectively.On the dataset AGO1234, the AUC of ASCRB is higher than that of iCRBP-LKHA, but in the remaining 30 datasets, the average AUC of iCRBP-LKHA is higher than that of ASCRB (0.9412 vs. 0.9395).In 24 of the 31 datasets, our model iCRBP-LKHA achieved the highest AUC value, improving the performance of state-of-the-art prediction methods.On the dataset hnRNPC-1 and TDP-43, iCRBP-LKHA is slightly worse than ASCRB.In terms of AUC, iCRBP-LKHA outperforms other competing methods, the same was found in ACC, F1 and MCC.The above results indicate that iCRBP-LKHA is better than competing methods in predicting linear RNA-RBP interaction sites.

Discussion
In this paper, we proposed a novel DL-based model, iCRBP-LKHA, based on large convolutional kernel and hybrid channel-spatial attention for identifying circRNA-RBP interaction sites.To effectively extract features from sequences, we adopted five encoding schemes, including KNF, Doc2Vec, EIIP, CCN and ANF to extract comprehensive feature information.Meanwhile, the neural network architecture, which consists of LKCNN, CBAM-1D and BiGRU, was proposed to explore local information, global context information and multiple features interaction information automatically.By integrating multiple information, iCRBP-LKHA improved model performance as compared with several state-of-the-art methods.The experimental results on 37 circRNAs datasets, 37 circRNAs stringent datasets and 31 linear RNAs datasets not only demonstrate the effectiveness of iCRBP-LKHA but also demonstrate the potential of this model in identifying other RNA-RBP interaction sites.

Data preparation
To evaluate the prediction performance of iCRBP-LKHA, we worked with the 37 circRNAs datasets (https://github.com/kavin525zhang/CRIP)that have been widely used by DL-based algorithms for benchmarking their performances [16,[18][19][20].CD-HIT was used to eliminate redundant sequences with the sequence identity threshold of 80% [37].A total of 32,216 cir-cRNAs were obtained from 37 circRNAs datasets.The wet lab-verified verified interaction sites were treated as positive samples and negative samples of equal size were randomly selected from the remaining fragments.335,976 positive samples and 335,976 negative samples were used to evaluate the model performance.To observe performance of the method on a more stringent dataset, CD-HIT was used to eliminate redundant sequences with the sequence identity threshold of 60%, resulting in a total of 139,293 positive samples and 139,293 negative samples (https://github.com/nathanyl/iCRBP-LKHA). The dataset was named 37 circRNAs stringent datasets.The number of samples in the 37 circRNAs datasets and the 37 circNRAs stringent datasets were listed in Table 7. 80% of the samples were randomly selected as the training set, and the remaining 20% of the samples were used as the testing set.10-flod cross validation were applied to optimize the parameters.Additionally, we compared the performance of iCRBP-LKHA with state-of-the-art linear RNA-RBP interaction sites identification methods.The benchmark human datasets of 31 linear RNAs were collected by iONMF [38] and downloaded from https://github.com/mstrazar/ionmf.Each dataset consists of 5000 training samples and 1000 testing samples.The framework of iCRBP-LKHA Traditional CNNs usually use small-sized convolutional kernels, such as 1x1, 3x3 and 5x5.However, small convolutional kernels may not effectively capture long-distance dependencies in sequence data.Compared with small convolutional kernels, large convolutional kernels can increase the effective receptive field (ERF) [39] by increasing the kernel width and height, thereby better capturing long-distance dependencies [23].Traditional attention mechanisms are usually implemented by learning weights, such as using the softmax function.In contrast, the hybrid attention mechanism can simultaneously consider multiple attention mechanisms and their combinations to obtain more comprehensive feature information from the data [40].
CBAM is a simple yet effective attention module [24].
Inspired by the large convolutional kernels and hybrid attention mechanism, we designed a novel DL-based method namely iCRBP-LKHA for predicting circRNA-RBP interaction sites.As shown in Fig 1, iCRBP-LKHA adopts five encoding schemes.The neural network architecture of iCRBP-LKHA mainly includes LKCNN, CBAM-1D and BiGRU.

Feature encoding
In this section, all fragments are encoded into five different types of features, including KNF, Doc2Vec, EIIP, CCN and ANF.These encoding schemes can extract various feature information from sequences.

k-nucleotide frequency (KNF)
KNF was used to extract local contextual features from circRNA sequences.KNF describes the frequency of all possible polynucleotides of k nucleotides occurring in the sequence.KNF integrates various local sequence information while preserving a large amount of original sequence information [41].Compared with the traditional one-hot encoding [16], KNF can retain more effective information in the sequence.The k nucleotides refer to all of a sequence's subsequences of length k, such that the sequence ACGU would have four 1-nucleotide (A, C, G and T), three 2-nucleotides (AC, CG, GT), two 3-nucleotides (ACG and CGT) and one 4-nucleotides (ACGT).In this paper, we set k = 1, 2, 3, which are called single-nucleotide composition frequency, dinucleotide composition frequency and trinucleotide composition frequency respectively.A sequence of length L will have L-k+1 k-nucleotides and 4 k total possible knucleotides.

Doc2Vec
In order to extract more sequence context and high-order biological information, a continuous high-dimensional word embedding encoding method Doc2Vec was used to vectorize the sequence and train the vectorization model [42].The 10-mer sequence fragments were input into the model, and the feature vectors were obtained through word embedding training.Doc2Vec captures the continuous distribution of global contextual features and semantic information to model long-term dependencies in sequences.

Electron-ion interaction pseudopotential (EIIP)
EIIP calculates the characteristics of free electron energy.These free electron energy is consider to be related to the binding site interaction [20,22].The EIIP values of sequence ATGC are 0.1260, 0.1335, 0.0806 and 0.1340 respectively.For example, TACCGAA is encoded as a numeric vector (0.1335, 0.1260, 0.1340, 0.1340, 0.0806, 0.1260, 0.1260).We used the EIIP encoding method to encode DNA sequences into digital vectors.

Chemical characteristic of nucleotide (CCN)
Each nucleotide contains three chemical features (CCN), which are ring structure, chemical functions and hydrogen bonds.Research shows that these three chemical features are related to binding site interactions [43].In ring structure, A and G are coded as 1, C and T are coded as 0. In chemical functions, A and C are coded as 1, G and T are coded as 0. In hydrogen bonds, A and T are coded as 1, C and G are coded as 0. For example, GTACCGA is encoded as (1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1).

Accumulated nucleotide frequency (ANF)
ANF describes the occurrence frequency of the i-th nucleotide in a fragment composed of previous i-nucleotides and is widely used to represent the density feature of nucleotide sequences [44].ANF can be used to identify sequence features [45].The density d i of any nucleotide s i at position i can be defined by the following formula, where L is the sequence length, |S i | is the length of the i-th prefix string {s 1 , s 2 , . .., s i } in the sequence, q2{A, C, G, T}.

Large kernel convolutional neural network (LKCNN)
Compared with small convolutional kernels, large convolutional kernels can increase the ERF and capture more complex patterns and nonlinear relationships, thereby improving the performance of neural networks [23,46].In this paper, for five feature matrices obtained by five encoding schemes, a large kernel CNN was used to reparametrize the feature matrices to help downstream feature extraction task.Since the distributions of the five original feature matrices are different, we first applied 128 1D convolutional filters with kernel size 3 to the original feature matrices to obtain five feature matrices of the same size.The five feature matrices were concatenated to form a new feature map.Then the feature map was fed to a 1x1 convolutional layer, followed by a 2x2 average pooling operation with a stride of 2, and the convolution kernel is 512.Subsequently, we used a 1x1 convolutional layer with 256 kernels, followed by batch normalization (BN) operation.After that, we used a convolution layer with 256 7x7 convolution kernels.Then, a 5x5 convolutional layer with 256 convolution kernels was used, followed by a max pooling operation.Finally, a 3x3 convolutional layer with a max pooling operation was used and the convolution kernel is 128, and then the feature map was fed a 1x1 convolutional layer with a convolution kernel of 128.

Convolutional block attention module with one-dimensional convolution (CBAM-1D)
The attention mechanism is a widely used method for improving the feature representation of the model [47].Inspired by CBAM, we proposed CBAM-1D to extract the key information of feature matrices and the correlation information between the five features.CBAM-1D module can generate attention maps in both channel and spatial dimensions, then the two attention maps are multiplied to the original feature map for adaptive feature refinement to generate the final feature map.CBAM-1D focuses on important features and suppresses the influence of noisy data and irrelevant information.
CBAM-1D consists of two modules, a 1D-channel attention module and a spatial attention module.In the channel attention module, first, the feature map passed through global max pooling and 1D-global average pooling respectively, and passed through multilayer perceptron (MLP) respectively.1D-global average pooling means first performing a 1D convolution operation and then performing global average pooling, which can improve the feature representation ability of the model.Then the two feature maps were merged by element-wise summation, and passed through the ReLU to generate the final channel attention feature map.Finally, the channel attention feature map and the feature map were element-wise multiplied to generate the feature map required by the spatial attention module.
In the spatial attention module, first, the feature map passed through global max pooling and global average pooling, and the two results were concated.Then after a convolution operation with kernel size 7x7, the dimension was reduced to one channel.Next, feature map generated spatial attention feature through sigmoid function.Finally, the spatial attention feature and the feature map were multiplied to obtain the final feature.

Bidirectional gating recurrent unit (BiGRU)
In this section, BiGRU was used to extract important information in the sequence [48].BiGRU has two gates: reset gate and update gate.The reset gate enables the model to ignore previous state information, while the update gate allows the model to incorporate the previous state into the current state when processing the sequence.By updating previous state information to the current state, the model can capture important contextual information that contributes to the final prediction.In BiGRU, the hidden unit size is set to 128, the batch size is 1024, the learning rate is 0.003, and dropout is set to 0.8.

Evaluation methods
In the experiment, AUC, ACC, F1 and MCC are used to evaluate the performance of these methods.
TP � TN À FP � FN ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Project of Science and Technology of Guangxi (Grant no.2021AB20147), Guangxi Natural Science Foundation (Grant nos.2021JJA170204 & 2021JJA170199) and Guangxi Science and Technology Base and Talents Special Project (Grant nos.2021AC19354 & 2021AC19394), and supported by the Natural Science Foundation of Ningbo City under Grant No.2023J199, and supported by Key Research and Development (Digital Twin) Program of Ningbo City under Grant Nos.2023Z219, 2023Z226, CHZ is supported by the University Synergy Innovation Program of Anhui Province (No. GXXT-2021-039), LY is supported by the National Natural Science Foundation of China (No. 62002189), the Ability Improvement Project of Science and Technology SMES in Shandong Province (2023TSGC0279), the Youth Innovation Team of Colleges and Universities in Shandong Province (2023KJ329) and the Qilu University of Technology (Shandong Academy of Sciences) Talent Scientific Research Project (No. 2023RCKY128), ZS is supported by the National Natural Science Foundation of China (No. 62102200).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Fig 1 .
Fig 1.Schematic overview of iCRBP-LKHA.The input circRNA sequence is encoded by five schemes: KNF, Doc2Vec, EIIP, CCN and ANF.Then, the concatenated features are fed to a deep neural network architecture formed by LKCNN, CBAM-1D and BiGRU to extract local information, global context information and multiple features interaction information.Finally, a flattened layer integrates the resulting information followed by a fully connected layer with softmax for label classification.https://doi.org/10.1371/journal.pcbi.1012399.g001

Fig 2 .
Fig 2. (A) Performance comparison of different network depths in terms of distribution of AUCs across 37 circRNAs datasets experiments.(B) Performance comparison of deep neural network architectures among multiple feature encoding schemes as visualized in line graph.https://doi.org/10.1371/journal.pcbi.1012399.g002

Fig 3 .
Fig 3. (A) Performance comparison of different encoding schemes combinations in terms of distribution of AUCs across 37 circRNA datasets experiments.iCRBP-LKHA means using all five encoding schemes.iCRBP-LKHA/KNF means using other four encoding schemes except KNF.(B) Determination of the suitable neural network architecture from multiple possible neural architectures and algorithms in terms of distribution of AUCs as shown in the heatmap.(C) Performance comparison of different neural network architectures in terms of distribution of AUCs in ablation experiments.iCRBP-LKHA means our proposed neural network architecture.(w/o)LKCNN means neural network architecture without LKCNN.LKCNN->CNN means LKCNN is replaced by CNN.https://doi.org/10.1371/journal.pcbi.1012399.g003

A
deep neural network architecture is proposed to extract important local and global information from five encoding schemes.The model architecture shown in Fig 1 mainly consists of three parts, namely LKCNN, CBAM-1D and BiGRU network.
Performance comparison of iCRBP-LKHA, iCRBP-LKHA+2 and iCRBP-LKHA-2 on 37 circRNAs datasets.Bold data represent the best AUC values of experimental results.(DOCX) The results were recorded in S4 Table.These results demonstrate the effectiveness of the neural network architecture of iCRBP-LKHA.

Table 2 . Comparison of AUC between iCRBP-LKHA and five shallow learning algorithms on 37 circRNA datasets.
Bold data represent the best AUC values of experimental results.

Table 3 . Comparison of prediction performance of different methods on 37 circRNA datasets.
[19][20][21]present the best AUC values of experimental results.The AUCs of the other seven methods are obtained from references[19][20][21].

Table 4 . Performance comparison of different methods on independent datasets.
Bold data represent the best values of experimental results. https://doi.org/10.1371/journal.pcbi.1012399.t004

Table 5 . Comparison of AUC of different methods on 37 circRNAs stringent datasets.
Bold data represent the best AUC values of experimental results.

Table 6 . Comparison of prediction performance of different methods on 31 linear RNAs datasets.
[19][20][21]present the best AUC values of experimental results.The AUCs of the other seven methods are obtained from references[19][20][21].