Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 3/2020

Open Access 01-07-2020 | Research

A machine learning framework for accurately recognizing circular RNAs for clinical decision-supporting

Authors: Yidan Wang, Xuanping Zhang, Tao Wang, Jinchun Xing, Zhun Wu, Wei Li, Jiayin Wang

Published in: BMC Medical Informatics and Decision Making | Special Issue 3/2020

Login to get access

Abstract

Background

Circular RNAs (circRNAs) are those RNA molecules that lack the poly (A) tails, which present the closed-loop structure. Recent studies emphasized that some circRNAs imply different functions from canonical transcripts, and further associated with complex diseases. Several computational methods have been developed for detecting circRNAs from RNA-seq data. However, the existing methods prefer to high sensitivity strategies, which always introduce many false positives. Thus, in clinical decision-supporting system, a comprehensive filtering approach is needed for accurately recognizing real circRNAs for decision models.

Methods

In this paper, we first reviewed the detection strategies of the existing methods. According to the features from RNA-seq data, we showed that any single feature (data signal) selected by the existing strategies cannot accurately distinguish a circRNA. However, we found that some combinations of those features (data signals) could be used as signatures for recognizing circRNAs. To avoid the high computational complexity of the combinational optimization problem, we present CIRCPlus2, which adopts a machine learning framework to recognize real circRNAs according to multiple data signals captured from RNA-seq data. By comparing multiple machine learning frameworks, CIRCPlus2 adopts a Gradient Boosting Decision Tree (GBDT) framework.

Results

Given a set of candidate circRNAs, reported by any existing detection tool(s), the features of each candidate are extracted from the aligned reads. The GBDT framework can be trained by a training dataset. By applying the selected features on the framework, the predictions on true/false positives are reported. To verify the performance of the proposed approach, we conducted several groups of experiments on both real RNA-seq datasets and a series of simulation datasets with different preset configurations. The results demonstrated that CIRCPlus2 clearly improved the specificities, while it also maintained high levels of sensitivities.

Conclusions

Filtering false positives is quite important in RNA-seq data analysis pipeline. Machine learning framework is suitable for solving this filtering problem. CIRCPlus2 is an efficient approach to identify the false positive circRNAs from the real ones.
Literature
1.
go back to reference Zhang Y, Zhang X, Chen T, Xiang J, Yin Q, Xing Y, et al. Circular intronic long noncoding RNAs. Mol Cell. 2013;51:792–806.CrossRef Zhang Y, Zhang X, Chen T, Xiang J, Yin Q, Xing Y, et al. Circular intronic long noncoding RNAs. Mol Cell. 2013;51:792–806.CrossRef
2.
go back to reference Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One. 2012;7:e30733.CrossRef Salzman J, Gawad C, Wang PL, Lacayo N, Brown PO. Circular RNAs are the predominant transcript isoform from hundreds of human genes in diverse cell types. PLoS One. 2012;7:e30733.CrossRef
3.
go back to reference Eddy SR. Non-coding RNA, genes and the modern RNA world. Nat Rev Genet. 2001;2:919–29.CrossRef Eddy SR. Non-coding RNA, genes and the modern RNA world. Nat Rev Genet. 2001;2:919–29.CrossRef
4.
go back to reference Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, et al. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495:384–8.CrossRef Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, et al. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495:384–8.CrossRef
5.
go back to reference Guarnerio J, Bezzi M, Jeong JC, Paffenholz SV, Berry K, Naldini MM, et al. Oncogenic role of fusion-circRNAs derived from cancer-associated chromosomal translocations. Cell. 2016;165:289–302.CrossRef Guarnerio J, Bezzi M, Jeong JC, Paffenholz SV, Berry K, Naldini MM, et al. Oncogenic role of fusion-circRNAs derived from cancer-associated chromosomal translocations. Cell. 2016;165:289–302.CrossRef
6.
go back to reference Tay ML, Pek JW. Maternally inherited stable intronic sequence RNA triggers a self-reinforcing feedback loop during development. Curr Biol. 2017;27:1062–7.CrossRef Tay ML, Pek JW. Maternally inherited stable intronic sequence RNA triggers a self-reinforcing feedback loop during development. Curr Biol. 2017;27:1062–7.CrossRef
7.
go back to reference Xu S, Zhou L, Ponnusamy M, Zhang L, Dong Y, Zhang Y, et al. A comprehensive review of circRNA: from purification and identification to disease marker potential. PeerJ. 2018;6:e5503.CrossRef Xu S, Zhou L, Ponnusamy M, Zhang L, Dong Y, Zhang Y, et al. A comprehensive review of circRNA: from purification and identification to disease marker potential. PeerJ. 2018;6:e5503.CrossRef
8.
go back to reference Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495:333–8.CrossRef Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495:333–8.CrossRef
9.
go back to reference Gao Y, Wang J, Zhao F. CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol. 2015;16:4.CrossRef Gao Y, Wang J, Zhao F. CIRI: an efficient and unbiased algorithm for de novo circular RNA identification. Genome Biol. 2015;16:4.CrossRef
10.
go back to reference Zhang X, Wang Y, Zhao Z, Wang J. An efficient algorithm for sensitively detecting circular RNA from RNA-seq data. Int J Mol Sci. 2018;19:2897.CrossRef Zhang X, Wang Y, Zhao Z, Wang J. An efficient algorithm for sensitively detecting circular RNA from RNA-seq data. Int J Mol Sci. 2018;19:2897.CrossRef
11.
go back to reference Zeng X, Lin W, Guo M, Zou Q. A comprehensive overview and evaluation of circular RNA detection tools. PLoS Comput Biol. 2017;13:e1005420.CrossRef Zeng X, Lin W, Guo M, Zou Q. A comprehensive overview and evaluation of circular RNA detection tools. PLoS Comput Biol. 2017;13:e1005420.CrossRef
12.
go back to reference Hansen TB, Venø MT, Damgaard CK, Kjems J. Comparison of circular RNA prediction tools. Nucleic Acids Res. 2016;44:e58.CrossRef Hansen TB, Venø MT, Damgaard CK, Kjems J. Comparison of circular RNA prediction tools. Nucleic Acids Res. 2016;44:e58.CrossRef
13.
go back to reference Gaffo E, Bonizzato A, Kronnie G, Bortoluzzi S. CirComPara: a multi-method comparative bioinformatics pipeline to detect and study circRNAs from RNA-seq data. Non-Coding RNA. 2017;3:8.CrossRef Gaffo E, Bonizzato A, Kronnie G, Bortoluzzi S. CirComPara: a multi-method comparative bioinformatics pipeline to detect and study circRNAs from RNA-seq data. Non-Coding RNA. 2017;3:8.CrossRef
14.
go back to reference Salzman J, Chen RE, Olsen MN, Wang PL, Brown PO. Cell-type specific features of circular RNA expression. PLoS Genet. 2013;9:e1003777.CrossRef Salzman J, Chen RE, Olsen MN, Wang PL, Brown PO. Cell-type specific features of circular RNA expression. PLoS Genet. 2013;9:e1003777.CrossRef
15.
go back to reference Jeck WR, Sharpless NE. Detecting and characterizing circular RNAs. Nat Biotechnol. 2014;32:453–61.CrossRef Jeck WR, Sharpless NE. Detecting and characterizing circular RNAs. Nat Biotechnol. 2014;32:453–61.CrossRef
17.
go back to reference Gao Y, Wang J, Zheng Y, Zhang J, Chen S, Zhao F. Comprehensive identification of internal structure and alternative splicing events in circular RNAs. Nat Commun. 2016;7:12060.0. Gao Y, Wang J, Zheng Y, Zhang J, Chen S, Zhao F. Comprehensive identification of internal structure and alternative splicing events in circular RNAs. Nat Commun. 2016;7:12060.0.
Metadata
Title
A machine learning framework for accurately recognizing circular RNAs for clinical decision-supporting
Authors
Yidan Wang
Xuanping Zhang
Tao Wang
Jinchun Xing
Zhun Wu
Wei Li
Jiayin Wang
Publication date
01-07-2020
Publisher
BioMed Central
DOI
https://doi.org/10.1186/s12911-020-1117-0

Other articles of this Special Issue 3/2020

BMC Medical Informatics and Decision Making 3/2020 Go to the issue