Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 6/2019

Open Access 01-12-2019 | Research

MultiSourcDSim: an integrated approach for exploring disease similarity

Authors: Lei Deng, Danyi Ye, Junmin Zhao, Jingpu Zhang

Published in: BMC Medical Informatics and Decision Making | Special Issue 6/2019

Login to get access

Abstract

Background

A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy.

Methods

In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases.

Results

Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases.

Conclusions

MultiSourcDSim is an efficient approach to predict similarity between diseases.
Literature
1.
go back to reference Suthram S, Dudley JT, Chiang AP, Rong C, Hastie TJ, Butte AJ. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. Plos Comput Biol. 2010; 6(2):1000662.CrossRef Suthram S, Dudley JT, Chiang AP, Rong C, Hastie TJ, Butte AJ. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. Plos Comput Biol. 2010; 6(2):1000662.CrossRef
2.
go back to reference Gottlieb A, Stein GY, Ruppin E, Sharan R. Predict: a method for inferring novel drug indications with application to personalized medicine. Mole Syst Biol. 2011; 7(1):496.CrossRef Gottlieb A, Stein GY, Ruppin E, Sharan R. Predict: a method for inferring novel drug indications with application to personalized medicine. Mole Syst Biol. 2011; 7(1):496.CrossRef
3.
go back to reference Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL. The human disease network. Proc Nat Acad Sci USA. 2007; 104(21):8685–90.PubMedCrossRef Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL. The human disease network. Proc Nat Acad Sci USA. 2007; 104(21):8685–90.PubMedCrossRef
4.
go back to reference Hu G, Agarwal P. Human disease-drug network based on genomic expression profiles. Plos One. 2009; 4(8):6536.CrossRef Hu G, Agarwal P. Human disease-drug network based on genomic expression profiles. Plos One. 2009; 4(8):6536.CrossRef
5.
go back to reference Zhang X, Zhang R, Jiang Y, Sun P, Tang G, Wang X, Lv H, Li X. The expanded human disease network combining protein-protein interaction information. Eur J Human Genet Ejhg. 2011; 19(7):783–8.CrossRef Zhang X, Zhang R, Jiang Y, Sun P, Tang G, Wang X, Lv H, Li X. The expanded human disease network combining protein-protein interaction information. Eur J Human Genet Ejhg. 2011; 19(7):783–8.CrossRef
6.
go back to reference Lee DS, Park J, Kay KA, Christakis NA, Oltvai ZN, Barabási AL. The implications of human metabolic network topology for disease comorbidity. Proc Natl Acad Sci USA. 2008; 105(29):9880–5.PubMedCrossRef Lee DS, Park J, Kay KA, Christakis NA, Oltvai ZN, Barabási AL. The implications of human metabolic network topology for disease comorbidity. Proc Natl Acad Sci USA. 2008; 105(29):9880–5.PubMedCrossRef
7.
go back to reference Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet. 2003; 33(33 Suppl):228–37.PubMedCrossRef Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet. 2003; 33(33 Suppl):228–37.PubMedCrossRef
8.
go back to reference Emmert-Streib F, Dehmer M. Analysis of Microarray Data: A Network-Based Approach: Wiley; 2008. Emmert-Streib F, Dehmer M. Analysis of Microarray Data: A Network-Based Approach: Wiley; 2008.
9.
go back to reference Emmertstreib F, Glazko GV. Network biology: a direct approach to study biological function. Wiley Interdiscipl Rev Syst Biol Med. 2011; 3(4):379–91.CrossRef Emmertstreib F, Glazko GV. Network biology: a direct approach to study biological function. Wiley Interdiscipl Rev Syst Biol Med. 2011; 3(4):379–91.CrossRef
10.
go back to reference Jin L, Min L, Wei L, Wu FX, Yi P, Wang J. Classification of alzheimer’s disease using whole brain hierarchical network. IEEE/ACM Trans Comput Biol Bioinforma. 2018; PP(99):624–32. Jin L, Min L, Wei L, Wu FX, Yi P, Wang J. Classification of alzheimer’s disease using whole brain hierarchical network. IEEE/ACM Trans Comput Biol Bioinforma. 2018; PP(99):624–32.
11.
go back to reference Chen B, Li M, Wang J, Shang X, Wu FX. A fast and high performance multiple data integration algorithm for identifying human disease genes. Bmc Med Genomics. 2015; 8(S3):1–11.CrossRef Chen B, Li M, Wang J, Shang X, Wu FX. A fast and high performance multiple data integration algorithm for identifying human disease genes. Bmc Med Genomics. 2015; 8(S3):1–11.CrossRef
12.
go back to reference Consortium TGO, Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS. Gene ontology: tool for the unification of biology. Nature Genet. 2000; 25(1):25–9.CrossRef Consortium TGO, Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS. Gene ontology: tool for the unification of biology. Nature Genet. 2000; 25(1):25–9.CrossRef
13.
go back to reference Zeng C, Zhan W, Deng L. SDADB: A functional annotation database of protein structural domains. Database. 2018:1–8. Zeng C, Zhan W, Deng L. SDADB: A functional annotation database of protein structural domains. Database. 2018:1–8.
14.
go back to reference Zhang Z, Zhang J, Fan C, Tang Y, Deng L. Katzlgo: large-scale prediction of lncrna functions by using the katz measure based on multiple networks. IEEE/ACM Trans Comput Biol Bioinforma. 2019; 16(2):407–16.CrossRef Zhang Z, Zhang J, Fan C, Tang Y, Deng L. Katzlgo: large-scale prediction of lncrna functions by using the katz measure based on multiple networks. IEEE/ACM Trans Comput Biol Bioinforma. 2019; 16(2):407–16.CrossRef
15.
go back to reference Jimenezsanchez G, Childs B, Valle D. Human disease genes. Nature. 2001; 409(6822):853–5.CrossRef Jimenezsanchez G, Childs B, Valle D. Human disease genes. Nature. 2001; 409(6822):853–5.CrossRef
16.
17.
go back to reference Pereziratxeta C, Bork P, Andrade MA. Association of genes to genetically inherited diseases using data mining. Nature Genet. 2002; 31(3):316–9.CrossRef Pereziratxeta C, Bork P, Andrade MA. Association of genes to genetically inherited diseases using data mining. Nature Genet. 2002; 31(3):316–9.CrossRef
18.
go back to reference Mathur S, Dinakarpandian D. Automated ontological gene annotation for computing disease similarity. Transl. Bioinforma. 2010; 2010:12. Mathur S, Dinakarpandian D. Automated ontological gene annotation for computing disease similarity. Transl. Bioinforma. 2010; 2010:12.
19.
go back to reference Mathur S, Dinakarpandian D. Finding disease similarity based on implicit semantic similarity. J Biomed Informa. 2012; 45(2):363–71.CrossRef Mathur S, Dinakarpandian D. Finding disease similarity based on implicit semantic similarity. J Biomed Informa. 2012; 45(2):363–71.CrossRef
21.
go back to reference Resnik P. Using information content to evaluate semantic similarity in a taxonomy. 1995; 1995:448Ű453. Resnik P. Using information content to evaluate semantic similarity in a taxonomy. 1995; 1995:448Ű453.
22.
go back to reference Lin D. An information-theoretic definition of similarity. In: International Conference on Machine Learning(Citeseer): 1998. p. 296–304. Lin D. An information-theoretic definition of similarity. In: International Conference on Machine Learning(Citeseer): 1998. p. 296–304.
23.
go back to reference Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. Proc. Int. Conf. Res. Comput. Linguist. 1997:19–33. Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. Proc. Int. Conf. Res. Comput. Linguist. 1997:19–33.
24.
go back to reference Deng Y, Gao L, Wang B, Guo X. Hposim: An r package for phenotypic similarity measure and enrichment analysis based on the human phenotype ontology. Plos One. 2015; 10(2):0115692. Deng Y, Gao L, Wang B, Guo X. Hposim: An r package for phenotypic similarity measure and enrichment analysis based on the human phenotype ontology. Plos One. 2015; 10(2):0115692.
26.
go back to reference Tong H, Faloutsos C, Pan JY. Fast random walk with restart and its applications. In: International Conference on Data Mining(IEEE): 2006. p. 613–22. Tong H, Faloutsos C, Pan JY. Fast random walk with restart and its applications. In: International Conference on Data Mining(IEEE): 2006. p. 613–22.
27.
go back to reference Zhou XZ, Menche J, Barabási A, Sharma A. Human symptoms–disease network. Nature Commun. 2014; 5:4212.CrossRef Zhou XZ, Menche J, Barabási A, Sharma A. Human symptoms–disease network. Nature Commun. 2014; 5:4212.CrossRef
28.
go back to reference Cho H, Berger B, Peng J. Diffusion component analysis: Unraveling functional topology in biological networks. Comput Sci. 2016; 9029(4):62–4. Cho H, Berger B, Peng J. Diffusion component analysis: Unraveling functional topology in biological networks. Comput Sci. 2016; 9029(4):62–4.
29.
go back to reference Zhang J, Zhang Z, Wang Z, Liu Y, Deng L. Ontological function annotation of long non-coding rnas through hierarchical multi-label classification. Bioinformatics. 2018; 34(10):1750–7.PubMedCrossRef Zhang J, Zhang Z, Wang Z, Liu Y, Deng L. Ontological function annotation of long non-coding rnas through hierarchical multi-label classification. Bioinformatics. 2018; 34(10):1750–7.PubMedCrossRef
30.
go back to reference Deng L, Wu H, Liu C, Zhan W, Zhang J. Probing the functions of long non-coding rnas by exploiting the topology of global association and interaction network. Comput Biol Chem. 2018; 74:360–7.PubMedCrossRef Deng L, Wu H, Liu C, Zhan W, Zhang J. Probing the functions of long non-coding rnas by exploiting the topology of global association and interaction network. Comput Biol Chem. 2018; 74:360–7.PubMedCrossRef
31.
go back to reference Wang S, Cho H, Zhai C, Berger B, Peng J. Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics. 2015; 31(12):357–64.CrossRef Wang S, Cho H, Zhai C, Berger B, Peng J. Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics. 2015; 31(12):357–64.CrossRef
32.
go back to reference Pakhomov S, Mcinnes B, Adam T, Liu Y, Pedersen T, Melton GB. Semantic similarity and relatedness between clinical terms: An experimental study. AMIA... Ann Symp Proc/ AMIA Symp. AMIA Symposium. 2010; 2010:572. Pakhomov S, Mcinnes B, Adam T, Liu Y, Pedersen T, Melton GB. Semantic similarity and relatedness between clinical terms: An experimental study. AMIA... Ann Symp Proc/ AMIA Symp. AMIA Symposium. 2010; 2010:572.
34.
go back to reference van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JA. A text-mining analysis of the human phenome. Eur J Human Genet. 2006; 14(5):535–42.CrossRef van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JA. A text-mining analysis of the human phenome. Eur J Human Genet. 2006; 14(5):535–42.CrossRef
36.
go back to reference Lan W, Wang J, Li M, Peng W, Wu F. Computational approaches for prioritizing candidate disease genes based on ppi networks. Tsinghua Sci Technol. 2015; 20(5):500–512.CrossRef Lan W, Wang J, Li M, Peng W, Wu F. Computational approaches for prioritizing candidate disease genes based on ppi networks. Tsinghua Sci Technol. 2015; 20(5):500–512.CrossRef
37.
go back to reference Zhang J, Zhang Z, Chen Z, Deng L. Integrating multiple heterogeneous networks for novel lncrna-disease association inference. IEEE/ACM Trans Comput Biol Bioinforma. 2019; 16(2):396–406.CrossRef Zhang J, Zhang Z, Chen Z, Deng L. Integrating multiple heterogeneous networks for novel lncrna-disease association inference. IEEE/ACM Trans Comput Biol Bioinforma. 2019; 16(2):396–406.CrossRef
38.
go back to reference Deng L, Zhang W, Shi Y, Tang Y. Fusion of multiple heterogeneous networks for predicting circrna-disease associations. Sci Rep (Nat Publ Group). 2019; 9:1–10. Deng L, Zhang W, Shi Y, Tang Y. Fusion of multiple heterogeneous networks for predicting circrna-disease associations. Sci Rep (Nat Publ Group). 2019; 9:1–10.
39.
go back to reference Guo X, Zhang J, Cai Z, Du DZ, Pan Y. Searching genome-wide multi-locus associations for multiple diseases based on bayesian inference. IEEE/ACM Trans Comput Biol Bioinforma. 2017; PP(99):1–1. Guo X, Zhang J, Cai Z, Du DZ, Pan Y. Searching genome-wide multi-locus associations for multiple diseases based on bayesian inference. IEEE/ACM Trans Comput Biol Bioinforma. 2017; PP(99):1–1.
40.
go back to reference Teng B, Yang C, Liu J, Cai Z, Wan X. Exploring the genetic patterns of complex diseases via the integrative genome-wide approach. IEEE/ACM Trans Comput Biol Bioinforma. 2016; 13(3):557–64.CrossRef Teng B, Yang C, Liu J, Cai Z, Wan X. Exploring the genetic patterns of complex diseases via the integrative genome-wide approach. IEEE/ACM Trans Comput Biol Bioinforma. 2016; 13(3):557–64.CrossRef
41.
go back to reference Zeng X, Zhang X, Zou Q. Integrative approaches for predicting microrna function and prioritizing disease-related microrna using biological interaction networks. Brief Bioinforma. 2016; 17(2):193.CrossRef Zeng X, Zhang X, Zou Q. Integrative approaches for predicting microrna function and prioritizing disease-related microrna using biological interaction networks. Brief Bioinforma. 2016; 17(2):193.CrossRef
42.
go back to reference Zou Q, Li J, Hong Q, Lin Z, Wu Y, Shi H, Ying J. Prediction of microrna-disease associations based on social network analysis methods. Biomed Res Int. 2015; 2015(10):810514.PubMedPubMedCentral Zou Q, Li J, Hong Q, Lin Z, Wu Y, Shi H, Ying J. Prediction of microrna-disease associations based on social network analysis methods. Biomed Res Int. 2015; 2015(10):810514.PubMedPubMedCentral
43.
go back to reference Yan C, Wang J, Ni P, Lan W, Wu F, Pan Y. Dnrlmf-mda:predicting microrna-disease associations based on similarities of micrornas and diseases. IEEE/ACM Trans Comput Biol Bioinforma. 2017; PP(99):1–1. Yan C, Wang J, Ni P, Lan W, Wu F, Pan Y. Dnrlmf-mda:predicting microrna-disease associations based on similarities of micrornas and diseases. IEEE/ACM Trans Comput Biol Bioinforma. 2017; PP(99):1–1.
44.
go back to reference Liang C, Li J, Peng J, Peng J, Wang Y. Semfunsim: A new method for measuring disease similarity by integrating semantic and gene functional association. Plos One. 2014; 9(6):99415.CrossRef Liang C, Li J, Peng J, Peng J, Wang Y. Semfunsim: A new method for measuring disease similarity by integrating semantic and gene functional association. Plos One. 2014; 9(6):99415.CrossRef
45.
go back to reference Ghiassian SD, Menche J, Barabási AL. A disease module detection (diamond) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome. Plos Comput Biol. 2015; 11(4):1004120.CrossRef Ghiassian SD, Menche J, Barabási AL. A disease module detection (diamond) algorithm derived from a systematic analysis of connectivity patterns of disease proteins in the human interactome. Plos Comput Biol. 2015; 11(4):1004120.CrossRef
Metadata
Title
MultiSourcDSim: an integrated approach for exploring disease similarity
Authors
Lei Deng
Danyi Ye
Junmin Zhao
Jingpu Zhang
Publication date
01-12-2019
Publisher
BioMed Central
DOI
https://doi.org/10.1186/s12911-019-0968-8

Other articles of this Special Issue 6/2019

BMC Medical Informatics and Decision Making 6/2019 Go to the issue