Skip to main content
Top
Published in: BMC Cancer 1/2016

Open Access 01-12-2016 | Research article

Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines

Authors: Harinder Singh, Rahul Kumar, Sandeep Singh, Kumardeep Chaudhary, Ankur Gautam, Gajendra P. S. Raghava

Published in: BMC Cancer | Issue 1/2016

Login to get access

Abstract

Background

In past, numerous quantitative structure-activity relationship (QSAR) based models have been developed for predicting anticancer activity for a specific class of molecules against different cancer drug targets. In contrast, limited attempt have been made to predict the anticancer activity of a diverse class of chemicals against a wide variety of cancer cell lines. In this study, we described a hybrid method developed on thousands of anticancer and non-anticancer molecules tested against National Cancer Institute (NCI) 60 cancer cell lines.

Results

Our analysis of anticancer molecules revealed that majority of anticancer molecules contains 18–24 carbon atoms and are dominated by functional groups like R2NH, R3N, ROH, RCOR, and ROR. It was also observed that certain substructures (e.g., 1-methoxy-4-methylbenzene, 1-methoxy benzene, Nitrobenzene, Indole, Propenyl benzene) are more abundant in anticancer molecules. Next, we developed anticancer molecule prediction models using various machine-learning techniques and achieved maximum matthews correlation coefficient (MCC) of 0.81 with 90.40 % accuracy using support vector machine (SVM) based models. In another approach, a novel similarity or potency score based method has been developed using selected fragments/fingerprints and achieved maximum MCC of 0.82 with 90.65 % accuracy. Finally, we combined the strength of above methods and developed a hybrid method with maximum MCC of 0.85 with 92.47 % accuracy.

Conclusions

We developed a hybrid method utilizing the best of machine learning and potency score based method. The highly accurate hybrid method can be used for classification of anticancer and non-anticancer molecules. In order to facilitate scientific community working in the field of anticancer drug discovery, we integrate hybrid and potency method in a web server CancerIN. This server provides various facilities that includes; virtual screening of anticancer molecules, analog based drug design, and similarity with known anticancer molecules (http://​crdd.​osdd.​net/​oscadd/​cancerin).
Appendix
Available only for authorised users
Literature
1.
go back to reference Kibria G, Hatakeyama H, Harashima H. Cancer multidrug resistance: mechanisms involved and strategies for circumvention using a drug delivery system. Arch Pharm Res. 2013. Kibria G, Hatakeyama H, Harashima H. Cancer multidrug resistance: mechanisms involved and strategies for circumvention using a drug delivery system. Arch Pharm Res. 2013.
2.
go back to reference Menden MP, Iorio F, Garnett M, McDermott U, Benes CH, Ballester PJ, et al. Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS One. 2013;8(4), e61318.PubMedCentralCrossRefPubMed Menden MP, Iorio F, Garnett M, McDermott U, Benes CH, Ballester PJ, et al. Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS One. 2013;8(4), e61318.PubMedCentralCrossRefPubMed
3.
4.
5.
go back to reference Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483(7391):570–5.PubMedCentralCrossRefPubMed Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483(7391):570–5.PubMedCentralCrossRefPubMed
6.
go back to reference Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483(7391):603–7.PubMedCentralCrossRefPubMed Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483(7391):603–7.PubMedCentralCrossRefPubMed
7.
go back to reference Bussey KJ, Chin K, Lababidi S, Reimers M, Reinhold WC, Kuo WL, et al. Integrating data on DNA copy number with gene expression levels and drug sensitivities in the NCI-60 cell line panel. Mol Cancer Ther. 2006;5(4):853–67.PubMedCentralCrossRefPubMed Bussey KJ, Chin K, Lababidi S, Reimers M, Reinhold WC, Kuo WL, et al. Integrating data on DNA copy number with gene expression levels and drug sensitivities in the NCI-60 cell line panel. Mol Cancer Ther. 2006;5(4):853–67.PubMedCentralCrossRefPubMed
8.
go back to reference Papillon-Cavanagh S, De Jay N, Hachem N, Olsen C, Bontempi G, Aerts HJ, et al. Comparison and validation of genomic predictors for anticancer drug sensitivity. JAMIA. 2013;20(4):597–602.PubMedCentralPubMed Papillon-Cavanagh S, De Jay N, Hachem N, Olsen C, Bontempi G, Aerts HJ, et al. Comparison and validation of genomic predictors for anticancer drug sensitivity. JAMIA. 2013;20(4):597–602.PubMedCentralPubMed
9.
go back to reference Haibe-Kains B, El-Hachem N, Birkbak NJ, Jin AC, Beck AH, Aerts HJ, et al. Inconsistency in large pharmacogenomic studies. Nature. 2013;504(7480):389–93.PubMedCentralCrossRefPubMed Haibe-Kains B, El-Hachem N, Birkbak NJ, Jin AC, Beck AH, Aerts HJ, et al. Inconsistency in large pharmacogenomic studies. Nature. 2013;504(7480):389–93.PubMedCentralCrossRefPubMed
10.
go back to reference Gonzales-Diaz H, Gia O, Uriarte E, Hernadez I, Ramos R, Chaviano M, et al. Markovian chemicals “in silico” design (MARCH-INSIDE), a promising approach for computer-aided molecular design I: discovery of anticancer compounds. J Mol Model. 2003;9(6):395–407.CrossRefPubMed Gonzales-Diaz H, Gia O, Uriarte E, Hernadez I, Ramos R, Chaviano M, et al. Markovian chemicals “in silico” design (MARCH-INSIDE), a promising approach for computer-aided molecular design I: discovery of anticancer compounds. J Mol Model. 2003;9(6):395–407.CrossRefPubMed
11.
go back to reference Stumpf SH. Pathways to success: training for independent living. Monogr Am Assoc Ment Retard. 1990;15:1–111.PubMed Stumpf SH. Pathways to success: training for independent living. Monogr Am Assoc Ment Retard. 1990;15:1–111.PubMed
12.
go back to reference Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Unified multi-target approach for the rational in silico design of anti-bladder cancer agents. Anticancer Agents Med Chem. 2013;13(5):791–800.CrossRefPubMed Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Unified multi-target approach for the rational in silico design of anti-bladder cancer agents. Anticancer Agents Med Chem. 2013;13(5):791–800.CrossRefPubMed
13.
go back to reference Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Chemoinformatics in anti-cancer chemotherapy: multi-target QSAR model for the in silico discovery of anti-breast cancer agents. Eur J Pharm Sci. 2012;47(1):273–9.CrossRefPubMed Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Chemoinformatics in anti-cancer chemotherapy: multi-target QSAR model for the in silico discovery of anti-breast cancer agents. Eur J Pharm Sci. 2012;47(1):273–9.CrossRefPubMed
14.
go back to reference Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Chemoinformatics in multi-target drug discovery for anti-cancer therapy: in silico design of potent and versatile anti-brain tumor agents. Anticancer Agents Med Chem. 2012;12(6):678–85.CrossRefPubMed Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MN. Chemoinformatics in multi-target drug discovery for anti-cancer therapy: in silico design of potent and versatile anti-brain tumor agents. Anticancer Agents Med Chem. 2012;12(6):678–85.CrossRefPubMed
15.
go back to reference Estrada E, Uriarte E, Montero A, Teijeira M, Santana L, De Clercq E. A novel approach for the virtual screening and rational design of anticancer compounds. J Med Chem. 2000;43(10):1975–85.CrossRefPubMed Estrada E, Uriarte E, Montero A, Teijeira M, Santana L, De Clercq E. A novel approach for the virtual screening and rational design of anticancer compounds. J Med Chem. 2000;43(10):1975–85.CrossRefPubMed
16.
go back to reference Gonzalez-Diaz H, Vina D, Santana L, de Clercq E, Uriarte E. Stochastic entropy QSAR for the in silico discovery of anticancer compounds: prediction, synthesis, and in vitro assay of new purine carbanucleosides. Bioorg Med Chem. 2006;14(4):1095–107.CrossRefPubMed Gonzalez-Diaz H, Vina D, Santana L, de Clercq E, Uriarte E. Stochastic entropy QSAR for the in silico discovery of anticancer compounds: prediction, synthesis, and in vitro assay of new purine carbanucleosides. Bioorg Med Chem. 2006;14(4):1095–107.CrossRefPubMed
17.
go back to reference Gonzalez-Diaz H, Bonet I, Teran C, De Clercq E, Bello R, Garcia MM, et al. ANN-QSAR model for selection of anticancer leads from structurally heterogeneous series of compounds. Eur J Med Chem. 2007;42(5):580–5.CrossRefPubMed Gonzalez-Diaz H, Bonet I, Teran C, De Clercq E, Bello R, Garcia MM, et al. ANN-QSAR model for selection of anticancer leads from structurally heterogeneous series of compounds. Eur J Med Chem. 2007;42(5):580–5.CrossRefPubMed
18.
go back to reference Kumar R, Chaudhary K, Singla D, Gautam A, Raghava GPS. Designing of promiscuous inhibitors against pancreatic cancer cell lines. Sci Rep. 2014;4. Kumar R, Chaudhary K, Singla D, Gautam A, Raghava GPS. Designing of promiscuous inhibitors against pancreatic cancer cell lines. Sci Rep. 2014;4.
19.
go back to reference Hou X, Du J, Fang H, Li M. 3D-QSAR study on a series of Bcl-2 protein inhibitors using comparative molecular field analysis. Protein Pept Lett. 2011;18(5):440–9.CrossRefPubMed Hou X, Du J, Fang H, Li M. 3D-QSAR study on a series of Bcl-2 protein inhibitors using comparative molecular field analysis. Protein Pept Lett. 2011;18(5):440–9.CrossRefPubMed
20.
go back to reference Shah P, Saquib M, Sharma S, Husain I, Sharma SK, Singh V, et al. 3D-QSAR and molecular modeling studies on 2,3-dideoxy hexenopyranosid-4-uloses as anti-tubercular agents targeting alpha-mannosidase. Bioinorg Chem. 2015;59:91–6.CrossRef Shah P, Saquib M, Sharma S, Husain I, Sharma SK, Singh V, et al. 3D-QSAR and molecular modeling studies on 2,3-dideoxy hexenopyranosid-4-uloses as anti-tubercular agents targeting alpha-mannosidase. Bioinorg Chem. 2015;59:91–6.CrossRef
21.
go back to reference Lu W, Li P, Shan Y, Su P, Wang J, Shi Y, et al. Discovery of biphenyl-based VEGFR-2 inhibitors. Part 3: design, synthesis and 3D-QSAR studies. Bioorg Med Chem. 2015;23(5):1044–54.CrossRefPubMed Lu W, Li P, Shan Y, Su P, Wang J, Shi Y, et al. Discovery of biphenyl-based VEGFR-2 inhibitors. Part 3: design, synthesis and 3D-QSAR studies. Bioorg Med Chem. 2015;23(5):1044–54.CrossRefPubMed
22.
go back to reference Yu R, Wang J, Wang R, Lin Y, Hu Y, Wang Y, et al. Combined pharmacophore modeling, 3D-QSAR, homology modeling and docking studies on CYP11B1 inhibitors. Molecules. 2015;20(1):1014–30.CrossRefPubMed Yu R, Wang J, Wang R, Lin Y, Hu Y, Wang Y, et al. Combined pharmacophore modeling, 3D-QSAR, homology modeling and docking studies on CYP11B1 inhibitors. Molecules. 2015;20(1):1014–30.CrossRefPubMed
23.
go back to reference Chauhan JS, Dhanda SK, Singla D, Open Source Drug Discovery C, Agarwal SM, Raghava GP. QSAR-based models for designing quinazoline/imidazothiazoles/pyrazolopyrimidines based inhibitors against wild and mutant EGFR. PLoS One. 2014;9(7), e101079.PubMedCentralCrossRefPubMed Chauhan JS, Dhanda SK, Singla D, Open Source Drug Discovery C, Agarwal SM, Raghava GP. QSAR-based models for designing quinazoline/imidazothiazoles/pyrazolopyrimidines based inhibitors against wild and mutant EGFR. PLoS One. 2014;9(7), e101079.PubMedCentralCrossRefPubMed
24.
go back to reference Singh H, Singh S, Singla D, Agarwal SM, Raghava GPS. QSAR based model for discriminating EGFR inhibitors and non-inhibitors using Random forest. Biol Direct. 2015;10:10.PubMedCentralCrossRefPubMed Singh H, Singh S, Singla D, Agarwal SM, Raghava GPS. QSAR based model for discriminating EGFR inhibitors and non-inhibitors using Random forest. Biol Direct. 2015;10:10.PubMedCentralCrossRefPubMed
25.
go back to reference Kumar R, Chaudhary K, Singla D, Gautam A, Raghava GP. Designing of promiscuous inhibitors against pancreatic cancer cell lines. Sci Rep. 2014;4:4668.PubMedCentralPubMed Kumar R, Chaudhary K, Singla D, Gautam A, Raghava GP. Designing of promiscuous inhibitors against pancreatic cancer cell lines. Sci Rep. 2014;4:4668.PubMedCentralPubMed
26.
go back to reference Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer. 2006;6(10):813–23.CrossRefPubMed Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer. 2006;6(10):813–23.CrossRefPubMed
27.
go back to reference Rosén J, Rickardson L, Backlund A, Gullbo J, Bohlin L, Larsson R, et al. ChemGPS-NP Mapping of chemical compounds for prediction of anticancer mode of action. QSAR Comb Sci. 2009;28(4):436–46.CrossRef Rosén J, Rickardson L, Backlund A, Gullbo J, Bohlin L, Larsson R, et al. ChemGPS-NP Mapping of chemical compounds for prediction of anticancer mode of action. QSAR Comb Sci. 2009;28(4):436–46.CrossRef
28.
go back to reference Li GH, Huang JF. CDRUG: a web server for predicting anticancer activity of chemical compounds. Bioinformatics. 2012;28(24):3334–5.CrossRefPubMed Li GH, Huang JF. CDRUG: a web server for predicting anticancer activity of chemical compounds. Bioinformatics. 2012;28(24):3334–5.CrossRefPubMed
29.
go back to reference Hinselmann G, Rosenbaum L, Jahn A, Fechner N, Zell A. jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints. J Cheminform. 2011;3(1):3.PubMedCentralCrossRefPubMed Hinselmann G, Rosenbaum L, Jahn A, Fechner N, Zell A. jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints. J Cheminform. 2011;3(1):3.PubMedCentralCrossRefPubMed
30.
go back to reference Paull KD, Shoemaker RH, Hodes L, Monks A, Scudiero DA, Rubinstein L, et al. Display and analysis of patterns of differential activity of drugs against human tumor cell lines: development of mean graph and COMPARE algorithm. J Natl Cancer Inst. 1989;81(14):1088–92.CrossRefPubMed Paull KD, Shoemaker RH, Hodes L, Monks A, Scudiero DA, Rubinstein L, et al. Display and analysis of patterns of differential activity of drugs against human tumor cell lines: development of mean graph and COMPARE algorithm. J Natl Cancer Inst. 1989;81(14):1088–92.CrossRefPubMed
31.
go back to reference Yap CW. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem. 2011;32(7):1466–74.CrossRefPubMed Yap CW. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem. 2011;32(7):1466–74.CrossRefPubMed
32.
go back to reference Singla D, Tewari R, Kumar A, Raghava GP, Open Source Drug Discovery C. Designing of inhibitors against drug tolerant Mycobacterium tuberculosis (H37Rv). Chem Cent J. 2013;7(1):49.PubMedCentralCrossRefPubMed Singla D, Tewari R, Kumar A, Raghava GP, Open Source Drug Discovery C. Designing of inhibitors against drug tolerant Mycobacterium tuberculosis (H37Rv). Chem Cent J. 2013;7(1):49.PubMedCentralCrossRefPubMed
34.
go back to reference Hall MEF, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA Data mining software: an update. SIGKDD Explorations. 2009;11(1):10–8.CrossRef Hall MEF, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA Data mining software: an update. SIGKDD Explorations. 2009;11(1):10–8.CrossRef
35.
go back to reference Joachims T. Making large-scale support vector machine learning practical. In: Advances in kernel methods: support vector learning Edited by Scholkopf B, Burges C, Smola A Cambridge. MA: MIT Press; 1999. p. 169–84. Joachims T. Making large-scale support vector machine learning practical. In: Advances in kernel methods: support vector learning Edited by Scholkopf B, Burges C, Smola A Cambridge. MA: MIT Press; 1999. p. 169–84.
36.
37.
go back to reference Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics. 2005;21(20):3940–1.CrossRefPubMed Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics. 2005;21(20):3940–1.CrossRefPubMed
38.
go back to reference Schneidman-Duhovny D, Dror O, Inbar Y, Nussinov R, Wolfson HJ. PharmaGist: a webserver for ligand-based pharmacophore detection. Nucleic Acids Res. 2008;36(Web Server):W223–8.PubMedCentralCrossRefPubMed Schneidman-Duhovny D, Dror O, Inbar Y, Nussinov R, Wolfson HJ. PharmaGist: a webserver for ligand-based pharmacophore detection. Nucleic Acids Res. 2008;36(Web Server):W223–8.PubMedCentralCrossRefPubMed
39.
go back to reference Yadav IS, Singh H, Khan MI, Chaudhury A, Raghava GP, Agarwal SM. EGFRIndb: epidermal growth factor receptor inhibitor database. Anticancer Agents Med Chem. 2014;14(7):928–35.CrossRefPubMed Yadav IS, Singh H, Khan MI, Chaudhury A, Raghava GP, Agarwal SM. EGFRIndb: epidermal growth factor receptor inhibitor database. Anticancer Agents Med Chem. 2014;14(7):928–35.CrossRefPubMed
40.
go back to reference Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20(15):2479–81.CrossRefPubMed Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004;20(15):2479–81.CrossRefPubMed
41.
go back to reference Csizmadia F. JChem: Java applets and modules supporting chemical database handling from Web browsers. J Chem Inf Comput Sci. 2000;40(2):323–4.CrossRefPubMed Csizmadia F. JChem: Java applets and modules supporting chemical database handling from Web browsers. J Chem Inf Comput Sci. 2000;40(2):323–4.CrossRefPubMed
42.
go back to reference Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci. 1988;28(1):31–6.CrossRef Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci. 1988;28(1):31–6.CrossRef
43.
go back to reference Schüller A, Hähnke V, Schneider G. SmiLib v2.0: a java-based tool for rapid combinatorial library enumeration. QSAR Comb Sci. 2007;26(3):407–10.CrossRef Schüller A, Hähnke V, Schneider G. SmiLib v2.0: a java-based tool for rapid combinatorial library enumeration. QSAR Comb Sci. 2007;26(3):407–10.CrossRef
Metadata
Title
Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines
Authors
Harinder Singh
Rahul Kumar
Sandeep Singh
Kumardeep Chaudhary
Ankur Gautam
Gajendra P. S. Raghava
Publication date
01-12-2016
Publisher
BioMed Central
Published in
BMC Cancer / Issue 1/2016
Electronic ISSN: 1471-2407
DOI
https://doi.org/10.1186/s12885-016-2082-y

Other articles of this Issue 1/2016

BMC Cancer 1/2016 Go to the issue
Webinar | 19-02-2024 | 17:30 (CET)

Keynote webinar | Spotlight on antibody–drug conjugates in cancer

Antibody–drug conjugates (ADCs) are novel agents that have shown promise across multiple tumor types. Explore the current landscape of ADCs in breast and lung cancer with our experts, and gain insights into the mechanism of action, key clinical trials data, existing challenges, and future directions.

Dr. Véronique Diéras
Prof. Fabrice Barlesi
Developed by: Springer Medicine