Skip to main content
Top
Published in: Journal of Translational Medicine 1/2021

Open Access 01-12-2021 | Research

TPM, FPKM, or Normalized Counts? A Comparative Study of Quantification Measures for the Analysis of RNA-seq Data from the NCI Patient-Derived Models Repository

Authors: Yingdong Zhao, Ming-Chung Li, Mariam M. Konaté, Li Chen, Biswajit Das, Chris Karlovich, P. Mickey Williams, Yvonne A. Evrard, James H. Doroshow, Lisa M. McShane

Published in: Journal of Translational Medicine | Issue 1/2021

Login to get access

Abstract

Background

In order to correctly decode phenotypic information from RNA-sequencing (RNA-seq) data, careful selection of the RNA-seq quantification measure is critical for inter-sample comparisons and for downstream analyses, such as differential gene expression between two or more conditions. Several methods have been proposed and continue to be used. However, a consensus has not been reached regarding the best gene expression quantification method for RNA-seq data analysis.

Methods

In the present study, we used replicate samples from each of 20 patient-derived xenograft (PDX) models spanning 15 tumor types, for a total of 61 human tumor xenograft samples available through the NCI patient-derived model repository (PDMR). We compared the reproducibility across replicate samples based on TPM (transcripts per million), FPKM (fragments per kilobase of transcript per million fragments mapped), and normalized counts using coefficient of variation, intraclass correlation coefficient, and cluster analysis.

Results

Our results revealed that hierarchical clustering on normalized count data tended to group replicate samples from the same PDX model together more accurately than TPM and FPKM data. Furthermore, normalized count data were observed to have the lowest median coefficient of variation (CV), and highest intraclass correlation (ICC) values across all replicate samples from the same model and for the same gene across all PDX models compared to TPM and FPKM data.

Conclusion

We provided compelling evidence for a preferred quantification measure to conduct downstream analyses of PDX RNA-seq data. To our knowledge, this is the first comparative study of RNA-seq data quantification measures conducted on PDX models, which are known to be inherently more variable than cell line models. Our findings are consistent with what others have shown for human tumors and cell lines and add further support to the thesis that normalized counts are the best choice for the analysis of RNA-seq data across samples.
Appendix
Available only for authorised users
Literature
1.
go back to reference Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8.PubMedCrossRef Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621–8.PubMedCrossRef
4.
go back to reference Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17:13.PubMedPubMedCentralCrossRef Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17:13.PubMedPubMedCentralCrossRef
5.
go back to reference Zhang C, Zhang B, Lin LL, Zhao S. Evaluation and comparison of computational tools for RNA-seq isoform quantification. BMC Genom. 2017;18:583.CrossRef Zhang C, Zhang B, Lin LL, Zhao S. Evaluation and comparison of computational tools for RNA-seq isoform quantification. BMC Genom. 2017;18:583.CrossRef
6.
go back to reference Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417–9.PubMedPubMedCentralCrossRef Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14:417–9.PubMedPubMedCentralCrossRef
7.
go back to reference Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34:525–7.PubMedCrossRef Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34:525–7.PubMedCrossRef
8.
go back to reference Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol. 2014;32:462–4.PubMedPubMedCentralCrossRef Patro R, Mount SM, Kingsford C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol. 2014;32:462–4.PubMedPubMedCentralCrossRef
9.
go back to reference Dillies MA, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform. 2013;14:671–83.PubMedCrossRef Dillies MA, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform. 2013;14:671–83.PubMedCrossRef
11.
go back to reference Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–30.CrossRef Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020;369:1318–30.CrossRef
12.
go back to reference Du T, Sikora MJ, Levine KM, Tasdemir N, Riggins RB, Wendell SG, Van Houten B, Oesterreich S. Key regulators of lipid metabolism drive endocrine resistance in invasive lobular breast cancer. Breast Cancer Res. 2018;20:106.PubMedPubMedCentralCrossRef Du T, Sikora MJ, Levine KM, Tasdemir N, Riggins RB, Wendell SG, Van Houten B, Oesterreich S. Key regulators of lipid metabolism drive endocrine resistance in invasive lobular breast cancer. Breast Cancer Res. 2018;20:106.PubMedPubMedCentralCrossRef
13.
go back to reference Begik O, Lucas MC, Liu H, Ramirez JM, Mattick JS, Novoa EM. Integrative analyses of the RNA modification machinery reveal tissue- and cancer-specific signatures. Genome Biol. 2020;21:97.PubMedPubMedCentralCrossRef Begik O, Lucas MC, Liu H, Ramirez JM, Mattick JS, Novoa EM. Integrative analyses of the RNA modification machinery reveal tissue- and cancer-specific signatures. Genome Biol. 2020;21:97.PubMedPubMedCentralCrossRef
14.
go back to reference Yu S, Wu Y, Li C, Qu Z, Lou G, Guo X, Ji J, Li N, Guo M, Zhang M, et al. Comprehensive analysis of the SLC16A gene family in pancreatic cancer via integrated bioinformatics. Sci Rep. 2020;10:7315.PubMedPubMedCentralCrossRef Yu S, Wu Y, Li C, Qu Z, Lou G, Guo X, Ji J, Li N, Guo M, Zhang M, et al. Comprehensive analysis of the SLC16A gene family in pancreatic cancer via integrated bioinformatics. Sci Rep. 2020;10:7315.PubMedPubMedCentralCrossRef
15.
go back to reference Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6:pl1.PubMedPubMedCentralCrossRef Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO, Sun Y, Jacobsen A, Sinha R, Larsson E, et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 2013;6:pl1.PubMedPubMedCentralCrossRef
16.
go back to reference Hidalgo M, Amant F, Biankin AV, Budinska E, Byrne AT, Caldas C, Clarke RB, de Jong S, Jonkers J, Maelandsmo GM, et al. Patient-derived xenograft models: an emerging platform for translational cancer research. Cancer Discov. 2014;4:998–1013.PubMedPubMedCentralCrossRef Hidalgo M, Amant F, Biankin AV, Budinska E, Byrne AT, Caldas C, Clarke RB, de Jong S, Jonkers J, Maelandsmo GM, et al. Patient-derived xenograft models: an emerging platform for translational cancer research. Cancer Discov. 2014;4:998–1013.PubMedPubMedCentralCrossRef
17.
go back to reference Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5.PubMedPubMedCentralCrossRef Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28:511–5.PubMedPubMedCentralCrossRef
18.
go back to reference Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform. 2011;12:323.CrossRef Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform. 2011;12:323.CrossRef
19.
go back to reference Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521.PubMedCrossRef Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521.PubMedCrossRef
22.
go back to reference Tarazona S, Furio-Tari P, Turra D, Pietro AD, Nueda MJ, Ferrer A, Conesa A. Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package. Nucleic Acids Res. 2015;43:e140.PubMedPubMedCentral Tarazona S, Furio-Tari P, Turra D, Pietro AD, Nueda MJ, Ferrer A, Conesa A. Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package. Nucleic Acids Res. 2015;43:e140.PubMedPubMedCentral
23.
go back to reference Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963;58:236.CrossRef Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963;58:236.CrossRef
24.
go back to reference Murtagh F, Legendre P. Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif. 2014;31:274–95.CrossRef Murtagh F, Legendre P. Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? J Classif. 2014;31:274–95.CrossRef
25.
go back to reference Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–8.PubMedCrossRef Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–8.PubMedCrossRef
26.
go back to reference Gamer M, Fellows J, Lemon I, Singh P (2019) Package “irr”. Various coefficients of interrater reliability and agreement. In: The comprehensive R archive network (CRAN). 0.84.1 ed. 2019. Gamer M, Fellows J, Lemon I, Singh P (2019) Package “irr”. Various coefficients of interrater reliability and agreement. In: The comprehensive R archive network (CRAN). 0.84.1 ed. 2019.
27.
go back to reference Kuiper FK, Fisher L. Monte-Carlo comparison of 6 clustering procedures. Biometrics. 1975;31:777–83.CrossRef Kuiper FK, Fisher L. Monte-Carlo comparison of 6 clustering procedures. Biometrics. 1975;31:777–83.CrossRef
28.
go back to reference Blashfield RK. Mixture model tests of cluster-analysis—accuracy of 4 agglomerative hierarchical methods. Psychol Bull. 1976;83:377–88.CrossRef Blashfield RK. Mixture model tests of cluster-analysis—accuracy of 4 agglomerative hierarchical methods. Psychol Bull. 1976;83:377–88.CrossRef
29.
go back to reference Hands S, Everitt B. A Monte-Carlo study of the recovery of cluster structure in binary data by hierarchical-clustering techniques. Multivar Behav Res. 1987;22:235–43.CrossRef Hands S, Everitt B. A Monte-Carlo study of the recovery of cluster structure in binary data by hierarchical-clustering techniques. Multivar Behav Res. 1987;22:235–43.CrossRef
30.
go back to reference Ferreira L, Hitchcock DB. A comparison of hierarchical methods for clustering functional data. Commun Stat Simul Comput. 2009;38:1925–49.CrossRef Ferreira L, Hitchcock DB. A comparison of hierarchical methods for clustering functional data. Commun Stat Simul Comput. 2009;38:1925–49.CrossRef
31.
go back to reference Korn EL, Habermann JK, Upender MB, Ried T, McShane LM. Objective method of comparing DNA microarray image analysis systems. Biotechniques. 2004;36:960–7.PubMedCrossRef Korn EL, Habermann JK, Upender MB, Ried T, McShane LM. Objective method of comparing DNA microarray image analysis systems. Biotechniques. 2004;36:960–7.PubMedCrossRef
32.
go back to reference Evans C, Hardin J, Stoebel DM. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions. Brief Bioinform. 2018;19:776–92.PubMedCrossRef Evans C, Hardin J, Stoebel DM. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions. Brief Bioinform. 2018;19:776–92.PubMedCrossRef
33.
go back to reference Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinform. 2010;11:94.CrossRef Bullard JH, Purdom E, Hansen KD, Dudoit S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinform. 2010;11:94.CrossRef
34.
go back to reference Lin Y, Golovnina K, Chen ZX, Lee HN, Negron YL, Sultana H, Oliver B, Harbison ST. Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. BMC Genom. 2016;17:28.CrossRef Lin Y, Golovnina K, Chen ZX, Lee HN, Negron YL, Sultana H, Oliver B, Harbison ST. Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. BMC Genom. 2016;17:28.CrossRef
35.
go back to reference Wagner GP, Kin K, Lynch VJ. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 2012;131:281–5.PubMedCrossRef Wagner GP, Kin K, Lynch VJ. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 2012;131:281–5.PubMedCrossRef
36.
go back to reference Consortium M, Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, et al. The microarray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006;24:1151–61.CrossRef Consortium M, Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, et al. The microarray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006;24:1151–61.CrossRef
37.
go back to reference Abrams ZB, Johnson TS, Huang K, Payne PRO, Coombes K. A protocol to evaluate RNA sequencing normalization methods. BMC Bioinform. 2019;20:679.CrossRef Abrams ZB, Johnson TS, Huang K, Payne PRO, Coombes K. A protocol to evaluate RNA sequencing normalization methods. BMC Bioinform. 2019;20:679.CrossRef
38.
go back to reference Woo XY, Srivastava A, Graber JH, Yadav V, Sarsani VK, Simons A, Beane G, Grubb S, Ananda G, Liu R, et al. Genomic data analysis workflows for tumors from patient-derived xenografts (PDXs): challenges and guidelines. BMC Med Genom. 2019;12:92.CrossRef Woo XY, Srivastava A, Graber JH, Yadav V, Sarsani VK, Simons A, Beane G, Grubb S, Ananda G, Liu R, et al. Genomic data analysis workflows for tumors from patient-derived xenografts (PDXs): challenges and guidelines. BMC Med Genom. 2019;12:92.CrossRef
40.
go back to reference Jo SY, Kim E, Kim S. Impact of mouse contamination in genomic profiling of patient-derived models and best practice for robust analysis. Genome Biol. 2019;20:231.PubMedPubMedCentralCrossRef Jo SY, Kim E, Kim S. Impact of mouse contamination in genomic profiling of patient-derived models and best practice for robust analysis. Genome Biol. 2019;20:231.PubMedPubMedCentralCrossRef
41.
go back to reference Isella C, Terrasi A, Bellomo SE, Petti C, Galatola G, Muratore A, Mellano A, Senetta R, Cassenti A, Sonetto C, et al. Stromal contribution to the colorectal cancer transcriptome. Nat Genet. 2015;47:312–9.PubMedCrossRef Isella C, Terrasi A, Bellomo SE, Petti C, Galatola G, Muratore A, Mellano A, Senetta R, Cassenti A, Sonetto C, et al. Stromal contribution to the colorectal cancer transcriptome. Nat Genet. 2015;47:312–9.PubMedCrossRef
42.
go back to reference Liu Y, Chanana P, Davila JI, Hou X, Zanfagnin V, McGehee CD, Goode EL, Polley EC, Haluska P, Weroha SJ, Wang C. Gene expression differences between matched pairs of ovarian cancer patient tumors and patient-derived xenografts. Sci Rep. 2019;9:6314.PubMedPubMedCentralCrossRef Liu Y, Chanana P, Davila JI, Hou X, Zanfagnin V, McGehee CD, Goode EL, Polley EC, Haluska P, Weroha SJ, Wang C. Gene expression differences between matched pairs of ovarian cancer patient tumors and patient-derived xenografts. Sci Rep. 2019;9:6314.PubMedPubMedCentralCrossRef
44.
go back to reference Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003;34:267–73.PubMedCrossRef Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003;34:267–73.PubMedCrossRef
45.
go back to reference Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102:15545–50.PubMedPubMedCentralCrossRef Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102:15545–50.PubMedPubMedCentralCrossRef
Metadata
Title
TPM, FPKM, or Normalized Counts? A Comparative Study of Quantification Measures for the Analysis of RNA-seq Data from the NCI Patient-Derived Models Repository
Authors
Yingdong Zhao
Ming-Chung Li
Mariam M. Konaté
Li Chen
Biswajit Das
Chris Karlovich
P. Mickey Williams
Yvonne A. Evrard
James H. Doroshow
Lisa M. McShane
Publication date
01-12-2021
Publisher
BioMed Central
Published in
Journal of Translational Medicine / Issue 1/2021
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/s12967-021-02936-w

Other articles of this Issue 1/2021

Journal of Translational Medicine 1/2021 Go to the issue
Live Webinar | 27-06-2024 | 18:00 (CEST)

Keynote webinar | Spotlight on medication adherence

Live: Thursday 27th June 2024, 18:00-19:30 (CEST)

WHO estimates that half of all patients worldwide are non-adherent to their prescribed medication. The consequences of poor adherence can be catastrophic, on both the individual and population level.

Join our expert panel to discover why you need to understand the drivers of non-adherence in your patients, and how you can optimize medication adherence in your clinics to drastically improve patient outcomes.

Prof. Kevin Dolgin
Prof. Florian Limbourg
Prof. Anoop Chauhan
Developed by: Springer Medicine
Obesity Clinical Trial Summary

At a glance: The STEP trials

A round-up of the STEP phase 3 clinical trials evaluating semaglutide for weight loss in people with overweight or obesity.

Developed by: Springer Medicine

Highlights from the ACC 2024 Congress

Year in Review: Pediatric cardiology

Watch Dr. Anne Marie Valente present the last year's highlights in pediatric and congenital heart disease in the official ACC.24 Year in Review session.

Year in Review: Pulmonary vascular disease

The last year's highlights in pulmonary vascular disease are presented by Dr. Jane Leopold in this official video from ACC.24.

Year in Review: Valvular heart disease

Watch Prof. William Zoghbi present the last year's highlights in valvular heart disease from the official ACC.24 Year in Review session.

Year in Review: Heart failure and cardiomyopathies

Watch this official video from ACC.24. Dr. Biykem Bozkurt discusses last year's major advances in heart failure and cardiomyopathies.