Skip to main content
Top
Published in: BMC Medical Research Methodology 1/2019

Open Access 01-12-2019 | Research article

Development of an algorithm for evaluating the impact of measurement variability on response categorization in oncology trials

Authors: Jeong-Hwa Yoon, Soon Ho Yoon, Seokyung Hahn

Published in: BMC Medical Research Methodology | Issue 1/2019

Login to get access

Abstract

Background

Radiologic assessments of baseline and post-treatment tumor burden are subject to measurement variability, but the impact of this variability on the objective response rate (ORR) and progression rate in specific trials has been unpredictable on a practical level. In this study, we aimed to develop an algorithm for evaluating the quantitative impact of measurement variability on the ORR and progression rate.

Methods

First, we devised a hierarchical model for estimating the distribution of measurement variability using a clinical trial dataset of computed tomography scans. Next, a simulation method was used to calculate the probability representing the effect of measurement errors on categorical diagnoses in various scenarios using the estimated distribution. Based on the probabilities derived from the simulation, we developed an algorithm to evaluate the reliability of an ORR (or progression rate) (i.e., the variation in the assessed rate) by generating a 95% central range of ORR (or progression rate) results if a reassessment was performed. Finally, we performed validation using an external dataset. In the validation of the estimated distribution of measurement variability, the coverage level was calculated as the proportion of the 95% central ranges of hypothetical second readings that covered the actual burden sizes. In the validation of the evaluation algorithm, for 100 resampled datasets, the coverage level was calculated as the proportion of the 95% central ranges of ORR results that covered the ORR from a real second assessment.

Results

We built a web tool for implementing the algorithm (publicly available at http://​studyanalysis201​7.​pythonanywhere.​com/​). In the validation of the estimated distribution and the algorithm, the coverage levels were 93 and 100%, respectively.

Conclusions

The validation exercise using an external dataset demonstrated the adequacy of the statistical model and the utility of the developed algorithm. Quantification of variation in the ORR and progression rate due to potential measurement variability is essential and will help inform decisions made on the basis of trial data.
Appendix
Available only for authorised users
Literature
1.
go back to reference Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45(2):228–47.CrossRef Eisenhauer EA, Therasse P, Bogaerts J, Schwartz LH, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer. 2009;45(2):228–47.CrossRef
2.
go back to reference Therasse P, Le Cesne A, Van Glabbeke M, Verweij J, Judson I. RECIST vs. WHO: prospective comparison of response criteria in an EORTC phase II clinical trial investigating ET-743 in advanced soft tissue sarcoma. Eur J Cancer. 2005;41(10):1426–30.CrossRef Therasse P, Le Cesne A, Van Glabbeke M, Verweij J, Judson I. RECIST vs. WHO: prospective comparison of response criteria in an EORTC phase II clinical trial investigating ET-743 in advanced soft tissue sarcoma. Eur J Cancer. 2005;41(10):1426–30.CrossRef
3.
go back to reference Erasmus JJ, Gladish GW, Broemeling L, Sabloff BS, Truong MT, Herbst RS, Munden RF. Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. J Clin Oncol. 2003;21(13):2574–82.CrossRef Erasmus JJ, Gladish GW, Broemeling L, Sabloff BS, Truong MT, Herbst RS, Munden RF. Interobserver and intraobserver variability in measurement of non-small-cell carcinoma lung lesions: implications for assessment of tumor response. J Clin Oncol. 2003;21(13):2574–82.CrossRef
4.
go back to reference Oxnard GR, Zhao B, Sima CS, Ginsberg MS, James LP, Lefkowitz RA, Guo P, Kris MG, Schwartz LH, Riely GJ. Variability of lung tumor measurements on repeat computed tomography scans taken within 15 minutes. J Clin Oncol. 2011;29(23):3114–9.CrossRef Oxnard GR, Zhao B, Sima CS, Ginsberg MS, James LP, Lefkowitz RA, Guo P, Kris MG, Schwartz LH, Riely GJ. Variability of lung tumor measurements on repeat computed tomography scans taken within 15 minutes. J Clin Oncol. 2011;29(23):3114–9.CrossRef
5.
go back to reference Yoon SH, Kim KW, Goo JM, Kim DW, Hahn S. Observer variability in RECIST-based tumour burden measurements: a meta-analysis. Eur J Cancer. 2016;53:5–15.CrossRef Yoon SH, Kim KW, Goo JM, Kim DW, Hahn S. Observer variability in RECIST-based tumour burden measurements: a meta-analysis. Eur J Cancer. 2016;53:5–15.CrossRef
6.
go back to reference Shao T, Wang L, Templeton AJ, Jang R, Vera-Badillo FW, McNamara MG, Margolis M, Kim TK, Sinaei M, Shoushtari H, et al. Use and misuse of waterfall plots. J Natl Cancer Inst. 2014;106(12):dju331. Shao T, Wang L, Templeton AJ, Jang R, Vera-Badillo FW, McNamara MG, Margolis M, Kim TK, Sinaei M, Shoushtari H, et al. Use and misuse of waterfall plots. J Natl Cancer Inst. 2014;106(12):dju331.
7.
go back to reference Agresti A. A model for agreement between ratings on an ordinal scale. Biometrics. 1988;44(2):539–48. Agresti A. A model for agreement between ratings on an ordinal scale. Biometrics. 1988;44(2):539–48.
8.
go back to reference Banerjee M, Capozzoli M, McSweeney L, Sinha D. Beyond kappa: a review of interrater agreement measures. Can J Stat. 1999;27(1):3–23.CrossRef Banerjee M, Capozzoli M, McSweeney L, Sinha D. Beyond kappa: a review of interrater agreement measures. Can J Stat. 1999;27(1):3–23.CrossRef
9.
go back to reference Jakobsson U, Westergren A. Statistical methods for assessing agreement for ordinal data. Scand J Caring Sci. 2005;19(4):427–31.CrossRef Jakobsson U, Westergren A. Statistical methods for assessing agreement for ordinal data. Scand J Caring Sci. 2005;19(4):427–31.CrossRef
10.
go back to reference Svensson E. Guidelines to statistical evaluation of data from rating scales and questionnaires. J Rehabil Med. 2001;33(1):47–8.CrossRef Svensson E. Guidelines to statistical evaluation of data from rating scales and questionnaires. J Rehabil Med. 2001;33(1):47–8.CrossRef
11.
go back to reference Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327(8476):307–10.CrossRef Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327(8476):307–10.CrossRef
12.
go back to reference Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8(2):135–60.CrossRef Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8(2):135–60.CrossRef
13.
go back to reference Euser AM, Dekker FW, le Cessie S. A practical approach to Bland-Altman plots and variation coefficients for log transformed variables. J Clin Epidemiol. 2008;61(10):978–82.CrossRef Euser AM, Dekker FW, le Cessie S. A practical approach to Bland-Altman plots and variation coefficients for log transformed variables. J Clin Epidemiol. 2008;61(10):978–82.CrossRef
14.
go back to reference De Boeck P, Bakker M, Zwitser R, Nivard M, Hofman A, Tuerlinckx F, Partchev I. The estimation of item response models with the lmer function from the lme4 package in R. J Stat Softw. 2011;39(12):1–28.CrossRef De Boeck P, Bakker M, Zwitser R, Nivard M, Hofman A, Tuerlinckx F, Partchev I. The estimation of item response models with the lmer function from the lme4 package in R. J Stat Softw. 2011;39(12):1–28.CrossRef
15.
go back to reference Tremblay A, Ransijn J. Package ‘LMERConvenienceFunctions’; 2015. Tremblay A, Ransijn J. Package ‘LMERConvenienceFunctions’; 2015.
16.
go back to reference Perini TA, GLd O, Ornellas JS, FPd O. Technical error of measurement in anthropometry. Rev Bras Med Esporte. 2005;11(1):81–5.CrossRef Perini TA, GLd O, Ornellas JS, FPd O. Technical error of measurement in anthropometry. Rev Bras Med Esporte. 2005;11(1):81–5.CrossRef
17.
go back to reference Hadfield JD. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Softw. 2010;33(2):1–22.CrossRef Hadfield JD. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J Stat Softw. 2010;33(2):1–22.CrossRef
18.
go back to reference R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2015. R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2015.
19.
go back to reference Delignette-Muller ML, Dutang C. Fitdistrplus: an R package for fitting distributions. J Stat Softw. 2015;64(4):1–34.CrossRef Delignette-Muller ML, Dutang C. Fitdistrplus: an R package for fitting distributions. J Stat Softw. 2015;64(4):1–34.CrossRef
20.
go back to reference Rossum GV. Python tutorial, technical report CS-R9526. Amsterdam: Centrum voor Wiskunde en Informatica; 1995. Rossum GV. Python tutorial, technical report CS-R9526. Amsterdam: Centrum voor Wiskunde en Informatica; 1995.
21.
go back to reference Møller B, Weedon-Fekjær H, Haldorsen T. Empirical evaluation of prediction intervals for cancer incidence. BMC Med Res Methodol. 2005;5(1):21.CrossRef Møller B, Weedon-Fekjær H, Haldorsen T. Empirical evaluation of prediction intervals for cancer incidence. BMC Med Res Methodol. 2005;5(1):21.CrossRef
22.
go back to reference Therasse P, Arbuck SG, Eisenhauer EA, Wanders J, Kaplan RS, Rubinstein L, Verweij J, Van Glabbeke M, van Oosterom AT, Christian MC. New guidelines to evaluate the response to treatment in solid tumors. J Natl Cancer Inst. 2000;92(3):205–16.CrossRef Therasse P, Arbuck SG, Eisenhauer EA, Wanders J, Kaplan RS, Rubinstein L, Verweij J, Van Glabbeke M, van Oosterom AT, Christian MC. New guidelines to evaluate the response to treatment in solid tumors. J Natl Cancer Inst. 2000;92(3):205–16.CrossRef
23.
go back to reference Schwartz L, Bogaerts J, Ford R, Shankar L, Therasse P, Gwyther S, Eisenhauer E. Evaluation of lymph nodes with RECIST 1.1. Eur J Cancer. 2009;45(2):261–7.CrossRef Schwartz L, Bogaerts J, Ford R, Shankar L, Therasse P, Gwyther S, Eisenhauer E. Evaluation of lymph nodes with RECIST 1.1. Eur J Cancer. 2009;45(2):261–7.CrossRef
Metadata
Title
Development of an algorithm for evaluating the impact of measurement variability on response categorization in oncology trials
Authors
Jeong-Hwa Yoon
Soon Ho Yoon
Seokyung Hahn
Publication date
01-12-2019
Publisher
BioMed Central
Published in
BMC Medical Research Methodology / Issue 1/2019
Electronic ISSN: 1471-2288
DOI
https://doi.org/10.1186/s12874-019-0727-7

Other articles of this Issue 1/2019

BMC Medical Research Methodology 1/2019 Go to the issue