Skip to main content
Top
Published in: Systematic Reviews 1/2014

Open Access 01-12-2014 | Methodology

Recovering the raw data behind a non-parametric survival curve

Authors: Zhihui Liu, Benjamin Rich, James A Hanley

Published in: Systematic Reviews | Issue 1/2014

Login to get access

Abstract

Background

Researchers often wish to carry out additional calculations or analyses using the survival data from one or more studies of other authors. When it is not possible to obtain the raw data directly, reconstruction techniques provide a valuable alternative. Several authors have proposed methods/tools for extracting data from such curves using a digitizing software. Instead of using a digitizer to read in the coordinates from a raster image, we propose directly reading in the lines of the PostScript file of a vector image.

Methods

Using examples, and a formal error analysis, we illustrate the extent to which, with what accuracy and precision, and in what circumstances, this information can be recovered from the various electronic formats in which such curves are published. We focus on the additional precision, and elimination of observer variation, achieved by using vector-based formats rendered by PostScript, rather than the lower resolution image-based formats that have been analyzed up to now. We provide some R code to process these.

Results

If the raster-based images are available, one can reliably recover much of the original information that seems to be ‘hidden’ beneath published survival curves. If the original images can be obtained as a PostScript file, the data recovered from it can then be either input into these tools or processed directly. We found that the PostScript used by Stata discloses considerably more of the data hidden behind survival curves than that generated by other statistical packages.

Conclusions

When it is not possible to obtain the raw data from the authors, reconstruction techniques are a valuable alternative. Compared with previous approaches, one advantage of ours is that there is no observer variation: there is no need to repeat the digitization process, since the extraction is completely replicable.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hanley JA:Analysis of mortality data from cancer screening studies: looking in the right window. Epidemiology. 2005, 16 (6): 786-790. 10.1097/01.ede.0000181313.61948.76.CrossRefPubMed Hanley JA:Analysis of mortality data from cancer screening studies: looking in the right window. Epidemiology. 2005, 16 (6): 786-790. 10.1097/01.ede.0000181313.61948.76.CrossRefPubMed
2.
go back to reference Hanley JA, McGregor M, Liu Z, Strumpf EC, Dendukuri N:Measuring the mortality impact of breast cancer screening. Can J Public Health. 2013, 104 (7): 437-442. Hanley JA, McGregor M, Liu Z, Strumpf EC, Dendukuri N:Measuring the mortality impact of breast cancer screening. Can J Public Health. 2013, 104 (7): 437-442.
3.
go back to reference Hanley JA:Mortality reductions produced by sustained prostate cancer screening have been underestimated. J Med Screen. 2010, 17 (3): 147-151. 10.1258/jms.2010.010005.CrossRefPubMed Hanley JA:Mortality reductions produced by sustained prostate cancer screening have been underestimated. J Med Screen. 2010, 17 (3): 147-151. 10.1258/jms.2010.010005.CrossRefPubMed
4.
go back to reference Guyot P, Ades AE, Ouwens MJ, Welton NJ:Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves. BMC Med Res Methodol. 2012, 12: 9-10.1186/1471-2288-12-9.CrossRefPubMedPubMedCentral Guyot P, Ades AE, Ouwens MJ, Welton NJ:Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves. BMC Med Res Methodol. 2012, 12: 9-10.1186/1471-2288-12-9.CrossRefPubMedPubMedCentral
5.
go back to reference Ouwens MJNM, Philipsa Z, Jansen JP:Network meta-analysis of parametric survival curves. Res Synth Methods. 2010, 1: 258-271. 10.1002/jrsm.25.CrossRefPubMed Ouwens MJNM, Philipsa Z, Jansen JP:Network meta-analysis of parametric survival curves. Res Synth Methods. 2010, 1: 258-271. 10.1002/jrsm.25.CrossRefPubMed
6.
go back to reference Jansen J:Network meta-analysis of survival data with fractional polynomials. BMC Med Res Methodol. 2011, 11: 1-14. 10.1186/1471-2288-11-1.CrossRef Jansen J:Network meta-analysis of survival data with fractional polynomials. BMC Med Res Methodol. 2011, 11: 1-14. 10.1186/1471-2288-11-1.CrossRef
7.
go back to reference Duchateau L, Collette L, Sylvester R, Pignon J:Estimating number of events from the Kaplan-Meier curve for incorporation in a literature-based meta-analysis: what you don’t see you can’t get!. Biometrics. 2000, 56 (3): 886-892. 10.1111/j.0006-341X.2000.00886.x.CrossRefPubMed Duchateau L, Collette L, Sylvester R, Pignon J:Estimating number of events from the Kaplan-Meier curve for incorporation in a literature-based meta-analysis: what you don’t see you can’t get!. Biometrics. 2000, 56 (3): 886-892. 10.1111/j.0006-341X.2000.00886.x.CrossRefPubMed
8.
go back to reference Parmar M, Torri V, Stewart L:Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Stat Med. 1998, 17: 2815-2834. 10.1002/(SICI)1097-0258(19981230)17:24<2815::AID-SIM110>3.0.CO;2-8.CrossRefPubMed Parmar M, Torri V, Stewart L:Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Stat Med. 1998, 17: 2815-2834. 10.1002/(SICI)1097-0258(19981230)17:24<2815::AID-SIM110>3.0.CO;2-8.CrossRefPubMed
9.
go back to reference Earle CC, Wells GA:An assessment of methods to combine published survival curves. Medical Decision Making. 2002, 20: 104-111.CrossRef Earle CC, Wells GA:An assessment of methods to combine published survival curves. Medical Decision Making. 2002, 20: 104-111.CrossRef
10.
go back to reference Williamson P, Smith C, Hutton J, Marson A:Aggregate data meta-analysis with time-to-event outcomes. Stat Med. 2002, 21: 3337-3351. 10.1002/sim.1303.CrossRefPubMed Williamson P, Smith C, Hutton J, Marson A:Aggregate data meta-analysis with time-to-event outcomes. Stat Med. 2002, 21: 3337-3351. 10.1002/sim.1303.CrossRefPubMed
11.
go back to reference Tudur C, Williamson P, Khan S, Best L:The value of the aggregate data approach in meta-analysis with time-to-event outcomes. J R Stat Soc Series A (Stat Soc). 2001, 164: 357-370. 10.1111/1467-985X.00207.CrossRef Tudur C, Williamson P, Khan S, Best L:The value of the aggregate data approach in meta-analysis with time-to-event outcomes. J R Stat Soc Series A (Stat Soc). 2001, 164: 357-370. 10.1111/1467-985X.00207.CrossRef
13.
go back to reference Murrell P:Importing vector graphics: the grimport package for R. J Stat Softw. 2009, 30: 1-37. Murrell P:Importing vector graphics: the grimport package for R. J Stat Softw. 2009, 30: 1-37.
16.
go back to reference Andriole GL, Crawford ED, Grubb RL, Buys SS, Chia D, Church TR, Fouad MN, Gelmann EP, Kvale PA, Reding DJ, Weissfeld JL, Yokochi LA, Crawford ED, O’Brien B, Clapp JD, Rathmell JM, Riley TL, Hayes RB, Kramer BS, Izmirlian G, Miller AB, Pinsky PF, Prorok PC, Gohagan JK, Berg CD:Prostate cancer screening in the randomized prostate, lung, colorectal, and ovarian cancer screening trial: mortality results after 13 years of follow-up. J Nat Cancer Inst. 2012, 104: 125-132. 10.1093/jnci/djr500.CrossRefPubMedPubMedCentral Andriole GL, Crawford ED, Grubb RL, Buys SS, Chia D, Church TR, Fouad MN, Gelmann EP, Kvale PA, Reding DJ, Weissfeld JL, Yokochi LA, Crawford ED, O’Brien B, Clapp JD, Rathmell JM, Riley TL, Hayes RB, Kramer BS, Izmirlian G, Miller AB, Pinsky PF, Prorok PC, Gohagan JK, Berg CD:Prostate cancer screening in the randomized prostate, lung, colorectal, and ovarian cancer screening trial: mortality results after 13 years of follow-up. J Nat Cancer Inst. 2012, 104: 125-132. 10.1093/jnci/djr500.CrossRefPubMedPubMedCentral
18.
go back to reference Pearson S, Troughton R, Richards A:Rivaroxaban versus warfarin in nonvalvular atrial fibrillation. N Engl J Med. 2011, 365: 883-891. 10.1056/NEJMoa1009638.CrossRef Pearson S, Troughton R, Richards A:Rivaroxaban versus warfarin in nonvalvular atrial fibrillation. N Engl J Med. 2011, 365: 883-891. 10.1056/NEJMoa1009638.CrossRef
19.
go back to reference Schröder FH, Hugosson J, Roobol MJ, Tammela TLJ, Ciatto S, Nelen V, Kwiatkowski M, Lujan M, Lilja H, Zappa M, Cheung AM:Chle: Screening and prostate-cancer mortality in a randomized european study. N Engl J Med. 2009, 360: 1320-1328. 10.1056/NEJMoa0810084.CrossRefPubMed Schröder FH, Hugosson J, Roobol MJ, Tammela TLJ, Ciatto S, Nelen V, Kwiatkowski M, Lujan M, Lilja H, Zappa M, Cheung AM:Chle: Screening and prostate-cancer mortality in a randomized european study. N Engl J Med. 2009, 360: 1320-1328. 10.1056/NEJMoa0810084.CrossRefPubMed
20.
go back to reference Goss P, Ingle J, Ales-Martinez J, Cheung AM, Chlebowski RT, Wactawski-Wende J, McTiernan A, Robbins J, Johnson KC, Martin LW, Winquist E, Sarto GE, Garber JE, Fabian CJ, Pujol P, maunsell E, Farmer P, Gelmon KA, Tu D, Richardson H:Exemestane for breast-cancer prevention in postmenopausal women. N Engl J Med. 2011, 364: 2381-2391. 10.1056/NEJMoa1103507.CrossRefPubMed Goss P, Ingle J, Ales-Martinez J, Cheung AM, Chlebowski RT, Wactawski-Wende J, McTiernan A, Robbins J, Johnson KC, Martin LW, Winquist E, Sarto GE, Garber JE, Fabian CJ, Pujol P, maunsell E, Farmer P, Gelmon KA, Tu D, Richardson H:Exemestane for breast-cancer prevention in postmenopausal women. N Engl J Med. 2011, 364: 2381-2391. 10.1056/NEJMoa1103507.CrossRefPubMed
21.
go back to reference Kofteridis DP, Alexopoulou C, Valachis A, Maraki S, Dimopoulou D, Georgopoulos D, Samonis G:Aerosolized plus intravenous colistin versus intravenous colistin alone for the treatment of ventilator-associated pneumonia: a matched case-control study. Clin Infect Dis. 2010, 51 (11): 1238-1244. 10.1086/657242.CrossRefPubMed Kofteridis DP, Alexopoulou C, Valachis A, Maraki S, Dimopoulou D, Georgopoulos D, Samonis G:Aerosolized plus intravenous colistin versus intravenous colistin alone for the treatment of ventilator-associated pneumonia: a matched case-control study. Clin Infect Dis. 2010, 51 (11): 1238-1244. 10.1086/657242.CrossRefPubMed
22.
go back to reference Fey MF, Tobler A:Marriage risk of cancer research fellows. Lancet. 2011, 378 (9809): 2070-10.1016/S0140-6736(11)61898-9.CrossRefPubMed Fey MF, Tobler A:Marriage risk of cancer research fellows. Lancet. 2011, 378 (9809): 2070-10.1016/S0140-6736(11)61898-9.CrossRefPubMed
23.
go back to reference Ridker PM, Danielson E, Fonseca FA, Genest J, Kastelein JJ, Koening W, Libby P, Lorenzatti AJ, MacFadayen JG, Nordestgaard BG, Shepherd J, Wilelrson JT, Glynn RJ, Gotto Am:Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N Engl J Med. 2008, 359 (21): 2195-207. 10.1056/NEJMoa0807646.CrossRefPubMed Ridker PM, Danielson E, Fonseca FA, Genest J, Kastelein JJ, Koening W, Libby P, Lorenzatti AJ, MacFadayen JG, Nordestgaard BG, Shepherd J, Wilelrson JT, Glynn RJ, Gotto Am:Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N Engl J Med. 2008, 359 (21): 2195-207. 10.1056/NEJMoa0807646.CrossRefPubMed
Metadata
Title
Recovering the raw data behind a non-parametric survival curve
Authors
Zhihui Liu
Benjamin Rich
James A Hanley
Publication date
01-12-2014
Publisher
BioMed Central
Published in
Systematic Reviews / Issue 1/2014
Electronic ISSN: 2046-4053
DOI
https://doi.org/10.1186/2046-4053-3-151

Other articles of this Issue 1/2014

Systematic Reviews 1/2014 Go to the issue

Reviewer acknowledgement

Reviewer Acknowledgement 2013