Skip to main content
Top
Published in: BMC Health Services Research 1/2009

Open Access 01-12-2009 | Correspondence

The SAIL Databank: building a national architecture for e-health research and evaluation

Authors: David V Ford, Kerina H Jones, Jean-Philippe Verplancke, Ronan A Lyons, Gareth John, Ginevra Brown, Caroline J Brooks, Simon Thompson, Owen Bodger, Tony Couch, Ken Leake

Published in: BMC Health Services Research | Issue 1/2009

Login to get access

Abstract

Background

Vast quantities of electronic data are collected about patients and service users as they pass through health service and other public sector organisations, and these data present enormous potential for research and policy evaluation. The Health Information Research Unit (HIRU) aims to realise the potential of electronically-held, person-based, routinely-collected data to conduct and support health-related studies. However, there are considerable challenges that must be addressed before such data can be used for these purposes, to ensure compliance with the legislation and guidelines generally known as Information Governance.

Methods

A set of objectives was identified to address the challenges and establish the Secure Anonymised Information Linkage (SAIL) system in accordance with Information Governance. These were to: 1) ensure data transportation is secure; 2) operate a reliable record matching technique to enable accurate record linkage across datasets; 3) anonymise and encrypt the data to prevent re-identification of individuals; 4) apply measures to address disclosure risk in data views created for researchers; 5) ensure data access is controlled and authorised; 6) establish methods for scrutinising proposals for data utilisation and approving output; and 7) gain external verification of compliance with Information Governance.

Results

The SAIL databank has been established and it operates on a DB2 platform (Data Warehouse Edition on AIX) running on an IBM 'P' series Supercomputer: Blue-C. The findings of an independent internal audit were favourable and concluded that the systems in place provide adequate assurance of compliance with Information Governance. This expanding databank already holds over 500 million anonymised and encrypted individual-level records from a range of sources relevant to health and well-being. This includes national datasets covering the whole of Wales (approximately 3 million population) and local provider-level datasets, with further growth in progress. The utility of the databank is demonstrated by increasing engagement in high quality research studies.

Conclusion

Through the pragmatic approach that has been adopted, we have been able to address the key challenges in establishing a national databank of anonymised person-based records, so that the data are available for research and evaluation whilst meeting the requirements of Information Governance.
Appendix
Available only for authorised users
Literature
1.
go back to reference Black N: High-quality clinical databases: breaking down barriers. Lancet. 1999, 353: 1205-1206. 10.1016/S0140-6736(99)00108-7.CrossRefPubMed Black N: High-quality clinical databases: breaking down barriers. Lancet. 1999, 353: 1205-1206. 10.1016/S0140-6736(99)00108-7.CrossRefPubMed
7.
go back to reference European Union Directive on Data Protection. Official Journal of the European Community. 1995, 31: 10.1258/135581903766468846. No. L. 281 European Union Directive on Data Protection. Official Journal of the European Community. 1995, 31: 10.1258/135581903766468846. No. L. 281
8.
go back to reference Boyd P: Health research and the Data Protection Act 1998. J Health Serv Res Policy. 2003, 8 (suppl 1): 24-27. 10.1186/1472-6947-9-3.CrossRefPubMed Boyd P: Health research and the Data Protection Act 1998. J Health Serv Res Policy. 2003, 8 (suppl 1): 24-27. 10.1186/1472-6947-9-3.CrossRefPubMed
27.
go back to reference Lyons R, Jones K, John G, Brooks C, Verplancke J, Ford D, Brown G, Leake K: The SAIL databank: linking multiple health and social care datasets. BMC Medical Informatics and Decision Making. 2009, 9: 3-10.1197/jamia.M2273.CrossRefPubMedPubMedCentral Lyons R, Jones K, John G, Brooks C, Verplancke J, Ford D, Brown G, Leake K: The SAIL databank: linking multiple health and social care datasets. BMC Medical Informatics and Decision Making. 2009, 9: 3-10.1197/jamia.M2273.CrossRefPubMedPubMedCentral
29.
go back to reference Schneier B: Description of a New Variable-Length Key, 64-bit Block Cipher (Blowfish). Fast Software Encryption, Cambridge Security Workshop Proceedings (December 1993). 1994, Springer-Verlag, 191-204.CrossRef Schneier B: Description of a New Variable-Length Key, 64-bit Block Cipher (Blowfish). Fast Software Encryption, Cambridge Security Workshop Proceedings (December 1993). 1994, Springer-Verlag, 191-204.CrossRef
33.
go back to reference Safran C, Bloomrosen M, Hammond WE, Labkoff S, Markel-Fox S, Tang PC, Detmer DE: Toward a National Framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper. Journal of the American Informatics Association. 2007, 14: 1-9. 10.1197/jamia.M2273.CrossRef Safran C, Bloomrosen M, Hammond WE, Labkoff S, Markel-Fox S, Tang PC, Detmer DE: Toward a National Framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper. Journal of the American Informatics Association. 2007, 14: 1-9. 10.1197/jamia.M2273.CrossRef
34.
go back to reference Rodgers SE, Lyons RA, Dsilva R, Jones KH, Brooks CJ, Ford DV, John G, Verplancke J-P: Residential Anonymous Linking Fields (RALFs): a novel information infrastructure to study the interaction between the environment and individuals' health. J Public Health (Oxf). 2009. Rodgers SE, Lyons RA, Dsilva R, Jones KH, Brooks CJ, Ford DV, John G, Verplancke J-P: Residential Anonymous Linking Fields (RALFs): a novel information infrastructure to study the interaction between the environment and individuals' health. J Public Health (Oxf). 2009.
35.
36.
go back to reference Bayardo RJ, Agrawal R: Data privacy through optimal k-anonymisation. Proceedings of the 21st International Conference on Data Engineering. 2005 Bayardo RJ, Agrawal R: Data privacy through optimal k-anonymisation. Proceedings of the 21st International Conference on Data Engineering. 2005
37.
go back to reference Wong RC-W, Li J, Fu AW-C, Wang K: (α, k)-Anonymity: an Enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. Conference on knowledge discovery in data. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006 Wong RC-W, Li J, Fu AW-C, Wang K: (α, k)-Anonymity: an Enhanced k-Anonymity Model for Privacy-Preserving Data Publishing. Conference on knowledge discovery in data. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006
38.
go back to reference Skinner C, Shlomo N: Assessing identification risk in survey microdata using log-linear models. 2006, Southampton Statistical Sciences Research Institute, S3RI Methodology Working Papers, M06/14, 36-Accessed 23 January 2009, [http://eprints.soton.ac.uk/41842/] Skinner C, Shlomo N: Assessing identification risk in survey microdata using log-linear models. 2006, Southampton Statistical Sciences Research Institute, S3RI Methodology Working Papers, M06/14, 36-Accessed 23 January 2009, [http://​eprints.​soton.​ac.​uk/​41842/​]
Metadata
Title
The SAIL Databank: building a national architecture for e-health research and evaluation
Authors
David V Ford
Kerina H Jones
Jean-Philippe Verplancke
Ronan A Lyons
Gareth John
Ginevra Brown
Caroline J Brooks
Simon Thompson
Owen Bodger
Tony Couch
Ken Leake
Publication date
01-12-2009
Publisher
BioMed Central
Published in
BMC Health Services Research / Issue 1/2009
Electronic ISSN: 1472-6963
DOI
https://doi.org/10.1186/1472-6963-9-157

Other articles of this Issue 1/2009

BMC Health Services Research 1/2009 Go to the issue