Skip to main content
Top
Published in: Journal of Translational Medicine 1/2012

Open Access 01-12-2012 | Methodology

The Stanford Data Miner: a novel approach for integrating and exploring heterogeneous immunological data

Authors: Janet C Siebert, Wes Munsil, Yael Rosenberg-Hasson, Mark M Davis, Holden T Maecker

Published in: Journal of Translational Medicine | Issue 1/2012

Login to get access

Abstract

Background

Systems-level approaches are increasingly common in both murine and human translational studies. These approaches employ multiple high information content assays. As a result, there is a need for tools to integrate heterogeneous types of laboratory and clinical/demographic data, and to allow the exploration of that data by aggregating and/or segregating results based on particular variables (e.g., mean cytokine levels by age and gender).

Methods

Here we describe the application of standard data warehousing tools to create a novel environment for user-driven upload, integration, and exploration of heterogeneous data. The system presented here currently supports flow cytometry and immunoassays performed in the Stanford Human Immune Monitoring Center, but could be applied more generally.

Results

Users upload assay results contained in platform-specific spreadsheets of a defined format, and clinical and demographic data in spreadsheets of flexible format. Users then map sample IDs to connect the assay results with the metadata. An OLAP (on-line analytical processing) data exploration interface allows filtering and display of various dimensions (e.g., Luminex analytes in rows, treatment group in columns, filtered on a particular study). Statistics such as mean, median, and N can be displayed. The views can be expanded or contracted to aggregate or segregate data at various levels. Individual-level data is accessible with a single click. The result is a user-driven system that permits data integration and exploration in a variety of settings. We show how the system can be used to find gender-specific differences in serum cytokine levels, and compare them across experiments and assay types.

Conclusions

We have used the tools and techniques of data warehousing, including open-source business intelligence software, to support investigator-driven data integration and mining of diverse immunological data.
Appendix
Available only for authorised users
Literature
1.
go back to reference Querec TD, Akondy RS, Lee EK, Cao W, Nakaya HI, Teuwen D, Pirani A, Gernert K, Deng J, Marzolf B, Kennedy K, Wu H, Bennouna S, Oluoch H, Miller J, Vencio RZ, Mulligan M, Aderem A, Ahmed R, Pulendran B: Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans. Nat Immunol. 2009, 10: 116-125.CrossRefPubMedPubMedCentral Querec TD, Akondy RS, Lee EK, Cao W, Nakaya HI, Teuwen D, Pirani A, Gernert K, Deng J, Marzolf B, Kennedy K, Wu H, Bennouna S, Oluoch H, Miller J, Vencio RZ, Mulligan M, Aderem A, Ahmed R, Pulendran B: Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans. Nat Immunol. 2009, 10: 116-125.CrossRefPubMedPubMedCentral
2.
go back to reference Kimball R, Ross M, Thornthwaite W, Mundy J, Becker B: The Data Warehouse Lifecycle Toolkit. 2008, Wiley, 2 Kimball R, Ross M, Thornthwaite W, Mundy J, Becker B: The Data Warehouse Lifecycle Toolkit. 2008, Wiley, 2
3.
go back to reference Siebert J: Integrated biomarker discovery: combining heterogeneous data. Bioanalysis. 2011, 3: 2369-2372. 10.4155/bio.11.229.CrossRefPubMed Siebert J: Integrated biomarker discovery: combining heterogeneous data. Bioanalysis. 2011, 3: 2369-2372. 10.4155/bio.11.229.CrossRefPubMed
4.
go back to reference Janetzki S, Britten CM, Kalos M, Levitsky HI, Maecker HT, Melief CJM, Old LJ, Romero P, Hoos A, Davis MM: "MIATA"-minimal information about T cell assays. Immunity. 2009, 31: 527-528. 10.1016/j.immuni.2009.09.007.CrossRefPubMedPubMedCentral Janetzki S, Britten CM, Kalos M, Levitsky HI, Maecker HT, Melief CJM, Old LJ, Romero P, Hoos A, Davis MM: "MIATA"-minimal information about T cell assays. Immunity. 2009, 31: 527-528. 10.1016/j.immuni.2009.09.007.CrossRefPubMedPubMedCentral
5.
go back to reference Lee JA, Spidlen J, Boyce K, Cai J, Crosbie N, Dalphin M, Furlong J, Gasparetto M, Goldberg M, Goralczyk EM, Hyun B, Jansen K, Kollmann T, Kong M, Leif R, McWeeney S, Moloshok TD, Moore W, Nolan G, Nolan J, Nikolich-Zugich J, Parrish D, Purcell B, Qian Y, Selvaraj B, Smith C, Tchuvatkina O, Wertheimer A, Wilkinson P, Wilson C, Wood J, Zigon R, Scheuermann RH, Brinkman RR: MIFlowCyt: the minimum information about a Flow Cytometry Experiment. Cytometry A. 2008, 73: 926-930.CrossRefPubMedPubMedCentral Lee JA, Spidlen J, Boyce K, Cai J, Crosbie N, Dalphin M, Furlong J, Gasparetto M, Goldberg M, Goralczyk EM, Hyun B, Jansen K, Kollmann T, Kong M, Leif R, McWeeney S, Moloshok TD, Moore W, Nolan G, Nolan J, Nikolich-Zugich J, Parrish D, Purcell B, Qian Y, Selvaraj B, Smith C, Tchuvatkina O, Wertheimer A, Wilkinson P, Wilson C, Wood J, Zigon R, Scheuermann RH, Brinkman RR: MIFlowCyt: the minimum information about a Flow Cytometry Experiment. Cytometry A. 2008, 73: 926-930.CrossRefPubMedPubMedCentral
6.
go back to reference Alkharouf NW, Jamison DC, Matthews BF: Online analytical processing (OLAP): a fast and effective data mining tool for gene expression databases. J Biomed Biotechnol. 2005, 2005: 181-188. 10.1155/JBB.2005.181.CrossRefPubMedPubMedCentral Alkharouf NW, Jamison DC, Matthews BF: Online analytical processing (OLAP): a fast and effective data mining tool for gene expression databases. J Biomed Biotechnol. 2005, 2005: 181-188. 10.1155/JBB.2005.181.CrossRefPubMedPubMedCentral
7.
go back to reference Kehl C, Simms AM, Toofanny RD, Daggett V: Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data. Protein Eng Des Sel. 2008, 21: 379-386. 10.1093/protein/gzn015.CrossRefPubMed Kehl C, Simms AM, Toofanny RD, Daggett V: Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data. Protein Eng Des Sel. 2008, 21: 379-386. 10.1093/protein/gzn015.CrossRefPubMed
8.
go back to reference Bernier E, Gosselin P, Badard T, Bédard Y: Easier surveillance of climate-related health vulnerabilities through a Web-based spatial OLAP application. Int J Health Geogr. 2009, 8: 18-10.1186/1476-072X-8-18.CrossRefPubMedPubMedCentral Bernier E, Gosselin P, Badard T, Bédard Y: Easier surveillance of climate-related health vulnerabilities through a Web-based spatial OLAP application. Int J Health Geogr. 2009, 8: 18-10.1186/1476-072X-8-18.CrossRefPubMedPubMedCentral
9.
go back to reference Yang Z, Zhang Z, Wen J, Wang X, Lu B, Yang Z, Zhang W, Wang M, Feng X, Ling C, Wu S, Hu R: Elevated serum chemokine CXC ligand 5 levels are associated with hypercholesterolemia but not a worsening of insulin resistance in Chinese people. J Clin Endocrinol Metab. 2010, 95: 3926-3932. 10.1210/jc.2009-2194.CrossRefPubMed Yang Z, Zhang Z, Wen J, Wang X, Lu B, Yang Z, Zhang W, Wang M, Feng X, Ling C, Wu S, Hu R: Elevated serum chemokine CXC ligand 5 levels are associated with hypercholesterolemia but not a worsening of insulin resistance in Chinese people. J Clin Endocrinol Metab. 2010, 95: 3926-3932. 10.1210/jc.2009-2194.CrossRefPubMed
10.
go back to reference Casabiell X, Piñeiro V, Peino R, Lage M, Camiña J, Gallego R, Vallejo LG, Dieguez C, Casanueva FF: Gender differences in both spontaneous and stimulated leptin secretion by human omental adipose tissue in vitro: dexamethasone and estradiol stimulate leptin release in women, but not in men. J Clin Endocrinol Metab. 1998, 83: 2149-2155. 10.1210/jc.83.6.2149.PubMed Casabiell X, Piñeiro V, Peino R, Lage M, Camiña J, Gallego R, Vallejo LG, Dieguez C, Casanueva FF: Gender differences in both spontaneous and stimulated leptin secretion by human omental adipose tissue in vitro: dexamethasone and estradiol stimulate leptin release in women, but not in men. J Clin Endocrinol Metab. 1998, 83: 2149-2155. 10.1210/jc.83.6.2149.PubMed
11.
go back to reference Couillard C, Mauriège P, Prud'homme D, Nadeau A, Tremblay A, Bouchard C, Després JP: Plasma leptin concentrations: gender differences and associations with metabolic risk factors for cardiovascular disease. Diabetologia. 1997, 40: 1178-1184. 10.1007/s001250050804.CrossRefPubMed Couillard C, Mauriège P, Prud'homme D, Nadeau A, Tremblay A, Bouchard C, Després JP: Plasma leptin concentrations: gender differences and associations with metabolic risk factors for cardiovascular disease. Diabetologia. 1997, 40: 1178-1184. 10.1007/s001250050804.CrossRefPubMed
12.
go back to reference Leung J, Jayachandran M, Kendall-Thomas J, Behrenbeck T, Araoz P, Miller VM: Pilot study of sex differences in chemokine/cytokine markers of atherosclerosis in humans. Gend Med. 2008, 5: 44-52. 10.1016/S1550-8579(08)80007-1.CrossRefPubMed Leung J, Jayachandran M, Kendall-Thomas J, Behrenbeck T, Araoz P, Miller VM: Pilot study of sex differences in chemokine/cytokine markers of atherosclerosis in humans. Gend Med. 2008, 5: 44-52. 10.1016/S1550-8579(08)80007-1.CrossRefPubMed
13.
go back to reference Witten IH, Frank E, Hall MA: Data Mining: Practical Machine Learning Tools and Techniques, Third Edition. 2011, Morgan Kaufmann, 3 Witten IH, Frank E, Hall MA: Data Mining: Practical Machine Learning Tools and Techniques, Third Edition. 2011, Morgan Kaufmann, 3
14.
go back to reference Maecker HT, McCoy JP, Nussenblatt R: Standardizing immunophenotyping for the human immunology project. Nat Rev Immunol. 2012, 12: 191-200.PubMedPubMedCentral Maecker HT, McCoy JP, Nussenblatt R: Standardizing immunophenotyping for the human immunology project. Nat Rev Immunol. 2012, 12: 191-200.PubMedPubMedCentral
15.
go back to reference Kotecha N, Krutzik PO, Irish JM: Web-based analysis and publication of flow cytometry experiments. Curr Protoc Cytom. 2010, Chapter 10: Unit10.17-PubMed Kotecha N, Krutzik PO, Irish JM: Web-based analysis and publication of flow cytometry experiments. Curr Protoc Cytom. 2010, Chapter 10: Unit10.17-PubMed
16.
go back to reference Gollub J, Ball CA, Sherlock G: The stanford microarray database: a user's guide. Methods Mol Biol. 2006, 338: 191-208.PubMed Gollub J, Ball CA, Sherlock G: The stanford microarray database: a user's guide. Methods Mol Biol. 2006, 338: 191-208.PubMed
Metadata
Title
The Stanford Data Miner: a novel approach for integrating and exploring heterogeneous immunological data
Authors
Janet C Siebert
Wes Munsil
Yael Rosenberg-Hasson
Mark M Davis
Holden T Maecker
Publication date
01-12-2012
Publisher
BioMed Central
Published in
Journal of Translational Medicine / Issue 1/2012
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/1479-5876-10-62

Other articles of this Issue 1/2012

Journal of Translational Medicine 1/2012 Go to the issue
Live Webinar | 27-06-2024 | 18:00 (CEST)

Keynote webinar | Spotlight on medication adherence

Live: Thursday 27th June 2024, 18:00-19:30 (CEST)

WHO estimates that half of all patients worldwide are non-adherent to their prescribed medication. The consequences of poor adherence can be catastrophic, on both the individual and population level.

Join our expert panel to discover why you need to understand the drivers of non-adherence in your patients, and how you can optimize medication adherence in your clinics to drastically improve patient outcomes.

Prof. Kevin Dolgin
Prof. Florian Limbourg
Prof. Anoop Chauhan
Developed by: Springer Medicine
Obesity Clinical Trial Summary

At a glance: The STEP trials

A round-up of the STEP phase 3 clinical trials evaluating semaglutide for weight loss in people with overweight or obesity.

Developed by: Springer Medicine