Abstract
The last decade has seen substantial advances in the understanding of the genetics of complex traits and disease. This has been largely driven by genome-wide association studies (GWAS), which have identified thousands of genetic loci associated with these traits and disease. This chapter provides a guide on how to perform GWAS on both binary (case–control) and quantitative traits. As poor data quality, through both genotyping failures and unobserved population structure, is a major cause of false-positive genetic associations, there is a particular focus on the crucial steps required to prepare the SNP data prior to analysis. This is followed by the methods used to perform the actual GWAS and visualization of the results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stranger BE, Stahl EA, Raj T (2011) Progress and promise of genome-wide association studies for human complex trait genetics. Genetics 187:367–383
Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H et al (2014) The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 42:D1001–D1006
Dewan A, Liu M, Hartman S, Zhang SS-M, Liu DTL, Zhao C et al (2006) HTRA1 promoter polymorphism in wet age-related macular degeneration. Science 314(5801):989–992
Klein RJ, Zeiss C, Chew EY, Tsai J-Y, Sackler RS, Haynes C et al (2005) Complement factor H polymorphism in age-related macular degeneration. Science 308(5720):385–389
The Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661–678
Visscher PM, Brown MA, McCarthy MI, Yang J (2012) Five years of GWAS discovery. Am J Hum Genet 90(1):7–24
Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM et al (2005) Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet 37(11):1243–1246
Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Zondervan KT (2010) Data quality control in genetic case-control association studies. Nat Protoc 5(9):1564–1573
Rabbee N, Speed TP (2006) A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22(1):7–12
Teo YY, Inouye M, Small KS, Gwilliam R, Deloukas P, Kwiatkowski DP et al (2007) A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23(20):2741–2746
Cardon LR, Palmer LJ (2003) Population stratification and spurious allelic association. Lancet 361:598–604
Campbell CD, Ogburn EL, Lunetta KL, Lyon HN, Freedman ML, Groop LC et al (2005) Demonstrating stratification in a European American population. Nat Genet 37(8):868–872
The International HapMap 3 Consortium (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467:52–58
Turner S, Armstrong LL, Bradford Y, Carlson CS, Crawford DC, Crenshaw AT et al (2011) Quality control procedures for genome-wide association studies. Curr Protoc Hum Genet Chapter 1, Unit 1.19
Wittke-Thompson JK, Pluzhnikov A, Cox NJ (2005) Rational inferences about departures from Hardy-Weinberg equilibrium. Am J Hum Genet 76(6):967–986
Spencer CCA, Su Z, Donnelly P, Marchini J (2009) Designing genome-wide association studies: Sample size, power, imputation, and the choice of genotyping chip. PLoS Genet 5(5):e1000477
The 1000 Genomes Project Consortium (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422):56–65
Marchini J, Howie B (2010) Genotype imputation for genome-wide association studies. Nat Rev Genet 11(7):499–511
Howie B, Marchini J, Stephens M (2011) Genotype imputation with thousands of genomes. G3 1(6):457–470
Pe’er I, Yelensky R, Altshuler D, Daly MJ (2008) Estimation of the multiple testing burden for genomewide association studies of nearly all common variants. Genet Epidemiol 32(4):381–385
Chanock SJ, Manolio T, Boehnke M, Boerwinkle E, Hunter DJ, Thomas G et al (2007) Replicating genotype-phenotype associations. Nature 447(7145):655–660
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D et al (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38(8):904–909
Delaneau O, Marchini J, Zagury J-F (2011) A linear complexity phasing method for thousands of genomes. Nat Methods 9(2):179–181
Delaneau O, Howie B, Cox AJ, Zagury JF, Marchini J (2013) Haplotype estimation using sequencing reads. Am J Hum Genet 93(4):687–696
Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR (2012) Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet 44(8):955–959
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
McRae, A.F. (2017). Analysis of Genome-Wide Association Data. In: Keith, J. (eds) Bioinformatics. Methods in Molecular Biology, vol 1526. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6613-4_9
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6613-4_9
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6611-0
Online ISBN: 978-1-4939-6613-4
eBook Packages: Springer Protocols