Skip to main content
Top
Published in: International Journal of Legal Medicine 2/2016

01-03-2016 | Original Article

Selection of highly informative SNP markers for population affiliation of major US populations

Authors: Xiangpei Zeng, Ranajit Chakraborty, Jonathan L. King, Bobby LaRue, Rodrigo S. Moura-Neto, Bruce Budowle

Published in: International Journal of Legal Medicine | Issue 2/2016

Login to get access

Abstract

Ancestry informative markers (AIMs) can be used to detect and adjust for population stratification and predict the ancestry of the source of an evidence sample. Autosomal single nucleotide polymorphisms (SNPs) are the best candidates for AIMs. It is essential to identify the most informative AIM SNPs across relevant populations. Several informativeness measures for ancestry estimation have been used for AIMs selection: absolute allele frequency differences (δ), F statistics (F ST), and informativeness for assignment measure (In). However, their efficacy has not been compared objectively, particularly for determining affiliations of major US populations. In this study, these three measures were directly compared for AIMs selection among four major US populations, i.e., African American, Caucasian, East Asian, and Hispanic American. The results showed that the F ST panel performed slightly better for population resolution based on principal component analysis (PCA) clustering than did the δ panel and both performed better than the In panel. Therefore, the 23 AIMs selected by the F ST measure were used to characterize the four major American populations. Genotype data of nine sample populations were used to evaluate the efficiency of the 23-AIMs panel. The results indicated that individuals could be correctly assigned to the major population categories. Our AIMs panel could contribute to the candidate pool of AIMs for potential forensic identification purposes.
Appendix
Available only for authorised users
Literature
1.
go back to reference Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298:2381–2385CrossRefPubMed Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, Zhivotovsky LA, Feldman MW (2002) Genetic structure of human populations. Science 298:2381–2385CrossRefPubMed
2.
go back to reference Hoggart CJ, Parra EJ, Shriver MD, Bonilla C, Kittles RA, Clayton DG, McKeigue PM (2003) Control of confounding of genetic associations in stratified populations. Am J Hum Genet 72:1492–1504PubMedCentralCrossRefPubMed Hoggart CJ, Parra EJ, Shriver MD, Bonilla C, Kittles RA, Clayton DG, McKeigue PM (2003) Control of confounding of genetic associations in stratified populations. Am J Hum Genet 72:1492–1504PubMedCentralCrossRefPubMed
3.
go back to reference Shriver MD, Parra EJ, Dios S, Bonilla C, Norton H, Jovel C, Pfaff C, Jones C, Massac A, Cameron N, Baron A, Jackson T, Argyropoulos G, Jin L, Hoggart CJ, McKeigue PM, Kittles RA (2003) Skin pigmentation, biogeographical ancestry and admixture mapping. Hum Genet 112:387–399PubMed Shriver MD, Parra EJ, Dios S, Bonilla C, Norton H, Jovel C, Pfaff C, Jones C, Massac A, Cameron N, Baron A, Jackson T, Argyropoulos G, Jin L, Hoggart CJ, McKeigue PM, Kittles RA (2003) Skin pigmentation, biogeographical ancestry and admixture mapping. Hum Genet 112:387–399PubMed
4.
go back to reference Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517CrossRefPubMed Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517CrossRefPubMed
5.
6.
go back to reference Yang N, Li H, Criswell LA, Gregersen PK, Alarcon-Riquelme ME, Kittles R, Shigeta R, Silva G, Patel PI, Belmont JW, Seldin MF (2005) Examination of ancestry and ethnic affiliation using highly informative diallelic DNA markers: application to diverse and admixed populations and implications for clinical epidemiology and forensic medicine. Hum Genet 118:382–392CrossRefPubMed Yang N, Li H, Criswell LA, Gregersen PK, Alarcon-Riquelme ME, Kittles R, Shigeta R, Silva G, Patel PI, Belmont JW, Seldin MF (2005) Examination of ancestry and ethnic affiliation using highly informative diallelic DNA markers: application to diverse and admixed populations and implications for clinical epidemiology and forensic medicine. Hum Genet 118:382–392CrossRefPubMed
7.
go back to reference Shriver MD, Kittles RA (2004) Genetic ancestry and the search for personalized genetic histories. Nat Rev Genet 5:611–618CrossRefPubMed Shriver MD, Kittles RA (2004) Genetic ancestry and the search for personalized genetic histories. Nat Rev Genet 5:611–618CrossRefPubMed
8.
go back to reference King JL, LaRue BL, Novroski NM, Stoljarova M, Seo SB, Zeng X, Warshauer DH, Davis CP, Parson W, Sajantila A, Budowle B (2014) High-quality and high throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq. Forensic Sci Int Genet 12:128–135CrossRefPubMed King JL, LaRue BL, Novroski NM, Stoljarova M, Seo SB, Zeng X, Warshauer DH, Davis CP, Parson W, Sajantila A, Budowle B (2014) High-quality and high throughput massively parallel sequencing of the human mitochondrial genome using the Illumina MiSeq. Forensic Sci Int Genet 12:128–135CrossRefPubMed
9.
go back to reference Jobling MA, Tyler-Smith C (2003) The human Y chromosome: an evolutionary marker comes of age. Nat Rev Genet 4:598–612CrossRefPubMed Jobling MA, Tyler-Smith C (2003) The human Y chromosome: an evolutionary marker comes of age. Nat Rev Genet 4:598–612CrossRefPubMed
10.
go back to reference Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC (1991) African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507CrossRefPubMed Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC (1991) African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507CrossRefPubMed
11.
go back to reference Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborty R (1994) Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet 55:175–189PubMedCentralPubMed Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborty R (1994) Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet 55:175–189PubMedCentralPubMed
12.
go back to reference Jin L, Chakraborty R (1995) Population structure, stepwise mutations, heterozygote deficiency and their implications in DNA forensics. Heredity 74:274–285CrossRefPubMed Jin L, Chakraborty R (1995) Population structure, stepwise mutations, heterozygote deficiency and their implications in DNA forensics. Heredity 74:274–285CrossRefPubMed
13.
go back to reference Smith MW, Lautenberger JA, Shin HD, Chretien JP, Shrestha S, Gilbert DA, O’Brien SJ (2001) Markers for mapping by admixture linkage disequilibrium in African American and Hispanic populations. Am J Hum Genet 69:1080–1094PubMedCentralCrossRefPubMed Smith MW, Lautenberger JA, Shin HD, Chretien JP, Shrestha S, Gilbert DA, O’Brien SJ (2001) Markers for mapping by admixture linkage disequilibrium in African American and Hispanic populations. Am J Hum Genet 69:1080–1094PubMedCentralCrossRefPubMed
14.
go back to reference Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311PubMedCentralCrossRefPubMed Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29:308–311PubMedCentralCrossRefPubMed
15.
go back to reference International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796CrossRef International HapMap Consortium (2003) The International HapMap Project. Nature 426:789–796CrossRef
16.
go back to reference 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65CrossRef 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA (2012) An integrated map of genetic variation from 1,092 human genomes. Nature 491:56–65CrossRef
17.
go back to reference Phillips C, Salas A, Sánchez JJ, Fondevila M, Gómez-Tato A, Alvarez-Dios J, Calaza M, de Cal MC, Ballard D, Lareu MV, Carracedo A, SNPforID Consortium (2007) Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs. Forensic Sci Int Genet 1:273–280CrossRefPubMed Phillips C, Salas A, Sánchez JJ, Fondevila M, Gómez-Tato A, Alvarez-Dios J, Calaza M, de Cal MC, Ballard D, Lareu MV, Carracedo A, SNPforID Consortium (2007) Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs. Forensic Sci Int Genet 1:273–280CrossRefPubMed
18.
go back to reference Kosoy R, Nassir R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF (2009) Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Hum Mutat 30:69–78PubMedCentralCrossRefPubMed Kosoy R, Nassir R, Tian C, White PA, Butler LM, Silva G, Kittles R, Alarcon-Riquelme ME, Gregersen PK, Belmont JW, De La Vega FM, Seldin MF (2009) Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America. Hum Mutat 30:69–78PubMedCentralCrossRefPubMed
19.
go back to reference Kidd KK, Speed WC, Pakstis AJ, Furtado MR, Fang R, Madbouly A, Maiers M, Middha M, Friedlaender FR, Kidd JR (2014) Progress toward an efficient panel of SNPs for ancestry inference. Forensic Sci Int Genet 10:23–32CrossRefPubMed Kidd KK, Speed WC, Pakstis AJ, Furtado MR, Fang R, Madbouly A, Maiers M, Middha M, Friedlaender FR, Kidd JR (2014) Progress toward an efficient panel of SNPs for ancestry inference. Forensic Sci Int Genet 10:23–32CrossRefPubMed
20.
go back to reference Nievergelt CM, Maihofer AX, Shekhtman T, Libiger O, Wang X, Kidd KK, Kidd JR (2013) Inference of human continental origin and admixture proportions using a highly discriminative ancestry informative 41-SNP panel. Investig Genet 4:13PubMedCentralCrossRefPubMed Nievergelt CM, Maihofer AX, Shekhtman T, Libiger O, Wang X, Kidd KK, Kidd JR (2013) Inference of human continental origin and admixture proportions using a highly discriminative ancestry informative 41-SNP panel. Investig Genet 4:13PubMedCentralCrossRefPubMed
21.
go back to reference Wei YL, Wei L, Zhao L, Sun QF, Jiang L, Zhang T, Liu HB, Chen JG, Ye J, Hu L, Li CX (2015) A single-tube 27-plex SNP assay for estimating individual ancestry and admixture from three continents. Int J Legal Med Wei YL, Wei L, Zhao L, Sun QF, Jiang L, Zhang T, Liu HB, Chen JG, Ye J, Hu L, Li CX (2015) A single-tube 27-plex SNP assay for estimating individual ancestry and admixture from three continents. Int J Legal Med
22.
24.
go back to reference Ding L, Wiener H, Abebe T, Altaye M, Go RC, Kercsmar C, Grabowski G, Martin LJ, Khurana Hershey GK, Chakorborty R, Baye TM (2011) Comparison of measures of marker informativeness for ancestry and admixture mapping. BMC Genomics 12:622PubMedCentralCrossRefPubMed Ding L, Wiener H, Abebe T, Altaye M, Go RC, Kercsmar C, Grabowski G, Martin LJ, Khurana Hershey GK, Chakorborty R, Baye TM (2011) Comparison of measures of marker informativeness for ancestry and admixture mapping. BMC Genomics 12:622PubMedCentralCrossRefPubMed
25.
go back to reference Amirisetty S, Hershey GK, Baye TM (2012) AncestrySNPminer: a bioinformatics tool to retrieve and develop ancestry informative SNP panels. Genomics 100:57–63PubMedCentralCrossRefPubMed Amirisetty S, Hershey GK, Baye TM (2012) AncestrySNPminer: a bioinformatics tool to retrieve and develop ancestry informative SNP panels. Genomics 100:57–63PubMedCentralCrossRefPubMed
28.
go back to reference Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577PubMed Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577PubMed
29.
go back to reference Qin P, Li Z, Jin W, Lu D, Lou H, Shen J, Jin L, Shi Y, Xu S (2014) A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese. Eur J Hum Genet 22:248–253PubMedCentralCrossRefPubMed Qin P, Li Z, Jin W, Lu D, Lou H, Shen J, Jin L, Shi Y, Xu S (2014) A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese. Eur J Hum Genet 22:248–253PubMedCentralCrossRefPubMed
30.
go back to reference Adinsoft SARL (2010) XLSTAT-software. Version 10. Addinsoft, Paris Adinsoft SARL (2010) XLSTAT-software. Version 10. Addinsoft, Paris
31.
go back to reference Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959PubMedCentralPubMed Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959PubMedCentralPubMed
32.
go back to reference SPSS Inc (2007) SPSS for Windows. Version 16.0. Chicago SPSS Inc (2007) SPSS for Windows. Version 16.0. Chicago
33.
go back to reference Green SB, Salkind NJ, Akey TM (2008) Using SPSS for Windows and Macintosh: analyzing and understanding data. Prentice Hall, New Jersey Green SB, Salkind NJ, Akey TM (2008) Using SPSS for Windows and Macintosh: analyzing and understanding data. Prentice Hall, New Jersey
34.
go back to reference Kidd JM, Gravel S, Byrnes J, Moreno-Estrada A, Musharoff S, Bryc K, Degenhardt JD, Brisbin A, Sheth V, Chen R, McLaughlin SF, Peckham HE, Omberg L, Bormann-Chung CA, Stanley S, Pearlstein K, Levandowsky E, Gravel S, Acevedo-Acevedo S, Auton A, Keinan A, Acuna-Alonzo V, Canizales-Quinteros S, Eng C, Burchard EG, Russell A, Reynolds A, Clark AG, Reese M, Lincoln SE, Butte AJ, De La Vega FM, Bustamante CD (2012) Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation. Am J Hum Genet 91:660–671PubMedCentralCrossRefPubMed Kidd JM, Gravel S, Byrnes J, Moreno-Estrada A, Musharoff S, Bryc K, Degenhardt JD, Brisbin A, Sheth V, Chen R, McLaughlin SF, Peckham HE, Omberg L, Bormann-Chung CA, Stanley S, Pearlstein K, Levandowsky E, Gravel S, Acevedo-Acevedo S, Auton A, Keinan A, Acuna-Alonzo V, Canizales-Quinteros S, Eng C, Burchard EG, Russell A, Reynolds A, Clark AG, Reese M, Lincoln SE, Butte AJ, De La Vega FM, Bustamante CD (2012) Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation. Am J Hum Genet 91:660–671PubMedCentralCrossRefPubMed
35.
go back to reference Wall JD, Jiang R, Gignoux C, Chen GK, Eng C, Huntsman S, Marjoram P (2011) Genetic variation in Native Americans, inferred from Latino SNP and resequencing data. Mol Biol Evol 28:2231–2237PubMedCentralCrossRefPubMed Wall JD, Jiang R, Gignoux C, Chen GK, Eng C, Huntsman S, Marjoram P (2011) Genetic variation in Native Americans, inferred from Latino SNP and resequencing data. Mol Biol Evol 28:2231–2237PubMedCentralCrossRefPubMed
36.
go back to reference Salazar-Flores J, Zuñiga-Chiquette F, Rubi-Castellanos R, Álvarez-Miranda JL, Zetina-Hérnandez A, Martínez-Sevilla VM, González-Andrade F, Corach D, Vullo C, Álvarez JC, Lorente JA, Sánchez-Diz P, Herrera RJ, Cerda-Flores RM, Muñoz-Valle JF, Rangel-Villalobos H (2015) Admixture and genetic relationships of Mexican Mestizos regarding Latin American and Caribbean populations based on 13 CODIS-STRs. Homo 66:44–59CrossRefPubMed Salazar-Flores J, Zuñiga-Chiquette F, Rubi-Castellanos R, Álvarez-Miranda JL, Zetina-Hérnandez A, Martínez-Sevilla VM, González-Andrade F, Corach D, Vullo C, Álvarez JC, Lorente JA, Sánchez-Diz P, Herrera RJ, Cerda-Flores RM, Muñoz-Valle JF, Rangel-Villalobos H (2015) Admixture and genetic relationships of Mexican Mestizos regarding Latin American and Caribbean populations based on 13 CODIS-STRs. Homo 66:44–59CrossRefPubMed
37.
go back to reference Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23:1801–1806CrossRefPubMed Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23:1801–1806CrossRefPubMed
38.
go back to reference Rosenberg N (2004) Distruct: a program for the graphical display of population structure. Mol Ecol Notes 4:137–138CrossRef Rosenberg N (2004) Distruct: a program for the graphical display of population structure. Mol Ecol Notes 4:137–138CrossRef
39.
go back to reference Bushnell D, Hudson RA (2010) Colombia: a country study. Federal Research Division, Library of Congress, Washingtion D.C Bushnell D, Hudson RA (2010) Colombia: a country study. Federal Research Division, Library of Congress, Washingtion D.C
40.
go back to reference Halder I, Shriver M, Thomas M, Fernandez JR, Frudakis T (2008) A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications. Hum Mutat 29:648–658CrossRefPubMed Halder I, Shriver M, Thomas M, Fernandez JR, Frudakis T (2008) A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications. Hum Mutat 29:648–658CrossRefPubMed
41.
go back to reference Phillips C, Parson W, Lundsberg B, Santos C, Freire-Aradas A, Torres M, Eduardoff M, Børsting C, Johansen P, Fondevila M, Morling N, Schneider P, EUROFORGEN-NoE Consortium, Carracedo A, Lareu MV (2014) Building a forensic ancestry panel from the ground up: the EUROFORGEN Global AIM-SNP set. Forensic Sci Int Genet 11:13–25CrossRefPubMed Phillips C, Parson W, Lundsberg B, Santos C, Freire-Aradas A, Torres M, Eduardoff M, Børsting C, Johansen P, Fondevila M, Morling N, Schneider P, EUROFORGEN-NoE Consortium, Carracedo A, Lareu MV (2014) Building a forensic ancestry panel from the ground up: the EUROFORGEN Global AIM-SNP set. Forensic Sci Int Genet 11:13–25CrossRefPubMed
42.
go back to reference Gettings KB, Lai R, Johnson JL, Peck MA, Hart JA, Gordish-Dressman H, Schanfield MS, Podini DS (2014) A 50-SNP assay for biogeographic ancestry and phenotype prediction in the US population. Forensic Sci Int Genet 8:101–108CrossRefPubMed Gettings KB, Lai R, Johnson JL, Peck MA, Hart JA, Gordish-Dressman H, Schanfield MS, Podini DS (2014) A 50-SNP assay for biogeographic ancestry and phenotype prediction in the US population. Forensic Sci Int Genet 8:101–108CrossRefPubMed
43.
go back to reference Jia J, Wei YL, Qin CJ, Hu L, Wan LH, Li CX (2014) Developing a novel panel of genome-wide ancestry informative markers for bio-geographical ancestry estimates. Forensic Sci Int Genet 8:187–194CrossRefPubMed Jia J, Wei YL, Qin CJ, Hu L, Wan LH, Li CX (2014) Developing a novel panel of genome-wide ancestry informative markers for bio-geographical ancestry estimates. Forensic Sci Int Genet 8:187–194CrossRefPubMed
44.
go back to reference Rogalla U, Rychlicka E, Derenko MV, Malyarchuk BA, Grzybowski T (2015) Simple and cost-effective 14-loci SNP assay designed for differentiation of European, East Asian and African samples. Forensic Sci Int Genet 14:42–49CrossRefPubMed Rogalla U, Rychlicka E, Derenko MV, Malyarchuk BA, Grzybowski T (2015) Simple and cost-effective 14-loci SNP assay designed for differentiation of European, East Asian and African samples. Forensic Sci Int Genet 14:42–49CrossRefPubMed
Metadata
Title
Selection of highly informative SNP markers for population affiliation of major US populations
Authors
Xiangpei Zeng
Ranajit Chakraborty
Jonathan L. King
Bobby LaRue
Rodrigo S. Moura-Neto
Bruce Budowle
Publication date
01-03-2016
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Legal Medicine / Issue 2/2016
Print ISSN: 0937-9827
Electronic ISSN: 1437-1596
DOI
https://doi.org/10.1007/s00414-015-1297-9

Other articles of this Issue 2/2016

International Journal of Legal Medicine 2/2016 Go to the issue