Skip to main content
Top
Published in: Journal of Translational Medicine 1/2019

Open Access 01-12-2019 | Research

The phylogeny of 48 alleles, experimentally verified at 21 kb, and its application to clinical allele detection

Authors: Kshitij Srivastava, Kurt R. Wollenberg, Willy A. Flegel

Published in: Journal of Translational Medicine | Issue 1/2019

Login to get access

Abstract

Background

Sequence information generated from next generation sequencing is often computationally phased using haplotype-phasing algorithms. Utilizing experimentally derived allele or haplotype information improves this prediction, as routinely used in HLA typing. We recently established a large dataset of long ERMAP alleles, which code for protein variants in the Scianna blood group system. We propose the phylogeny of this set of 48 alleles and identify evolutionary steps to derive the observed alleles.

Methods

The nucleotide sequence of > 21 kb each was used for all physically confirmed 48 ERMAP alleles that we previously published. Full-length sequences were aligned and variant sites were extracted manually. The Bayesian coalescent algorithm implemented in BEAST v1.8.3 was used to estimate a coalescent phylogeny for these variants and the allelic ancestral states at the internal nodes of the phylogeny.

Results

The phylogenetic analysis allowed us to identify the evolutionary relationships among the 48 ERMAP alleles, predict 4243 potential ancestral alleles and calculate a posterior probability for each of these unobserved alleles. Some of them coincide with observed alleles that are extant in the population.

Conclusions

Our proposed strategy places known alleles in a phylogenetic framework, allowing us to describe as-yet-undiscovered alleles. In this new approach, which relies heavily on the accuracy of the alleles used for the phylogenetic analysis, an expanded set of predicted alleles can be used to infer alleles when large genotype data are analyzed, as typically generated by high-throughput sequencing. The alleles identified by studies like ours may be utilized in designing of microarray technologies, imputing of genotypes and mapping of next generation sequencing data.
Appendix
Available only for authorised users
Literature
1.
go back to reference Tay GK, Witt CS, Christiansen FT, Charron D, Baker D, Herrmann R, Smith LK, Diepeveen D, Mallal S, McCluskey J, et al. Matching for MHC haplotypes results in improved survival following unrelated bone marrow transplantation. Bone Marrow Transplant. 1995;15(3):381–5.PubMed Tay GK, Witt CS, Christiansen FT, Charron D, Baker D, Herrmann R, Smith LK, Diepeveen D, Mallal S, McCluskey J, et al. Matching for MHC haplotypes results in improved survival following unrelated bone marrow transplantation. Bone Marrow Transplant. 1995;15(3):381–5.PubMed
2.
go back to reference Chou ST, Liem RI, Thompson AA. Challenges of alloimmunization in patients with haemoglobinopathies. Br J Haematol. 2012;159(4):394–404.PubMedCrossRef Chou ST, Liem RI, Thompson AA. Challenges of alloimmunization in patients with haemoglobinopathies. Br J Haematol. 2012;159(4):394–404.PubMedCrossRef
3.
go back to reference Tournamille C, Meunier-Costes N, Costes B, Martret J, Barrault A, Gauthier P, Galacteros F, Nzouekou R, Bierling P, Noizat-Pirenne F. Partial C antigen in sickle cell disease patients: clinical relevance and prevention of alloimmunization. Transfusion. 2010;50(1):13–9.PubMedCrossRef Tournamille C, Meunier-Costes N, Costes B, Martret J, Barrault A, Gauthier P, Galacteros F, Nzouekou R, Bierling P, Noizat-Pirenne F. Partial C antigen in sickle cell disease patients: clinical relevance and prevention of alloimmunization. Transfusion. 2010;50(1):13–9.PubMedCrossRef
4.
go back to reference Allen ES, Srivastava K, Hsieh MM, Fitzhugh CD, Klein HG, Tisdale JF, Flegel WA. Immunohaematological complications in patients with sickle cell disease after haemopoietic progenitor cell transplantation: a prospective, single-centre, observational study. Lancet Haematol. 2017;4(11):e553–61.PubMedPubMedCentralCrossRef Allen ES, Srivastava K, Hsieh MM, Fitzhugh CD, Klein HG, Tisdale JF, Flegel WA. Immunohaematological complications in patients with sickle cell disease after haemopoietic progenitor cell transplantation: a prospective, single-centre, observational study. Lancet Haematol. 2017;4(11):e553–61.PubMedPubMedCentralCrossRef
6.
go back to reference Lloyd SS, Steele EJ, Dawkins RL. Analysis of Haplotype Sequences. In: Kulski JK, editor. Next Generation Sequencing-Advances, Applications and Challenges. InTechOpen; 2016. pp. 345–368. Lloyd SS, Steele EJ, Dawkins RL. Analysis of Haplotype Sequences. In: Kulski JK, editor. Next Generation Sequencing-Advances, Applications and Challenges. InTechOpen; 2016. pp. 345–368.
7.
go back to reference Robinson J, Halliwell JA, Hayhurst JD, Flicek P, Parham P, Marsh SGE. The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res. 2015;43(Database issue):D423–31.PubMedCrossRef Robinson J, Halliwell JA, Hayhurst JD, Flicek P, Parham P, Marsh SGE. The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res. 2015;43(Database issue):D423–31.PubMedCrossRef
8.
9.
go back to reference Schmid P, Ravenell KR, Sheldon SL, Flegel WA. DARC alleles and Duffy phenotypes in African Americans. Transfusion. 2012;52(6):1260–7.PubMedCrossRef Schmid P, Ravenell KR, Sheldon SL, Flegel WA. DARC alleles and Duffy phenotypes in African Americans. Transfusion. 2012;52(6):1260–7.PubMedCrossRef
10.
go back to reference Calafell F, Roubinet F, Ramirez-Soriano A, Saitou N, Bertranpetit J, Blancher A. Evolutionary dynamics of the human ABO gene. Hum Genet. 2008;124(2):123–35.PubMedCrossRef Calafell F, Roubinet F, Ramirez-Soriano A, Saitou N, Bertranpetit J, Blancher A. Evolutionary dynamics of the human ABO gene. Hum Genet. 2008;124(2):123–35.PubMedCrossRef
11.
go back to reference Church DM, Schneider VA, Graves T, Auger K, Cunningham F, Bouk N, Chen HC, Agarwala R, McLaren WM, Ritchie GR, Albracht D, Kremitzki M, Rock S, Kotkiewicz H, Kremitzki C, Wollam A, Trani L, Fulton L, Fulton R, Matthews L, Whitehead S, Chow W, Torrance J, Dunn M, Harden G, Threadgold G, Wood J, Collins J, Heath P, Griffiths G, Pelan S, Grafham D, Eichler EE, Weinstock G, Mardis ER, Wilson RK, Howe K, Flicek P, Hubbard T. Modernizing reference genome assemblies. PLoS Biol. 2011;9(7):e1001091.PubMedPubMedCentralCrossRef Church DM, Schneider VA, Graves T, Auger K, Cunningham F, Bouk N, Chen HC, Agarwala R, McLaren WM, Ritchie GR, Albracht D, Kremitzki M, Rock S, Kotkiewicz H, Kremitzki C, Wollam A, Trani L, Fulton L, Fulton R, Matthews L, Whitehead S, Chow W, Torrance J, Dunn M, Harden G, Threadgold G, Wood J, Collins J, Heath P, Griffiths G, Pelan S, Grafham D, Eichler EE, Weinstock G, Mardis ER, Wilson RK, Howe K, Flicek P, Hubbard T. Modernizing reference genome assemblies. PLoS Biol. 2011;9(7):e1001091.PubMedPubMedCentralCrossRef
12.
go back to reference Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017;27(5):849–64.PubMedPubMedCentralCrossRef Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017;27(5):849–64.PubMedPubMedCentralCrossRef
13.
go back to reference Su YY, Gordon CT, Ye TZ, Perkins AC, Chui DH. Human ERMAP: an erythroid adhesion/receptor transmembrane protein. Blood Cells Mol Dis. 2001;27(5):938–49.PubMedCrossRef Su YY, Gordon CT, Ye TZ, Perkins AC, Chui DH. Human ERMAP: an erythroid adhesion/receptor transmembrane protein. Blood Cells Mol Dis. 2001;27(5):938–49.PubMedCrossRef
14.
go back to reference Xu H, Foltz L, Sha Y, Madlansacay MR, Cain C, Lindemann G, Vargas J, Nagy D, Harriman B, Mahoney W, Schueler PA. Cloning and characterization of human erythroid membrane-associated protein, human ERMAP. Genomics. 2001;76(1–3):2–4.PubMedCrossRef Xu H, Foltz L, Sha Y, Madlansacay MR, Cain C, Lindemann G, Vargas J, Nagy D, Harriman B, Mahoney W, Schueler PA. Cloning and characterization of human erythroid membrane-associated protein, human ERMAP. Genomics. 2001;76(1–3):2–4.PubMedCrossRef
15.
go back to reference Wagner FF, Poole J, Flegel WA. Scianna antigens including Rd are expressed by ERMAP. Blood. 2003;101(2):752–7.PubMedCrossRef Wagner FF, Poole J, Flegel WA. Scianna antigens including Rd are expressed by ERMAP. Blood. 2003;101(2):752–7.PubMedCrossRef
16.
go back to reference Velliquette RW. Review: the Scianna blood group system. Immunohematology. 2005;21(2):70–6.PubMed Velliquette RW. Review: the Scianna blood group system. Immunohematology. 2005;21(2):70–6.PubMed
17.
go back to reference Ye T-Z, Gordon CT, Lai Y-H, Fujiwara Y, Peters LL, Perkins AC, Chui DHK. Ermap, a gene coding for a novel erythroid specific adhesion/receptor membrane protein. Gene. 2000;242(1–2):337–45.PubMedCrossRef Ye T-Z, Gordon CT, Lai Y-H, Fujiwara Y, Peters LL, Perkins AC, Chui DHK. Ermap, a gene coding for a novel erythroid specific adhesion/receptor membrane protein. Gene. 2000;242(1–2):337–45.PubMedCrossRef
18.
go back to reference Afrache H, Gouret P, Ainouche S, Pontarotti P, Olive D. The butyrophilin (BTN) gene family: from milk fat to the regulation of the immune response. Immunogenetics. 2012;64(11):781–94.PubMedCrossRef Afrache H, Gouret P, Ainouche S, Pontarotti P, Olive D. The butyrophilin (BTN) gene family: from milk fat to the regulation of the immune response. Immunogenetics. 2012;64(11):781–94.PubMedCrossRef
19.
go back to reference Rhodes DA, Reith W, Trowsdale J. Regulation of Immunity by Butyrophilins. Annu Rev Immunol. 2016;34:151–72.PubMedCrossRef Rhodes DA, Reith W, Trowsdale J. Regulation of Immunity by Butyrophilins. Annu Rev Immunol. 2016;34:151–72.PubMedCrossRef
20.
go back to reference Di Marco Barros R, Roberts NA, Dart RJ, Vantourout P, Jandke A, Nussbaumer O, Deban L, Cipolat S, Hart R, Iannitto ML, Laing A, Spencer-Dene B, East P, Gibbons D, Irving PM, Pereira P, Steinhoff U, Hayday A. Epithelia use butyrophilin-like molecules to shape organ-specific gammadelta T cell compartments. Cell. 2016;167(1):203–18.PubMedPubMedCentralCrossRef Di Marco Barros R, Roberts NA, Dart RJ, Vantourout P, Jandke A, Nussbaumer O, Deban L, Cipolat S, Hart R, Iannitto ML, Laing A, Spencer-Dene B, East P, Gibbons D, Irving PM, Pereira P, Steinhoff U, Hayday A. Epithelia use butyrophilin-like molecules to shape organ-specific gammadelta T cell compartments. Cell. 2016;167(1):203–18.PubMedPubMedCentralCrossRef
21.
go back to reference Srivastava K, Lee E, Owens E, Rujirojindakul P, Flegel WA. Full-length nucleotide sequence of ERMAP alleles encoding Scianna (SC) antigens. Transfusion. 2016;56(12):3047–54.PubMedPubMedCentralCrossRef Srivastava K, Lee E, Owens E, Rujirojindakul P, Flegel WA. Full-length nucleotide sequence of ERMAP alleles encoding Scianna (SC) antigens. Transfusion. 2016;56(12):3047–54.PubMedPubMedCentralCrossRef
22.
go back to reference Srivastava K, Wollenberg KR, Flegel WA. Use of 48 ERMAP alleles, at 21,406 nucleotides each, to predict haplotypes for genotype prediction from next generation sequencing data (abstract). Transfusion. 2017;57(Supplement S3):44A. Srivastava K, Wollenberg KR, Flegel WA. Use of 48 ERMAP alleles, at 21,406 nucleotides each, to predict haplotypes for genotype prediction from next generation sequencing data (abstract). Transfusion. 2017;57(Supplement S3):44A.
23.
go back to reference Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.PubMedPubMedCentralCrossRef Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.PubMedPubMedCentralCrossRef
25.
go back to reference Bryant D, Moulton V. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol. 2004;21(2):255–65.PubMedCrossRef Bryant D, Moulton V. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol. 2004;21(2):255–65.PubMedCrossRef
27.
go back to reference Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993;10(3):512–26.PubMed Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993;10(3):512–26.PubMed
28.
go back to reference Rannala B, Yang Z. Inferring speciation times under an episodic molecular clock. Syst Biol. 2007;56(3):453–66.PubMedCrossRef Rannala B, Yang Z. Inferring speciation times under an episodic molecular clock. Syst Biol. 2007;56(3):453–66.PubMedCrossRef
29.
go back to reference Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics. 2002;161(3):1307–20.PubMedPubMedCentral Drummond AJ, Nicholls GK, Rodrigo AG, Solomon W. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics. 2002;161(3):1307–20.PubMedPubMedCentral
30.
go back to reference McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303.PubMedPubMedCentralCrossRef McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303.PubMedPubMedCentralCrossRef
32.
go back to reference Liu Y, Koyuturk M, Maxwell S, Xiang M, Veigl M, Cooper RS, Tayo BO, Li L, LaFramboise T, Wang Z, Zhu X, Chance MR. Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing. BMC Genomics. 2014;15:685.PubMedPubMedCentralCrossRef Liu Y, Koyuturk M, Maxwell S, Xiang M, Veigl M, Cooper RS, Tayo BO, Li L, LaFramboise T, Wang Z, Zhu X, Chance MR. Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing. BMC Genomics. 2014;15:685.PubMedPubMedCentralCrossRef
33.
go back to reference The Genomes Project C. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.CrossRef The Genomes Project C. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.CrossRef
34.
go back to reference Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225–9.PubMedCrossRef Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M, Liu-Cordero SN, Rotimi C, Adeyemo A, Cooper R, Ward R, Lander ES, Daly MJ, Altshuler D. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225–9.PubMedCrossRef
35.
go back to reference Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–11.PubMedPubMedCentralCrossRef Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–11.PubMedPubMedCentralCrossRef
36.
37.
go back to reference Olsson ML, Chester MA. Polymorphism and recombination events at the ABO locus: a major challenge for genomic ABO blood grouping strategies. Transfus Med. 2001;11(4):295–313.PubMedCrossRef Olsson ML, Chester MA. Polymorphism and recombination events at the ABO locus: a major challenge for genomic ABO blood grouping strategies. Transfus Med. 2001;11(4):295–313.PubMedCrossRef
Metadata
Title
The phylogeny of 48 alleles, experimentally verified at 21 kb, and its application to clinical allele detection
Authors
Kshitij Srivastava
Kurt R. Wollenberg
Willy A. Flegel
Publication date
01-12-2019
Publisher
BioMed Central
Published in
Journal of Translational Medicine / Issue 1/2019
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/s12967-019-1791-9

Other articles of this Issue 1/2019

Journal of Translational Medicine 1/2019 Go to the issue