Skip to main content
Top
Published in: Malaria Journal 1/2021

Open Access 01-12-2021 | Malaria | Methodology

Markov chain Monte Carlo Gibbs sampler approach for estimating haplotype frequencies among multiple malaria infected human blood samples

Authors: Gie Ken-Dror, Pankaj Sharma

Published in: Malaria Journal | Issue 1/2021

Login to get access

Abstract

Background

Malaria patients can have two or more haplotypes in their blood sample making it challenging to identify which haplotypes they carry. In addition, there are challenges in measuring the type and frequency of resistant haplotypes in populations. This study presents a novel statistical method Gibbs sampler algorithm to investigate this issue.

Results

The performance of the algorithm is evaluated on simulated datasets consisting of patient blood samples characterized by their multiplicity of infection (MOI) and malaria genotype. The simulation used different resistance allele frequencies (RAF) at each Single Nucleotide Polymorphisms (SNPs) and different limit of detection (LoD) of the SNPs and the MOI. The Gibbs sampler algorithm presents higher accuracy among high LoD of the SNPs or the MOI, validated, and deals with missing MOI compared to previous related statistical approaches.

Conclusions

The Gibbs sampler algorithm provided robust results when faced with genotyping errors caused by LoDs and functioned well even in the absence of MOI data on individual patients.
Appendix
Available only for authorised users
Literature
1.
go back to reference Hastings IM, Nsanzabana C, Smith TA. A comparison of methods to detect and quantify the markers of antimalarial drug resistance. Am J Trop Med Hyg. 2010;83:489–95.CrossRef Hastings IM, Nsanzabana C, Smith TA. A comparison of methods to detect and quantify the markers of antimalarial drug resistance. Am J Trop Med Hyg. 2010;83:489–95.CrossRef
2.
go back to reference Pegoraro M, Weedall GD. Malaria in the 'Omics Era'. Genes. 2021;12. Pegoraro M, Weedall GD. Malaria in the 'Omics Era'. Genes. 2021;12.
3.
go back to reference Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002;419:498–511.CrossRef Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, et al. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002;419:498–511.CrossRef
4.
go back to reference Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921.CrossRef Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921.CrossRef
5.
go back to reference Jennison C, Arnott A, Tessier N, Tavul L, Koepfli C, Felger I, et al. Plasmodium vivax populations are more genetically diverse and less structured than sympatric Plasmodium falciparum populations. PLoS Negl Trop Dis. 2015;9:e3634.CrossRef Jennison C, Arnott A, Tessier N, Tavul L, Koepfli C, Felger I, et al. Plasmodium vivax populations are more genetically diverse and less structured than sympatric Plasmodium falciparum populations. PLoS Negl Trop Dis. 2015;9:e3634.CrossRef
6.
go back to reference Daniels R, Volkman SK, Milner DA, Mahesh N, Neafsey DE, Park DJ, et al. A general SNP-based molecular barcode for Plasmodium falciparum identification and tracking. Malar J. 2008;7:223.CrossRef Daniels R, Volkman SK, Milner DA, Mahesh N, Neafsey DE, Park DJ, et al. A general SNP-based molecular barcode for Plasmodium falciparum identification and tracking. Malar J. 2008;7:223.CrossRef
7.
go back to reference Ba H, Duffy CW, Ahouidi AD, Deh YB, Diallo MY, Tandia A, et al. Widespread distribution of Plasmodium vivax malaria in Mauritania on the interface of the Maghreb and West Africa. Malar J. 2016;15:80.CrossRef Ba H, Duffy CW, Ahouidi AD, Deh YB, Diallo MY, Tandia A, et al. Widespread distribution of Plasmodium vivax malaria in Mauritania on the interface of the Maghreb and West Africa. Malar J. 2016;15:80.CrossRef
8.
go back to reference Price RN, Commons RJ, Battle KE, Thriemer K, Mendis K. Plasmodium vivax in the era of the shrinking P. falciparum map. Trends Parasitol. 2020;36:560–70.CrossRef Price RN, Commons RJ, Battle KE, Thriemer K, Mendis K. Plasmodium vivax in the era of the shrinking P. falciparum map. Trends Parasitol. 2020;36:560–70.CrossRef
9.
go back to reference Plowe CV, Roper C, Barnwell JW, Happi CT, Joshi HH, Mbacham W, et al. World Antimalarial Resistance Network (WARN) III: molecular markers for drug resistant malaria. Malar J. 2007;6:121.CrossRef Plowe CV, Roper C, Barnwell JW, Happi CT, Joshi HH, Mbacham W, et al. World Antimalarial Resistance Network (WARN) III: molecular markers for drug resistant malaria. Malar J. 2007;6:121.CrossRef
10.
go back to reference Cheeseman IH, Miller BA, Nair S, Nkhoma S, Tan A, Tan JC, et al. A major genome region underlying artemisinin resistance in malaria. Science. 2012;336:79–82.CrossRef Cheeseman IH, Miller BA, Nair S, Nkhoma S, Tan A, Tan JC, et al. A major genome region underlying artemisinin resistance in malaria. Science. 2012;336:79–82.CrossRef
11.
go back to reference Zhong D, Koepfli C, Cui L, Yan G. Molecular approaches to determine the multiplicity of Plasmodium infections. Malar J. 2018;17:172.CrossRef Zhong D, Koepfli C, Cui L, Yan G. Molecular approaches to determine the multiplicity of Plasmodium infections. Malar J. 2018;17:172.CrossRef
12.
go back to reference Greenhouse B, Dokomajilar C, Hubbard A, Rosenthal PJ, Dorsey G. Impact of transmission intensity on the accuracy of genotyping to distinguish recrudescence from new infection in antimalarial clinical trials. Antimicrob Agents Chemother. 2007;51:3096–103.CrossRef Greenhouse B, Dokomajilar C, Hubbard A, Rosenthal PJ, Dorsey G. Impact of transmission intensity on the accuracy of genotyping to distinguish recrudescence from new infection in antimalarial clinical trials. Antimicrob Agents Chemother. 2007;51:3096–103.CrossRef
13.
go back to reference Wigger L, Vogt JE, Roth V. Malaria haplotype frequency estimation. Stat Med. 2013;32:3737–51.CrossRef Wigger L, Vogt JE, Roth V. Malaria haplotype frequency estimation. Stat Med. 2013;32:3737–51.CrossRef
14.
go back to reference Hastings IM, Smith TA. MalHaploFreq: a computer programme for estimating malaria haplotype frequencies from blood samples. Malar J. 2008;7:130.CrossRef Hastings IM, Smith TA. MalHaploFreq: a computer programme for estimating malaria haplotype frequencies from blood samples. Malar J. 2008;7:130.CrossRef
15.
go back to reference Li X, Foulkes AS, Yucel RM, Rich SM. An expectation maximization approach to estimate malaria haplotype frequencies in multiply infected children. Stat Appl Genet Mol Biol. 2007;6:33.CrossRef Li X, Foulkes AS, Yucel RM, Rich SM. An expectation maximization approach to estimate malaria haplotype frequencies in multiply infected children. Stat Appl Genet Mol Biol. 2007;6:33.CrossRef
16.
go back to reference Ken-Dror G, Hastings IM. Markov chain Monte Carlo and expectation maximization approaches for estimation of haplotype frequencies for multiply infected human blood samples. Malar J. 2016;15:430.CrossRef Ken-Dror G, Hastings IM. Markov chain Monte Carlo and expectation maximization approaches for estimation of haplotype frequencies for multiply infected human blood samples. Malar J. 2016;15:430.CrossRef
17.
go back to reference Taylor AR, Flegg JA, Nsobya SL, Yeka A, Kamya MR, Rosenthal PJ, et al. Estimation of malaria haplotype and genotype frequencies: a statistical approach to overcome the challenge associated with multiclonal infections. Malar J. 2014;13:102.CrossRef Taylor AR, Flegg JA, Nsobya SL, Yeka A, Kamya MR, Rosenthal PJ, et al. Estimation of malaria haplotype and genotype frequencies: a statistical approach to overcome the challenge associated with multiclonal infections. Malar J. 2014;13:102.CrossRef
18.
go back to reference R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014. R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014.
19.
go back to reference Jaki T, Parry A, Winter K, Hastings I. Analysing malaria drug trials on a per-individual or per-clone basis: a comparison of methods. Stat Med. 2013;32:3020–38.CrossRef Jaki T, Parry A, Winter K, Hastings I. Analysing malaria drug trials on a per-individual or per-clone basis: a comparison of methods. Stat Med. 2013;32:3020–38.CrossRef
20.
go back to reference Brooks S, Brooks S, Gelman A, Jones G, Meng X-L, Brooks S. Handbook of Markov chain Monte Carlo. Boca Raton, Fl: CRC Press; 2011.CrossRef Brooks S, Brooks S, Gelman A, Jones G, Meng X-L, Brooks S. Handbook of Markov chain Monte Carlo. Boca Raton, Fl: CRC Press; 2011.CrossRef
21.
go back to reference Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. London, New York: Chapman & Hall; 1996. Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. London, New York: Chapman & Hall; 1996.
22.
go back to reference Roberts GO, Sahu SK. Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler. J R Stat Soc Ser B. 1997;59:291–317.CrossRef Roberts GO, Sahu SK. Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler. J R Stat Soc Ser B. 1997;59:291–317.CrossRef
23.
go back to reference Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol. 1995;12:921–7.PubMed Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol. 1995;12:921–7.PubMed
24.
go back to reference Adkins RM. Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset. BMC Genet. 2004;5:22.CrossRef Adkins RM. Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset. BMC Genet. 2004;5:22.CrossRef
25.
go back to reference Fallin D, Schork NJ. Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet. 2000;67:947–59.CrossRef Fallin D, Schork NJ. Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet. 2000;67:947–59.CrossRef
26.
go back to reference Istrail S, Waterman MS, Clark AG. Computational methods for SNPs and Haplotype inference: DIMACS/RECOMB satellite workshop, Piscataway, NJ, USA, 2002. Berlin, New York: Springer-Verlag; 2004.CrossRef Istrail S, Waterman MS, Clark AG. Computational methods for SNPs and Haplotype inference: DIMACS/RECOMB satellite workshop, Piscataway, NJ, USA, 2002. Berlin, New York: Springer-Verlag; 2004.CrossRef
27.
go back to reference Tishkoff SA, Pakstis AJ, Ruano G, Kidd KK. The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus. Am J Hum Genet. 2000;67:518–22.CrossRef Tishkoff SA, Pakstis AJ, Ruano G, Kidd KK. The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus. Am J Hum Genet. 2000;67:518–22.CrossRef
28.
go back to reference Sabbagh A, Darlu P. Inferring haplotypes at the NAT2 locus: the computational approach. BMC Genet. 2005;6:30.CrossRef Sabbagh A, Darlu P. Inferring haplotypes at the NAT2 locus: the computational approach. BMC Genet. 2005;6:30.CrossRef
29.
go back to reference Lunn D, Lunn D. The BUGS book : a practical introduction to Bayesian analysis. Boca Raton, FL, London: CRC Press Chapman & Hall; 2013. Lunn D, Lunn D. The BUGS book : a practical introduction to Bayesian analysis. Boca Raton, FL, London: CRC Press Chapman & Hall; 2013.
30.
go back to reference Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992;7:457–72. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992;7:457–72.
31.
go back to reference Brooks SP, Gelman A. General methods for monitoring convergence of iterative simulations. J Comput Graph Stat. 1998;7:434–55. Brooks SP, Gelman A. General methods for monitoring convergence of iterative simulations. J Comput Graph Stat. 1998;7:434–55.
32.
go back to reference Carlo M. One long run with diagnostics: implementation strategies for Markov chain Monte Carlo. Stat Sci. 1992;7:493–7. Carlo M. One long run with diagnostics: implementation strategies for Markov chain Monte Carlo. Stat Sci. 1992;7:493–7.
33.
go back to reference Spiegelhalter WR, Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. Boca Raton, Fla: Chapman & Hall; 1996. Spiegelhalter WR, Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. Boca Raton, Fla: Chapman & Hall; 1996.
34.
go back to reference Heidelberger P, Welch PD. A spectral method for confidence interval generation and run lengthcontrol in simulations. Commun Assoc Comput Mach. 1981;24:233–45. Heidelberger P, Welch PD. A spectral method for confidence interval generation and run lengthcontrol in simulations. Commun Assoc Comput Mach. 1981;24:233–45.
35.
go back to reference Heidelberger P, Welch PD. Simulation run length control in the presence of an initial transient. Operations Res. 1983;31:1109–44.CrossRef Heidelberger P, Welch PD. Simulation run length control in the presence of an initial transient. Operations Res. 1983;31:1109–44.CrossRef
36.
go back to reference Bernardo JM, Valencia International Meeting on Bayesian S: Bayesian Statistics 4: proceedings of the 4th Valencia International Meeting, April 15–20, 1991. Oxford: O.U.P; 1992. Bernardo JM, Valencia International Meeting on Bayesian S: Bayesian Statistics 4: proceedings of the 4th Valencia International Meeting, April 15–20, 1991. Oxford: O.U.P; 1992.
37.
go back to reference Preston MD, Campino S, Assefa SA, Echeverry DF, Ocholla H, Amambua-Ngwa A, et al. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains. Nat Commun. 2014;5:4052.CrossRef Preston MD, Campino S, Assefa SA, Echeverry DF, Ocholla H, Amambua-Ngwa A, et al. A barcode of organellar genome polymorphisms identifies the geographic origin of Plasmodium falciparum strains. Nat Commun. 2014;5:4052.CrossRef
38.
go back to reference Wang Z, Cabrera M, Yang J, Yuan L, Gupta B, Liang X, et al. Genome-wide association analysis identifies genetic loci associated with resistance to multiple antimalarials in Plasmodium falciparum from China-Myanmar border. Sci Rep. 2016;6:33891.CrossRef Wang Z, Cabrera M, Yang J, Yuan L, Gupta B, Liang X, et al. Genome-wide association analysis identifies genetic loci associated with resistance to multiple antimalarials in Plasmodium falciparum from China-Myanmar border. Sci Rep. 2016;6:33891.CrossRef
Metadata
Title
Markov chain Monte Carlo Gibbs sampler approach for estimating haplotype frequencies among multiple malaria infected human blood samples
Authors
Gie Ken-Dror
Pankaj Sharma
Publication date
01-12-2021
Publisher
BioMed Central
Keyword
Malaria
Published in
Malaria Journal / Issue 1/2021
Electronic ISSN: 1475-2875
DOI
https://doi.org/10.1186/s12936-021-03841-9

Other articles of this Issue 1/2021

Malaria Journal 1/2021 Go to the issue
Obesity Clinical Trial Summary

At a glance: The STEP trials

A round-up of the STEP phase 3 clinical trials evaluating semaglutide for weight loss in people with overweight or obesity.

Developed by: Springer Medicine

Highlights from the ACC 2024 Congress

Year in Review: Pediatric cardiology

Watch Dr. Anne Marie Valente present the last year's highlights in pediatric and congenital heart disease in the official ACC.24 Year in Review session.

Year in Review: Pulmonary vascular disease

The last year's highlights in pulmonary vascular disease are presented by Dr. Jane Leopold in this official video from ACC.24.

Year in Review: Valvular heart disease

Watch Prof. William Zoghbi present the last year's highlights in valvular heart disease from the official ACC.24 Year in Review session.

Year in Review: Heart failure and cardiomyopathies

Watch this official video from ACC.24. Dr. Biykem Bozkurt discusses last year's major advances in heart failure and cardiomyopathies.