Skip to main content
Top
Published in: International Journal of Legal Medicine 5/2015

01-09-2015 | Original Article

AncesTrees: ancestry estimation with randomized decision trees

Authors: David Navega, Catarina Coelho, Ricardo Vicente, Maria Teresa Ferreira, Sofia Wasterlain, Eugénia Cunha

Published in: International Journal of Legal Medicine | Issue 5/2015

Login to get access

Abstract

In forensic anthropology, ancestry estimation is essential in establishing the individual biological profile. The aim of this study is to present a new program—AncesTrees—developed for assessing ancestry based on metric analysis. AncesTrees relies on a machine learning ensemble algorithm, random forest, to classify the human skull. In the ensemble learning paradigm, several models are generated and co-jointly used to arrive at the final decision. The random forest algorithm creates ensembles of decision trees classifiers, a non-linear and non-parametric classification technique. The database used in AncesTrees is composed by 23 craniometric variables from 1,734 individuals, representative of six major ancestral groups and selected from the Howells’ craniometric series. The program was tested in 128 adult crania from the following collections: the African slaves’ skeletal collection of Valle da Gafaria; the Medical School Skull Collection and the Identified Skeletal Collection of 21st Century, both curated at the University of Coimbra. The first step of the test analysis was to perform ancestry estimation including all the ancestral groups of the database. The second stage of our test analysis was to conduct ancestry estimation including only the European and the African ancestral groups. In the first test analysis, 75 % of the individuals of African ancestry and 79.2 % of the individuals of European ancestry were correctly identified. The model involving only African and European ancestral groups had a better performance: 93.8 % of all individuals were correctly classified. The obtained results show that AncesTrees can be a valuable tool in forensic anthropology.
Literature
1.
go back to reference Ousley S, Jantz R, Freid D (2009) Understanding race and human variation: why forensic anthropologists are good at identifying race. Am J Phys Anthropol 139:68–76CrossRefPubMed Ousley S, Jantz R, Freid D (2009) Understanding race and human variation: why forensic anthropologists are good at identifying race. Am J Phys Anthropol 139:68–76CrossRefPubMed
3.
4.
go back to reference Hefner JT, Spradley K, Anderson BE (2011) Ancestry estimation using random forest modelling. Proc. Am. Acad. Forensic Sci. Chicago, IL, pp 352–353 Hefner JT, Spradley K, Anderson BE (2011) Ancestry estimation using random forest modelling. Proc. Am. Acad. Forensic Sci. Chicago, IL, pp 352–353
5.
go back to reference Hefner JT, Ousley SD, Dirkmaat DC (2012) Morphoscopic traits and the assessment of ancestry. In: Dirkmaat DC (ed) Companion forensic anthropol, 1st edn. Wiley-Blackwell, West Sussex, pp 287–310CrossRef Hefner JT, Ousley SD, Dirkmaat DC (2012) Morphoscopic traits and the assessment of ancestry. In: Dirkmaat DC (ed) Companion forensic anthropol, 1st edn. Wiley-Blackwell, West Sussex, pp 287–310CrossRef
6.
go back to reference Edgar HJH (2005) Prediction of race using characteristics of dental morphology. J Forensic Sci 50:269–273CrossRefPubMed Edgar HJH (2005) Prediction of race using characteristics of dental morphology. J Forensic Sci 50:269–273CrossRefPubMed
7.
go back to reference Edgar HJH (2009) Testing the utility of dental morphological traits commonly used in the forensic identification of ancestry. Front Oral Biol 13:49–54CrossRefPubMed Edgar HJH (2009) Testing the utility of dental morphological traits commonly used in the forensic identification of ancestry. Front Oral Biol 13:49–54CrossRefPubMed
9.
go back to reference Hefner JT, Spradley MK, Anderson B (2014) Ancestry assessment using random forest modeling. J Forensic Sci 59:583–589CrossRefPubMed Hefner JT, Spradley MK, Anderson B (2014) Ancestry assessment using random forest modeling. J Forensic Sci 59:583–589CrossRefPubMed
10.
go back to reference Hefner JT, Ousley SD (2014) Statistical classification methods for estimating ancestry using morphoscopic traits. J Forensic Sci n/a–n/a Hefner JT, Ousley SD (2014) Statistical classification methods for estimating ancestry using morphoscopic traits. J Forensic Sci n/a–n/a
11.
go back to reference Giles E, Elliot O (1962) Race identification from cranial measurements. J Forensic Sci 7:147–157 Giles E, Elliot O (1962) Race identification from cranial measurements. J Forensic Sci 7:147–157
12.
go back to reference Ousley SD, Jantz RL (2005) FORDISC 3.0: Personal computer forensic discriminant functions. Universty of Tennesse Ousley SD, Jantz RL (2005) FORDISC 3.0: Personal computer forensic discriminant functions. Universty of Tennesse
13.
go back to reference Ousley SD, Jantz RL (2012) ForDisc 3 and statistical methods for sex and ancestry estimation. In: Dirkmaat DC (ed) A Companion to Forensic Anthropology, 1st edn. Wiley-Blackwell, West Sussex, UK, pp 311–329CrossRef Ousley SD, Jantz RL (2012) ForDisc 3 and statistical methods for sex and ancestry estimation. In: Dirkmaat DC (ed) A Companion to Forensic Anthropology, 1st edn. Wiley-Blackwell, West Sussex, UK, pp 311–329CrossRef
14.
go back to reference Wright R (1992) Correlation between cranial form and geography in homo sapiens: CRANID—a computer program for forensic and other applications. Archaeol Ocean 27:128–134CrossRef Wright R (1992) Correlation between cranial form and geography in homo sapiens: CRANID—a computer program for forensic and other applications. Archaeol Ocean 27:128–134CrossRef
15.
go back to reference Wright R (2008) Detection of likely ancestry using CRANID. In: Oxenham M (ed) Forensic approaches death, disaster and abuse. Australian Academic Press, Sydney, pp 111–122 Wright R (2008) Detection of likely ancestry using CRANID. In: Oxenham M (ed) Forensic approaches death, disaster and abuse. Australian Academic Press, Sydney, pp 111–122
16.
go back to reference Du Jardin P, Ponsaillé J, Alunni-Perret V, Quatrehomme G (2009) A comparison between neural network and other metric methods to determine sex from the upper femur in a modern French population. Forensic Sci Int 192:127, e1–6CrossRefPubMed Du Jardin P, Ponsaillé J, Alunni-Perret V, Quatrehomme G (2009) A comparison between neural network and other metric methods to determine sex from the upper femur in a modern French population. Forensic Sci Int 192:127, e1–6CrossRefPubMed
17.
go back to reference Mahfouz M, Badawi A, Merkl B, Fatah EEA, Pritchard E, Kesler K, Moore M, Jantz R, Jantz L (2007) Patella sex determination by 3D statistical shape models and nonlinear classifiers. Forensic Sci Int 173:161–170CrossRefPubMed Mahfouz M, Badawi A, Merkl B, Fatah EEA, Pritchard E, Kesler K, Moore M, Jantz R, Jantz L (2007) Patella sex determination by 3D statistical shape models and nonlinear classifiers. Forensic Sci Int 173:161–170CrossRefPubMed
18.
go back to reference Moss GP, Shah AJ, Adams RG, Davey N, Wilkinson SC, Pugh WJ, Sun Y (2012) The application of discriminant analysis and machine learning methods as tools to identify and classify compounds with potential as transdermal enhancers. Eur J Pharm Sci Off J Eur Fed Pharm Sci 45:116–127 Moss GP, Shah AJ, Adams RG, Davey N, Wilkinson SC, Pugh WJ, Sun Y (2012) The application of discriminant analysis and machine learning methods as tools to identify and classify compounds with potential as transdermal enhancers. Eur J Pharm Sci Off J Eur Fed Pharm Sci 45:116–127
19.
go back to reference Howells WW (1973) Cranial variation in man: a study by multivariate analysis of patterns of difference among recent human populations. Harvard University Press, Cambridge Howells WW (1973) Cranial variation in man: a study by multivariate analysis of patterns of difference among recent human populations. Harvard University Press, Cambridge
20.
go back to reference Howells WW (1989) Skull shapes and the map: craniometric analyses in the dispersion of modern homo. Peabody Museum of Archaeology and Ethnology, Harvard University Howells WW (1989) Skull shapes and the map: craniometric analyses in the dispersion of modern homo. Peabody Museum of Archaeology and Ethnology, Harvard University
21.
go back to reference Howells WW (1995) Who’s who in skulls: ethnic identification of crania from measurements. Peabody Museum of Archaeology and Ethnology, Harvard University Howells WW (1995) Who’s who in skulls: ethnic identification of crania from measurements. Peabody Museum of Archaeology and Ethnology, Harvard University
22.
23.
go back to reference Neves MJ, Almeida M, Ferreira MT (2011) História de um arrabalde durante os séculos XV e XVI: O “poço dos negros” em Lagos (Algarve, Portugal) e o seu contributo para o estudo dos escravos africanos em Portugal. In: Matos AT, Costa JPO (eds) Herança do Infante: História, Arqueologia e Museologia em Lagos. Câmara Municipal de Lagos, Lagos, Portugal, pp 29–46 Neves MJ, Almeida M, Ferreira MT (2011) História de um arrabalde durante os séculos XV e XVI: O “poço dos negros” em Lagos (Algarve, Portugal) e o seu contributo para o estudo dos escravos africanos em Portugal. In: Matos AT, Costa JPO (eds) Herança do Infante: História, Arqueologia e Museologia em Lagos. Câmara Municipal de Lagos, Lagos, Portugal, pp 29–46
24.
go back to reference Coelho C (2012) Uma Identidade perdida no mar e reencontrada nos ossos: avaliação das afinidades populacionais de uma amostra de escravos dos séculos XV–XVI. Dissertation, University of Coimbra Coelho C (2012) Uma Identidade perdida no mar e reencontrada nos ossos: avaliação das afinidades populacionais de uma amostra de escravos dos séculos XV–XVI. Dissertation, University of Coimbra
25.
go back to reference Cunha E, Wasterlain S (2007) The Coimbra identified osteological collections. In: Grupe G, Peters J (eds) Skeletal series and their socio-economic context. Verlag Marie Leidorf, GmbH, Rahden/Westf, Germany, pp 23–33 Cunha E, Wasterlain S (2007) The Coimbra identified osteological collections. In: Grupe G, Peters J (eds) Skeletal series and their socio-economic context. Verlag Marie Leidorf, GmbH, Rahden/Westf, Germany, pp 23–33
26.
go back to reference Cunha E (1989) Cálculo de Funções Discriminantes para a Diagnose Sexual do Crânio. Dissertation, University of Coimbra Cunha E (1989) Cálculo de Funções Discriminantes para a Diagnose Sexual do Crânio. Dissertation, University of Coimbra
27.
go back to reference Ferreira MT, Navega D, Vicente R, Cunha E (2013) A Colecção de Esqueletos Identificados Século XXI. 12° Congr. Nac. Med. Leg. E Ciênc. Forenses Ferreira MT, Navega D, Vicente R, Cunha E (2013) A Colecção de Esqueletos Identificados Século XXI. 12° Congr. Nac. Med. Leg. E Ciênc. Forenses
29.
go back to reference Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40:139–157CrossRef Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40:139–157CrossRef
30.
go back to reference Dietterich TG (2000) Ensemble methods in machine learning. Mult. Classif. Syst. Springer Berlin Heidelberg, pp 1–15 Dietterich TG (2000) Ensemble methods in machine learning. Mult. Classif. Syst. Springer Berlin Heidelberg, pp 1–15
31.
go back to reference Mitchell TM (1997) Machine learning. McGraw Hill, Burr Ridge Mitchell TM (1997) Machine learning. McGraw Hill, Burr Ridge
32.
go back to reference Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, BerlinCrossRef Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, BerlinCrossRef
33.
go back to reference Breiman L (1996) Bagging predictors. Mach Learn 24:123–140 Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
34.
go back to reference Ho TK (1995) Random decision forests. Proc Third Int Conf Doc Anal Recognit 1:278–282CrossRef Ho TK (1995) Random decision forests. Proc Third Int Conf Doc Anal Recognit 1:278–282CrossRef
35.
go back to reference Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844CrossRef Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20:832–844CrossRef
36.
go back to reference Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9:1545–1588CrossRef Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9:1545–1588CrossRef
37.
go back to reference Kleinberg EM (1996) An overtraining-resistant stochastic modeling method for pattern recognition. Ann Stat 24:2319–2349CrossRef Kleinberg EM (1996) An overtraining-resistant stochastic modeling method for pattern recognition. Ann Stat 24:2319–2349CrossRef
38.
go back to reference Darroch JN, Mosimann JE (1985) Canonical and principal components of shape. Biometrika 72:241–252CrossRef Darroch JN, Mosimann JE (1985) Canonical and principal components of shape. Biometrika 72:241–252CrossRef
39.
go back to reference Yang P, Hwa Yang Y, Zhou B, Zomaya A (2010) A review of ensemble methods in bioinformatics. Curr Bioinforma 5:296–308CrossRef Yang P, Hwa Yang Y, Zhou B, Zomaya A (2010) A review of ensemble methods in bioinformatics. Curr Bioinforma 5:296–308CrossRef
Metadata
Title
AncesTrees: ancestry estimation with randomized decision trees
Authors
David Navega
Catarina Coelho
Ricardo Vicente
Maria Teresa Ferreira
Sofia Wasterlain
Eugénia Cunha
Publication date
01-09-2015
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Legal Medicine / Issue 5/2015
Print ISSN: 0937-9827
Electronic ISSN: 1437-1596
DOI
https://doi.org/10.1007/s00414-014-1050-9

Other articles of this Issue 5/2015

International Journal of Legal Medicine 5/2015 Go to the issue