Skip to main content
Top
Published in: European Archives of Oto-Rhino-Laryngology 11/2015

01-11-2015 | Laryngology

Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

Authors: Virgilijus Uloza, Evaldas Padervinskis, Aurelija Vegiene, Ruta Pribuisiene, Viktoras Saferis, Evaldas Vaiciukynas, Adas Gelzinis, Antanas Verikas

Published in: European Archives of Oto-Rhino-Laryngology | Issue 11/2015

Login to get access

Abstract

The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73–1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7 % and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5 %. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60 % and RFC revealed the EER of 7.9 %, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.
Literature
1.
go back to reference Roy N, Merrill RM, Thibeault S, Parsa RA, Gray SD, Smith EM (2004) Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res 47:281–293CrossRefPubMed Roy N, Merrill RM, Thibeault S, Parsa RA, Gray SD, Smith EM (2004) Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res 47:281–293CrossRefPubMed
2.
go back to reference Branski RC, Cukier-Blaj S, Pusic A, Cano SJ, Klassen A, Mener D et al (2010) Measuring quality of life in dysphonic patients: a systematic review of content development in patient-reported outcomes measures. J Voice 24:193–198CrossRefPubMed Branski RC, Cukier-Blaj S, Pusic A, Cano SJ, Klassen A, Mener D et al (2010) Measuring quality of life in dysphonic patients: a systematic review of content development in patient-reported outcomes measures. J Voice 24:193–198CrossRefPubMed
3.
go back to reference Bhattacharyya N (2014) The prevalence of voice problems among adults in the united states. Laryngoscope 124:2359–2362CrossRefPubMed Bhattacharyya N (2014) The prevalence of voice problems among adults in the united states. Laryngoscope 124:2359–2362CrossRefPubMed
4.
go back to reference Cohen SM, Kim J, Roy N, Courey M (2014) Delayed otolaryngology referral for voice disorders increases health care costs. Am J Med 128:11–18 Cohen SM, Kim J, Roy N, Courey M (2014) Delayed otolaryngology referral for voice disorders increases health care costs. Am J Med 128:11–18
5.
go back to reference Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G et al (2001) A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Eur Arch Otorhinolaryngol 258:77–82CrossRefPubMed Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G et al (2001) A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. Eur Arch Otorhinolaryngol 258:77–82CrossRefPubMed
6.
go back to reference Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. Conf Proc IEEE Eng Med Biol Soc 2011:7095–7098PubMed Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. Conf Proc IEEE Eng Med Biol Soc 2011:7095–7098PubMed
7.
go back to reference Mat Baki M, Wood G, Alston M, Ratcliffe P, Sandhu G, Rubin JS, Birchall MA (2015) Reliability of operavox against multidimensional voice program (MDVP). Clin Otolaryngol 40:22–28CrossRefPubMed Mat Baki M, Wood G, Alston M, Ratcliffe P, Sandhu G, Rubin JS, Birchall MA (2015) Reliability of operavox against multidimensional voice program (MDVP). Clin Otolaryngol 40:22–28CrossRefPubMed
8.
go back to reference Reynolds DA (1995) Large population speaker identification using clean and telephone speech. Signal Process Lett IEEE 2:46–48CrossRef Reynolds DA (1995) Large population speaker identification using clean and telephone speech. Signal Process Lett IEEE 2:46–48CrossRef
9.
go back to reference Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53:468–477CrossRefPubMed Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53:468–477CrossRefPubMed
10.
go back to reference Wormald RN, Moran RJ, Reilly RB, Lacy PD (2008) Performance of an automated, remote system to detect vocal fold paralysis. Ann Otol Rhinol Laryngol 117:834–838CrossRefPubMed Wormald RN, Moran RJ, Reilly RB, Lacy PD (2008) Performance of an automated, remote system to detect vocal fold paralysis. Ann Otol Rhinol Laryngol 117:834–838CrossRefPubMed
11.
go back to reference Jokinen E, Yrttiaho S, Pulakka H, Vainio M, Alku P (2012) Signal-to-noise ratio adaptive post-filtering method for intelligibility enhancement of telephone speech. J Acoust Soc Am 132:3990–4001CrossRefPubMed Jokinen E, Yrttiaho S, Pulakka H, Vainio M, Alku P (2012) Signal-to-noise ratio adaptive post-filtering method for intelligibility enhancement of telephone speech. J Acoust Soc Am 132:3990–4001CrossRefPubMed
12.
go back to reference Lin E, Hornibrook J, Ormond T (2012) Evaluating iphone recordings for acoustic voice assessment. Folia Phoniatr Logop 64:122–130CrossRefPubMed Lin E, Hornibrook J, Ormond T (2012) Evaluating iphone recordings for acoustic voice assessment. Folia Phoniatr Logop 64:122–130CrossRefPubMed
13.
go back to reference Bach KK, Belafsky PC, Wasylik K, Postma GN, Koufman JA (2005) Validity and reliability of the glottal function index. Arch Otolaryngol Head Neck Surg 131:961–964CrossRefPubMed Bach KK, Belafsky PC, Wasylik K, Postma GN, Koufman JA (2005) Validity and reliability of the glottal function index. Arch Otolaryngol Head Neck Surg 131:961–964CrossRefPubMed
14.
go back to reference Pribuisiene R, Baceviciene M, Uloza V, Vegiene A, Antuseva J (2012) Validation of the Lithuanian version of the glottal function index. J Voice 26:73–78CrossRef Pribuisiene R, Baceviciene M, Uloza V, Vegiene A, Antuseva J (2012) Validation of the Lithuanian version of the glottal function index. J Voice 26:73–78CrossRef
15.
go back to reference Verikas A, Gelzinis A, Bacauskiene M, Uloza V, Kaseta M (2009) Using the patient’s questionnaire data to screen laryngeal disorders. Comput Biol Med 39:148–155CrossRefPubMed Verikas A, Gelzinis A, Bacauskiene M, Uloza V, Kaseta M (2009) Using the patient’s questionnaire data to screen laryngeal disorders. Comput Biol Med 39:148–155CrossRefPubMed
16.
go back to reference Verikas A, Bacauskiene M, Gelzinis A, Vaiciukynas E, Uloza V (2012) Questionnaire-versus voice-based screening for laryngeal disorders. Expert Syst Appl 39:6254–6262CrossRef Verikas A, Bacauskiene M, Gelzinis A, Vaiciukynas E, Uloza V (2012) Questionnaire-versus voice-based screening for laryngeal disorders. Expert Syst Appl 39:6254–6262CrossRef
17.
go back to reference Uloza V, Saferis V, Uloziene I (2005) Perceptual and acoustic assessment of voice pathology and the efficacy of endolaryngeal phonomicrosurgery. J Voice 19:138–145CrossRefPubMed Uloza V, Saferis V, Uloziene I (2005) Perceptual and acoustic assessment of voice pathology and the efficacy of endolaryngeal phonomicrosurgery. J Voice 19:138–145CrossRefPubMed
18.
go back to reference Bland JM, Altman D (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 327:307–310CrossRef Bland JM, Altman D (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 327:307–310CrossRef
19.
go back to reference Elliott AC, Woodward WA (2007) Statistical analysis quick reference guidebook: with SPSS examples. Sage Publications, New York Elliott AC, Woodward WA (2007) Statistical analysis quick reference guidebook: with SPSS examples. Sage Publications, New York
21.
go back to reference Saenz-Lechon N, Godino-Llorente JI, Osma-Ruiz V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1:120–128CrossRef Saenz-Lechon N, Godino-Llorente JI, Osma-Ruiz V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1:120–128CrossRef
22.
go back to reference Brümmer N, de Villiers E (2013) The BOSARIS toolkit: Theory, algorithms and code for surviving the new dcf. ArXiv Preprint ArXiv 1304.2865 Brümmer N, de Villiers E (2013) The BOSARIS toolkit: Theory, algorithms and code for surviving the new dcf. ArXiv Preprint ArXiv 1304.2865
23.
go back to reference Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Gr Stat 15:651–674CrossRef Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Gr Stat 15:651–674CrossRef
24.
go back to reference Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14:323–348PubMedCentralCrossRefPubMed Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14:323–348PubMedCentralCrossRefPubMed
25.
go back to reference Eadie TL, Doyle PC (2005) Classification of dysphonic voice: acoustic and auditory-perceptual measures. J Voice 19:1–14CrossRefPubMed Eadie TL, Doyle PC (2005) Classification of dysphonic voice: acoustic and auditory-perceptual measures. J Voice 19:1–14CrossRefPubMed
26.
go back to reference Smits I, Ceuppens P, De Bodt MS (2005) A comparative study of acoustic voice measurements by means of Dr. Speech and computerized speech lab. J Voice 19:187–196CrossRefPubMed Smits I, Ceuppens P, De Bodt MS (2005) A comparative study of acoustic voice measurements by means of Dr. Speech and computerized speech lab. J Voice 19:187–196CrossRefPubMed
27.
go back to reference Oguz H, Demirci M, Safak MA, Arslan N, Islam A, Kargin S (2007) Effects of unilateral vocal cord paralysis on objective voice measures obtained by Praat. Eur Arch Otorhinolaryngol 264:257–261CrossRefPubMed Oguz H, Demirci M, Safak MA, Arslan N, Islam A, Kargin S (2007) Effects of unilateral vocal cord paralysis on objective voice measures obtained by Praat. Eur Arch Otorhinolaryngol 264:257–261CrossRefPubMed
28.
go back to reference Zhang Y, Jiang JJ (2008) Acoustic analyses of sustained and running voices from patients with laryngeal pathologies. J Voice 22:1–9CrossRefPubMed Zhang Y, Jiang JJ (2008) Acoustic analyses of sustained and running voices from patients with laryngeal pathologies. J Voice 22:1–9CrossRefPubMed
29.
go back to reference Maryn Y, Corthals P, De Bodt M, Van Cauwenberge P, Deliyski D (2009) Perturbation measures of voice: a comparative study between multi-dimensional voice program and praat. Folia Phoniatr Logop 61:217–226CrossRefPubMed Maryn Y, Corthals P, De Bodt M, Van Cauwenberge P, Deliyski D (2009) Perturbation measures of voice: a comparative study between multi-dimensional voice program and praat. Folia Phoniatr Logop 61:217–226CrossRefPubMed
30.
go back to reference Linder R, Albers AE, Hess M, Pöppl SJ, Schönweiler R (2008) Artificial neural network-based classification to screen for dysphonia using psychoacoustic scaling of acoustic voice features. J Voice 22:155–163CrossRefPubMed Linder R, Albers AE, Hess M, Pöppl SJ, Schönweiler R (2008) Artificial neural network-based classification to screen for dysphonia using psychoacoustic scaling of acoustic voice features. J Voice 22:155–163CrossRefPubMed
31.
go back to reference Muhammad G, Mesallam TA, Malki KH, Farahat M, Mahmood A, Alsulaiman M (2012) Multidirectional regression (MDR)-based features for automatic voice disorder detection. J Voice 26:19–27CrossRef Muhammad G, Mesallam TA, Malki KH, Farahat M, Mahmood A, Alsulaiman M (2012) Multidirectional regression (MDR)-based features for automatic voice disorder detection. J Voice 26:19–27CrossRef
32.
go back to reference Svec JG, Granqvist S (2010) Guidelines for selecting microphones for human voice production research. Am J Speech Lang Pathol 19:356–368CrossRefPubMed Svec JG, Granqvist S (2010) Guidelines for selecting microphones for human voice production research. Am J Speech Lang Pathol 19:356–368CrossRefPubMed
33.
go back to reference Moon KR, Chung SM, Park HS, Kim HS (2012) Materials of acoustic analysis: sustained vowel versus sentence. J Voice 26:563–565CrossRefPubMed Moon KR, Chung SM, Park HS, Kim HS (2012) Materials of acoustic analysis: sustained vowel versus sentence. J Voice 26:563–565CrossRefPubMed
34.
go back to reference Kaleem M, Ghoraani B, Guergachi A, Krishnan S (2013) Pathological speech signal analysis and classification using empirical mode decomposition. Med Biol Eng Comput 51:811–821CrossRefPubMed Kaleem M, Ghoraani B, Guergachi A, Krishnan S (2013) Pathological speech signal analysis and classification using empirical mode decomposition. Med Biol Eng Comput 51:811–821CrossRefPubMed
35.
go back to reference Henríquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Díaz-de-María F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. Audio Speech Lang Process IEEE Trans 17:1186–1195CrossRef Henríquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Díaz-de-María F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. Audio Speech Lang Process IEEE Trans 17:1186–1195CrossRef
36.
go back to reference Uloza V, Verikas A, Bacauskiene M, Gelzinis A, Pribuisiene R, Kaseta M, Saferis V (2011) Categorizing normal and pathological voices: automated and perceptual categorization. J Voice 25:700–708CrossRefPubMed Uloza V, Verikas A, Bacauskiene M, Gelzinis A, Pribuisiene R, Kaseta M, Saferis V (2011) Categorizing normal and pathological voices: automated and perceptual categorization. J Voice 25:700–708CrossRefPubMed
37.
go back to reference Vaiciukynas E, Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2012) Exploring similarity-based classification of larynx disorders from human voice. Speech Commun 54:601–610CrossRef Vaiciukynas E, Verikas A, Gelzinis A, Bacauskiene M, Uloza V (2012) Exploring similarity-based classification of larynx disorders from human voice. Speech Commun 54:601–610CrossRef
Metadata
Title
Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening
Authors
Virgilijus Uloza
Evaldas Padervinskis
Aurelija Vegiene
Ruta Pribuisiene
Viktoras Saferis
Evaldas Vaiciukynas
Adas Gelzinis
Antanas Verikas
Publication date
01-11-2015
Publisher
Springer Berlin Heidelberg
Published in
European Archives of Oto-Rhino-Laryngology / Issue 11/2015
Print ISSN: 0937-4477
Electronic ISSN: 1434-4726
DOI
https://doi.org/10.1007/s00405-015-3708-4

Other articles of this Issue 11/2015

European Archives of Oto-Rhino-Laryngology 11/2015 Go to the issue