Skip to main content
Top
Published in: The Cerebellum 2/2024

11-04-2023 | Dysarthria | Research

Uncertainty of Vowel Predictions as a Digital Biomarker for Ataxic Dysarthria

Authors: Dmitry Yu. Isaev, Roza M. Vlasova, J. Matias Di Martino, Christopher D. Stephen, Jeremy D. Schmahmann, Guillermo Sapiro, Anoopum S. Gupta

Published in: The Cerebellum | Issue 2/2024

Login to get access

Abstract

Dysarthria is a common manifestation across cerebellar ataxias leading to impairments in communication, reduced social connections, and decreased quality of life. While dysarthria symptoms may be present in other neurological conditions, ataxic dysarthria is a perceptually distinct motor speech disorder, with the most prominent characteristics being articulation and prosody abnormalities along with distorted vowels. We hypothesized that uncertainty of vowel predictions by an automatic speech recognition system can capture speech changes present in cerebellar ataxia. Speech of participants with ataxia (N=61) and healthy controls (N=25) was recorded during the “picture description” task. Additionally, participants’ dysarthric speech and ataxia severity were assessed on a Brief Ataxia Rating Scale (BARS). Eight participants with ataxia had speech and BARS data at two timepoints. A neural network trained for phoneme prediction was applied to speech recordings. Average entropy of vowel tokens predictions (AVE) was computed for each participant’s recording, together with mean pitch and intensity standard deviations (MPSD and MISD) in the vowel segments. AVE and MISD demonstrated associations with BARS speech score (Spearman’s rho=0.45 and 0.51), and AVE demonstrated associations with BARS total (rho=0.39). In the longitudinal cohort, Wilcoxon pairwise signed rank test demonstrated an increase in BARS total and AVE, while BARS speech and acoustic measures did not significantly increase. Relationship of AVE to both BARS speech and BARS total, as well as the ability to capture disease progression even in absence of measured speech decline, indicates the potential of AVE as a digital biomarker for cerebellar ataxia.
Literature
1.
go back to reference Klockgether T. Chapter 35 - ataxias. In: Goetz CG, editor. Textbook of clinical neurology (Third Edition). Philadelphia: W.B. Saunders; 2007. p. 765–80.CrossRef Klockgether T. Chapter 35 - ataxias. In: Goetz CG, editor. Textbook of clinical neurology (Third Edition). Philadelphia: W.B. Saunders; 2007. p. 765–80.CrossRef
2.
go back to reference Ziegler W. Chapter 1 - the phonetic cerebellum: cerebellar involvement in speech sound production. In: Mariën P, Manto M, editors. The Linguistic Cerebellum. San Diego: Academic Press; 2016. p. 1–32. Ziegler W. Chapter 1 - the phonetic cerebellum: cerebellar involvement in speech sound production. In: Mariën P, Manto M, editors. The Linguistic Cerebellum. San Diego: Academic Press; 2016. p. 1–32.
3.
go back to reference Duffy, J.R., Motor speech disorders : substrates, differential diagnosis, and management. 2nd ed. 2005, St. Louis, Mo.: Elsevier Mosby. xiii, 578 p. Duffy, J.R., Motor speech disorders : substrates, differential diagnosis, and management. 2nd ed. 2005, St. Louis, Mo.: Elsevier Mosby. xiii, 578 p.
6.
go back to reference Darley FL, Aronson AE, Brown JR. Differential diagnostic patterns of dysarthria. J Speech Hear Res. 1969;12(2):246–69.PubMedCrossRef Darley FL, Aronson AE, Brown JR. Differential diagnostic patterns of dysarthria. J Speech Hear Res. 1969;12(2):246–69.PubMedCrossRef
7.
go back to reference Kent RD, et al. A speaking task analysis of the dysarthria in cerebellar disease. Folia Phoniatr Logop. 1997;49(2):63–82.PubMedCrossRef Kent RD, et al. A speaking task analysis of the dysarthria in cerebellar disease. Folia Phoniatr Logop. 1997;49(2):63–82.PubMedCrossRef
8.
go back to reference Kent RD, Netsell R, Abbs JH. Acoustic characteristics of dysarthria associated with cerebellar disease. J Speech Hear Res. 1979;22(3):627–48.PubMedCrossRef Kent RD, Netsell R, Abbs JH. Acoustic characteristics of dysarthria associated with cerebellar disease. J Speech Hear Res. 1979;22(3):627–48.PubMedCrossRef
9.
go back to reference Trouillas P, et al. International Cooperative Ataxia Rating Scale for pharmacological assessment of the cerebellar syndrome. The Ataxia Neuropharmacology Committee of the World Federation of Neurology. J Neurol Sci. 1997;145(2):205–11.PubMedCrossRef Trouillas P, et al. International Cooperative Ataxia Rating Scale for pharmacological assessment of the cerebellar syndrome. The Ataxia Neuropharmacology Committee of the World Federation of Neurology. J Neurol Sci. 1997;145(2):205–11.PubMedCrossRef
10.
11.
go back to reference Weyer A, et al. Reliability and validity of the scale for the assessment and rating of ataxia: a study in 64 ataxia patients. Mov Disord. 2007;22(11):1633–7.PubMedCrossRef Weyer A, et al. Reliability and validity of the scale for the assessment and rating of ataxia: a study in 64 ataxia patients. Mov Disord. 2007;22(11):1633–7.PubMedCrossRef
12.
go back to reference Kewley-Port D, Burkle TZ, Lee JH. Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. J Acoust Soc Am. 2007;122(4):2365–75.PubMedCrossRef Kewley-Port D, Burkle TZ, Lee JH. Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. J Acoust Soc Am. 2007;122(4):2365–75.PubMedCrossRef
13.
go back to reference Lansford KL, Liss JM. Vowel acoustics in dysarthria: speech disorder diagnosis and classification. J Speech Lang Hear Res. 2014;57(1):57–67.PubMedCrossRef Lansford KL, Liss JM. Vowel acoustics in dysarthria: speech disorder diagnosis and classification. J Speech Lang Hear Res. 2014;57(1):57–67.PubMedCrossRef
14.
go back to reference Lansford KL, Liss JM. Vowel acoustics in dysarthria: mapping to perception. J Speech Lang Hear Res. 2014;57(1):68–80.PubMedCrossRef Lansford KL, Liss JM. Vowel acoustics in dysarthria: mapping to perception. J Speech Lang Hear Res. 2014;57(1):68–80.PubMedCrossRef
15.
16.
17.
go back to reference Odell K, et al. Perceptual characteristics of vowel and prosody production in apraxic, aphasic, and dysarthric speakers. J Speech Hear Res. 1991;34(1):67.PubMedCrossRef Odell K, et al. Perceptual characteristics of vowel and prosody production in apraxic, aphasic, and dysarthric speakers. J Speech Hear Res. 1991;34(1):67.PubMedCrossRef
18.
go back to reference Delgado-Hernandez J. Pilot study of the acoustic values of the vowels in Spanish as indicators of the severity of dysarthria. Revista de neurologiá. 2017;64(3):105.PubMed Delgado-Hernandez J. Pilot study of the acoustic values of the vowels in Spanish as indicators of the severity of dysarthria. Revista de neurologiá. 2017;64(3):105.PubMed
19.
go back to reference Liss JM, et al. Lexical boundary error analysis in hypokinetic and ataxic dysarthria. J Acoust Soc Am. 2000;107(6):3415–24.PubMedCrossRef Liss JM, et al. Lexical boundary error analysis in hypokinetic and ataxic dysarthria. J Acoust Soc Am. 2000;107(6):3415–24.PubMedCrossRef
20.
go back to reference Borrie SA, Lansford KL, Barrett TS. Rhythm perception and its role in perception and learning of dysrhythmic speech. J Speech Lang Hear Res. 2017;60(3):561–70.PubMedCrossRef Borrie SA, Lansford KL, Barrett TS. Rhythm perception and its role in perception and learning of dysrhythmic speech. J Speech Lang Hear Res. 2017;60(3):561–70.PubMedCrossRef
21.
go back to reference Hertrich I, Ackermann H. Temporal and spectral aspects of coarticulation in ataxic dysarthria: an acoustic analysis. J Speech Lang Hear Res. 1999;42(2):367–81.PubMedCrossRef Hertrich I, Ackermann H. Temporal and spectral aspects of coarticulation in ataxic dysarthria: an acoustic analysis. J Speech Lang Hear Res. 1999;42(2):367–81.PubMedCrossRef
22.
go back to reference Ackermann H, et al. Phonemic vowel length contrasts in cerebellar disorders. Brain Lang. 1999;67(2):95–109.PubMedCrossRef Ackermann H, et al. Phonemic vowel length contrasts in cerebellar disorders. Brain Lang. 1999;67(2):95–109.PubMedCrossRef
23.
go back to reference Liu, A.T., et al. Mockingjay: unsupervised speech representation learning with deep bidirectional transformer encoders. in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020. Liu, A.T., et al. Mockingjay: unsupervised speech representation learning with deep bidirectional transformer encoders. in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020.
24.
go back to reference Song, X., et al. Speech-XLNet: Unsupervised acoustic model pretraining for self-attention networks. 2019. arXiv:1910.10387. Song, X., et al. Speech-XLNet: Unsupervised acoustic model pretraining for self-attention networks. 2019. arXiv:1910.10387.
25.
go back to reference Chi, P.-H., et al. Audio ALBERT: A lite BERT for self-supervised learning of audio representation. in 2021 IEEE Spoken Language Technology Workshop (SLT). 2021. Chi, P.-H., et al. Audio ALBERT: A lite BERT for self-supervised learning of audio representation. in 2021 IEEE Spoken Language Technology Workshop (SLT). 2021.
26.
go back to reference Liu, A.T., S.-W. Li, H.-y. Lee TERA: Self-supervised learning of transformer encoder representation for speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021. 29: p. 2351–66. Liu, A.T., S.-W. Li, H.-y. Lee TERA: Self-supervised learning of transformer encoder representation for speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021. 29: p. 2351–66.
27.
go back to reference Baevski, A., et al. wav2vec 2.0: a framework for self-supervised learning of speech representations. Advances in neural information processing systems, 2020. 33: p. 12449–60. Baevski, A., et al. wav2vec 2.0: a framework for self-supervised learning of speech representations. Advances in neural information processing systems, 2020. 33: p. 12449–60.
28.
go back to reference Garofolo, John S., et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium, 1993. Garofolo, John S., et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium, 1993.
29.
go back to reference Panayotov, V., et al. Librispeech: an ASR corpus based on public domain audio books. in 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2015. Panayotov, V., et al. Librispeech: an ASR corpus based on public domain audio books. in 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2015.
30.
go back to reference Zhu J, Zhang C. Performing forced alignment with Wav2vec 2.0. J Acoust Soc Am. 2021;150(4):A357–7.CrossRef Zhu J, Zhang C. Performing forced alignment with Wav2vec 2.0. J Acoust Soc Am. 2021;150(4):A357–7.CrossRef
31.
go back to reference Noffs G, et al. Acoustic speech analytics are predictive of cerebellar dysfunction in multiple sclerosis. Cerebellum. 2020;19(5):691–700.PubMedCrossRef Noffs G, et al. Acoustic speech analytics are predictive of cerebellar dysfunction in multiple sclerosis. Cerebellum. 2020;19(5):691–700.PubMedCrossRef
33.
go back to reference Vogel AP, et al. Features of speech and swallowing dysfunction in pre-ataxic spinocerebellar ataxia type 2. Neurology. 2020;95(2):e194–205.PubMedCrossRef Vogel AP, et al. Features of speech and swallowing dysfunction in pre-ataxic spinocerebellar ataxia type 2. Neurology. 2020;95(2):e194–205.PubMedCrossRef
35.
go back to reference Kashyap B, et al. Quantitative assessment of speech in cerebellar ataxia using magnitude and phase based cepstrum. Ann Biomed Eng. 2020;48(4):1322–36.PubMedCrossRef Kashyap B, et al. Quantitative assessment of speech in cerebellar ataxia using magnitude and phase based cepstrum. Ann Biomed Eng. 2020;48(4):1322–36.PubMedCrossRef
36.
go back to reference Blaney B, Hewlett N. Dysarthria and Friedreich’s ataxia: what can intelligibility assessment tell us? Int J Lang Commun Disord. 2007;42(1):19–37.PubMedCrossRef Blaney B, Hewlett N. Dysarthria and Friedreich’s ataxia: what can intelligibility assessment tell us? Int J Lang Commun Disord. 2007;42(1):19–37.PubMedCrossRef
38.
go back to reference Ludlow CL, Kent RD, Gray LC. Measuring voice, speech, and swallowing in the clinic and laboratory. In: San Diego. United States: Plural Publishing, Incorporated; 2014. Ludlow CL, Kent RD, Gray LC. Measuring voice, speech, and swallowing in the clinic and laboratory. In: San Diego. United States: Plural Publishing, Incorporated; 2014.
39.
go back to reference Zhou H, et al. Assessment of gait and balance impairment in people with spinocerebellar ataxia using wearable sensors. Neurol Sci. 2022;43(4):2589–99.PubMedCrossRef Zhou H, et al. Assessment of gait and balance impairment in people with spinocerebellar ataxia using wearable sensors. Neurol Sci. 2022;43(4):2589–99.PubMedCrossRef
40.
go back to reference Goodglass, H., et al., Boston diagnostic aphasia examination. 2001. Goodglass, H., et al., Boston diagnostic aphasia examination. 2001.
41.
go back to reference Chang Z, et al. Accurate detection of cerebellar smooth pursuit eye movement abnormalities via mobile phone video and machine learning. Sci Rep. 2020;10(1):18641.PubMedPubMedCentralCrossRef Chang Z, et al. Accurate detection of cerebellar smooth pursuit eye movement abnormalities via mobile phone video and machine learning. Sci Rep. 2020;10(1):18641.PubMedPubMedCentralCrossRef
42.
go back to reference Wolf, T., et al. Transformers: state-of-the-art natural language processing. Online: Association for Computational Linguistics 2020. Wolf, T., et al. Transformers: state-of-the-art natural language processing. Online: Association for Computational Linguistics 2020. 
43.
go back to reference Lee K, Hon H. Speaker-independent phone recognition using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989;37(11):1641–8.CrossRef Lee K, Hon H. Speaker-independent phone recognition using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989;37(11):1641–8.CrossRef
44.
go back to reference Shannon CE, Weaver W. The mathematical theory of communication, vol. v. Urbana: University of Illinois Press; 1949. p. 117. Shannon CE, Weaver W. The mathematical theory of communication, vol. v. Urbana: University of Illinois Press; 1949. p. 117.
45.
go back to reference Jadoul Y, Thompson B, de Boer B. Introducing parselmouth: a python interface to Praat. JPhon. 2018;71:1–15. Jadoul Y, Thompson B, de Boer B. Introducing parselmouth: a python interface to Praat. JPhon. 2018;71:1–15.
46.
go back to reference Tukey JW. Exploratory data analysis. Addison-Wesley series in behavioral science, vol. xvi. Reading, Mass: Addison-Wesley Pub. Co; 1977. p. 688. Tukey JW. Exploratory data analysis. Addison-Wesley series in behavioral science, vol. xvi. Reading, Mass: Addison-Wesley Pub. Co; 1977. p. 688.
47.
go back to reference Shapiro SS, Wilk MB. An analysis of variance test for normality (complete samples). Biometrika. 1965;52(3/4):591–611.MathSciNetCrossRef Shapiro SS, Wilk MB. An analysis of variance test for normality (complete samples). Biometrika. 1965;52(3/4):591–611.MathSciNetCrossRef
48.
go back to reference Long JS. Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences, vol. xxx. Thousand Oaks: Sage Publications; 1997. p. 297. Long JS. Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences, vol. xxx. Thousand Oaks: Sage Publications; 1997. p. 297.
49.
go back to reference Folker JE, et al. Differentiating profiles of speech impairments in Friedreich’s ataxia: a perceptual and instrumental approach. Int J Lang Commun Disord. 2012;47(1):65–76.PubMedCrossRef Folker JE, et al. Differentiating profiles of speech impairments in Friedreich’s ataxia: a perceptual and instrumental approach. Int J Lang Commun Disord. 2012;47(1):65–76.PubMedCrossRef
50.
go back to reference Daniloff RG, Hammarberg RE. On defining coarticulation. J Phon. 1973;1(3):239–48.CrossRef Daniloff RG, Hammarberg RE. On defining coarticulation. J Phon. 1973;1(3):239–48.CrossRef
51.
go back to reference Stilp CE, Kluender KR. Cochlea-scaled entropy, not consonants, vowels, or time, best predicts speech intelligibility. Proc Natl Acad Sci USA. 2010;107(27):12387–92.PubMedPubMedCentralCrossRef Stilp CE, Kluender KR. Cochlea-scaled entropy, not consonants, vowels, or time, best predicts speech intelligibility. Proc Natl Acad Sci USA. 2010;107(27):12387–92.PubMedPubMedCentralCrossRef
52.
go back to reference Shor J, Venugopalan S. TRILLsson: distilled universal paralinguistic speech representations. 2022. arXiv:2203.00236. Shor J, Venugopalan S. TRILLsson: distilled universal paralinguistic speech representations. 2022. arXiv:2203.00236.
53.
go back to reference Shor, J., et al. Universal paralinguistic speech representations using self-supervised conformers. 2022. in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2022. Shor, J., et al. Universal paralinguistic speech representations using self-supervised conformers. 2022. in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2022.
54.
go back to reference Korzekwa, D., et al. Interpretable deep learning model for the detection and reconstruction of dysarthric speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019 p. 3890–94. Korzekwa, D., et al. Interpretable deep learning model for the detection and reconstruction of dysarthric speech. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019 p. 3890–94.
55.
go back to reference Kim H, et al. Dysarthric speech database for universal access research. In: Proceedings of the Annual Conference of the International Speech Communication Association: INTERSPEECH; 2008. p. 1741–4. Kim H, et al. Dysarthric speech database for universal access research. In: Proceedings of the Annual Conference of the International Speech Communication Association: INTERSPEECH; 2008. p. 1741–4.
56.
go back to reference Weston J et al. Learning de-identified representations of prosody from raw audio. In International Conference on Machine Learning. 2021. PMLR. Weston J et al. Learning de-identified representations of prosody from raw audio. In International Conference on Machine Learning. 2021. PMLR.
57.
go back to reference Grabe E, Low EL. Durational variability in speech and the rhythm class hypothesis. Lab Phonol. 2002;7:515–46. Grabe E, Low EL. Durational variability in speech and the rhythm class hypothesis. Lab Phonol. 2002;7:515–46.
58.
go back to reference Low EL. Prosodic prominence in singapore english: University of Cambridge; 1998. Low EL. Prosodic prominence in singapore english: University of Cambridge; 1998.
59.
go back to reference Conneau A et al. Unsupervised cross-lingual representation learning for speech recognition. 2020. arXiv:2006.13979. Conneau A et al. Unsupervised cross-lingual representation learning for speech recognition. 2020. arXiv:2006.13979.
60.
go back to reference Malmsten M, Haffenden C, Börjeson L. Hearing voices at the national library -- a speech corpus and acoustic model for the Swedish language. 2022. arXiv:2205.03026. Malmsten M, Haffenden C, Börjeson L. Hearing voices at the national library -- a speech corpus and acoustic model for the Swedish language. 2022. arXiv:2205.03026.
61.
go back to reference Xu Q, Baevski A, Auli M. Simple and effective zero-shot cross-lingual phoneme recognition. 2021. arXiv:2109.11680. Xu Q, Baevski A, Auli M. Simple and effective zero-shot cross-lingual phoneme recognition. 2021. arXiv:2109.11680.
Metadata
Title
Uncertainty of Vowel Predictions as a Digital Biomarker for Ataxic Dysarthria
Authors
Dmitry Yu. Isaev
Roza M. Vlasova
J. Matias Di Martino
Christopher D. Stephen
Jeremy D. Schmahmann
Guillermo Sapiro
Anoopum S. Gupta
Publication date
11-04-2023
Publisher
Springer US
Published in
The Cerebellum / Issue 2/2024
Print ISSN: 1473-4222
Electronic ISSN: 1473-4230
DOI
https://doi.org/10.1007/s12311-023-01539-z

Other articles of this Issue 2/2024

The Cerebellum 2/2024 Go to the issue