Skip to main content
Top

16-04-2024 | Mood Disorders

Multimodal mental state analysis

Authors: Bipin Kumar Rai, Ishika Jain, Baibhav Tiwari, Abhay Saxena

Published in: Health Services and Outcomes Research Methodology

Login to get access

Abstract

Self-reports or professional interviews have typically been used to diagnose depression, although these methods often miss significant behavioral signals. Sometimes, people with depression may not express their feelings accurately, which can make it hard for psychologists to diagnose them correctly. We believe that paying attention to how people speak and behave can help us better identify depression. In real-life situations, psychologists can use different methods, like listening to how someone talks, their body language and change in their emotions while talking. To detect signs of depression more accurately authors presents MANOBAL, a system that analyzes voice, text, and facial expressions to detect depression. We use the DAIC-WoZ dataset, which was requested from the University of Southern California (UoS). We used this dataset for the multimodal depression detection model. Deep learning is challenged with such complicated data, therefore MANOBAL used a multimodal method. It uses elements from audio recordings, text, and facial expressions to predict both depression and its severity. This fusion has two advantages: first, it can substitute for uncertain data in one modality (such as voice) by using input from another (text, facial expressions). Second, it can give more weight to more dependable data sources, which improves accuracy. Small datasets are not very helpful when testing accuracy in fusion models, but MANOBAL overcomes this by exploiting DAIC-Woz dataset's transfer characteristics and increasing training labels. The initial results are encouraging, with a root mean square error of 0.168 for predicting depression severity. Experiments show the effectiveness of combining modalities. High-level features based on Mel Frequency Cepstral Coefficients (MFCC) give useful information on depression, but adding additional audio characteristics and facial action unit increases accuracy by 10% and 20%, respectively.
Literature
go back to reference Bota, P.J., Wang, C., Fred, A.L., Da Silva, H.P.: A review, current challenges, and future possibilities on emotion recognition using machine learning and physiological signals. IEEE Access 26(7), 140990–141020 (2019)CrossRef Bota, P.J., Wang, C., Fred, A.L., Da Silva, H.P.: A review, current challenges, and future possibilities on emotion recognition using machine learning and physiological signals. IEEE Access 26(7), 140990–141020 (2019)CrossRef
go back to reference Rai, B.K.: BBTCD: blockchain based traceability of counterfeited drugs. Health Serv Outcomes Res Methodol 23(3), 337–353 (2023)CrossRef Rai, B.K.: BBTCD: blockchain based traceability of counterfeited drugs. Health Serv Outcomes Res Methodol 23(3), 337–353 (2023)CrossRef
go back to reference Shatte, A.B., Hutchinson, D.M., Teague, S.J.: Machine learning in mental health: a scoping review of methods and applications. Psychol. Med. 49(9), 1426–1448 (2019)CrossRefPubMed Shatte, A.B., Hutchinson, D.M., Teague, S.J.: Machine learning in mental health: a scoping review of methods and applications. Psychol. Med. 49(9), 1426–1448 (2019)CrossRefPubMed
go back to reference Tavabi, L.: “Multimodal machine learning for interactive mental health therapy,” In: ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, Association for Computing Machinery, Inc, Oct. 2019, pp. 453–456. doi: https://doi.org/10.1145/3340555.3356095. Tavabi, L.: “Multimodal machine learning for interactive mental health therapy,” In: ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, Association for Computing Machinery, Inc, Oct. 2019, pp. 453–456. doi: https://​doi.​org/​10.​1145/​3340555.​3356095.
go back to reference Thieme, A., Belgrave, D., Doherty, G.: Machine learning in mental health: a systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Transact. Comput.-Human Interact. (TOCHI) 27(5), 1–53 (2020)CrossRef Thieme, A., Belgrave, D., Doherty, G.: Machine learning in mental health: a systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Transact. Comput.-Human Interact. (TOCHI) 27(5), 1–53 (2020)CrossRef
Metadata
Title
Multimodal mental state analysis
Authors
Bipin Kumar Rai
Ishika Jain
Baibhav Tiwari
Abhay Saxena
Publication date
16-04-2024
Publisher
Springer US
Published in
Health Services and Outcomes Research Methodology
Print ISSN: 1387-3741
Electronic ISSN: 1572-9400
DOI
https://doi.org/10.1007/s10742-024-00329-2