Top

Health Services and Outcomes Research Methodology

16-04-2024 | Mood Disorders

Multimodal mental state analysis

Authors: Bipin Kumar Rai, Ishika Jain, Baibhav Tiwari, Abhay Saxena

Published in: Health Services and Outcomes Research Methodology

Abstract

Self-reports or professional interviews have typically been used to diagnose depression, although these methods often miss significant behavioral signals. Sometimes, people with depression may not express their feelings accurately, which can make it hard for psychologists to diagnose them correctly. We believe that paying attention to how people speak and behave can help us better identify depression. In real-life situations, psychologists can use different methods, like listening to how someone talks, their body language and change in their emotions while talking. To detect signs of depression more accurately authors presents MANOBAL, a system that analyzes voice, text, and facial expressions to detect depression. We use the DAIC-WoZ dataset, which was requested from the University of Southern California (UoS). We used this dataset for the multimodal depression detection model. Deep learning is challenged with such complicated data, therefore MANOBAL used a multimodal method. It uses elements from audio recordings, text, and facial expressions to predict both depression and its severity. This fusion has two advantages: first, it can substitute for uncertain data in one modality (such as voice) by using input from another (text, facial expressions). Second, it can give more weight to more dependable data sources, which improves accuracy. Small datasets are not very helpful when testing accuracy in fusion models, but MANOBAL overcomes this by exploiting DAIC-Woz dataset's transfer characteristics and increasing training labels. The initial results are encouraging, with a root mean square error of 0.168 for predicting depression severity. Experiments show the effectiveness of combining modalities. High-level features based on Mel Frequency Cepstral Coefficients (MFCC) give useful information on depression, but adding additional audio characteristics and facial action unit increases accuracy by 10% and 20%, respectively.

Abbaschian, B.J., Sierra-Sosa, D., Elmaghraby, A.: Deep learning techniques for speech emotion recognition, from databases to models. Sensors 21(4), 1249 (2021). https://doi.org/10.3390/s21041249CrossRefPubMedPubMedCentral

Alanazi, S.A., et al.: Public’s mental health monitoring via sentimental analysis of financial text using machine learning techniques. Int. J. Environ. Res.s Public Health 19, 15 (2022). https://doi.org/10.3390/ijerph19159695CrossRef

Babu, N.V., Kanaga, E.G.: Sentiment analysis in social media data for depression detection using artificial intelligence: a review. SN Comput. Sci. 3(1), 74 (2022). https://doi.org/10.1007/s42979-021-00958-1CrossRefPubMed

Bota, P.J., Wang, C., Fred, A.L., Da Silva, H.P.: A review, current challenges, and future possibilities on emotion recognition using machine learning and physiological signals. IEEE Access 26(7), 140990–141020 (2019)CrossRef

Campbell, F., Blank, L., Cantrell, A., Baxter, S., Blackmore, C., Dixon, J., Goyder, E.: Factors that influence mental health of university and college students in the UK: a systematic review. BMC Public Health 22(1), 1778 (2022). https://doi.org/10.1186/s12889-022-13943-xCrossRefPubMedPubMedCentral

Chung, J., Teo, J.: Mental Health prediction using machine learning: taxonomy, applications, and challenges. Appl. Comput. Intell. Soft Comput. 5(2022), 1–9 (2022). https://doi.org/10.1155/2022/9970363CrossRef

Ehiabhi, J., Wang, H.: A systematic review of machine learning models in mental health analysis based on multi-channel multi-modal biometric signals. BioMedInformatics 3(1), 193–219 (2023). https://doi.org/10.3390/biomedinformatics3010014CrossRef

Garcia-Ceja, E., Riegler, M., Nordgreen, T., Jakobsen, P., Oedegaard, K.J., Tørresen, J.: Mental health monitoring with multimodal sensing and machine learning: a survey. Pervasive Mobile Comput. 1(51), 1–26 (2018). https://doi.org/10.1016/j.pmcj.2018.09.003CrossRef

Hernández-Torrano, D., Ibrayeva, L., Sparks, J., Lim, N., Clementi, A., Almukhambetova, A., Nurtayev, Y., Muratkyzy, A.: Mental health and well-being of university students: a bibliometric mapping of the literature. Front. Psychol. 9(11), 540000 (2020). https://doi.org/10.3389/fpsyg.2020.01226CrossRef

Kazemitabar, M., Lajoie, S.P., Doleck, T.: Analysis of emotion regulation using posture, voice, and attention: a qualitative case study. Comput. Education Open 2, 100030 (2021). https://doi.org/10.1016/j.caeo.2021.100030CrossRef

Khalil, R.A., Jones, E., Babar, M.I., Jan, T., Zafar, M.H., Alhussain, T.: Speech emotion recognition using deep learning techniques: a review. IEEE Access 7, 117327–117345 (2019). https://doi.org/10.1109/ACCESS.2019.2936124CrossRef

Lin, L., Chen, X., Shen, Y., Zhang, L.: Towards automatic depression detection: a bilstm/1d cnn-based model. Appl. Sci. (switzerland) 10(23), 1–20 (2020). https://doi.org/10.3390/app10238701CrossRef

Nandwani, P., Verma, R.: A review on sentiment analysis and emotion detection from text. Soc. Netw. Anal. Mining 11(1), 81 (2021). https://doi.org/10.1007/s13278-021-00776-6CrossRef

Rahman, R.A., Omar, K., Noah, S.A.M., Danuri, M.S.N.M., Al-Garadi, M.A.: Application of machine learning methods in mental health detection: a systematic review. IEEE Access 8, 183952–183964 (2020). https://doi.org/10.1109/ACCESS.2020.3029154CrossRef

Rai, B.K.: BBTCD: blockchain based traceability of counterfeited drugs. Health Serv Outcomes Res Methodol 23(3), 337–353 (2023)CrossRef

Rai, B.K., Fatima, S., Satyarth, K.: Patient-centric multichain healthcare record. Int. J.E-Health Med. Commun. (IJEHMC) 13(4), 1–4 (2022). https://doi.org/10.4018/IJEHMC.309439CrossRef

Rai, B. K., Kumar, G., and Balyan, V. Eds., “AI and Blockchain in Healthcare,” 2023, doi: https://doi.org/10.1007/978-981-99-0377-1.

Shatte, A.B., Hutchinson, D.M., Teague, S.J.: Machine learning in mental health: a scoping review of methods and applications. Psychol. Med. 49(9), 1426–1448 (2019)CrossRefPubMed

Tavabi, L.: “Multimodal machine learning for interactive mental health therapy,” In: ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction, Association for Computing Machinery, Inc, Oct. 2019, pp. 453–456. doi: https://doi.org/10.1145/3340555.3356095.

Thieme, A., Belgrave, D., Doherty, G.: Machine learning in mental health: a systematic review of the HCI literature to support the development of effective and implementable ML systems. ACM Transact. Comput.-Human Interact. (TOCHI) 27(5), 1–53 (2020)CrossRef

Xie, W. et al., “Interpreting Depression from Question-wise Long-term Video Recording of SDS Evaluation,” Jun. 2021. http://arxiv.org/abs/2106.13393

Title: Multimodal mental state analysis
Authors: Bipin Kumar Rai
Ishika Jain
Baibhav Tiwari
Abhay Saxena
Publication date: 16-04-2024
Publisher: Springer US
Keywords: Mood Disorders
Affective Disorder
Published in: Health Services and Outcomes Research Methodology
Print ISSN: 1387-3741
Electronic ISSN: 1572-9400
DOI: https://doi.org/10.1007/s10742-024-00329-2

At a glance: The STEP trials

Springer Medicine

Multimodal mental state analysis

Abstract

At a glance: The STEP trials

Springer Medicine

Abstract

Please log in to get access to this content