Integrating BERT Representations and Psycholinguistic Features for Emotion Recognition in Clinical Texts

Authors

  • Yiqing Xu

    School of Information Engineering, Changzhou Vocational Institute of Industry Technology, Changzhou 213164, China

  • Zalizah Awang Long

    Malaysian Institute of Information Technology, Universiti Kuala Lumpur, Kuala Lumpur 50250, Malaysia

  • Djoko Budiyanto Setyohadi

    Department of Informatics, University of Atma Jaya Yogyakarta, Yogyakarta 55281, Indonesia

DOI:

https://doi.org/10.30564/fls.v7i5.9660
Received: 23 April 2025 | Revised: 7 May 2025 | Accepted: 13 May 2025 | Published Online: 15 May 2025

Abstract

In clinical texts, recognizing emotions is crucial for monitoring mental health, though it is still a tough task because of the way language is used and the particular terms in this field. The hybrid framework suggested in this research uses ClinicalBERT for context and LIWC and the NRC Emotion Lexicon for psycholinguistic features to help improve multi-label emotion classification in clinical narratives. The data has been de-identified and annotated with anger, anxiety, sadness, joy, fear and neutral emotions and there is good agreement between annotators (Cohen’s κ = 0.81). Three approaches were studied: using Random Forest with psycholinguistic features, ClinicalBERT-based Multilayer Perceptron (MLP) and a hybrid MLP that combines both sets of features. The hybrid model was better than the baselines, achieving mean scores of 0.884 (±0.011) accuracy, 0.854 (±0.012) Micro-F1, 0.814 (±0.013) Macro-F1 and 0.924 (±0.011) AUC which were statistically significant (ANOVA p < 0.005; Cohen’s d = 1.24–2.89). The SHAP analysis found that ClinicalBERT contributed more than two-thirds of the predictive ability, while psycholinguistic features contributed the rest, making the model easier to understand. This method works to solve main problems in healthcare AI by ensuring the accuracy of predictions and making the results easy to understand. It backs up trustworthy use in clinics by giving clear and reliable emotion predictions that can support decisions, monitor risks and be used in digital mental health services. The results suggest that using deep learning together with existing psychological tools improves emotional detection in healthcare.

Keywords:

Emotion Recognition; Clinical Text; BERT; Psycholinguistic Features; LIWC; NRC Lexicon; Deep Learning; NLP

References

[1] Calvo, R.A., Milne, D.N., Hussain, M.S., et al., 2017. Natural language processing in mental health applications using non-clinical texts. Natural Language Engineering. 23(5), 649–685. DOI: https://doi.org/10.1017/S1351324916000383

[2] Mishra, A.R., Rai, A., Nandan, D., et al., 2025. Unveiling Emotions: NLP-Based Mood Classification and Well-Being Tracking for Enhanced Mental Health Awareness. Mathematical Modelling of Engineering Problems. 12(2), 647-656. DOI: https://doi.org/10.18280/mmep.120228

[3] Shatte, A.B.R., Hutchinson, D.M., Teague, S.J., 2019. Machine learning in mental health: A scoping review of methods and applications. Psychological Medicine. 49(9), 1426–1448. DOI: https://doi.org/10.1017/S0033291719000151

[4] De Choudhury, M., De, S., 2014. Mental health discourse on reddit: Self-disclosure, social support, and anonymity. Proceedings of the International AAAI Conference on Web and Social Media. 8(1), 71–80. DOI: https://doi.org/10.1609/icwsm.v8i1.14526

[5] Pennebaker, J.W., Boyd, R.L., Jordan, K., et al., 2015. The development and psychometric properties of LIWC2015. University of Texas at Austin: Austin, TX, USA.

[6] Mohammad, S., Turney, P., 2013. Crowdsourcing a word–emotion association lexicon. Computational Intelligence. 29(3), 436–465. DOI: https://doi.org/10.1111/j.1467-8640.2012.00460.x

[7] Tausczik, Y.R., Pennebaker, J.W., 2010. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology. 29(1), 24–54. DOI: https://doi.org/10.1177/0261927X09351676

[8] Devlin, J., Chang, M.W., Lee, K., et al., 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2–7 June 2019; Minneapolis, Minnesota. pp. 4171–4186. DOI: https://doi.org/10.18653/v1/N19-1423

[9] Rogers, A., Kovaleva, O., Rumshisky, A., 2020. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics. 8, 842–866. DOI: https://doi.org/10.1162/tacl_a_00349

[10] Alsentzer, E., Murphy, J., Boag, W., et al., 2019. Publicly available clinical BERT embeddings. Proceedings of the 2nd Clinical Natural Language Processing Workshop; 7 June 2019; Minneapolis, MN, USA. pp. 72–78. DOI: https://doi.org/10.18653/v1/W19-1909

[11] Tonekaboni, S., Joshi, S., McCradden, M.D., et al., 2019. What clinicians want: Contextualizing explainable machine learning for clinical end use. Proceedings of the 4th Machine Learning for Healthcare Conference; 9–10 August 2019; Ann Arbor, Michigan, USA. 106, 359–380.

[12] Kim, Y., Klinger, R., 2019. Frowning Frodo, wincing Leia, and a seriously great friendship: Learning to classify emotional relationships of fictional characters. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics; 2–7 June 2019; Minneapolis, MN, USA. pp. 647–653.

[13] Li, X., Song, K., Feng, S., et al., 2018. A Co-Attention Neural Network Model for Emotion Cause Analysis with Emotional Context Awareness. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing; October 31–November 4, 2018; Brussels, Belgium. pp. 4752–4757.

[14] Strapparava, C., Mihalcea, R., 2008. Learning to identify emotions in text. Proceedings of the 2008 ACM Symposium on Applied Computing; 16–20 March, 2008; Fortaleza, Brazil. pp. 1556–1560.

[15] Lundberg, S.M., Lee, S.I., 2017. A unified approach to interpreting model predictions. Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS 2017); 4–9 December 2017; Long Beach, CA, USA. pp. 4765–4774.

[16] Esteva, A., Robicquet, A., Ramsundar, B., et al., 2019. A guide to deep learning in healthcare. Nature Medicine. 25(1), 24–29. DOI: https://doi.org/10.1038/s41591-018-0316-z

[17] Johnson, A.E.W., Pollard, T.J., Shen, L., et al., 2016. MIMIC-III, a freely accessible critical care database. Scientific Data. 3, 160035. DOI: https://doi.org/10.1038/sdata.2016.35

[18] Ekman, P., 1992. An argument for basic emotions. Cognition and Emotion. 6(3–4), 169–200. DOI: https://doi.org/10.1080/02699939208411068

[19] McHugh, M.L., 2012. Interrater reliability: the kappa statistic. Biochemia Medica. 22(3), 276–282.

[20] Breiman, L., 2001. Random forests. Machine Learning. 45(1), 5–32. DOI: https://doi.org/10.1023/A:1010933404324

[21] Daniels, Z.A., Metaxas, D.N., 2017. Addressing imbalance in multi-label classification using structured Hellinger forests. Proceedings of the 31st AAAI Conference on Artificial Intelligence; 4–9 February 2017; San Francisco, CA, USA. pp. 1826–1832. DOI: https://doi.org/10.1609/aaai.v31i1.10908

[22] Iavarone, B., 2024. Understanding emotive response to textual stimuli: A multimodal approach. Scuola Normale Superiore: Pisa, Italy. Available from: https://tesidottorato.depositolegale.it/handle/20.500.14242/167628

[23] Makhmudov, F., Kultimuratov, A., Cho, Y.I., 2024. Enhancing multimodal emotion recognition through attention mechanisms in BERT and CNN architectures. Applied Sciences. 14(10), 4199. DOI: https://doi.org/10.3390/app14104199

[24] Acheampong, F.A., Nunoo-Mensah, H., Chen, W., 2021. Transformer models for text-based emotion detection: a review of BERT-based approaches. Artificial Intelligence Review. 54(8), 5789–5829. DOI: https://doi.org/10.1007/s10462-021-09958-2

[25] Kazemeinizadeh, A., 2022. Psychological understanding of textual journals using natural language processing approaches [Master's thesis]. The University of Western Ontario: London, ON, Canada.

[26] Salmerón-Ríos, A., García-Díaz, J.A., Pan, R., et al., 2024. Fine grain emotion analysis in Spanish using linguistic features and transformers. PeerJ Computer Science. 10, e1992. DOI: https://doi.org/10.7717/peerj-cs.1992

Downloads

How to Cite

Xu, Y., Awang Long, Z., & Setyohadi, D. B. (2025). Integrating BERT Representations and Psycholinguistic Features for Emotion Recognition in Clinical Texts. Forum for Linguistic Studies, 7(5), 976–989. https://doi.org/10.30564/fls.v7i5.9660

Issue

Article Type

Article