Detection of Alzheimer's Disease Using Fine-Tuned Large Language Models

Authors

  • Baha Ihnaini

    The College of Science, Mathematics and Technology, Wenzhou-Kean University, Zhejiang 325027, China

  • Yongxin Deng

    The College of Science, Mathematics and Technology, Wenzhou-Kean University, Zhejiang 325027, China

  • Yujie He

    The College of Science, Mathematics and Technology, Wenzhou-Kean University, Zhejiang 325027, China

  • Le Geng

    The College of Science, Mathematics and Technology, Wenzhou-Kean University, Zhejiang 325027, China

  • Jiyai Xu

    The College of Liberal Arts, Wenzhou-Kean University, Zhejiang 325027, China

DOI:

https://doi.org/10.30564/fls.v7i8.9899
Received:6 May 2025 | Revised: 26 May 2025 | Accepted: 10 June 2025 | Published Online: 1 August 2025

Abstract

Since there is no known cure for Alzheimer's disease (AD), early detection is essential to controlling its progression.Because of the high cost and invasiveness of traditional diagnostic techniques like MRIs and pathological testing, researchers are looking into less expensive alternatives that use machine learning (ML) and natural language processing (NLP). By evaluating their performance against traditional ML and deep learning (DL) techniques, this study explores the possibility of using fine-tuned open-source large language models (LLMs) to identify AD through linguistic analysis. To optimize models like Qwen1.5–7B and OLMo1.7–7B, we used supervised fine-tuning (SFT) with parameter-efficient techniques like LoRA and QLoRA on the Pitt Corpus dataset, which consists of speech transcripts from the “Cookie Theft” picture description task. The findings showed that LLMs performed noticeably better than conventional techniques; Qwen1.5–7B had an F1-score of 0.8824, which was higher than CNN (0.7987), LSTM (0.7689), and logistic regression (0.83). The study demonstrates how LLMs can detect subtle linguistic impairments in AD that are difficult for traditional models to identify, like syntactic errors and repetitions. The comparatively small dataset size and exclusive reliance on textual data are limitations, though, and it is recommended that future studies include multimodal inputs and more varied datasets. Despite limitations, the results highlight the potential of optimized LLMs as scalable, non-invasive methods for early AD detection, providing a way to enhance patient care and diagnostic precision. Through this study, a novel, accurate, and reliable method for early diagnosis of Alzheimer's disease patients can be provided.

Keywords:

Alzheimer's Disease; Large Language Models; Natural Language Processing; Supervised Fine-Tuning

References

[1] Lane, C.A., Hardy, J., Schott, J.M., 2018. Alzheimer’s disease. European Journal of Neurology. 25(1), 59–70. DOI: https://doi.org/10.1111/ene.13439

[2] Lynch, C., 2020. World Alzheimer Report 2019: Attitudes to dementia, a global survey: Public health: Engaging people in ADRD research. Alzheimer's & Dementia. 16(S10), e038255. DOI: https://doi.org/10.1002/alz.038255

[3] Jack, C.R., et al., 2015. Magnetic resonance imaging in Alzheimer’s Disease Neuroimaging Initiative 2. Alzheimer's & Dementia. 11(7), 740–756. DOI: https://doi.org/10.1016/j.jalz.2015.05.002

[4] Liu, N., Yuan, Z., Tang, Q., 2022. Improving Alzheimer’s Disease Detection for Speech Based on Feature Purification Network. Frontiers in Public Health. 9, 835960. DOI: https://doi.org/10.3389/fpubh.2021.835960

[5] Wu, J., Yang, S., Zhan, R., et al., 2025. A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions. Computational Linguistics. 51(1), 275–338. DOI: https://doi.org/10.1162/coli_a_00549

[6] Yuan, J., Bian, Y., Cai, X., et al., 2020. Disfluencies and fine-tuning pre-trained language models for detection of Alzheimer’s disease. In Proceedings of the 21st Annual Conference of the International Speech Communication Association (Interspeech 2020), Shanghai, China, (25–29 October 2020); pp. 2162–2166. DOI: https://doi.org/10.21437/Interspeech.2020-2516

[7] Matosevic, L., Jovic, A., 2022. Accurate Detection of Dementia from Speech Transcripts Using RoBERTa Model. In Proceedings of the 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 23–27 May 2022; pp. 1478–1484. DOI: https://doi.org/10.23919/MIPRO55190.2022.9803462

[8] Liu, N., Luo, K., Yuan, Z., et al., 2022. A Transfer Learning Method for Detecting Alzheimer’s Disease Based on Speech and Natural Language Processing. Frontiers in Public Health. 10, 772592. DOI: https://doi.org/10.3389/fpubh.2022.772592

[9] Liu, N., Yuan, Z., 2022. Spontaneous Language Analysis in Alzheimer’s Disease: Evaluation of Natural Language Processing Technique for Analyzing Lexical Performance. Journal of Shanghai Jiaotong University Science. 27(2), 160–167. DOI: https://doi.org/10.1007/s12204-021-2384-3

[10] Liu, N., Wang, L., 2023. An approach for assisting diagnosis of Alzheimer’s disease based on natural language processing. Frontiers in Aging Neuroscience. 15, 1281726. DOI: https://doi.org/10.3389/fnagi.2023.1281726

[11] Goodglass, H., Kaplan, E., 1996. The assessment of aphasia and related disorders, 2nd ed. Lea & Febiger: Philadelphia, PA, USA.

[12] Christodoulou, E., Ma, J., Collins, G.S., et al., 2019. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology. 110, 12–22. DOI: https://doi.org/10.1016/j.jclinepi.2019.02.004

[13] Dudzik, W., Nalepa, J., Kawulok, M., 2021. Evolving data-adaptive support vector machines for binary classification. Knowledge-Based Systems. 227, 107221. DOI: https://doi.org/10.1016/j.knosys.2021.107221

[14] Nusinovici, S., Tham, Y.C., Yan, M.Y.C., et al., 2020. Logistic regression was as good as machine learning for predicting major chronic diseases. Journal of Clinical Epidemiology. 122, 56–69. DOI: https://doi.org/10.1016/j.jclinepi.2020.03.002

[15] Hsu, B.M., 2020. Comparison of Supervised Classification Models on Textual Data. Mathematics. 8(5), 851. DOI: https://doi.org/10.3390/math8050851

[16] Abdullah, D.M., Abdulazeez, A.M., 2021. Machine Learning Applications based on SVM Classification A Review. Qubahan Academic Journal. 1(2), 81–90. DOI: https://doi.org/10.48161/qaj.v1n2a50

[17] Salehi, W., Baglat, P., Gupta, G., et al., 2023. An Approach to Binary Classification of Alzheimer’s Disease Using LSTM. Bioengineering. 10(8), 950. DOI: https://doi.org/10.3390/bioengineering10080950

[18] Wu, M., Chen, L., 2015. Image recognition based on deep learning. In Proceedings of the 2015 Chinese Automation Congress (CAC), Wuhan, China, (27–29 November 2015); pp. 542–546. DOI: https://doi.org/10.1109/CAC.2015.7382560

[19] Torfi, A., Shirvani, R.A., Keneshloo, Y., et al., 2021. Natural Language Processing Advancements by Deep Learning: A Survey. arXiv:2003.01200v4. DOI: https://doi.org/10.48550/arXiv.2003.01200

[20] Wang, F., Wang, H., Zhou, X., et al., 2022. A Driving Fatigue Feature Detection Method Based on Multifractal Theory. IEEE Sensors Journal. 22(19), 19046–19059. DOI: https://doi.org/10.1109/JSEN.2022.3201015

[21] Li, C., Zhang, C., Fu, Q., 2020. Research on CNN + LSTM user intention classification based on multi-granularity features of texts. The Journal of Engineering. 2020(13), 486–490. DOI: https://doi.org/10.1049/joe.2019.1175

[22] Howard, J., Ruder, S., 2018. Universal Language Model Fine-tuning for Text Classification. arXiv:1801.06146v5. DOI: https://doi.org/10.48550/arXiv.1801.06146

[23] Zhu, C., Ni, R., Xu, Z., et al., 2021. GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training. Advances in Neural Information Processing Systems, 34, 16410–16422.

[24] Rios, J.O., Armejach, A., Petit, E., et al., 2021. Dynamically Adapting Floating-Point Precision to Accelerate Deep Neural Network Training. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 13–16 December 2021; pp. 980–987. DOI: https://doi.org/10.1109/ICMLA52953.2021.00161

[25] Wu, Y., Liu, J., Bae, J., et al., 2019. Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, (9–12 December 2019); pp. 1971–1980. DOI: https://doi.org/10.1109/BigData47090.2019.9006104

[26] Hu, E.J., Shen, Y., Wallis, P., et al., 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685v2. DOI: https://doi.org/10.48550/arXiv.2106.09685

[27] Zeng, Y., Lee, K., 2024. The Expressive Power of Low-Rank Adaptation. arXiv:2310.17513v3. DOI: https://doi.org/10.48550/arXiv.2310.17513

[28] Zhang, B., Liu, Z., Cherry, C., et al., 2024. When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method. arXiv:2402.17193v1. 27 Feburary 2024. DOI: https://doi.org/10.48550/arXiv.2402.17193

[29] Dettmers, T., Pagnoni, A., Holtzman, A., et al., 2023. QLoRA: Efficient Finetuning of Quantized LLMs. Advances in neural information processing systems, 36, 10088–10115.

[30] Zhang, X., Rajabi, N., Duh, K., Koehn, P., 2023. Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA. In Proceedings of the Eighth Conference on Machine Translation, Singapore, 06–07 December, 2023; pp. 468–481. DOI: https://doi.org/10.18653/v1/2023.wmt-1.43

[31] Thirunavukarasu, A.J., Ting, D.S.J., Elangovan, K., et al., 2023. Large language models in medicine. Nature Medicine. 29(8), 1930–1940. DOI: https://doi.org/10.1038/s41591-023-02448-8

Downloads

How to Cite

Ihnaini, B., Deng , Y., He, Y., Geng, L., & Xu , J. (2025). Detection of Alzheimer’s Disease Using Fine-Tuned Large Language Models. Forum for Linguistic Studies, 7(8), 373–384. https://doi.org/10.30564/fls.v7i8.9899

Issue

Article Type

Article