Design of Intelligent Educational Mobile Apps with an Original Dataset for Chinese-Portuguese Translators

Authors

  • Lap Man Hoi

    Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, China

  • Yuqi Sun

    Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, China

  • Manlin Lin

    Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence, Ministry of Education, Macao Polytechnic University, Macao SAR, China

  • Sio Kei Im

    Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence, Ministry of Education, Macao Polytechnic University, Macao SAR, China

DOI:

https://doi.org/10.30564/fls.v7i7.9583
Received: 21 April 2025 | Revised: 26 May 2025 | Accepted: 9 June 2025 | Published Online: 16 July 2025

Abstract

Translation remains a vital process in many culturally diverse countries. Despite significant advances in artificial intelligence (AI) technology, machine translation currently lacks the ability to fully replace human expertise, requiring continued human intervention and review in translation workflows. This article introduces an innovative mobile education application (app) designed to train translators, with a particular focus on Chinese-Portuguese translation. This app uses a set of practice data, Chinese-Portuguese translation exercise corpus (CPTEC), developed by our corpus team to autonomously assess and identify translation quality defects, thereby promoting skill improvement. We also propose a novel hybrid grade system based on different translation quality assessment (TQA) dimensions to automatically evaluate translations by imitating humans. In addition, it demonstrates the design of challenging exercises within a mobile app to reinforce translation proficiency. To optimize the functionality of the mobile app, we use a large language model (LLM) to validate the solution, ensure that it learns the training material provided and track its performance. Subsequent experimental results show that the fine-tuned LLM improves on multiple dimensions (including accuracy, fidelity, fluency, readability, acceptability, and usability) compared to the initial state, confirming the effectiveness of the developed practice data in improving translation performance. To promote access to research, the practice data (CPTEC) will be distributed within the relevant AI community, to inspire people to create innovative software applications to support translators.

Keywords:

Automated Writing Evaluation; Chinese-Portuguese Translation Exercise Corpus; Common European Framework of Reference for Languages; Fine-Tuning; Generative Artificial Intelligence; Large Language Models; Portuguese as a Foreign Language

References

[1] Sun, Y., Hoi, L.M., Im, S.K., 2023. Constructing the evaluation index system of Chinese-portuguese machine translation using the delphi and analytic hierarchy process methods. Proceeding of the 2023 IEEE 3rd International Conference on Power, Electronics and Computer Applications (ICPECA); 27–29 January 2023; Shenyang, China. pp. 190–195. DOI: https://doi.org/10.1109/ICPECA56706.2023.10075920

[2] Deci, E.L., Ryan, R.M., 2000. The "what" and "why" of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry. 11(4), 227–268. DOI: https://doi.org/10.1207/S15327965PLI1104_01

[3] Kukulska-Hulme, A., 2020. Mobile-assisted language learning. In: Chapelle, C.A., Aijmer, K., Angelelli, C.V., et al. (eds.). The Encyclopedia of Applied Linguistics. John Wiley & Sons, Ltd: Hoboken, NJ, USA. pp. 1–9. DOI: https://doi.org/10.1002/9781405198431.wbeal0768.pub2

[4] Hoi, L.M., Sun, Y., Ke, W., et al., 2023. Visualizing the behavior of learning european portuguese in different regions of the world through a mobile application. IEEE Access. 11, 113913–113930. DOI: https://doi.org/10.1109/ACCESS.2023.3324390

[5] HKYWCA, 2025. Study finds strong parental influence on children's electronic device usage. Available from: https://www.ywca.org.hk/Press/Children-Electronic-Device-Usage (cited 23 January 2025).

[6] Gentile, D.A., Reimer, A.R., Nathanson, A.I., et al., 2014. Protective effects of parental monitoring of children's media use a prospective study. JAMA Pediatrics. 168(5), 479–484. DOI: https://doi.org/10.1001/jamapediatrics.2014.146

[7] Qayyum, A., Shahid, R., Fatima, M., 2024. The effect of excessive smartphone use on child cognitive development and academic achievement: A mixed method analysis. Annals of Human and Social Sciences. 5(3), 166–181. DOI: https://doi.org/10.35484/ahss.2024(5III)16

[8] Yang, J., Por, L.Y., Leong, M.C., et al., 2023. The potential of chatgpt in assisting children with down syndrome. Annals of Biomedical Engineering. 51(12), 2638–2640. DOI: https://doi.org/10.1007/s10439-023-03281-3

[9] CATTI, 2025. CATTI Integrated Service Platform. Available from: https://www.catticenter.com/cattiptyyksdg (cited 30 June 2025).

[10] Bittner, H., 2019. Evaluating the Evaluator: A Novel Perspective on Translation Quality Assessment, 1st ed. Routledge: Oxfordshire, UK.

[11] Peña Pollastri, A.P., 2008. Evaluation criteria for the improvement of translation quality. In: Forstner, M., Schmitt, P.A. (eds.). CIUTI-Forum 2008: Enhancing Translation Quality: Ways, Means, Methods. Peter Lang: Bern, Switzerland. pp. 239–260.

[12] Martínez Mateo, R., 2014. A deeper look into metrics for translation quality assessment (tqa): A case study. Language and Linguistics. 49, 73–93. DOI: https://doi.org/10.26754/ojs_misc/mj.20148792

[13] Jiang, L., Sun, Y., Hoi, L.M., 2024. Defining the indicator system for awe for portuguese classes in chinese higher education. Proceedings of the 2024 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC); 18–20 October 2024; Guangzhou, China. pp. 206–211. DOI: https://doi.org/10.1109/CyberC62439.2024.00043

[14] Machine Translate Foundation, 2024. Ninth conference on machine translation. Available from: https://machinetranslate.org/wmt24 (cited 26 November 2024).

[15] Papineni, K., Roukos, S., Ward, T., et al., 2002. Bleu: a method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics; 7–12 July 2002; Philadelphia, PA, USA. pp. 311–318. DOI: https://doi.org/10.3115/1073083.1073135

[16] Lin, C.Y., 2004. Rouge: A package for automatic evaluation of summaries. Proceedings of the ACL-04 Workshop; 25–26 July 2004; Barcelona, Spain. pp. 74–81.

[17] Radford, A., Narasimhan, K., Salimans, T., et al., 2018. Improving language understanding by generative pre-training. OpenAI. Available from: https://openai.com/index/language-unsupervised (cited 29June 2018).

[18] Radford, A., Wu, J., Child, R., et al., 2019. Language models are unsupervised multitask learners. OpenAI blog. 1(8), 9.

[19] Brown, T.B., Mann, B., Ryder, N., et al., 2020. Language models are fewshot learners. Proceedings of the 34th International Conference on Neural Information Processing Systems; 6–12 December 2020; Vancouver, Canada. pp. 1877–1901.

[20] OpenAI, 2021. Gpt-3 powers the next generation of apps. Available from: https://openai.com/index/gpt-3-apps (cited 27March 2021).

[21] OpenAI, 2021. Introducing chatgpt. Available from: https://openai.com/index/chatgpt (cited 30 November 2021).

[22] Achiam, J., Adler, S., Agarwal, S., et al., 2023. Gpt-4 technical report. arXiv. DOI: https://doi.org/10.48550/arXiv.2303.08774

[23] Wu, T., He, S., Liu, J., et al., 2023. A brief overview of chatgpt: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica. 10(5), 1122–1136. DOI: https://doi.org/10.1109/JAS.2023.123618

[24] Mughal, N., Mujtaba, G., Shaikh, S., et al., 2024. Comparative analysis of deep natural networks and large language models for aspect-based sentiment analysis. IEEE Access. 12, 60943–60959. DOI: https://doi.org/10.1109/ACCESS.2024.3386969

[25] Moradi Dakhel, A., Nikanjam, A., Khomh, F., et al., 2024. Generative AI for Effective Software Development. Springer Nature: Cham, Switzerland.

[26] Alammar, J., Grootendorst, M., 2024. Hands-On Large Language Models: Language Understanding and Generation, 1st ed. O'Reilly Media: Sebastopol, CA, USA.

[27] Tunstall, L., Werra, L.v., Wolf, T., 2022. Natural Language Processing with Transformers: Building Language Applications with Hugging Face, 1st ed. O'Reilly Media: New York, NY, USA.

[28] Hu, E.J., Shen, Y., Wallis, P., et al., 2022. Lora: Low-rank adaptation of large language models. Proceedings of the Tenth International Conference on Learning Representations (ICLR 2022); 25–29 April 2022; Virtual. pp. 1–13.

[29] Dettmers, T., Pagnoni, A., Holtzman, A., et al., 2023. Qlora: efficient finetuning of quantized llms. Proceedings of the 37th International Conference on Neural Information Processing Systems; 10–16 December 2023; New Orleans, LA, USA. pp. 10088–10115.

[30] Council of Europe, 2020. Common European Framework of Reference for Languages: Learning, Teaching, assessment: Companion volume, 1st ed. Cambridge University Press: London, UK.

[31] Nida, E.A., 1964. Toward a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating. E.J. Brill: Leiden, Netherlands.

[32] Pym, A., 2023. Exploring Translation Theories, 3rd ed. Routledge: London, UK.

[33] Bassnett, S., Lefevere, A., 1990. Translation, History and Culture, 1st ed. Cassell: London, UK.

[34] Chinese-Portuguese-English Machine Translation Laboratory (CPELab), 2024. Chinese-Portuguese Translation Exercise Corpus (CPTEC). Available from: https://huggingface.co/datasets/edmond5995/CPTransExercise (cited 15 December 2024).

[35] Vohra, D., 2016. Apache parquet. In: Vohra, D. (ed.). Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools. Apress: Berkeley, CA, USA. pp. 325–335.

[36] Oxford Dictionaries, 2015. Oxford Portuguese Dictionary, 1st ed. Oxford University Press: Oxford, UK.

[37] Wiktionary, 2023. Portuguese lemmas. Available from: https://en.wiktionary.org/wiki/Category:Portuguese_lemmas (cited 12 April 2023).

[38] Lommel, A., Uszkoreit, H., Burchardt, A., 2014. Multidimensional quality metrics (mqm): A framework for declaring and describing translation quality metrics [in Catalan]. Revista Tradumàtica: Tecnologies de la Traducció. 12, 455–463. DOI: https://doi.org/10.5565/rev/tradumatica.77

[39] Bojar, O., Chatterjee, R., Federmann, C., et al., 2016. Findings of the 2016 conference on machine translation. Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers; 11–12 August 2016; Berlin, Germany. pp. 131–198. DOI: https://doi.org/10.18653/v1/W16-2301

[40] Toral, A., Sánchez-Cartagena, V.M., 2017. A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1; 3–7 April 2017; Valencia, Spain. pp. 1063–1073.

[41] Hassan, H.M., Galal-Edeen, G.H., 2017. From usability to user experience. Proceedings of the 2017 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS); 24–26 November 2017; Okinawa, Japan. pp. 216–222. DOI: https://doi.org/10.1109/ICIIBMS.2017.8279761

[42] Kri, R., Sambyo, K., 2024. Comparative study of low resource digaru language using smt and nmt. International Journal of Information Technology. 16(4), 2015–2024. DOI: https://doi.org/10.1007/s41870-024-01769-2

[43] Sälevä, J., Lignos, C., 2021. The effectiveness of morphology-aware segmentation in lowresource neural machine translation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop; 19–23 April 2021; Virtual. pp. 164–174. DOI: https://doi.org/10.18653/v1/2021.eacl-srw.22

[44] Zhang, M., Li, Z., Fu, G., et al., 2019. Syntax-enhanced neural machine translation with syntax-aware word representations. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2–7 June 2019; Minneapolis, MN, USA. pp. 1151–1161. DOI: https://doi.org/10.18653/v1/N19-1118

[45] McDonald, S.V., 2022. Accuracy, readability, and acceptability in translation. Applied Translation. 16(2), 1–9. DOI: https://doi.org/10.51708/apptrans.v14n2.1238

[46] Callison-Burch, C., Osborne, M., Koehn, P., 2006. Re-evaluation the role of bleu in machine translation research. Proceedings of the 11th Conference of the European Chapter of the Association for Compuational Linguistics; 3–7 April 2006; Trento, Italy. pp. 249–256.

[47] Su, J., Chen, J., Jiang, H., et al., 2021. Multi-modal neural machine translation with deep semantic interactions. Information Sciences. 554, 47–60. DOI: https://doi.org/10.1016/j.ins.2020.11.024

[48] Muftah, M., 2022. Machine vs human translation: a new reality or a threat to professional arabic—english translators. PSU Research Review. 8(2), 484–497. DOI: https://doi.org/10.1108/PRR-02-2022-0024

[49] Chesterman, A., 2016. Memes of Translation: The Spread of Ideas in Translation Theory. John Benjamins: Amsterdam, Netherlands.

[50] Python Software Foundation, 2025. locale internationalization services. Available from: https://docs.python.org/3/library/locale.html (cited 25 May 2025).

[51] Hoi, L.M., Ke, W., Im, S.K., 2023. Corpus database management design for chinese-portuguese bidirectional parallel corpora. Proceedings of the 2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI); 26–28 May 2023; Taiyuan, China. pp. 103–108. DOI: https://doi.org/10.1109/CCAI57533.2023.10201319

[52] Azeredo, J.C.d., 2012. Houaiss Verb Conjugation Dictionary, 1st ed [in Portuguese]. Publifolha: São Paulo, Brazil.

Downloads

How to Cite

Hoi, L. M., Sun, Y., Lin, M., & Im, S. K. (2025). Design of Intelligent Educational Mobile Apps with an Original Dataset for Chinese-Portuguese Translators. Forum for Linguistic Studies, 7(7), 696–718. https://doi.org/10.30564/fls.v7i7.9583