Revolutionizing Harmonized System (HS) Code Search with Semantic Search and Word Embeddings: Empowering Trade Classifications

Authors

  • Supamas Sitisara

    School of Engineering, University of the Thai Chamber of Commerce, Bangkok 10400, Thailand

  • Supakpong Jinarat

    College of Engineering and Technology, Dhurakij Pundit University, Bangkok 10210, Thailand

  • Witchayut Ngamsaard

    School of Engineering, University of the Thai Chamber of Commerce, Bangkok 10400, Thailand

  • Nanthi Suthikarnnarunai

    School of Engineering, University of the Thai Chamber of Commerce, Bangkok 10400, Thailand

DOI:

https://doi.org/10.30564/fls.v7i10.10822
Received: 1 July 2025 | Revised: 9 July 2025 | Accepted: 30 July 2025 | Published Online: 25 September 2025

Abstract

The Harmonized System (HS) code is a crucial component of global trade. It helps classify goods correctly so that taxes and duties can be applied fairly and consistently across countries. However, many current HS code search tools rely on exact keyword matches. This often causes problems like wrong results, confusion, delays, and frustration, especially for users who don't know the exact terms to search for. These mistakes can also lead to incorrect tax charges and trade issues. This study introduces a new and innovative approach to searching for HS codes. It uses semantic search and word embedding models, advanced tools from natural language processing (NLP), to understand the meaning behind what users are asking, even if they don't use the exact right words. This approach makes the search more accurate, faster, and much easier for people to use. The study includes real examples, testing, and comparisons with traditional methods to show how this new system works better. The results clearly show that it improves both speed and accuracy, helping customs officers, brokers, traders, and regulators do their jobs more efficiently and correctly. By reducing errors and making the process smoother, this new system offers a big step forward in trade technology. It shows how artificial intelligence can help make international trade more reliable, user-friendly, and ready for the future.

Keywords:

Harmonized System (HS) Code; Semantic; Word Embeddings; Natural Language Processing (NLP); Customs Broker; Machine Learning

References

[1] Allende, J., 2022. World Customs Organization. Springer: Cham, Switzerland.

[2] Quan, J., Khan, M.S., 2024. The mediating role of job satisfaction and competitive advantage between quality management practices and sustainable performance: Case of hospitals in Guangxi, China. Human Systems Management. 43(6), 971–988. DOI: https://doi.org/10.3233/HSM-240045

[3] Pawłowski, M., 2022. Machine learning based product classification for ecommerce. Journal of Computer Information Systems. 62(4), 730–739.

[4] Clark, J., Bernard, D., 2022. Customs in a world of enhanced trade facilitation. In: Customs Matters: Strengthening Customs Administration in a Changing World. International Monetary Fund: Washington, DC, USA.

[5] Arya, A., Roy, S., Jonnala, S., 2023. An Ensemble-based approach for assigning text to correct Harmonized system code. In Proceedings of the 2023 International Conference on Artificial Intelligence and Smart Communication (AISC). DOI: https://doi.org/10.48550/arXiv.2211.04313

[6] Liao, M., Huang, L., Zhang, J., et al., 2024. Enhanced HS Code Classification for Import and Export Goods via Multiscale Attention and ERNIE-BiLSTM. Applied Sciences. 14(22), 10267.

[7] Harsani, P., Suhendra, A., Wulandari, L., et al., 2020. Artificial intelligence-based methods for harmonized system code translation: A review. Journal of Advanced Research in Dynamical and Control Systems. 12(2), 1389–1398.

[8] Zhong, C., 2024. AI-Powered Customs Clearance: Optimizing Trade Compliance and Border Management. Journal of AI-Driven Trade Facilitation Engineering and Single Window Systems. 2(1), 79–98.

[9] Hamisi, S.R., Kileo, W.J., 2024. The Effect of Automated Customs Clearance Systems on Enhancing Trade Efficiency in Tanzania. International Journal of Social Sciences and Management Research. 10(8), 408–422. DOI: https://doi.org/10.56201/ijssmr.v10.no8.2024.pg408.422

[10] Domingues, P., Carreira, P., Vieira, R., et al., 2016. Building automation systems: Concepts and technology review. Computer Standards & Interfaces. 45, 1–12.

[11] Gunarathne, S., Kalingamudali, S., 2019. Smart automation system for controlling various appliances using a mobile device. In Proceedings of the 2019 IEEE International Conference on Industrial Technology (ICIT), Melbourne, VIC, Australia, 13–15 February 2019.

[12] Kosgei, S.K., 2019. Effect of automated customs procedures on trade facilitation a case of clearing and forwarding agents in Nairobi region. KESRA/JKUAT: Juja, Kenya.

[13] Stassin, S., Amel, O., Mahmoudi, S., et al., 2023. Similarity versus Supervision: Best Approaches for HS Code Prediction. ESANN. 175–180.

[14] Merkulov, R., Chien, V., Khodaverdian, A.E., et al., 2023. Machine learning based product classification and approval. USA. 20230252544. 10 August 2023.

[15] Fedotova, G., 2020. Problems of digital transformation of customs services on classification of goods. In Proceedings of the 2nd International Scientific Conference on Innovations in Digital Economy; pp. 1–10. DOI: https://doi.org/10.1145/3444465.3444503

[16] Chen, X., Bromuri, S., Van Eekelen, M., 2021. Neural machine translation for harmonized system codes prediction. In Proceedings of the 2021 6th International Conference on Machine Learning Technologies; pp. 158–163. DOI: https://doi.org/10.1145/3468891.3468915

[17] Yereshko, K., Khoma, O., Pyslytsia, A., 2024. Digitalization of Customs Procedures: Current State and Prospects. Journal of Vasyl Stefanyk Precarpathian National University. 11(2), 103–115.

[18] Orłowska, M., Chackiewicz, M., 2024. Logistics and Customs Handling–New Technologies and Operational Efficiency and Compliance with International Regulations. Scientific Papers of Silesian University of Technology. Organization & Management. (211), 499–514.

[19] Bleikher, O.V., Ageeva, V.V., Brazovskaya, O.E., et al., 2016. Using information logistics techniques to develop an integrated information pool for improving efficiency of post-clearance customs control. Information Technologies in Science, Management, Social Sphere and Medicine. DOI: https://doi.org/10.2991/itsmssm-16.2016.32

[20] Novith, D.C., 2024. Harmonized System Code Recommendation: A Multi-Class Classification Model. Jurnal BPPK: Badan Pendidikan dan Pelatihan Keuangan. 17(3), 1–11.

[21] Yuan, Y., 2020. Improving information retrieval by semantic embedding [Master Thesis]. University of Twente: Enschede, Netherlands.

[22] Kavoya, J., 2020. Machine learning for intelligence driven Customs management. African Tax and Customs Review. 1(3), 50–58.

[23] Zuccon, G., Koopman, B., Bruza, P., et al., 2015. Integrating and evaluating neural word embeddings in information retrieval. In Proceedings of the 20th Australasian Document Computing Symposium; pp. 1–8. DOI: https://doi.org/10.1145/2838931.2838936

[24] Hambarde, K.A., Proenca, H., 2023. Information retrieval: recent advances and beyond. IEEE Access. 11, 76581–76604.

[25] Stein, R.A., Jaques, P.A., Valiati, J.F., 2019. An analysis of hierarchical text classification using word embeddings. Information Sciences. 471, 216–232.

[26] Raunak, V., 2017. Simple and effective dimensionality reduction for word embeddings. arXiv preprint. arXiv:1708.03629.

[27] Spichakova, M., Haav, H.-M., 2020. Application of Machine Learning for Assessment of HS Code Correctness. Baltic Journal of Modern Computing. 8(4), 698–718.

[28] Asudani, D.S., Nagwani, N.K., Singh, P., 2023. Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review. 56(9), 10345–10425.

[29] Du, S., Wu, Z., Wan, H., et al., 2021. HScodeNet: Combining hierarchical sequential and global spatial information of text for commodity HS code classification. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining; pp. 676–689. DOI: https://doi.org/10.1007/978-3-030-75765-6_54

[30] Lee, E., Kim, S., Kim, S., et al., 2021. Classification of goods using text descriptions with sentences retrieval. arXiv preprint. arXiv:2111.01663.

[31] Zhang, H., Khan, M.S., 2024. Empirical Research On Ethical Leadership And Knowledge Workers’ Innovative Behaviour: The Mediating Role Of Job Autonomy. Revista de Gestao Social e Ambiental. 18(9). DOI: https://doi.org/10.24857/rgsa.v18n9-091

[32] Worth, P.J., 2023. Word embeddings and semantic spaces in natural language processing. International Journal of Intelligence Science. 13(1), 1–21.

[33] Edwards, A., Camacho-Collados, J., De Ribaupierre, H., et al., 2020. Go simple and pre-train on domain-specific corpora: On the role of training data for text classification. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, December 2020; pp. 5522–5529.

[34] Wang, C., Nulty, P., Lillis, D., 2020. A comparative study on word embeddings in deep learning for text classification. In Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval; pp. 37–46. DOI: https://doi.org/10.1145/3443279.3443304

[35] Trade Classification Data_EDA, 2023. Kaggle Notebook. Available from: https://www.kaggle.com/code/kaggleprollc/trade-classification-data-eda (cited 18 May 2024).

[36] ThaiSomdej Dataset, 2023. Project.devplanter.com. Available from: http://project.devplanter.com/dataset.zip (cited 15 April 2023).

Downloads

How to Cite

Sitisara, S., Jinarat, S., Ngamsaard, W., & Suthikarnnarunai, N. (2025). Revolutionizing Harmonized System (HS) Code Search with Semantic Search and Word Embeddings: Empowering Trade Classifications. Forum for Linguistic Studies, 7(10), 356–371. https://doi.org/10.30564/fls.v7i10.10822