Research and Model Library Construction in Teacher-Student Learning Architectures for Knowledge Transfer

Authors

  • Jiaxiang Chen

    College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China

  • Yuhang Ouyang

    College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China

  • Zheyu Li

    College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China

DOI:

https://doi.org/10.30564/jcsr.v6i4.7344
Received: 24 September 2024 | Revised: 15 October 2024 | Accepted: 18 October 2024 | Published Online: 30 October 2024

Abstract

This paper summarizes and replicates multiple classical and cutting-edge knowledge transfer methods, including Factor Transfer (FT), Knowledge Distillation (KD), Deep Mutual Learning (DML), Contrastive Representation Distillation (CRD), and Born-Again Self-Distillation (BSS). Additionally, we studied three advanced knowledge transfer methods: Relational Knowledge Distillation (RKD), Similarity-Preserving (SP), and Attention-based Feature Distillation (AFD), successfully replicating an optimized version of KD, namely RKD. Based on these methods, a flexible model library was constructed in Pycharm, allowing the quick integration of multiple knowledge transfer strategies. The experimental results are visualized through a user-friendly interface, enabling intuitive comparisons of model training speed and performance across different methods. This research provides valuable insights into the challenge of building a reusable framework that efficiently integrates various knowledge transfer strategies into deep neural networks.

Keywords:

Knowledge transfer; Teacher-student architecture; Model library; Visualization; Educational frameworks

References

[1] Hinton, G., Vinyals, O., Dean, J., 2015. Distilling the knowledge in a neural network. Advances in Neural Information Processing Systems. 28, 1–9.

[2] Yosinski, J., 2014. Transfer learning via sparse fine-tuning. Proceedings of the 31st International Conference on Machine Learning (ICML); Beijing, China; 22–24 June 2014. pp. 370–378.

[3] Zhang, Y., Xiang, T., Hospedales, T. M., et al., 2018. Deep mutual learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Salt Lake City, UT, USA; 18–22 June 2018. pp. 1–9.

[4] Tian, Y., Krishnan, D., Isola, P., 2020. Contrastive representation distillation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA; 14–19 June 2020. pp. 1–10.

[5] Furlanello, T., Lipton, Z., Tschannen, M., et al., 2018. Born-again neural networks. Proceedings of the 35th International Conference on Machine Learning (ICML); Stockholm, Sweden; 10–15 July 2018. pp. 1603–1612.

[6] Park, W., Kim, D., Lu, Y., et al., 2019. Relational knowledge distillation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Long Beach, CA, USA; 16–20 June 2019. pp. 1–10.

[7] Wang, Q., 2021. Attention-based feature distillation for image classification. IEEE Transactions on Neural Networks and Learning Systems. 32(5), 2040–2050.

Downloads

How to Cite

Chen, J., Ouyang, Y., & Li, Z. (2024). Research and Model Library Construction in Teacher-Student Learning Architectures for Knowledge Transfer. Journal of Computer Science Research, 6(4), 73–81. https://doi.org/10.30564/jcsr.v6i4.7344

Issue

Article Type

Article