Research and Model Library Construction in Teacher-Student Learning Architectures for Knowledge Transfer
DOI:
https://doi.org/10.30564/jcsr.v6i4.7344Abstract
This paper summarizes and replicates multiple classical and cutting-edge knowledge transfer methods, including Factor Transfer (FT), Knowledge Distillation (KD), Deep Mutual Learning (DML), Contrastive Representation Distillation (CRD), and Born-Again Self-Distillation (BSS). Additionally, we studied three advanced knowledge transfer methods: Relational Knowledge Distillation (RKD), Similarity-Preserving (SP), and Attention-based Feature Distillation (AFD), successfully replicating an optimized version of KD, namely RKD. Based on these methods, a flexible model library was constructed in Pycharm, allowing the quick integration of multiple knowledge transfer strategies. The experimental results are visualized through a user-friendly interface, enabling intuitive comparisons of model training speed and performance across different methods. This research provides valuable insights into the challenge of building a reusable framework that efficiently integrates various knowledge transfer strategies into deep neural networks.
Keywords:
Knowledge transfer; Teacher-student architecture; Model library; Visualization; Educational frameworksReferences
[1] Hinton, G., Vinyals, O., Dean, J., 2015. Distilling the knowledge in a neural network. Advances in Neural Information Processing Systems. 28, 1–9.
[2] Yosinski, J., 2014. Transfer learning via sparse fine-tuning. Proceedings of the 31st International Conference on Machine Learning (ICML); Beijing, China; 22–24 June 2014. pp. 370–378.
[3] Zhang, Y., Xiang, T., Hospedales, T. M., et al., 2018. Deep mutual learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Salt Lake City, UT, USA; 18–22 June 2018. pp. 1–9.
[4] Tian, Y., Krishnan, D., Isola, P., 2020. Contrastive representation distillation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Seattle, WA, USA; 14–19 June 2020. pp. 1–10.
[5] Furlanello, T., Lipton, Z., Tschannen, M., et al., 2018. Born-again neural networks. Proceedings of the 35th International Conference on Machine Learning (ICML); Stockholm, Sweden; 10–15 July 2018. pp. 1603–1612.
[6] Park, W., Kim, D., Lu, Y., et al., 2019. Relational knowledge distillation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Long Beach, CA, USA; 16–20 June 2019. pp. 1–10.
[7] Wang, Q., 2021. Attention-based feature distillation for image classification. IEEE Transactions on Neural Networks and Learning Systems. 32(5), 2040–2050.
Downloads
How to Cite
Issue
Article Type
License
Copyright © 2024 Author(s)
This is an open access article under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.