Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks
Date
Authors
Li, Mengyao
Liu, Yanbin
Chen, Ling
Supervisor
Item type
Conference Contribution
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Nature
Abstract
Knowledge distillation (KD) transfers knowledge from a large, well-trained teacher network to a smaller student network, improving student’s performance without extra computational costs. Traditional KD methods usually focus on the logits or intermediate features. However, they might overlook the inherent correlation and suffer from capacity gaps due to the distinct architectures of the student and teacher. Relation-based distillation methods try to bridge these correlation but usually require a large memory bank for loss computation, being less efficient. To overcome those limitations, we propose a novel and efficient Optimal Transport-based Graph Distillation (OTGD) method. First, OTGD constructs attributed graphs for the teacher and student respectively, which are then utilized to capture both individual and relational knowledge through graph neural networks (GNNs). Then, we devise an innovative differentiable optimal transport objective to distill the teacher knowledge before and after GNNs learning, effectively incorporating both the feature-level and correlation-level knowledge. Specifically, our optimal transport objective is solved by the Sinkhorn algorithm without relying on an extra memory bank. This design makes our method efficient and numerically stable. Comprehensive experiments conducted on two benchmark datasets with diverse network architectures, and demonstrate that OTGD outperforms the state-of-the-art methods.Description
Keywords
Knowledge distillation, Graph neural network, Optimal transport
Source
In: Mahmud, M., Doborjeh, M., Wong, K., Leung, A.C.S., Doborjeh, Z., Tanveer, M. (eds) Neural Information Processing (ICONIP 2024) Part of the book series: Lecture Notes in Computer Science (LNCS, vol 15292). pp 194–209. Proceedings of the International Conference on Neural Information Processing, December 2-6, 2024, Auckland, New Zealand. https://iconip2024.org/
Publisher's version
Rights statement
This is the Author's Accepted Manuscript version of a conference paper published in the Proceedings of the International Conference on Neural Information Processing (ICONIP 2024) © 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. The Version of Record is available at DOI: 10.1007/978-981-96-6594-5_15
