Repository logo
 

Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

Authors

Li, Mengyao
Liu, Yanbin
Chen, Ling

Supervisor

Item type

Conference Contribution

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Nature

Abstract

Knowledge distillation (KD) transfers knowledge from a large, well-trained teacher network to a smaller student network, improving student’s performance without extra computational costs. Traditional KD methods usually focus on the logits or intermediate features. However, they might overlook the inherent correlation and suffer from capacity gaps due to the distinct architectures of the student and teacher. Relation-based distillation methods try to bridge these correlation but usually require a large memory bank for loss computation, being less efficient. To overcome those limitations, we propose a novel and efficient Optimal Transport-based Graph Distillation (OTGD) method. First, OTGD constructs attributed graphs for the teacher and student respectively, which are then utilized to capture both individual and relational knowledge through graph neural networks (GNNs). Then, we devise an innovative differentiable optimal transport objective to distill the teacher knowledge before and after GNNs learning, effectively incorporating both the feature-level and correlation-level knowledge. Specifically, our optimal transport objective is solved by the Sinkhorn algorithm without relying on an extra memory bank. This design makes our method efficient and numerically stable. Comprehensive experiments conducted on two benchmark datasets with diverse network architectures, and demonstrate that OTGD outperforms the state-of-the-art methods.

Description

Keywords

Knowledge distillation, Graph neural network, Optimal transport

Source

In: Mahmud, M., Doborjeh, M., Wong, K., Leung, A.C.S., Doborjeh, Z., Tanveer, M. (eds) Neural Information Processing (ICONIP 2024) Part of the book series: Lecture Notes in Computer Science (LNCS, vol 15292). pp 194–209. Proceedings of the International Conference on Neural Information Processing, December 2-6, 2024, Auckland, New Zealand. https://iconip2024.org/

Rights statement

This is the Author's Accepted Manuscript version of a conference paper published in the Proceedings of the International Conference on Neural Information Processing (ICONIP 2024) © 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. The Version of Record is available at DOI: 10.1007/978-981-96-6594-5_15