Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

Li, Mengyao; Liu, Yanbin; Chen, Ling

Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

Files

Author Accepted Manuscript under publisher's embargo until 22 July 2026(3.06 MB)

Conference programme(3.8 MB)

Date

2025-07-22

Authors

Li, Mengyao

Liu, Yanbin

Chen, Ling

Item type

Conference Contribution

Publisher

Springer Nature

Abstract

Knowledge distillation (KD) transfers knowledge from a large, well-trained teacher network to a smaller student network, improving student’s performance without extra computational costs. Traditional KD methods usually focus on the logits or intermediate features. However, they might overlook the inherent correlation and suffer from capacity gaps due to the distinct architectures of the student and teacher. Relation-based distillation methods try to bridge these correlation but usually require a large memory bank for loss computation, being less efficient. To overcome those limitations, we propose a novel and efficient Optimal Transport-based Graph Distillation (OTGD) method. First, OTGD constructs attributed graphs for the teacher and student respectively, which are then utilized to capture both individual and relational knowledge through graph neural networks (GNNs). Then, we devise an innovative differentiable optimal transport objective to distill the teacher knowledge before and after GNNs learning, effectively incorporating both the feature-level and correlation-level knowledge. Specifically, our optimal transport objective is solved by the Sinkhorn algorithm without relying on an extra memory bank. This design makes our method efficient and numerically stable. Comprehensive experiments conducted on two benchmark datasets with diverse network architectures, and demonstrate that OTGD outperforms the state-of-the-art methods.

Keywords

Knowledge distillation, Graph neural network, Optimal transport

Source

In: Mahmud, M., Doborjeh, M., Wong, K., Leung, A.C.S., Doborjeh, Z., Tanveer, M. (eds) Neural Information Processing (ICONIP 2024) Part of the book series: Lecture Notes in Computer Science (LNCS, vol 15292). pp 194–209. Proceedings of the International Conference on Neural Information Processing, December 2-6, 2024, Auckland, New Zealand. https://iconip2024.org/

DOI

10.1007/978-981-96-6594-5_15

Publisher's version

https://link.springer.com/chapter/10.1007/978-981-96-6594-5_15

Rights statement

This is the Author's Accepted Manuscript version of a conference paper published in the Proceedings of the International Conference on Neural Information Processing (ICONIP 2024) © 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. The Version of Record is available at DOI: 10.1007/978-981-96-6594-5_15

Permanent link

http://hdl.handle.net/10292/20761

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau

Full item page

Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

Files

Date

Authors

Supervisor

Item type

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Permanent link

Collections