Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

Li, Mengyao; Liu, Yanbin; Chen, Ling

Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks

aut.event.date	2024-12-02 to 2024-12-06
aut.event.place	, Auckland
dc.contributor.author	Li, Mengyao
dc.contributor.author	Liu, Yanbin
dc.contributor.author	Chen, Ling
dc.date.accessioned	2026-03-12T19:15:13Z
dc.date.available	2026-03-12T19:15:13Z
dc.date.issued	2025-07-22
dc.description.abstract	Knowledge distillation (KD) transfers knowledge from a large, well-trained teacher network to a smaller student network, improving student’s performance without extra computational costs. Traditional KD methods usually focus on the logits or intermediate features. However, they might overlook the inherent correlation and suffer from capacity gaps due to the distinct architectures of the student and teacher. Relation-based distillation methods try to bridge these correlation but usually require a large memory bank for loss computation, being less efficient. To overcome those limitations, we propose a novel and efficient Optimal Transport-based Graph Distillation (OTGD) method. First, OTGD constructs attributed graphs for the teacher and student respectively, which are then utilized to capture both individual and relational knowledge through graph neural networks (GNNs). Then, we devise an innovative differentiable optimal transport objective to distill the teacher knowledge before and after GNNs learning, effectively incorporating both the feature-level and correlation-level knowledge. Specifically, our optimal transport objective is solved by the Sinkhorn algorithm without relying on an extra memory bank. This design makes our method efficient and numerically stable. Comprehensive experiments conducted on two benchmark datasets with diverse network architectures, and demonstrate that OTGD outperforms the state-of-the-art methods.
dc.identifier.citation	In: Mahmud, M., Doborjeh, M., Wong, K., Leung, A.C.S., Doborjeh, Z., Tanveer, M. (eds) Neural Information Processing (ICONIP 2024) Part of the book series: Lecture Notes in Computer Science (LNCS, vol 15292). pp 194–209. Proceedings of the International Conference on Neural Information Processing, December 2-6, 2024, Auckland, New Zealand. https://iconip2024.org/
dc.identifier.doi	10.1007/978-981-96-6594-5_15
dc.identifier.uri	http://hdl.handle.net/10292/20761
dc.publisher	Springer Nature
dc.relation.uri	https://link.springer.com/chapter/10.1007/978-981-96-6594-5_15
dc.rights	This is the Author's Accepted Manuscript version of a conference paper published in the Proceedings of the International Conference on Neural Information Processing (ICONIP 2024) © 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. The Version of Record is available at DOI: 10.1007/978-981-96-6594-5_15
dc.rights.accessrights	OpenAccess
dc.subject	Knowledge distillation
dc.subject	Graph neural network
dc.subject	Optimal transport
dc.title	Knowledge Distillation With Differentiable Optimal Transport on Graph Neural Networks
dc.type	Conference Contribution
pubs.elements-id	593387

Files

Original bundle

Now showing 1 - 2 of 2

Name:: ICONIP24___KD_with_OT_loss.pdf
Size:: 3.06 MB
Format:: Adobe Portable Document Format
Description:: Author Accepted Manuscript under publisher's embargo until 22 July 2026

Download

Name:: ICONIP 2024 programme.pdf
Size:: 3.8 MB
Format:: Adobe Portable Document Format
Description:: Conference programme

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.37 KB
Format:: Plain Text
Description:

Download

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau