Dual Sparsity Transformer with Contour Loss for Real-Time UAV Image Segmentation
Date
Authors
Supervisor
Item type
Journal Article
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Integrating a semantic segmentation network into an Unmanned Aerial Vehicle (UAV) improves situational awareness and facilitates autonomous operations in dynamic environments. However, designing such a network for onboard deployment is challenging, as it must achieve high performance while maintaining low computational and memory requirements and ensuring real-time processing capabilities. UAVs typically operate at high altitudes, providing broad ground coverage; however, this results in key objects—such as humans, vehicles, and obstacles—appearing smaller in the imagery, thereby complicating their accurate identification. To address these challenges, we propose a lightweight semantic segmentation network and a network-agnostic loss specifically designed for UAV imagery. The Dual Sparsity Transformer (DST) incorporates two forms of sparsity: data-based sparsity, which reduces computational complexity; and content-based sparsity, which filters out irrelevant information to generate more refined aggregated features. The novel loss leverages predicted contours to capture complex patterns, boundaries, and small objects, imposing a higher penalty for misclassifications in these areas. This encourages the network to prioritize the accurate detection of challenging-to-distinguish objects. Our approach exhibits remarkable accuracy and real-time throughput for 4K resolution images on a mobile GPU, highlighting its effectiveness for onboard deployment in UAV systems.Description
Source
IEEE Transactions on Geoscience and Remote Sensing, ISSN: 0196-2892 (Print); 1558-0644 (Online), Institute of Electrical and Electronics Engineers (IEEE), 1-1. doi: 10.1109/tgrs.2025.3603946
Publisher's version
Rights statement
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
