A Lightweight Underwater Fish Image Semantic Segmentation Model Based on U-Net

Zhang, Zhenkai; Li, Wanghua; Seet, Boon-Chong

A Lightweight Underwater Fish Image Semantic Segmentation Model Based on U-Net

Files

Journal article(1.68 MB)

Date

2024-06-25

Authors

Zhang, Zhenkai

Li, Wanghua

Seet, Boon-Chong

Item type

Journal Article

Publisher

Wiley

Abstract

Semantic segmentation of underwater fish images is vital for monitoring fish stocks, assessing marine resources, and sustaining fisheries. To tackle challenges such as low segmentation accuracy, inadequate real-time performance, and imprecise location segmentation in current methods, a novel lightweight U-Net model is proposed. The proposed model acquires more segmentation details by applying a multiple-input approach at the first four encoder levels. To achieve both lightweight and high accuracy, a multi-scale residual structure (MRS) module is proposed to reduce parameters and compensate for the accuracy loss caused by the reduction of channels. To improve segmentation accuracy, a multi-scale skip connection (MSC) structure is further proposed, and the convolution block attention mechanism (CBAM) is introduced at the end of each decoder level for weight adjustment. Experimental results demonstrate a notable reduction in model volume, parameters, and floating-point operations by 94.20%, 94.39%, and 51.52% respectively, compared to the original model. The proposed model achieves a high mean intersection over union (mIOU) of 94.44%, mean pixel accuracy (mPA) of 97.03%, and a frame rate of 43.62 frames per second (FPS). With its high precision and minimal parameters, the model strikes a balance between accuracy and speed, making it particularly suitable for underwater image segmentation.

Keywords

0801 Artificial Intelligence and Image Processing , 0906 Electrical and Electronic Engineering , Artificial Intelligence & Image Processing , 4603 Computer vision and multimedia computation , 4607 Graphics, augmented reality and games

Source

IET Image Processing, ISSN: 1751-9659 (Print); 1751-9667 (Online), Wiley. doi: 10.1049/ipr2.13161

DOI

10.1049/ipr2.13161

Publisher's version

https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/ipr2.13161

Rights statement

© 2024 The Author(s). IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

Permanent link

http://hdl.handle.net/10292/17705

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau

Full item page