热带海洋学报

• • 上一篇    

基于RT-DETR的浅海底栖生物目标检测改进模型

邓健志1, 唐政豪2, 3, 李云1   

  1. 1. 桂林理工大学, 物理与电子信息工程学院, 广西 桂林 541006

    2. 桂林理工大学, 光电信息与智能通信技术工程研究中心, 广西 桂林 541006

    3. 桂林理工大学, 广西高校低维结构物理与应用重点实验室, 广西 桂林 541006

  • 收稿日期:2025-07-07 修回日期:2025-10-11 接受日期:2025-10-30
  • 通讯作者: 李云
  • 基金资助:

    广西科技重大专项 (桂科AA23062035-2, 桂科AA24263038)

Epibenthos target detection improved model based on RT-DETR

DENG Jianzhi1, TANG Zhenghao2, 3, LI Yun1   

  1. 1. Guilin University of Technology, College of Physics and Electronic Information Engineering, Guilin 541006, China; 

    2. Guilin University of Technology, College of Physics and Electronic Information Engineering, Guilin 541006, China; Guilin 541006, China; 

    3.Guilin University of Technology, Key Laboratory of Low-dimensional Structural Physics and Application, Education Department of Guangxi Zhuang Autonomous Region, Guilin 541006, China

  • Received:2025-07-07 Revised:2025-10-11 Accepted:2025-10-30
  • Supported by:

    Guangxi Science and Technology Major Project (Guike AA23062035-2,  Guike AA24263038)

摘要: 浅海底栖生物目标检测在海洋生态监测和资源管理中具有重要意义,但受限于水下图像的低光照、模糊及复杂背景,传统检测算法性能不佳。本文提出一种改进的RT-DETR模型(MEIE-RTDETR),设计了一种多尺度边缘信息增强(multiscale edge information enhance, MEIE)模块强化特征提取,采用自适应稀疏自注意力(adaptive sparse self-attention, ASSA)降低计算冗余,并引入亮度信息增强(intensity enhance layer, IELC3)模块改进特征金字塔提升小目标检测能力,最后设计P-IoU(Powerful-IoU)+NWD(Normalized Wasserstein Distance)损失函数增强对边界框模糊和多尺度目标检测的性能。在DUO和RUOD数据集上的实验表明,改进后的模型在参数量(15.83M)和计算量(49.5 GFLOPs)显著降低的同时,mAP@50分别达到85.0%和85.4%,优于Faster R-CNN、YOLO系列及原始RT-DETR,为水下轻量化高精度检测提供了有效解决方案。

关键词: 浅海底栖生物, 目标检测, RT-DETR, 多尺度特征增强, 自适应稀疏注意力, 轻量化模型

Abstract: Epibenthos target detection is of great significance in marine ecological monitoring and resource management. However, limited by the low illumination, blurriness and complex background of underwater images, the performance of traditional detection algorithms is poor. This paper proposes MEIE-RTDETR (multiscale edge information enhance-real time detection transformer) model, designs a Multi-scale Edge Information Enhancement module to strengthen feature extraction, adopts ASSA (adaptive sparse self-attention) to reduce computational redundancy, and introduces luminance information enhancement IEL (intensity enhance layer) module to improve the feature pyramid and enhance the ability of small target detection. Finally, the P-IoU (powerful-IoU) + NWD (normalized wasserstein distance) loss function is designed to enhance the performance of detecting blurred bounding boxes and multi-scale targets. Experiments on the DUO (detecting underwater objects) and RUOD (rethinking general underwater object detection) datasets show that the improved model outperforms Faster R-CNN (region-based convolutional neural networks), the YOLO (you only look once) series, and the original RT-DETR (real time-detection transformer), and mAP50 of the improved model achieves 85.0% and 85.4% respectively, while significantly reducing parameters and computational complexity. It provides an effective solution for epibenthos lightweight and high-precision detection.

Key words: epibenthos, target detection, RT-DETR, multi-scale feature enhancement, adaptive sparse self attention, lightweight model