热带海洋学报

• • 上一篇    

基于视觉状态空间模型与注意力机制的遥感影像海陆分割方法研究

吴家炜1, 刘子健1, 贺辉2, 邢海花1   

  1. 1. 海南师范大学信息科学技术学院,海南 海口,571158;

    2. 北京师范大学文理学院,广东 珠海,519000

  • 收稿日期:2025-07-10 修回日期:2025-08-11 接受日期:2025-08-18
  • 通讯作者: 邢海花
  • 基金资助:
    国家自然科学基金项目(6206601362277007); 海南省自然科学基金项目(622RC674623RC480)

Research on Sea-Land Segmentation Method for Remote Sensing Images Based on Visual State Space Model and Attention Mechanism

Wu Jiawei1, Liu Zijian1, He Hui2, Xing Haihua1   

  1. 1. School of Information Science and Technology, Hainan Normal University, Haikou 571158, China;

    2. Faculty of Arts and Sciences, Beijing Normal University, Zhuhai 519000, China

  • Received:2025-07-10 Revised:2025-08-11 Accepted:2025-08-18
  • Supported by:
    National Natural Science Foundation of China(62066013,62277007); Hainan Provincial Natural Science Foundation of China(622RC674;623RC480)

摘要: 从遥感影像中进行海岸线自动提取对于海岸带环境资源监测、评估与管理具有重要意义。然而,受海岸带地形结构复杂、尺度差异显著及边界模糊等因素影响,实现高精度、强泛化能力的海陆分割仍面临诸多挑战。针对上述问题,本文提出了一种新型的海陆分割方法(VMA-Net)。所提方法使用视觉状态空间模型作为编码器,以实现对遥感影像中长距离空间依赖的精确建模,并结合ASPP(atrous spatial pyramid pooling)模块和CBAM(convolutional block attention module),协同提升多尺度上下文感知与关键区域表达能力。在Benchmark Sea-land Dataset、GF-HNCD与sea-land segmentation V1.1三个遥感海岸线数据集上的大量实验结果表明,所提方法VMA-Net在mF1与MIoU等定量指标上均优于多种主流深度学习方法。其中,在BSD、GF-HNCD及sea-land segmentation V1.1数据集上,mF1分别达到98.35%、98.38%和99.26%,MIoU依次为96.75%、96.81%和98.53%。同时,模型参数量(35.46M)与FLOPs(25.41G)实现了精度与效率的良好平衡,为海岸带资源环境的智能监测与科学管理提供了有力的技术支撑。

关键词: Vision Mamba, 注意力机制, 遥感影像, 海岸带资源监测, 海陆分割

Abstract: Automatic coastline extraction from remote sensing imagery is of great significance for the monitoring, assessment, and management of coastal zone environmental resources. However, achieving high-precision and strongly generalizable sea-land segmentation still faces many challenges, as it is influenced by the complexity of coastal terrain structures, significant differences in spatial scales, and the ambiguity of boundaries. To address the above issues, the proposed sea-land segmentation method, named VMA-Net, uses a visual state space model as the encoder for accurate modeling of long-range spatial dependencies in remote sensing images, and incorporates the ASPP (atrous spatial pyramid pooling) module and the CBAM (convolutional block attention module), which collaboratively enhance the perception of multi-scale contextual information and the representation of key regions. Extensive experimental results on three remote sensing coastline datasets—Benchmark Sea-land Dataset, GF-HNCD, and sea-land segmentation V1.1—demonstrate that the proposed VMA-Net outperforms various mainstream deep learning methods in terms of quantitative metrics such as mF1 and MIoU. Specifically, on the BSD, GF-HNCD, and sea-land segmentation V1.1 datasets, the mF1 scores reach 98.35%, 98.38%, and 99.26%, respectively, while the MIoU scores are 96.75%, 96.81%, and 98.53%, respectively. Meanwhile, with 35.46 million parameters and 25.41 GFLOPs, the model achieves a good balance between accuracy and efficiency, providing strong technical support for intelligent monitoring and scientific management of coastal zone environmental resources

Key words: Vision Mamba, Attention Mechanism, Remote Sensing Imagery, Coastal Resource Monitoring, Sea-Land Segmentation