Research

Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection

Jia-Xing Zhao*1 Yang Cao*1 Deng-Ping Fan*1 Ming-Ming Cheng1 Xuan-Yi Li Le Zhang2

1CS, Nankai University      2A*STAR

Samples from RGBD saliency datasets.

Abstract

The large availability of depth sensors provides valuable complementary information for salient object detection (SOD) in RGBD images. However, due to the inherent difference between RGB and depth information, extracting features from the depth channel using ImageNet pre-trained backbone models and fusing them with RGB features directly are suboptimal. In this paper, we utilize contrast prior, which used to be a dominant cue in none deep learning based SOD approaches, into CNNs-based architecture to enhance the depth information. The enhanced depth cues are further integrated with RGB features for SOD, using a novel fluid pyramid integration, which can make better use of multi-scale cross-modal features. Comprehensive experiments on 5 challenging benchmark datasets demonstrate the superiority of the architecture CPFP over 9 state-of-the-art alternative methods.

Paper

  • Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection, J Zhao*, Y Cao*, DP Fan*, XY Li, L Zhang, Ming-Ming Cheng, IEEE CVPR, 2019 (*Equal contribution). [bib | pdf | code | dataset  [xdvf]| evaluation results]

Most related projects on this website

Method

Overview

Architecture CPFP. The architecture contains two modules: feature-enhanced modules(FEM) and fluid pyramid integration module. FEM contains two submodules: Contrast-enhanced net and cross-modal fusion. In contrast-enhanced net, we utilize a novel contrast loss to leverage the contrast prior in the deep network to generate the enhanced map, and then get the enhanced features by the cross-modal fusion at all the 5 stages of VGG-16. The fluid pyramid integration method is designed to fuse the multi-scale cross-modal features. Architecture CPFP. The architecture contains two modules: feature-enhanced modules(FEM) and fluid pyramid integration module. FEM contains two submodules: Contrast-enhanced net and cross-modal fusion. In contrast-enhanced net, we utilize a novel contrast loss to leverage the contrast prior in the deep network to generate the enhanced map, and then get the enhanced features by the cross-modal fusion at all the 5 stages of VGG-16. The fluid pyramid integration method is designed to fuse the multi-scale cross-modal features.

Qualitative comparisons

The visualization results from SSB1000, NJU2000, LFSD, RGBD135 and NLPR.

Quantitative comparisons

Quantitative comparison results including S-measure, mean F-measure, maximum F-measure and MAE on 5 popular datasets.

We provide all the available datasets(NJU2K, DES, GIT, LFSD, NLPR, SIP, SSD, STERE) and the training set as well as list we used in the code page.

If you find our work is helpful, please cite

@inproceedings{zhao2019Contrast,

title={Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection},

author={Zhao, Jia-Xing and Cao, Yang and Fan, Deng-Ping and Cheng, Ming-Ming and Li, Xuan-Yi and Zhang, Le},

booktitle=CVPR,

year={2019}

}

@inproceedings{fan2017structure,

title={{Structure-measure: A New Way to Evaluate Foreground Maps}},

author={Fan, Deng-Ping and Cheng, Ming-Ming and Liu, Yun and Li, Tao and Borji, Ali},

booktitle={IEEE International Conference on Computer Vision (ICCV)},

pages = {4548-4557},

year={2017},

note={\url{http://dpfan.net/smeasure/}},

organization={IEEE}

}

Further Related Work

We provide a novel and simple state-of-the-art architecture for salient object detection in ICCV 2019, more details can be referred to

Contact

zhaojiaxing AT mail.nankai.edu.cn

yangcao.cs AT gmail DOT com

dengpingfan AT mail.nankai.edu.cn

Leave a Reply

Your email address will not be published. Required fields are marked *