Deeply Supervised Salient Object Detection with Short Connections

Qibin Hou1 Ming-Ming Cheng1  Xiaowei Hu1  Ali Borji Zhuowen TuPhilip H. S. Torr4

1CCCE, Nankai University      2CRCV, UCF     3UCSD     4The University of Oxford

Abstract

Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis on the role of training data on performance. Our experimental results provide a more reasonable and powerful training set for future research and fair comparisons.

Paper

Source Code

You can find our code here. We have uploaded the caffe and CRF packages we used in our paper.

If you find our work is helpful, please cite

@article{hou2016deeply,
  title={Deeply supervised salient object detection with short connections},
  author={Hou, Qibin and Cheng, Ming-Ming and Hu, Xiaowei and Borji, Ali and Tu, Zhuowen and Torr, Philip},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2017}
}

Contact

andrewhoux AT gmail DOT com

Applications

This algorithm is used in flagship products such as Huawei Mate 10, Huawei Honour V10 etc, to create “AI Selfie: Brilliant Bokeh, perfect portraits”  effects as demonstrated in Mate 10 launch show, in Munich, Germany.

 

(Visited 4,308 times, 1 visits today)
  Subscribe  
Notify of
noname
Guest
noname

想問下。有關這篇論文的執行細節裡提到的「each image is trained for ten times」,是每張圖片總共經過10次訓練,還是每張圖片連續訓練10次?後面的註解有提到iter_size設為10,不過那個好像是用來增加batch_size用的,跟我對上面那句話的理解始終搭不上關係

另外,deconvolution layer如果是以固定kernel型式出現的話,是否可以在不影響back propagation的情況下換成一般的resize operation?因為在tensorflow裡,resize operation也是back propagation的對象之一

张守东
Guest
张守东

请问各位老师,做显著性检测时,喂进caffe里的训练集该怎么标注呢?(图像分类我好理解,不同的图片,标记为不同的类别)

MM Cheng
Admin

和Fully convolutional neuro network 等语义分割方法类似,整个label map作为ground truth 标注。

inkfish
Guest
inkfish

您好,我想重新训练您的网络,然而找不到MSRA-B数据集,微软上的下载链接已经失效了,您可不可以提供一个MSRA-B的下载链接

MM Cheng
Admin

在我们2015年IEEE TIP 的 Benchmark论文主页能找到所有相关述数据集的下载(百度网盘)。

Wan Yuqi
Guest
Wan Yuqi

Hi, Qibin. When i train the model, i can’t solve the problem ” Unknown layer type: ImageLabelmapData “. If you know a method to solve the problem, please to help me. Thank you very much!!!

Chen
Guest
Chen

I met the same problem.Do you have the method to solve it right now?

wsw
Guest
wsw

您好,我看了您的这篇论文。有个地方没明白。在3.3Inference一元项的定义中,分母中包含sigmoid函数。请问x的取值范围是{0,1}吗,那个h(x)的值域就是{0.5,e/(e+1)},可以这样理解吗

flyer
Guest
flyer

您的hed编译成功了吗?

flyer
Guest
flyer

请教一下您是如何编译的hed提供的caffe的?

flyer
Guest
flyer

您好,我编译hed提供的caffe时,注释了USE_CUDNN=1(因在其他网页看到编译这个需要cuda4,但我的是cudnn5),然后make all 时出错,提示cublas.h_v2.h:No such file or directory.