Deeply Supervised Salient Object Detection with Short Connections

14/03/2018 Qibin Hou

Qibin Hou¹ Ming-Ming Cheng¹ Xiaowei Hu¹ Ali Borji²Zhuowen Tu³Philip H. S. Torr⁴

¹CCCE, Nankai University ²CRCV, UCF ³UCSD ⁴The University of Oxford

Online Demo

Abstract

Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis of the role of training data on performance. Our experimental results provide a more reasonable and powerful training set for future research and fair comparisons.

Paper

Deeply supervised salient object detection with short connections, Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, Philip Torr, IEEE TPAMI (CVPR 2017), 2019. [pdf] [Project Page] [bib] [source code & data] [official version] [poster] [中文版海报][LaTeX]

Source Code

You can find our code here. We have uploaded the caffe and CRF packages we used in our paper.

If you find our work is helpful, please cite

@article{HouPami19Dss,
  title={Deeply supervised salient object detection with short connections},
  author={Hou, Qibin and Cheng, Ming-Ming and Hu, Xiaowei and Borji, Ali and Tu, Zhuowen and Torr, Philip},
  year  = {2019},
  volume={41}, 
  number={4}, 
  pages={815-828}, 
  journal= {IEEE TPAMI},
  doi = {10.1109/TPAMI.2018.2815688},
}

Contact

andrewhoux AT gmail DOT com

Applications

This algorithm is used in flagship products such as Huawei Mate 10, Huawei Honour V10 etc, to create “AI Selfie: Brilliant Bokeh, perfect portraits” effects as demonstrated in Mate 10 launch show, in Munich, Germany.

A report in Nature: link.

40 thoughts on “Deeply Supervised Salient Object Detection with Short Connections”

Yi Liu

18/05/2018 at 14:50

Hello! Only 1447 saliency maps exit in your published results of HKU-IS dataset. However, as far as I am concerned, there are 4447 images for HKU-IS. Why?
- Qibin HouPost author
  
  26/07/2018 at 11:20
  
  The rests are used for training and validation (See the paper).
Mr.Xiang

15/05/2018 at 11:14

您好！请问从MSRA-B中选择2000测试图片，这2000张是随机的吗？如果是随机，那各种论文中的测试集都不同如何比较呢？
- Qibin HouPost author
  
  26/07/2018 at 11:20
  
  DRFI那个paper提供的
yangliang

05/05/2018 at 21:42

In your publication “Deeply Supervised Salient Object Detection with Short Connections” on TPAMI, you said that “We use full resolution images to train our network, and the mini batch size is set to 10.”. Does that mean that you use the original training images and set the batch size to 10?I found that in the training dataset MSRA-B, some images have different sizes, so how do you use the different size of images in a batch? In addition, can you plz give the detail about the learning rate ,decay parameters and step size? Thank you so much!
- Qibin HouPost author
  
  07/05/2018 at 17:36
  
  You can set iter_size to 10 which can solve the problem of images with different resolutions. lr (0.001) if you normalize the loss. weight decay (5e-4). step size (8000, totally 12,000).
  - yangliang
    
    07/05/2018 at 18:12
    
    so which kind of optimization method do you use? The Adam or Momentum?
  - yangliang
    
    07/05/2018 at 22:06
    
    And what do you mean by the” normalize the loss”? thank you so much!
    - Qibin HouPost author
      
      08/05/2018 at 16:36
      
      Please refer to our source code for more details https://github.com/Andrew-Qibin/DSS
  - Nan
    
    05/11/2018 at 16:33
    
    Do you mean that we should train the model with an initial learning rate of 0.001? The basic learning rate specified in the paper as well as in the open sourced code is 1e-8. When we tried to train the model with a learning rate of 1e-8 using Momentum optimizer, it seemed that the side-output layers could not learn feartures in a right way. Some of the side-output layers would always output images that were completely white inspite of different input images. What do you think may cause such a phenomenon? Thanks for your time.
    - Qibin HouPost author
      
      05/11/2018 at 18:23
      
      The selection of the initial lr actually depends on whether the reduction parameter (Pytorch) in the loss layer is activated (the ‘normalize’ parameter in Caffe). This means if your total loss is divided by the number of pixels then you can set lr to 1e-3. If not, you need to set it to 1e-8 or less.
      - Nan
        
        06/11/2018 at 10:14
        
        Thanks a lot for your timely reply.
      - Nan
        
        08/11/2018 at 10:22
        
        By the way, is it necessary to set different learnig rates for the backbone network layers and the side-output layers?
Xiaowei

25/03/2018 at 21:27

I find your publication “Deeply Supervised Salient Object Detection with Short Connections” on TPAMI has a great improvement compared with the original conference version. The main change is using the ResNet 101 to replace the VGG. I follow your paper to replace the basic model (VGG) by ResNet 101, but I cannot get your results reported in your paper (on some datasets, the results are even worse than the VGG). Would you please give me the detail network parameters on ResNet101 (like train_val.prototxt, solver.prototxt) ?
- Qibin HouPost author
  
  25/03/2018 at 22:24
  
  Thanks for your interests. I will update my github repo soon after.
noname

28/12/2017 at 16:58

想問下。有關這篇論文的執行細節裡提到的「each image is trained for ten times」，是每張圖片總共經過10次訓練，還是每張圖片連續訓練10次？後面的註解有提到iter_size設為10，不過那個好像是用來增加batch_size用的，跟我對上面那句話的理解始終搭不上關係

另外，deconvolution layer如果是以固定kernel型式出現的話，是否可以在不影響back propagation的情況下換成一般的resize operation？因為在tensorflow裡，resize operation也是back propagation的對象之一
张守东

11/09/2017 at 15:00

请问各位老师，做显著性检测时，喂进caffe里的训练集该怎么标注呢？（图像分类我好理解，不同的图片，标记为不同的类别）
- MM Cheng
  
  27/12/2017 at 17:25
  
  和Fully convolutional neuro network 等语义分割方法类似，整个label map作为ground truth 标注。
  - Nan
    
    05/11/2018 at 15:14
    
    程老师好，很赞赏你们把自己的优秀成果开源出来与大家分享。我的有个疑问，就是在数据标注时，是否有统一的标准来减少主观影响。因为我看到数据集中有的图像中动物的头和身体都标注为显著性区域，有的图像中仅把动物头部或脸部标为显著性区域，而将颈部和身体标为非显著性区域。谢谢您的解答！
    - Qibin HouPost author
      
      05/11/2018 at 18:25
      
      这个完全取决于标注者对显著性这个概念的理解。想要做到完美的一致性是很难的。
      - Nan
        
        06/11/2018 at 10:15
        
        明白了，非常感谢！
inkfish

03/08/2017 at 10:21

您好，我想重新训练您的网络，然而找不到MSRA-B数据集，微软上的下载链接已经失效了，您可不可以提供一个MSRA-B的下载链接
- MM Cheng
  
  26/08/2017 at 04:38
  
  在我们2015年IEEE TIP 的 Benchmark论文主页能找到所有相关述数据集的下载（百度网盘）。
Wan Yuqi

22/07/2017 at 03:00

Hi, Qibin. When i train the model, i can’t solve the problem ” Unknown layer type: ImageLabelmapData “. If you know a method to solve the problem, please to help me. Thank you very much!!!
- Chen
  
  09/10/2017 at 16:13
  
  I met the same problem.Do you have the method to solve it right now?
wsw

26/06/2017 at 15:53

您好，我看了您的这篇论文。有个地方没明白。在3.3Inference一元项的定义中，分母中包含sigmoid函数。请问x的取值范围是{0,1}吗，那个h(x)的值域就是{0.5，e/(e+1)},可以这样理解吗
- flyer
  
  29/06/2017 at 02:59
  
  您的hed编译成功了吗？
flyer

22/06/2017 at 14:28

请教一下您是如何编译的hed提供的caffe的？
- Qibin HouPost author
  
  29/06/2017 at 14:25
  
  这几天我整理下caffe，然后上传下
  - flyer
    
    30/06/2017 at 03:54
    
    非常感谢！
    - Qibin HouPost author
      
      30/06/2017 at 08:53
      
      You may find it here https://github.com/Andrew-Qibin/caffe_dss
      - flyer
        
        03/07/2017 at 13:24
        
        Thank you! However, when I build the file ‘caffe_dss’ using the command ‘make test’,it shows ‘caffe/layers/hybrid_cross_entropy_loss_layer.hpp: No such file or directory compilation terminated’. I cannot find the file ‘caffe/layers/hybrid_cross_entropy_loss_layer.hpp’ indeed.
      - flyer
        
        03/07/2017 at 14:51
        
        The file ‘caffe/layers/soft_iou_loss_layer.hpp’ also can’t be found.
      - flyer
        
        03/07/2017 at 15:19
        
        I check the files in path ‘src/caffe/test/’. some head files cited in the files in path ‘src/caffe/test/ are existed, such as ‘channel_wise_cross_entropy_loss_layer.hpp’, ‘channel_wise_scale_layer.hpp’, ‘cross_entropy_loss_layer.hpp’, ‘full_cross_entropy_loss_layer.hpp’,and ‘hybrid_cross_entropy_loss_layer.hpp’, ‘iou_loss_layer.hpp’. Please tell us how to get these files. Thank you very much
      - flyer
        
        04/07/2017 at 00:54
        
        I check the files in path ‘src/caffe/test/’. some head files cited in the files in path ‘src/caffe/test/ are not existed, such as ‘channel_wise_cross_entropy_loss_layer.hpp’, ‘channel_wise_scale_layer.hpp’, ‘cross_entropy_loss_layer.hpp’, ‘full_cross_entropy_loss_layer.hpp’ and ‘hybrid_cross_entropy_loss_layer.hpp’, ‘iou_loss_layer.hpp’. Please tell us how to get these files. Thank you very much！
flyer

22/06/2017 at 14:27

您好，我编译hed提供的caffe时，注释了USE_CUDNN=1（因在其他网页看到编译这个需要cuda4，但我的是cudnn5），然后make all 时出错，提示cublas.h_v2.h:No such file or directory.
- Qibin HouPost author
  
  29/06/2017 at 14:25
  
  set USE_CUDNN=0
  - flyer
    
    30/06/2017 at 03:54
    
    非常感谢！
  - jiao
    
    14/09/2018 at 12:00
    
    请问可以不编译cudnn吗？
    - Qibin HouPost author
      
      18/09/2018 at 10:58
      
      sure!

Online Demo

Abstract

Paper

Source Code

If you find our work is helpful, please cite

Contact

Applications

You May Also Like

PoolNet+: Exploring the Potential of Pooling for Salient Object Detection

Efficient Salient Region Detection with Soft Image Abstraction

Global contrast based salient region detection

40 thoughts on “Deeply Supervised Salient Object Detection with Short Connections”

Leave a Reply Cancel reply