DemoResearch

Deeply Supervised Salient Object Detection with Short Connections

Qibin Hou1 Ming-Ming Cheng1  Xiaowei Hu1  Ali Borji Zhuowen TuPhilip H. S. Torr4

1CCCE, Nankai University      2CRCV, UCF     3UCSD     4The University of Oxford

Online Demo

Abstract

Recent progress on salient object detection is substantial, benefiting mostly from the explosive development of Convolutional Neural Networks (CNNs). Semantic segmentation and salient object detection algorithms developed lately have been mostly based on Fully Convolutional Neural Networks (FCNs). There is still a large room for improvement over the generic FCN models that do not explicitly deal with the scale-space problem. Holistically-Nested Edge Detector (HED) provides a skip-layer structure with deep supervision for edge and boundary detection, but the performance gain of HED on saliency detection is not obvious. In this paper, we propose a new salient object detection method by introducing short connections to the skip-layer structures within the HED architecture. Our framework takes full advantage of multi-level and multi-scale features extracted from FCNs, providing more advanced representations at each layer, a property that is critically needed to perform segment detection. Our method produces state-of-the-art results on 5 widely tested salient object detection benchmarks, with advantages in terms of efficiency (0.08 seconds per image), effectiveness, and simplicity over the existing algorithms. Beyond that, we conduct an exhaustive analysis of the role of training data on performance. Our experimental results provide a more reasonable and powerful training set for future research and fair comparisons.

Paper

Source Code

You can find our code here. We have uploaded the caffe and CRF packages we used in our paper.

If you find our work is helpful, please cite

@article{HouPami19Dss,
  title={Deeply supervised salient object detection with short connections},
  author={Hou, Qibin and Cheng, Ming-Ming and Hu, Xiaowei and Borji, Ali and Tu, Zhuowen and Torr, Philip},
  year  = {2019},
  volume={41}, 
  number={4}, 
  pages={815-828}, 
  journal= {IEEE TPAMI},
  doi = {10.1109/TPAMI.2018.2815688},
}

Contact

andrewhoux AT gmail DOT com

Applications

This algorithm is used in flagship products such as Huawei Mate 10, Huawei Honour V10 etc, to create “AI Selfie: Brilliant Bokeh, perfect portraits”  effects as demonstrated in Mate 10 launch show, in Munich, Germany.

A report in Nature: link.

(Visited 23,906 times, 1 visits today)
Subscribe
Notify of
guest

40 Comments
Inline Feedbacks
View all comments
Yi Liu

Hello! Only 1447 saliency maps exit in your published results of HKU-IS dataset. However, as far as I am concerned, there are 4447 images for HKU-IS. Why?

Mr.Xiang

您好!请问从MSRA-B中选择2000测试图片,这2000张是随机的吗?如果是随机,那各种论文中的测试集都不同如何比较呢?

yangliang

In your publication “Deeply Supervised Salient Object Detection with Short Connections” on TPAMI, you said that “We use full resolution images to train our network, and the mini batch size is set to 10.”. Does that mean that you use the original training images and set the batch size to 10?I found that in the training dataset MSRA-B, some images have different sizes, so how do you use the different size of images in a batch? In addition, can you plz give the detail about the learning rate ,decay parameters and step size? Thank you so much!

yangliang

so which kind of optimization method do you use? The Adam or Momentum?

yangliang

And what do you mean by the” normalize the loss”? thank you so much!

Nan

Do you mean that we should train the model with an initial learning rate of 0.001? The basic learning rate specified in the paper as well as in the open sourced code is 1e-8. When we tried to train the model with a learning rate of 1e-8 using Momentum optimizer, it seemed that the side-output layers could not learn feartures in a right way. Some of the side-output layers would always output images that were completely white inspite of different input images. What do you think may cause such a phenomenon? Thanks for your time.

Nan

Thanks a lot for your timely reply.

Nan

By the way, is it necessary to set different learnig rates for the backbone network layers and the side-output layers?

Xiaowei

I find your publication “Deeply Supervised Salient Object Detection with Short Connections” on TPAMI has a great improvement compared with the original conference version. The main change is using the ResNet 101 to replace the VGG. I follow your paper to replace the basic model (VGG) by ResNet 101, but I cannot get your results reported in your paper (on some datasets, the results are even worse than the VGG). Would you please give me the detail network parameters on ResNet101 (like train_val.prototxt, solver.prototxt) ?

noname

想問下。有關這篇論文的執行細節裡提到的「each image is trained for ten times」,是每張圖片總共經過10次訓練,還是每張圖片連續訓練10次?後面的註解有提到iter_size設為10,不過那個好像是用來增加batch_size用的,跟我對上面那句話的理解始終搭不上關係

另外,deconvolution layer如果是以固定kernel型式出現的話,是否可以在不影響back propagation的情況下換成一般的resize operation?因為在tensorflow裡,resize operation也是back propagation的對象之一

张守东

请问各位老师,做显著性检测时,喂进caffe里的训练集该怎么标注呢?(图像分类我好理解,不同的图片,标记为不同的类别)

MM Cheng

和Fully convolutional neuro network 等语义分割方法类似,整个label map作为ground truth 标注。

Nan

程老师好,很赞赏你们把自己的优秀成果开源出来与大家分享。我的有个疑问,就是在数据标注时,是否有统一的标准来减少主观影响。因为我看到数据集中有的图像中动物的头和身体都标注为显著性区域,有的图像中仅把动物头部或脸部标为显著性区域,而将颈部和身体标为非显著性区域。谢谢您的解答!

Nan

明白了,非常感谢!

inkfish

您好,我想重新训练您的网络,然而找不到MSRA-B数据集,微软上的下载链接已经失效了,您可不可以提供一个MSRA-B的下载链接

MM Cheng

在我们2015年IEEE TIP 的 Benchmark论文主页能找到所有相关述数据集的下载(百度网盘)。

Wan Yuqi

Hi, Qibin. When i train the model, i can’t solve the problem ” Unknown layer type: ImageLabelmapData “. If you know a method to solve the problem, please to help me. Thank you very much!!!

Chen

I met the same problem.Do you have the method to solve it right now?

wsw

您好,我看了您的这篇论文。有个地方没明白。在3.3Inference一元项的定义中,分母中包含sigmoid函数。请问x的取值范围是{0,1}吗,那个h(x)的值域就是{0.5,e/(e+1)},可以这样理解吗

flyer

您的hed编译成功了吗?

flyer

请教一下您是如何编译的hed提供的caffe的?

flyer

非常感谢!

flyer

Thank you! However, when I build the file ‘caffe_dss’ using the command ‘make test’,it shows ‘caffe/layers/hybrid_cross_entropy_loss_layer.hpp: No such file or directory compilation terminated’. I cannot find the file ‘caffe/layers/hybrid_cross_entropy_loss_layer.hpp’ indeed.

flyer

The file ‘caffe/layers/soft_iou_loss_layer.hpp’ also can’t be found.

flyer

I check the files in path ‘src/caffe/test/’. some head files cited in the files in path ‘src/caffe/test/ are existed, such as ‘channel_wise_cross_entropy_loss_layer.hpp’, ‘channel_wise_scale_layer.hpp’, ‘cross_entropy_loss_layer.hpp’, ‘full_cross_entropy_loss_layer.hpp’,and ‘hybrid_cross_entropy_loss_layer.hpp’, ‘iou_loss_layer.hpp’. Please tell us how to get these files. Thank you very much

flyer

I check the files in path ‘src/caffe/test/’. some head files cited in the files in path ‘src/caffe/test/ are not existed, such as ‘channel_wise_cross_entropy_loss_layer.hpp’, ‘channel_wise_scale_layer.hpp’, ‘cross_entropy_loss_layer.hpp’, ‘full_cross_entropy_loss_layer.hpp’ and ‘hybrid_cross_entropy_loss_layer.hpp’, ‘iou_loss_layer.hpp’. Please tell us how to get these files. Thank you very much!

flyer

您好,我编译hed提供的caffe时,注释了USE_CUDNN=1(因在其他网页看到编译这个需要cuda4,但我的是cudnn5),然后make all 时出错,提示cublas.h_v2.h:No such file or directory.

flyer

非常感谢!

jiao

请问可以不编译cudnn吗?