Research

S4Net: Single Stage Salient-Instance Segmentation

Ruochen FanMing-Ming Cheng2 Qibin Hou2 Tai-Jiang Mu1 Jingdong Wang3 Shi-Min Hu1

1Tsinghua University    2 University Nankai    3Microsoft Research Asia

Abstract

We consider an interesting problem—salient instance segmentation in this paper. Other than producing bounding boxes, our network also outputs high-quality instance-level segments. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also its surrounding context, enabling us to distinguish the instances in the same scope even with obstruction. Our network is end-to-end trainable and runs at a fast speed (40 fps when processing an image with resolution 320 × 320). We evaluate our approach on a public available benchmark and show that it outperforms other alternative solutions. We also provide a thorough analysis of the design choices to help readers better understand the functions of each part of our network. The source code can be found at https: //github.com/RuochenFan/S4Net.

Paper

  • S4Net: Single Stage Salient-Instance Segmentation, Ruochen Fan, Ming-Ming Cheng, Qibin Hou, Tai-Jiang Mu, Jingdong Wang, Shi-Min Hu, CVPR, 2019. [code] [pdf] [Project Page] [bib]

If you find our work is helpful, please cite

@inproceedings{Fan2019S4Net,
  title={S4Net: Single Stage Salient-Instance Segmentation},
  author={Ruochen Fan and Ming-Ming Cheng} and Qibin Hou and Tai-Jiang Mu and Jingdong Wang and Shi-Min Hu},
  booktitle={IEEE CVPR},
  year={2019},
}

Contact

644142239 AT qq DOT com  (Ruochen Fan)

RoIMasking

We propose RoIMasking to explicitly incorporate foreground/background separation for improving salient instance segmentation. We explicitly mark the region surrounding the object proposals as the initial background, and explore the foreground/background feature separations for salient instance segmentation in our segmentation branch. More specifically, we flip the signs of the feature values surrounding the proposals.

Network Structure

(a) A brief illustration of our framework. (b) The segmentation branch proposed in Mask R-CNN, which is composed of a stack of consecutive convolutional layers. (c) Our proposed segmentation branch which further enlarges the size of the receptive field.

Visualization Results

(Visited 3,212 times, 1 visits today)
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments