PoolNet+: Exploring the Potential of Pooling for Salient Object Detection
Jiang-Jiang Liu, Qibin Hou, Zhi-Ang Liu, Ming-Ming Cheng
TKLNDST, CS, Nankai University
Online Demo
Abstract
We solve the problem of salient object detection by investigating how to expand the role of pooling in convolutional neural networks. Based on the U-shape architecture, we first build a global guidance module (GGM) upon the bottom-up pathway, aiming at providing layers at different feature levels the location information of potential salient objects. We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway. By adding FAMs after the fusion operations in the topdown pathway, coarse-level features from the GGM can be seamlessly merged with features at various scales. These two pooling-based modules allow the high-level semantic features to be progressively refined, yielding detail enriched saliency maps. Experiment results show that our proposed approach can more accurately locate the salient objects with sharpened details and hence substantially improve the performance compared to the previous state-of-the-art. Our approach is fast as well and can run at a speed of more than 30 FPS when processing a 300×400 image.
Paper
- PoolNet+: Exploring the Potential of Pooling for Salient Object Detection, Jiang-Jiang Liu, Qibin Hou, Zhi-Ang Liu, Ming-Ming Cheng*, IEEE TPAMI, 2021. [project | code | bib | pdf | 中译版]
- A Simple Pooling-Based Design for Real-Time Salient Object Detection, Jiang-Jiang Liu*, Qibin Hou*, Ming-Ming Cheng, Jiashi Feng, Jianmin Jiang. In CVPR 2019. (*Equal contribution) [pdf][code]
Source Code
A new framework based on PyTorch is available, which involves source code for PoolNet (keep updating)!
Method
Pipeline
We build our architecture based on the feature pyramid networks (FPNs) [22] which are a type of classic U-shape architectures designed in a bottom-up and top-down manner. we introduce a global guidance module (GGM) which is built upon the top of the bottom-up pathway. By aggregating the high-level information extracted by GGM with into feature maps at each feature level, our goal is to explicitly notice the layers at different feature levels where salient objects are. After the guidance information from GGM is merged with the features at different levels, we further introduce a feature aggregation module (FAM) to ensure that feature maps at different scales can be merged seamlessly.
Qualitative comparisons
Quantitative comparisons
PR Curves
Joint Training with Edge Detection
(a) Source image; (b) Ground truth; (c-d) Edge maps and saliency maps using the boundaries of salient objects as ground truths of the edge branch; (e-f) Edge maps and saliency maps by joint training with the edge dataset [1, 29].
Speed Comparisons
If you find our work is helpful, please cite
@article{Liu21PamiPoolNet, title={PoolNet+: Exploring the Potential of Pooling for Salient Object Detection}, author={Jiang-Jiang Liu and Qibin Hou and Zhi-Ang Liu and Ming-Ming Cheng}, year = {2021}, volume={}, number={}, pages={-}, journal={IEEE TPAMI} } @article{HouPami19Dss, title={Deeply Supervised Salient Object Detection with Short Connections}, author={Hou, Qibin and Cheng, Ming-Ming and Hu, Xiaowei and Borji, Ali and Tu, Zhuowen and Torr, Philip}, year = {2019}, volume={41}, number={4}, pages={815-828}, journal={IEEE TPAMI} } @inproceedings{Liu19PoolNet, title={A Simple Pooling-Based Design for Real-Time Salient Object Detection}, author={Jiang-Jiang Liu and Hou, Qibin and Cheng, Ming-Ming and Feng, Jiashi and Jiang, Jianmin}, booktitle={IEEE Conference on Computer Vision and Pattern Recognition}, year={2019}, }
Contact
{j04.liu, andrewhoux} AT gmail DOT com
请问poolnet+的官方源码有开源吗?
因该工作接收时正值毕业,后工作较忙,一直未整理代码。我争取早点整理放出,望见谅!
请问开源了PoolNet+的代码吗?点那个code的链接没有跳转。
[…] //arxiv.org/abs/1904.09460 497.一种基于简单池的实时设计突出物体检测 作者:刘江江,侯启斌,郑明明,冯嘉实,江建民 论文链接:https: //arxiv.org/abs/1904.09569源码链接:http ://mmcheng.net/poolnet/ […]
您好,请问论文中Figure 4. feature map可视化的算法是如何实现的,期待您的回答,谢谢!
你好,是通过对输入图片在相应位置的feature map进行简单的channel维度的取平均得到的。
好的明白了,谢谢您!