Richer Convolutional Features for Edge Detection

23/03/2017 Yun Liu

Yun Liu¹ Ming-Ming Cheng¹ Xiaowei Hu¹ Jia-Wang Bian¹ Le Zhang² Xiang Bai³ Jinhui Tang⁴

¹Nankai University ²ADSC ³HUST ⁴NUST

Richer Convolutional Features for Edge Detection

Online demo at https://mc.nankai.edu.cn/edge

A simple demo captured by my phone （网速过慢可以使用西瓜视频观看）.

Abstract

Edge detection is a fundamental problem in computer vision. Recently, convolutional neural networks (CNNs) have pushed forward this field significantly. Existing methods which adopt specific layers of deep CNNs may fail to capture complex data structures caused by variations of scales and aspect ratios. In this paper, we propose an accurate edge detector using richer convolutional features (RCF). RCF encapsulates all convolutional features into more discriminative representation, which makes good usage of rich feature hierarchies, and is amenable to training via backpropagation. RCF fully exploits multiscale and multilevel information of objects to perform the image-to-image prediction holistically. Using VGG16 network, we achieve state-of-the-art performance on several available datasets. When evaluating on the well-known BSDS500 benchmark, we achieve ODS F-measure of 0.811 while retaining a fast speed (8 FPS). Besides, our fast version of RCF achieves ODS F-measure of 0.806 with 30 FPS. We also demonstrate the versatility of the proposed method by applying RCF edges for classical image segmentation.

Papers

Richer Convolutional Features for Edge Detection, Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Jia-Wang Bian, Le Zhang, Xiang Bai, Jinhui Tang, IEEE TPAMI, 2019. [pdf] [Project Page] [bib] [source code] [official version][latex]
Richer Convolutional Features for Edge Detection, Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, Xiang Bai, IEEE CVPR, 2017. [pdf] [Project Page] [bib] [source code, pre-trained models, evaluation results, etc]

We have released the code and data for plotting the edge PR curves of many existing edge detectors here.

Motivation

We build a simple network based on VGG16 to produce side outputs of *conv3_1*, *conv3_2*, *conv3_3*, *conv4_1*, *conv4_2* and *conv4_3*. One can clearly see that convolutional features become coarser gradually, and the intermediate layers *conv3_1*, *conv3_2*, *conv4_1*, and *conv4_2* contain lots of useful fine details that do not appear in other layers.

Method

Our RCF network architecture. The input is an image with arbitrary sizes, and our network outputs an edge possibility map in the same size. We combine hierarchical features from all the conv layers into a holistic framework, in which all of the parameters are learned automatically. Since receptive field sizes of conv layers in VGG16 are different from each other, our network can learn multiscale, including low-level and objectlevel, information that is helpful to edge detection.

The pipeline of our multiscale algorithm. The original image is resized to construct an image pyramid. And these multiscale images are input to RCF network for a forward pass. Then, we use bilinear interpolation to restore resulting edge response maps to original sizes. A simple average of these edge maps will output high-quality edges.

Evaluation on BSDS500 dataset

Performance summary of 50+ years edge detection history. Our method achieves the first real-time system with better F-Measure than human annotators. (Data for this figure can be found here)

The comparison with some competitors on BSDS500 dataset. The top three results are highlighted in red, green and blue respectively.

FAQs:

1. How your system is able to outperform humans, which is used as ground-truth?

We don’t think our method outperforms humans generally. It only achieves better F-Measure score than average human annotators of BSD 500 benchmarks. If given more time and careful training, human annotators could do better.