DemoResearch

Richer Convolutional Features for Edge Detection

Yun Liu1      Ming-Ming Cheng1     Xiaowei Hu1      Jia-Wang Bian1        Le Zhang2       Xiang Bai3      Jinhui Tang4

1Nankai University        2ADSC     3HUST       4NUST

Richer Convolutional Features for Edge Detection

Online demo at https://mc.nankai.edu.cn/edge

A simple demo captured by my phone (网速过慢可以使用西瓜视频观看).

Abstract

Edge detection is a fundamental problem in computer vision. Recently, convolutional neural networks (CNNs) have pushed forward this field significantly. Existing methods which adopt specific layers of deep CNNs may fail to capture complex data structures caused by variations of scales and aspect ratios. In this paper, we propose an accurate edge detector using richer convolutional features (RCF). RCF encapsulates all convolutional features into more discriminative representation, which makes good usage of rich feature hierarchies, and is amenable to training via backpropagation. RCF fully exploits multiscale and multilevel information of objects to perform the image-to-image prediction holistically. Using VGG16 network, we achieve state-of-the-art performance on several available datasets. When evaluating on the well-known BSDS500 benchmark, we achieve ODS F-measure of 0.811 while retaining a fast speed (8 FPS). Besides, our fast version of RCF achieves ODS F-measure of 0.806 with 30 FPS. We also demonstrate the versatility of the proposed method by applying RCF edges for classical image segmentation.

Papers

We have released the code and data for plotting the edge PR curves of many existing edge detectors here.

Motivation

We build a simple network based on VGG16 to produce side outputs of conv3_1, conv3_2, conv3_3, conv4_1, conv4_2 and conv4_3. One can clearly see that convolutional features become coarser gradually, and the intermediate layers conv3_1, conv3_2, conv4_1, and conv4_2 contain lots of useful fine details that do not appear in other layers.

Method

Our RCF network architecture. The input is an image with arbitrary sizes, and our network outputs an edge possibility map in the same size. We combine hierarchical features from all the conv layers into a holistic framework, in which all of the parameters are learned automatically. Since receptive field sizes of conv layers in VGG16 are different from each other, our network can learn multiscale, including low-level and objectlevel, information that is helpful to edge detection.

The pipeline of our multiscale algorithm. The original image is resized to construct an image pyramid. And these multiscale images are input to RCF network for a forward pass. Then, we use bilinear interpolation to restore resulting edge response maps to original sizes. A simple average of these edge maps will output high-quality edges.

Evaluation on BSDS500 dataset

Performance summary of 50+ years edge detection history. Our method achieves the first real-time system with better F-Measure than human annotators.   (Data for this figure can be found here)

The comparison with some competitors on BSDS500 dataset. The top three results are highlighted in red, green and blue respectively.

FAQs:

1. How your system is able to outperform humans, which is used as ground-truth?

We don’t think our method outperforms humans generally. It only achieves better F-Measure score than average human annotators of BSD 500 benchmarks. If given more time and careful training, human annotators could do better.

Related Papers

  • A Simple Pooling-Based Design for Real-Time Salient Object Detection, Jiang-Jiang Liu#, Qibin Hou#, Ming-Ming Cheng*, Jiashi Feng, Jianmin Jiang, IEEE CVPR, 2019. [project|bib|pdf|poster]
(Visited 39,827 times, 2 visits today)
Subscribe
Notify of
guest

207 Comments
Inline Feedbacks
View all comments
21AT

您好,请问一下resnet上实现rcf的代码公布了吗?

XuYun

老师您好,因为机器显卡只有1G的显存 爆显存了出现错误F1227 14:17:22.056977 1292 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory,老师请问修改batch_size后结果还对么,小白一个

XuYun

在测试bsd500的时候出现的问题

MM Cheng

你这个显存太小了。估计怎么修改batch_size都不够用吧。做模型训练的话,建议买一个11GB或以上的显存的显卡。

XuYun

好的 谢谢老师!

Zjm

老师您好!,我在训练初始时出现了Iteration 0,Terting net (#0) Iteration 0, loss=-nan,接着的loss都是-nan是什么情况,用您训练好的模型执行测试代码没问题。

Zjm

嗯,好的谢谢了!

Fan Yang

老师您好,有几个问题想向您请教: 1. stage 1 的旁路输出是有一个deconv 层没有在图中画出吗? 2. 每一层的padding 和 stride 是多少呢?感谢!

chunzhe

老师,您好,拜读了您的文章。我看在训练网络中trainval.prototxt中,每一次反卷积deconvolution后,都加一个autocrop层,这是为什么呢?可不可以不加autocrop层。第二个问题:在VGG16网络中的数据层中是进行裁剪为224*224的,而您的文件中没有进行裁剪,直接输入整张图像,这是为什么呢?第三个问题:在网络中,此时输入的label值是整张图像,而非检测中的目标的坐标值。如果在数据data层进行裁剪crop的话,label为整张图像也需要进行裁剪吗?这几个问题不太懂,期望您的答复,谢谢您老师!

Rufeng Zhang

您好,拜读了您的论文,我看对于BSDS有个阈值x,用来选择忽略的label,但您在论文里讲到因为NYUD是单人标注因此阈值x在NYUD是不需要的;我看了BSDS以及NYUD的GT,发现它们的label范围均是[0,255],是否这里是做一样的操作?(0, 128)这部分依然可以忽略?期待您的回复!还有测试NYUD也是用BSDS那个脚本吗,我没在它本身的官网看见测试代码

Ming

请问您是如何在resnet上实现rcf的呢

Ming

好的 十分感谢,请问过几天的代码是pytorch版本的吗

21AT

您好,请问一下resnet上实现rcf的代码公布了吗?

huangjie

I1105 20:11:10.021775 26811 layer_factory.hpp:77] Creating layer data

F1105 20:11:10.021813 26811 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: ImageLabelmapData
(known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Clip, Concat, ContrastiveLoss, Convolution, Crop, Data,
Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col,
ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter,
Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax,
SoftmaxWithLoss, Split, Swish, TanH, Threshold, Tile, WindowData)
*** Check failure stack trace: ***
已放弃 (核心已转储)

老师您好,我在训练RCF网络的时候出现了这个问题,找了很多方式都没有解决,期待你的回复。

Chun Xie

老师您好,我用BVLC的Caffe 报错unknown layer type:autocrop, 请问哪里能找到您使用的RCF的Caffe?

Chen Y

老师您好,RCF的caffe怎么用,直接Make编译吗?我Make之后,运行RCF-multiscale报错找不到caffe根目录,但是我看里我的caffe_root确实是指向刚编译好的caffe根目录的。我把caffe_root指向自己的caffe根目录就可以,但是报错unknown layer type:autocrop。

Wenlu

老师,您好!我的程序运行时出现了“Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: AutoCrop (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, WindowData)”这个问题,我注意到从Github上直接拷贝下来的caffe文件中的/include/layers中没有“AutoCrop”这个层,然而在您发布的rcf—master文件中有这个层的文件,请问在搭建caffe时必须要在您发布的程序下对caffe使用cmake进行编译吗?

Wenlu

非常感谢老师!祝您工作顺利!

ziyue

老师您好,文章中每个stage中特征融合的方式是直接求和,请问是否尝试过concat的方式,不知道效果怎么样?

shenyong

Note: Before evaluating the predicted edges, you should do the standard non-maximum suppression (NMS) and edge thinning. We used Piotr’s Structured Forest matlab toolbox available here.
Structured Edge Detection Toolbox V3.0
老师你好 ?请问预测图出来后,预处理是用的 这个工具 ,是哪个文件做的非最大抑制 和边缘细化 ?能给个详细的 说明吗?

MM Cheng

怎么用这些toolbox你自己看吧。

HuYAOHUI

您好,我跑了您给的代码,但训练时总的输出loss并不是您论文中提到的loss的求和啊?这是怎么回事?

HuYAOHUI

非常感谢~

HuYAOHUI

您好,我跑了您给的代码,训练时总的loss并不是您论文中提到的6个loss的求和啊?这是怎么回事呢?

HuYAOHUI

老师,您好,最后的loss我看论文是几个loss的叠加,但论文中提到在计算损失函数时的正负像素的比例,在哪里修改啊?我想跑一下自己的数据集,谢谢。

HuYAOHUI

老师您好,我看了cpp文件,我看是自动统计的,是吗?

Kai Li

多谢老师的指导,上回的问题已经解决啦。还有一些问题想请教老师一下:RCF 和 HED 这两篇文章的实验部分都提到 weight decay(0.0002),可能是由于我比较粗心,在caffe程序中并未找到 weight decay 用在哪了。不使用weight decay 的话,会影响泛化能力吗? 我最近在使用 tensorflow (1.8 version)还原您的实验,目前最好的ODS结果才将近0.72,不知是否是未使用weight dacay 的原因。期待您的回复。

Kai Li

非常感谢老师的指导,祝您生活愉快、工作顺利。

humm

老师,您好!您文章中ground truth图是二值图像吗?有的文章中ground truth图是黑底白边,有的是白底黑边,期刊论文要用那种?

MM Cheng

这个无所谓的,只是展示方式不同而已。我一般喜欢用白底黑边,只是因为打印机省墨 🙂

sss

你好,我在运行RCF-singlescale.ipynb的时候出现这个是什么意思啊?The kernel appears to have died. It will restart automatically.还有,我为什么看不到我测试的结果?郁闷

sss

命令行下面是这个错误:Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: AutoCrop (known types: AbsVal, Accuracy, AnnotatedData, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, DetectionEvaluate, DetectionOutput, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultiBoxLoss, MultinomialLogisticLoss, Normalize, PReLU, Parameter, Permute, Pooling, Power, PriorBox, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, SmoothL1Loss, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, Tile, VideoData, WindowData)
*** Check failure stack trace: ***
请问这是什么原因呢?

sss

好的,我试试,谢谢

feifei

尊敬的同学,您好,我也遇到了同样的问题,请问您解决了吗?解决了的话,能否帮助我一下,指点一下,感激不尽!

sss

你要不试试把caffee重新编译一下吧,我记得我当时好像重新编译了caffee

feifei

看了其他人跟老师的交流,说是得用老师的caffe,但老师的caffe应该是编译过的,所以怎么编译才能得到老师的caffe呢?能加您的邮箱交流一下吗?我需要您的帮助!拜托了,学长!小白一枚,麻烦您了!

sss

919066765这是我的QQ,编译就上面说的,需要用最原始的方法安装caffe

王思思

你好,你们有没有这个代码配置的详细细节呢?我想运行一下这个代码,可是在运行时报错了,我感觉应该是我自己没有配置好,小白一名,还请大神细心指点

sss

装好了,这个问题我已经解决了,谢谢了

王钰涵

你好,我也是小白,caffe我只做过简单的图像识别,没什么基础,现在准备做边缘检测,我们可以加下qq私下里讨论一下操作流程吗?我的qq是867243772,邮箱867243772@qq.com

feifei

您好,请问您解决了吗?可以加你的qq或者邮箱交流一下吗?

feifei

我是初学者,我想我们遇到过相同的问题,可以加下qq私下里讨论一下操作流程吗?我的qq:2681674725,邮箱xxiaofee@163.com

Jiantong Chen

作者您好,我对RCF中每个stage都采用了以cross-entropy loss的sigmoid layer,这个操作的物理意义怎么解释呢?比如是解决最终分类层损失太小、梯度弥散,还是说让每个stage学到的特征更鲁棒?

Kai Li

您好,我最近在学习边缘检测的东西,看到您2017发的文章是目前效果最好的。作为一个 CV 领域的一个小白的我,想问一个细节问题: 在Pdollar 大神的matlab工具箱里,使用的 Ground Truth 的 size大小 是 321*481。在对测试集进行性能评估的时候,是否要按照这个 size 进行呢?如果按照这个 SIze 对测试集的图像进行设置,老是报错~~。而用320*480的size,却没问题。作为边缘检测领域的权威,不知您是如何处理这个问题的?言语如有不适之处,敬请大神海涵。

Kai Li

多谢老师的指导,上回的问题已经解决啦。还有一个问题,想请教老师一下:RCF 和 HED 这两篇文章的实验部分都提到 weight decay(0.0002),可能是由于我比较粗心,在caffe程序中并未找到 weight decay 用在哪了。不使用weight decay的话,会影响泛化能力吗? 期待您的回复。