BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Ming-Ming Cheng1 Ziming Zhang2 Wen-Yan Lin3 Philip Torr1
1The University of Oxford 2Boston University 3Brookes Vision Group
Abstract
Training a generic objectness measure to produce a small set of candidate object windows, has been shown to speed up the classical sliding window object detection paradigm. We observe that generic objects with well-defined closed boundary can be discriminated by looking at the norm of gradients, with a suitable resizing of their corresponding image windows in to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure.
We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g. ADD, BITWISE SHIFT, etc.). Experiments on the challenging PASCAL VOC 2007 dataset show that our method efficiently (300fps on a single laptop CPU) generates a small set of category-independent, high quality object windows, yielding 96.2% object detection rate (DR) with 1,000 proposals. Increasing the numbers of proposals and color spaces for computing BING features, our performance can be further improved to 99.5% DR.
Papers
- BING: Binarized Normed Gradients for Objectness Estimation at 300fps, Ming-Ming Cheng, Yun Liu, Wen-Yan Lin, Ziming Zhang, Paul L. Rosin, Philip H. S. Torr, Computational Visual Media 5(1):3-20, 2019. [Project page][pdf][bib] (Extention of CVPR 2014 Oral)
- BING: Binarized Normed Gradients for Objectness Estimation at 300fps. Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, Philip Torr, IEEE CVPR, 2014. [Project page][pdf][bib][C++][Latex][PPT, 12 min] [Seminar report, 50 min] [Poster] [Spotlight, 1 min] (Oral, Accept rate: 5.75%)
Most related projects on this website
- SalientShape: Group Saliency in Image Collections. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. The Visual Computer 30 (4), 443-453, 2014. [pdf] [Project page] [bib] [latex] [Official version]
- Efficient Salient Region Detection with Soft Image Abstraction. Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook. IEEE International Conference on Computer Vision (IEEE ICCV), 2013. [pdf] [Project page] [bib] [latex] [official version]
- Global Contrast based Salient Region Detection. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Philip Torr, Shi-Min Hu. IEEE TPAMI, 2014. [Project page] [Bib] [Official version] (2nd most cited paper in CVPR 2011)
Spotlights Video (17MB Video, pptx)
Figure. Tradeoff between #WIN and DR (see [3] for more comparisons with other methods [6, 12, 16, 20, 25, 28, 30, 42] on the same benchmark). Our method achieves 96.2% DR using 1,000 proposals, and 99.5% DR using 5,000 proposals.
Table 1. Average computational time on VOC2007.
Table 2. Average number of atomic operations for computing objectness of each image window at different stages: calculate normed gradients, extract BING features, and get objectness score.
Figure. Illustration of the true positive object proposals for VOC2007 test images.
Downloads
The C++ source code of our method is public available for download. An OpenCV compatible VOC 2007 annotations could be found here. 由于VOC网站在中国大陆被墙,我们提供了一个镜像下载链接:百度网盘下载, 镜像下载. Matlab file for making figure plot in the paper. Results for VOC 2007 (75MB). We didn’t apply any patent for this system, encouraging free use for both academic and commercial users.
Links to most related works:
- Measuring the objectness of image windows. Alexe, B., Deselares, T. and Ferrari, V. PAMI 2012.
- Selective Search for Object Recognition, Jasper R. R. Uijlings, Koen E. A. van de Sande, Theo Gevers, Arnold W. M. Smeulders, International Journal of Computer Vision, Volume 104 (2), page 154-171, 2013
- Category-Independent Object Proposals With Diverse Ranking, Ian Endres, and Derek Hoiem, PAMI February 2014.
- Proposal Generation for Object Detection using Cascaded Ranking SVMs. Ziming Zhang, Jonathan Warrell and Philip H.S. Torr, IEEE CVPR, 2011: 1497-1504.
- Learning a Category Independent Object Detection Cascade. E. Rahtu, J. Kannala, M. B. Blaschko, IEEE ICCV, 2011.
- Generating object segmentation proposals using global and local search, Pekka Rantalankila, Juho Kannala, Esa Rahtu, CVPR 2014.
- Efficient Salient Region Detection with Soft Image Abstraction. Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook. IEEE ICCV, 2013.
- Global Contrast based Salient Region Detection. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Philip Torr, Shi-Min Hu. IEEE TPAMI, 2014. (2nd most cited paper in CVPR 2011).
- Geodesic Object Proposals. Philipp Krähenbühl and Vladlen Koltun, ECCV, 2014.
Suggested detectors:
The proposals needs to be verified by detector in order to be used in real applications. Our proposal method perfectly match the major speed limitation of the following stage of the art detectors (please email me if you have other suggestions as well):
- Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, R. Girshick, J. Donahue, T. Darrell, J. Malik, IEEE CVPR (Oral), 2014. (Code; achieves best ever reported performance on PASCAL VOC)
- Fast, Accurate Detection of 100,000 Object Classes on a Single Machine, CVPR 2013 (best paper).
- Regionlets for Generic Object Detection, ICCV 2013 oral. (Runner up Winner in the ImageNet large scale object detection challenge)
Recent methods
- Data-driven Objectness, IEEE TPAMI, in print.
Applications
If you have developed some exciting new extensions, applications, etc, please send a link to me via email. I will add a link here:
- CNN: Single-label to Multi-label, Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao, Shuicheng Yan, arXiv, 2014
Third party resources.
If you have made a version running on other platforms (Software at other platforms, e.g. Mac, Linux, vs2010, makefile projects) and want to share it with others, please send me an email containing the url and I will add a link here. Notice, these third party versions may or may not contain updates and bug fix, which I provided in the next section of this webpage for easier updates.
- Linux version of this work provided by Shuai Zheng from the University of Oxford.
- Linux version of this work provided by Dr. Ankur Handa from the University of Cambridge.
- Unix version of this work provided by Varun from University of Maryland.
- OpenCV version (doc) of this work by Francesco Puja et al.
- Matlab version of this work by Tianfei Zhou from Beijing Institute of Technology
- Matlab version (work with 64 bit Win7 & visual studio 2012) provided by Jiaming Li from University of Electronic Science and Technology of China(UESTC).
Bug fix
- 2014-4-11: There was a bug in Objectness::evaluatePerImgRecall(..) function. After update, the DR-#WIN curve looks slightly better for high value of #WIN. Thanks YongLong Tian and WangLong Wu for reporting the bug.
FAQs
Since the release of the source code 2 days ago, 500+ students and researchers has download this source code (according to email records). Here are some frequently asked questions from users. Please read the FAQs before sending me new emails. Questions already occurred in FAQs will not be replied.
1. I download your code but can’t compile it in visual studio 2008 or 2010. Why?
I use Visual Studio 2012 for develop. The shared source code guarantee working under Visual Studio 2012. The algorithm itself doesn’t rely on any visual studio 2012 specific features. Some users already reported that they successfully made a Linux version running and achieves 1000fps on a desktop machine (my 300fps was tested on a laptop machine). If users made my code running at different platforms and want to share it with others, I’m very happy to add links from this page. Please contact me via email to do this.
2. I run the code but the results are empty. Why?
Please check if you have download the PASCAL VOC data (2 zip files for training and testing and put them in ./VOC2007/). The original VOC annotations could not directly be read by OpenCV. I have shared a version which is compatible with OpenCV (https://mmcheng.net/code-data/). After unzip all the 3 data package, please put them in the same folder and run the source code.
3. What’s the password for unzip your source code?
Please read the notice in the download page. You can get it automatically by supplying your name and institute information.
4. I got different testing speed than 300fps. Why?
If you are using 64bit windows, and visual studio 2012, the default setting should be fine. Otherwise, please make sure to enable OPENMP and native SSE instructions. In any cases, speed should be tested under release mode rather than debug mode. Don’t uncomments commands for showing progress, e.g. printf(“Processing image: %s”, imageName). When the algorithm runs at hundreds fps, printf, image reading (SSD hard-disk would help in this case), etc might become bottleneck of the speed. Depending on different hardware, the running speed might be different. To eliminate influence of hard disk image reading speed, I preload all testing images before count timing and do predicting. Only 64 bit machines support such large memory for a single program. If you RAM size is small, such pre-loading might cause hard disk paging, resulting slow running time as well. Typical speed people reporting ranging from 100fps (typical laptop) ~ 1000fps (pretty powerful desktop).
5. After increase the number of proposals to 5000, I got only 96.5% detection rate. Why?
Please read through the paper before using the source code. As explained in the abstract, ‘With increase of the numbers of proposals and color spaces … improved to 99:5% DR’. Using three different color space can be enabled by calling “getObjBndBoxesForTests” rather than the default one in the demo code “getObjBndBoxesForTestsFast”.
6. I got compilation or linking errors like: can’t find “opencv2/opencv.hpp”, error C1083: can’t fine “atlstr.h”.
These are all standard libraries. Please copy the error message and search at Google for answers.
7. Why linear SVMs, gradient magnitudes? These are so simple and alternatives like *** could be better and I got some improvements by doing so. Some implementation details could be improve as well.
Yes, there are many possibilities for improvement and I’m glad to hear people got some improvements already (it is nice to receive these emails). Our major focus is the very simple observation about things vs. stuff distinction (see section 3.1 in our CVPR14 paper). We try to model it as simple and as efficient as possible. Implementation details are also not guaranteed to be optimal and there are space to improve (I’m glad to receive such suggestions via email as well).
8. Like many other proposal methods, the BING method also generates many proposal windows. How can I distinguish between the windows I expect from others.
Like many other proposal methods (PMAI 2012, IJCV 2013, PAMI 2014, etc.), the number of proposals typically goes to a few thousands. To get the real detection results, you still need to apply a detector. A major advantage of the proposal methods is that the detector can ignore most (up to 99%) image windows in traditional sliding window pipeline, but still be able to check 90+% object windows. See the ‘Suggested detectors‘ section on this webpage for more details.
9. Is there any step by step guidance of using the source code?
Please see the read me document for details about where to download data, where to put the files, and advice for getting maximal speed.
10. Could you give a detailed step by step example of how to get binary normed gradient map from normed gradient map?
The simple method of getting binary normed gradients (binary values) from normed gradients (BYTE values) is described in detail in Sec. 3.3 of our CVPR 2014 paper (the paragraph above equation 5). Here is a simple example to help understanding. E.g. the binary representation of a BYTE value 233 is 11101001. We can take its top 4 bits 1110 to approximate the original BYTE values. If you want to recover the BYTE value from the 4 binary bits 1110, you will get an approximate value 224.
11. Is there any intuitive explanation of the objectness scores, i.e. s_l in equation (1) and O_l in equation (3) ?
The bigger value these scores are, it is more likely to be an object window. Although BING feature is a good feature for getting object proposals, its still not good enough to produce object detection results (see also FAQ 8). We can consider the number of object windows as a computation budget, and we want high recall within this budget. Thus we typically select top n proposals according to these scores, even the score might be negative value (not necessary means a non-object window). The value s_l means how good the window match with the template. The o_l is the score after calibration in order to rank proposals from more likely size (e.g. 160*160) higher than proposals from less likely size (e.g 10*320). The calibration parameters can be considered as a per size bias terms.
12. Typos in the project page, imperfect post reply, miss-spelled English words in the C++ source code, email not replied, etc.
I apologies for my limited language ability. Please report to me via personal emails if you found such typos, etc. It would also be more than welcome if you can simply repost if I missed to reply some of the important information.
I’m a careless boy and forgot to reply some of the emails quite often. If you think your queries or suggestions are important but not get replied in 5 working days, please simply resent the email.
13. Problem when running to the function format().
Some user suffered from error caused by not be able to correctly format() function in the source code. This is an standard API function of OpenCV. Notice that proper version of OpenCV needs to be linked. It seems that the std::string is not compatible with each other across different versions of Visual studio. You must link to appropriate version of it. Be care with the strange name mapping in visual studio: Visual studio 2005 (VC8), Visual studio 2008 (VC9), Visual studio 2010 (VC10), Visual studio 2012 (VC11), Visual studio 2013 (VC13).
14. What’s the format of the returned bounding boxes and how to illustrate the boxes as in the paper.
We follow the PASCAL VOC standard bounding boxes definition, i.e. [minX, minY, maxX, maxY]. You can refer the Objectness::illuTestReults() function for how the illustration was done.
15. Discussions in CvChina
There are 400+ disscusions about this projects in http://www.cvchina.info/2014/02/25/14cvprbing/ (in Chinese). You may find answers to your problems there.
程老师,你好,
我阅读了你的CVPR2014(BING:Binarized Normed Gradients for Objectness Estimation at 300fp)的论文,并在我的电脑上跑通了你在网站上公布的代码。不过在我用你的代码的时候,遇到了问题。我发现即使我把训练样本的数量调为0,最终你的程序也能对测试样本进行物体检测,也就是说我感觉你的程序在样本训练阶段是自己加载了某个已经训练好的分类器,而不是用你给的程序去完成训练过程。因为我现在需要用你的程序对新的样本进行训练,但是我好像没有在你的程序里面找到训练新样本的程序,不知道是否是我没有看懂你的程序造成的,希望你能告诉我你程序里面是否有训练新样本的部分,如果有,在哪个文件里面?
非常感谢。
训练部分在Objectness.cpp中:
// Get potential bounding boxes for all test images
void Objectness::getObjBndBoxesForTestsFast(vector<vector> &_boxesTests, int numDetPerSize)
{
//setColorSpace(HSV);
trainObjectness(numDetPerSize);
loadTrainedModel();
…
}
程老师,您好,现在已经跑出来了您试验时的结果,不知道怎样在图片上显示出来呢?
结果的illustrate可以用我共享的程序中的illustrate函数。建议你阅读FAQ中关于proposal和detection区别的部分。
程老师你好,我试用了您的代码去跑程序,results中BBoxesB2W8MAXBGR文件夹下生成的txt文件都是零,读入了测试和训练,不知道问题在哪···
你好,我也遇到了和你相同的问题,请问你后来是怎么解决的?
程老师,我在VS2012+opencv2.4.8运行此程序,opencv配置应该是好的,可是运行时总是出现:
“Objectness.exe”(Win32): 已加载“D:BingObjectnessCVPR14x64DebugObjectness.exe”。已加载符号。
“Objectness.exe”(Win32): 已加载“C:WindowsSystem32ntdll.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已加载“C:WindowsSystem32kernel32.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已加载“C:WindowsSystem32KernelBase.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已加载“D:opencvbuildx64vc12binopencv_core248.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已加载“C:WindowsSystem32msvcp120.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已卸载“C:WindowsSystem32msvcp120.dll”
“Objectness.exe”(Win32): 已加载“C:WindowsSystem32msvcp120.dll”。“包括”/“排除”设置禁用了加载功能。
“Objectness.exe”(Win32): 已卸载“C:WindowsSystem32msvcp120.dll”
程序“[0x1FA8] Objectness.exe”已退出,返回值为 -1073741701 (0xc000007b)。
不知道是什么问题呢,能帮我看看么?
从你的描述上看不出问题。但是你貌似用的是32位,建议你用64位编译试试。
程老师,您好,我运行的时候说:无法打开输入文件“LibLinear.lib” ,而我在下载的程序文件夹里也没有找到“LibLinear.lib”,请问该怎么解决这个问题?
设置objectness的那个project作为start up project,而不是liblinear那个project作为start up
程老师,您好。我想问一下section3.2中的“w”是不是就是SVM中的那个margin啊?还有stage II中的v和t是怎么求出来的?我没看懂!
是的,Section 3中的w是标准svm formulation中的那个w。stage II中把stage I的score作为一个一维的feature,然后针对每一个size,训练一个新的linear svm,主要是为了得到一个bias term。其实这Stage II我们做的很弱,应该有更好的方法。
“Download the VOC 2007 data “,从这个给的链接下载的tar文件里并不是两个压缩文件啊。发现好多人遇到文件路径的问题,搞了一下午都没能运行起来,费劲,提供配置好的下载多好,如果有人只想看代码的话再提供个代码下载链接。
程博,你好!我现在做的一个项目是基于空间网格的二值前景目标提取。看了这个项目中相关结果图,发现在类似汽车检测分割时,是怎样有效的将有遮挡的目标进行分割提取的啦?下了源码,但是需要解压密码,请问密码是需要怎样获得啊?
你好,想请问下如果用于行人检测,bing获得的窗口尺寸大小不一,用HOG怎么检测啊?
HOG检测窗口大小通常都是固定的。
如果把bing获得的窗口尺寸缩放为HOG检测窗口大小,那么行人特征应该会发生变化,谢谢!
对Proposal进行验证时不能原封不动的使用固定大小的detector。你可以把Proposal region resize一下。或者采用更巧妙的方法。在这个项目主页中有一节:Suggested detectors。其中有一个Reglets可以参考。
你好,你应用bing在行人检测上可以了吗?我最近也在学习BING,能否交流下,QQ562760086
我目前还没具体做应用。还在试图改进proposal本身。
有个问题请教一下。我们先后尝试了用selective search 算法和BING算法产生候选框,发现当测试标准交集/并集的比率阈值为0.5 时,BING的查全很高。但是当交集/并集的比率阈值为0.8时,BING的查全就很低很低了。只有百分之1点多。 对于后面的检测分类器来说,这个阈值设置为0.5,太宽松了,会造成很多误认。请问BING有没有通过参数调节的方法,使候选框更精准一些呢?谢谢!
这方面我们也在测试,目前公开的代码对于特别高的Overlap确实效果不如selective search,将在后续版本中重点改善。
虽然overlap阈值设得严格的时候,会造成使查全率变得很低。但是BING候选框和Selective Search 方法能产生互补。所以对我们的应用还是很有帮助的。谢谢!
不用客气。BING通常更加关注物体的主体部分,例如一个伸开手臂的人通常手臂等相对较小区域会丢失。但是对detector来说不一定是坏事。
请问您的代码需要visual studio 2012 专业版吗?visual c++ 2012 express edition是否可以编译?
应该跟是否Visual studio专业版无关。但是我没有试过。
When I try to load the .sln file, it says solution cant be loaded. How do I solve this problem?
Are you using the proper version of Visual studio? VS 2012 or VS 2013.
VS2012
I don’t known. Is your Visual studio correctly installed? Have you opened other visual studio solution using your VS2012?
I changed the version in .sln file
VS2012 is version 11 actually.
Also there were two main files which caused linking error. This had to be changed.
What is the average time taken by the code to learn stage 1 and stage 2?
It took about 900seconds ?
It takes me 20 seconds to learn stage I and stage II. Are you running the code at debug mode. The project was build using Visual studio 2012 (vc11). You shouldn’t change the project settings if you use visual studio 2012.
I opened the .sln file in edit mode and the version given was 12. I changed it to 11 and it started working.
I am running the code in debug mode only.
If you wan’t to know the efficiency of a program. You shouldn’t test it under debug mode. The speed could be 10+ times slower.
Yes, Its getting trained in 16 seconds in release mode.
Thanks!
What changes should I do to the code to test cases in my own dataset?
Make it have similar format as VOC dataset
It is said that only 1000 proposals are made, but in the text files there are something around 2000 propsals and the number varies.Why is it so?
Having more than 1000 proposals in the txt files simply supplies additional flexibility. You can always use first 1000 proposals.
I ‘m pretty sure that the sse2 and openmp option is on. In vs2012, only x86 mode need to turn on the option manually.
I have shut all foreground programs down (including vs2012), and the speed of BING increase to 0.008s. The CPU of my laptop is i5 4200. Since i7 gains 300fps, I think this speed is acceptable. Whilst, OpenCV’s filter2D/matchTemplate get a speed of about 0.009s. BING is still a little faster than OpenCV.
However, the speed of bing convolution is fast only for small image size(no more than 100*100). The speed of the two method(bing convolution and opencv filter2D) on 100*100 images are similar. But when it comes to usual images(such as 400*300 in my project), bing is 3 times slower than filter2D.
Moreover, I have find that the convolution results of bing and opencv are quit different. Some of them even have inequality signs. Is NUM_COMP defined in class filterBING too small (NUM_COMP=2) ?
I used the your function to reconstruct the original filter. As was expected, there are lots of equal numbers in it. Fortunately, the BING and real convolution always share the same local maxima locations. So I guess this is why such an approximation only lead to only 0.2% decreasement of recall.
Unfortunately, in my project (an object detection program, not objectness), the detection rate decraesed dramatically. I don’t think the approximation strategy could be applied in usual applications. I have to find other accelerate algorithms (for mobile phone). Could you give me some suggestions?
I have a much different speed testing in my machine. Could you please try other machines as well? Have you modified my code somehow?
Your work is great! The idea of scaling the object to 8*8 is delicacy!
I have downloaded your program and tested it a few weeks ago.
Today, I want to use your accelerated convolution to speed up a project I’m working on. However, It did not work well. It cost even more time than the matchTemplate function provided by OpenCV.
Then I replace the FilterBING.matchTemplate with the OpenCV’s. The time of the two complementation is similar, and OpenCV perfroms even faster(0.0098 vs 0.0101). Since the BING filter is an approximation, the recall of BING is a little lower than OpenCV. BING gains 0.971 while OpenCV gains 0.973 for 5000 proposals.
I am pretty sure my settings are correct because I did not change anything.
In fact, the BING performs much worse in my project. It is about 2~3 times slower than OpenCV’s matchTemplate!
Sorry for my poor English.
I think your work is still meaningful to me because I am developing an APP for mobile phones.
The main operator of your algorithm is BITWISE operators, while the real convolution is floating-point, which is pretty slow in Android.
I will test the speed on mobile phones and report to you in recent days. (maybe not recent, I will go out to do an awful work)
Thanks for the reporting. I’m wondering if you have configured the program properly so that hardware bitwise operations are properly activated? Especially the popcnt for int64. If the program was linked to a software implementation of this sse instruction, it could be much slower. Generally bitwise operations are much more efficient or at least at similar speed than float operations. In a hardware implementation, a popcnt instruction could be in in one cup clock cycle, but a software alternative might be much different.
Good work
Thanks
hello , I don’t get the results that I want. I don’t know where is wrong. I put a result example:
2003
-0.310501, 1, 1, 500, 406
-0.40108, 1, 1, 500, 256
-0.401742, 97, 1, 352, 406
-0.433909, 1, 161, 500, 406
-0.584675, 257, 1, 500, 256
-0.593412, 1, 1, 256, 406
-0.607349, 225, 1, 480, 406
-0.637498, 161, 1, 416, 256
-0.645096, 97, 97, 352, 352
-0.795382, 1, 161, 256, 406
-0.820196, 65, 1, 320, 256
-0.82834, 353, 1, 480, 256
-0.831464, 145, 33, 272, 288
-0.834417, 177, 1, 304, 406
-0.836466, 225, 97, 480, 352
-0.840157, 193, 1, 320, 256
-0.852335, 369, 1, 496, 406
-0.870406, 129, 1, 256, 406
-0.870617, 113, 129, 240, 384
-0.877054, 321, 1, 448, 406
-0.878936, 273, 1, 400, 256
-0.886169, 1, 49, 500, 176
-0.892839, 225, 1, 352, 406
-0.897758, 1, 113, 500, 240
-0.903951, 1, 193, 500, 320
-0.904228, 65, 1, 192, 406
-0.907919, 1, 1, 500, 128
…
right???
thanks
From you description, it’s difficult to get what’s your question. Could you describe why do you think there are something wrong with the results you have got?
sorry to the question that not clearly. I’ve understand your alg. that is very wonderful and benefited for me.
My question is I’m not sure the results if is right, mean that from the data get from results folder above , I saw a lot of rectangle in data. but I’m not sure how to get correct object in image. And the score of the rectangle stand for correct?
How to get predict object(only truth few) just like the image from the BING(paper)
Thanks
Supplement: I mean whether I can use the score from results data to segment the correct object and wrong object??
Thanks
Hello,cheng. I made a mistake about the purpose of the BING. I thought it was a object detection alg. After looking at the FAQ8 and understand it.
Thanks.
Your are a good man.
You are welcome 🙂
Hi Mingming,
This work is excellent and I was surprised by its simplicity and effectiveness after I have read your paper. I plan to try your code on my current work and would like to clarify with you the following concerns before the code work 🙂
Different from your original approach which uses a sliding window approach to produce some candidate windows, I would like to check the objectiveness of a given window with arbitrary sizes (e.g. I give you a windows and you tell me its objectiveness). This prevents me from using the calibration (equation (3), because the window sizes is not predefined) stage and the speed up technique (Section 3.3, because I am not using sliding window approach). Hence, I plan to just use equation (1) to compute the objectiveness score. Will the performance be affected apparently without the calibration and gradient binarization ? Btw, I am OK with the computational cost.
Best,
Jiong
I guess you can try my learned filter to test your windows. I don’t know the results yet. Just want to reminder you that proposals are not detection. The number of proposals needs to go to 1000 in-order to get high recall (see also the FAQs)
Hi mingming:
I tried to do similar things as you did in the paper but used a implementation of my own. I generated training samples (object windows and random negative windows) from PASCAL dataset, computed gradient magnitude for each sample image and resized them to 8×8. Then SVM was trained on this data (I tried both Matlab SVM and LibSVM), but the training can’t converge. Is there anything I need to notice during this process?
Thanks.
Sorry, I don’t know what’s the problem in your implementation. Why not use my implementations?
Hi mister Ming-Ming Cheng,
Congratulations for your good paper, please can you help me, i can’t found the OpenCV compatible VOC 2007 annotations.
You can download it using the link provided in readme page: https://mmcheng.net/bingreadme/
Very promising work!
i read the source code and ran it successfully,however,i realize that the result of your work is recorded in the txt file,is there any intuitive way for me to see the result?a picture with frames e.g.besides,how can i check if the result is correct?
Thanks for your interest in our work. The code runs evaluation parts by default, thus you can check the results via the evaluation scores. The illustration of true positive results are also in the code, but not activated by default. you can try to find it and use it.
Helpful suggestion!
I want to test your method using dataset other than VOC 2007,could you offer any advice?
Besides,I realize that in order to get the correct result,loading pictures into the folder “JPEGImages” is not enough,yml file also affect the test result,which confused me,I thought to input test images is enough to get the correct result.what I should do to get frames drawn correctly in my own pictures?
Thx for your help!
Proposals are not detections (see more in the FAQ). To get proposals, you only needs jpg images. To illustrate the results, you won’t want 1000 windows overplayed with the image, which looks quite confusing. We illustrate true positives proposals according to the yml to select the true positives.