BING: Binarized Normed Gradients for Objectness Estimation at 300fps

17/02/2014 MM Cheng

Ming-Ming Cheng¹ Ziming Zhang² Wen-Yan Lin³ Philip Torr¹

¹The University of Oxford ²Boston University ³Brookes Vision Group

Abstract

Training a generic objectness measure to produce a small set of candidate object windows, has been shown to speed up the classical sliding window object detection paradigm. We observe that generic objects with well-defined closed boundary can be discriminated by looking at the norm of gradients, with a suitable resizing of their corresponding image windows in to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure.

We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g. ADD, BITWISE SHIFT, etc.). Experiments on the challenging PASCAL VOC 2007 dataset show that our method efficiently (300fps on a single laptop CPU) generates a small set of category-independent, high quality object windows, yielding 96.2% object detection rate (DR) with 1,000 proposals. Increasing the numbers of proposals and color spaces for computing BING features, our performance can be further improved to 99.5% DR.

Papers

BING: Binarized Normed Gradients for Objectness Estimation at 300fps, Ming-Ming Cheng, Yun Liu, Wen-Yan Lin, Ziming Zhang, Paul L. Rosin, Philip H. S. Torr, Computational Visual Media 5(1):3-20, 2019. [Project page][pdf][bib] (Extention of CVPR 2014 Oral)
BING: Binarized Normed Gradients for Objectness Estimation at 300fps. Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, Philip Torr, IEEE CVPR, 2014. [Project page][pdf][bib][C++][Latex][PPT, 12 min] [Seminar report, 50 min] [Poster] [Spotlight, 1 min] (Oral, Accept rate: 5.75%)

Most related projects on this website

SalientShape: Group Saliency in Image Collections. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. The Visual Computer 30 (4), 443-453, 2014. [pdf] [Project page] [bib] [latex] [Official version]
Efficient Salient Region Detection with Soft Image Abstraction. Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook. IEEE International Conference on Computer Vision (IEEE ICCV), 2013. [pdf] [Project page] [bib] [latex] [official version]
Global Contrast based Salient Region Detection. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Philip Torr, Shi-Min Hu. IEEE TPAMI, 2014. [Project page] [Bib] [Official version] (2nd most cited paper in CVPR 2011)

Spotlights Video (17MB Video, pptx)

Figure. Tradeoff between #WIN and DR (see [3] for more comparisons with other methods [6, 12, 16, 20, 25, 28, 30, 42] on the same benchmark). Our method achieves 96.2% DR using 1,000 proposals, and 99.5% DR using 5,000 proposals.

Table 1. Average computational time on VOC2007.

Table 2. Average number of atomic operations for computing objectness of each image window at different stages: calculate normed gradients, extract BING features, and get objectness score.

Figure. Illustration of the true positive object proposals for VOC2007 test images.

Downloads

The C++ source code of our method is public available for download. An OpenCV compatible VOC 2007 annotations could be found here. 由于VOC网站在中国大陆被墙，我们提供了一个镜像下载链接：百度网盘下载，镜像下载. Matlab file for making figure plot in the paper. Results for VOC 2007 (75MB). We didn’t apply any patent for this system, encouraging free use for both academic and commercial users.

Links to most related works:

Measuring the objectness of image windows. Alexe, B., Deselares, T. and Ferrari, V. PAMI 2012.
Selective Search for Object Recognition, Jasper R. R. Uijlings, Koen E. A. van de Sande, Theo Gevers, Arnold W. M. Smeulders, International Journal of Computer Vision, Volume 104 (2), page 154-171, 2013
Category-Independent Object Proposals With Diverse Ranking, Ian Endres, and Derek Hoiem, PAMI February 2014.
Proposal Generation for Object Detection using Cascaded Ranking SVMs. Ziming Zhang, Jonathan Warrell and Philip H.S. Torr, IEEE CVPR, 2011: 1497-1504.
Learning a Category Independent Object Detection Cascade. E. Rahtu, J. Kannala, M. B. Blaschko, IEEE ICCV, 2011.
Generating object segmentation proposals using global and local search, Pekka Rantalankila, Juho Kannala, Esa Rahtu, CVPR 2014.
Efficient Salient Region Detection with Soft Image Abstraction. Ming-Ming Cheng, Jonathan Warrell, Wen-Yan Lin, Shuai Zheng, Vibhav Vineet, Nigel Crook. IEEE ICCV, 2013.
Global Contrast based Salient Region Detection. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Philip Torr, Shi-Min Hu. IEEE TPAMI, 2014. (2nd most cited paper in CVPR 2011).
Geodesic Object Proposals. Philipp Krähenbühl and Vladlen Koltun, ECCV, 2014.

Suggested detectors:

The proposals needs to be verified by detector in order to be used in real applications. Our proposal method perfectly match the major speed limitation of the following stage of the art detectors (please email me if you have other suggestions as well):

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, R. Girshick, J. Donahue, T. Darrell, J. Malik, IEEE CVPR (Oral), 2014. (Code; achieves best ever reported performance on PASCAL VOC)
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine, CVPR 2013 (best paper).
Regionlets for Generic Object Detection, ICCV 2013 oral. (Runner up Winner in the ImageNet large scale object detection challenge)

Recent methods

Data-driven Objectness, IEEE TPAMI, in print.

Applications

If you have developed some exciting new extensions, applications, etc, please send a link to me via email. I will add a link here:

CNN: Single-label to Multi-label, Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao, Shuicheng Yan, arXiv, 2014

Third party resources.

If you have made a version running on other platforms (Software at other platforms, e.g. Mac, Linux, vs2010, makefile projects) and want to share it with others, please send me an email containing the url and I will add a link here. Notice, these third party versions may or may not contain updates and bug fix, which I provided in the next section of this webpage for easier updates.

Linux version of this work provided by Shuai Zheng from the University of Oxford.
Linux version of this work provided by Dr. Ankur Handa from the University of Cambridge.
Unix version of this work provided by Varun from University of Maryland.
OpenCV version (doc) of this work by Francesco Puja et al.
Matlab version of this work by Tianfei Zhou from Beijing Institute of Technology
Matlab version (work with 64 bit Win7 & visual studio 2012) provided by Jiaming Li from University of Electronic Science and Technology of China(UESTC).

Bug fix

2014-4-11: There was a bug in Objectness::evaluatePerImgRecall(..) function. After update, the DR-#WIN curve looks slightly better for high value of #WIN. Thanks YongLong Tian and WangLong Wu for reporting the bug.

FAQs

Since the release of the source code 2 days ago, 500+ students and researchers has download this source code (according to email records). Here are some frequently asked questions from users. Please read the FAQs before sending me new emails. Questions already occurred in FAQs will not be replied.

1. I download your code but can’t compile it in visual studio 2008 or 2010. Why?

I use Visual Studio 2012 for develop. The shared source code guarantee working under Visual Studio 2012. The algorithm itself doesn’t rely on any visual studio 2012 specific features. Some users already reported that they successfully made a Linux version running and achieves 1000fps on a desktop machine (my 300fps was tested on a laptop machine). If users made my code running at different platforms and want to share it with others, I’m very happy to add links from this page. Please contact me via email to do this.

2. I run the code but the results are empty. Why?

Please check if you have download the PASCAL VOC data (2 zip files for training and testing and put them in ./VOC2007/). The original VOC annotations could not directly be read by OpenCV. I have shared a version which is compatible with OpenCV (https://mmcheng.net/code-data/). After unzip all the 3 data package, please put them in the same folder and run the source code.

3. What’s the password for unzip your source code?

Please read the notice in the download page. You can get it automatically by supplying your name and institute information.

4. I got different testing speed than 300fps. Why?

If you are using 64bit windows, and visual studio 2012, the default setting should be fine. Otherwise, please make sure to enable OPENMP and native SSE instructions. In any cases, speed should be tested under release mode rather than debug mode. Don’t uncomments commands for showing progress, e.g. printf(“Processing image: %s”, imageName). When the algorithm runs at hundreds fps, printf, image reading (SSD hard-disk would help in this case), etc might become bottleneck of the speed. Depending on different hardware, the running speed might be different. To eliminate influence of hard disk image reading speed, I preload all testing images before count timing and do predicting. Only 64 bit machines support such large memory for a single program. If you RAM size is small, such pre-loading might cause hard disk paging, resulting slow running time as well. Typical speed people reporting ranging from 100fps (typical laptop) ~ 1000fps (pretty powerful desktop).

5. After increase the number of proposals to 5000, I got only 96.5% detection rate. Why?

Please read through the paper before using the source code. As explained in the abstract, ‘With increase of the numbers of proposals and color spaces … improved to 99:5% DR’. Using three different color space can be enabled by calling “getObjBndBoxesForTests” rather than the default one in the demo code “getObjBndBoxesForTestsFast”.

6. I got compilation or linking errors like: can’t find “opencv2/opencv.hpp”, error C1083: can’t fine “atlstr.h”.

These are all standard libraries. Please copy the error message and search at Google for answers.

7. Why linear SVMs, gradient magnitudes? These are so simple and alternatives like *** could be better and I got some improvements by doing so. Some implementation details could be improve as well.

Yes, there are many possibilities for improvement and I’m glad to hear people got some improvements already (it is nice to receive these emails). Our major focus is the very simple observation about things vs. stuff distinction (see section 3.1 in our CVPR14 paper). We try to model it as simple and as efficient as possible. Implementation details are also not guaranteed to be optimal and there are space to improve (I’m glad to receive such suggestions via email as well).

8. Like many other proposal methods, the BING method also generates many proposal windows. How can I distinguish between the windows I expect from others.

Like many other proposal methods (PMAI 2012, IJCV 2013, PAMI 2014, etc.), the number of proposals typically goes to a few thousands. To get the real detection results, you still need to apply a detector. A major advantage of the proposal methods is that the detector can ignore most (up to 99%) image windows in traditional sliding window pipeline, but still be able to check 90+% object windows. See the ‘Suggested detectors‘ section on this webpage for more details.

9. Is there any step by step guidance of using the source code?

Please see the read me document for details about where to download data, where to put the files, and advice for getting maximal speed.

10. Could you give a detailed step by step example of how to get binary normed gradient map from normed gradient map?

The simple method of getting binary normed gradients (binary values) from normed gradients (BYTE values) is described in detail in Sec. 3.3 of our CVPR 2014 paper (the paragraph above equation 5). Here is a simple example to help understanding. E.g. the binary representation of a BYTE value 233 is 11101001. We can take its top 4 bits 1110 to approximate the original BYTE values. If you want to recover the BYTE value from the 4 binary bits 1110, you will get an approximate value 224.

11. Is there any intuitive explanation of the objectness scores, i.e. s_l in equation (1) and O_l in equation (3) ?

The bigger value these scores are, it is more likely to be an object window. Although BING feature is a good feature for getting object proposals, its still not good enough to produce object detection results (see also FAQ 8). We can consider the number of object windows as a computation budget, and we want high recall within this budget. Thus we typically select top n proposals according to these scores, even the score might be negative value (not necessary means a non-object window). The value s_l means how good the window match with the template. The o_l is the score after calibration in order to rank proposals from more likely size (e.g. 160*160) higher than proposals from less likely size (e.g 10*320). The calibration parameters can be considered as a per size bias terms.

12. Typos in the project page, imperfect post reply, miss-spelled English words in the C++ source code, email not replied, etc.

I apologies for my limited language ability. Please report to me via personal emails if you found such typos, etc. It would also be more than welcome if you can simply repost if I missed to reply some of the important information.

I’m a careless boy and forgot to reply some of the emails quite often. If you think your queries or suggestions are important but not get replied in 5 working days, please simply resent the email.

13. Problem when running to the function format().

Some user suffered from error caused by not be able to correctly format() function in the source code. This is an standard API function of OpenCV. Notice that proper version of OpenCV needs to be linked. It seems that the std::string is not compatible with each other across different versions of Visual studio. You must link to appropriate version of it. Be care with the strange name mapping in visual studio: Visual studio 2005 (VC8), Visual studio 2008 (VC9), Visual studio 2010 (VC10), Visual studio 2012 (VC11), Visual studio 2013 (VC13).

14. What’s the format of the returned bounding boxes and how to illustrate the boxes as in the paper.

We follow the PASCAL VOC standard bounding boxes definition, i.e. [minX, minY, maxX, maxY]. You can refer the Objectness::illuTestReults() function for how the illustration was done.

15. Discussions in CvChina

There are 400+ disscusions about this projects in http://www.cvchina.info/2014/02/25/14cvprbing/ (in Chinese). You may find answers to your problems there.

329 thoughts on “BING: Binarized Normed Gradients for Objectness Estimation at 300fps”

Nick Sergievskiy

15/04/2014 at 14:41

Hi Ming-Ming
I found your work very interesting and i whant to implement in my (test) task. I try to implement BING on avss2007 bag detection. It is most stand alone thing that imagine. But there is no learning data (small dataset) and try to run voc2007-learned obj detector. I found that recall and accuracy not so good as in voc2007. Boundary box have low intersection accuracy to groubd truth. Then i run simple test: black ellipse and bbox. Accuracy for ellipse is not good too and score hve same distribution as difficult scene with many proposals.
How can I increase accuracy of BING? If i choose more proposals what method is effective to rerate my data with low calculations?
- MM ChengPost author
  
  16/04/2014 at 07:52
  
  In the second stage of the SVM learning, we train 36 individual SVMs. Each training sample can only be used in one of them. If the number of training set is small, you might end up with cases where the number of training data for many sizes are less than 1, which means you won’t be able to get proper training. We are planning to solve this issue in the journal version, but not finished yet.
  
  I don’t quite get you description “It is most stand alone thing that imagine”. From my understanding, your images are mostly with a single clear object in it. If this is the case, I will refer you to try https://mmcheng.net/SalObj/ and https://mmcheng.net/effisalobj/ first. You would probabaly get nice segmentation of the target object as well.
  - Nick Sergievskiy
    
    16/04/2014 at 12:04
    
    Thank you.
    I try to learn deep inside the code.
    Under the “accuracy “I mean measure of rectangle intersection in Selective Search it calls Best Overlap (or Mean Average Best Overlap (MABO)).
    In “avss2007 bag detection” there is many objects like people, chairs, trains and etc. But some times frame contains isolated object like a bag. This problem often solves by simple video analysis, but I want use some kind of detector on single frame.
    In my test I have several proposals with Best Overlap score 0.4-0.6 with 700×700 image and 60×50 object ( isolated rectangular or blob object with good gradient). Maybe 36 scales not enough for accurate Best Overlap?
    - MM ChengPost author
      
      17/04/2014 at 07:21
      
      I didn’t have chance to see you sample images. From you description, I suggest you to try this one as well to see the results: https://mmcheng.net/SalObj/
  - Alan
    
    14/12/2019 at 14:45
    
    你好程老师，请问该算法代码的解压密码是多少呀？谢谢。
    - MM ChengPost author
      
      16/12/2019 at 10:17
      
      代码下载页面红色字体注意事项中有写：https://mmcheng.net/code-data/
Lye

11/04/2014 at 11:47

Hi Ming-Ming

Many thanks for the code. I am experiencing buffer overflow problem in the following part of the code , in loadBBoxes() . Any recommended solutions? What is the computer ram size that you use?
void DataSetVOC::loadAnnotations()
{
for (int i = 0; i < testNum; i++)
if(!loadBBoxes(testSet[i], gtTestBoxes[i], gtTestClsIdx[i]))
}

Thank you
- MM ChengPost author
  
  12/04/2014 at 16:26
  
  This part is not limited by the ram size. As it’s simply loading txt annotations, which only requires a few KB memory.
  However, in order to discount image reading time, I load all the VOC testing images into RAM before any processing, which requires a few GB RAM. Try to re-download a new version, in which the default option is to not do pre-loading. Reported testing time might be slightly longer but the code requires much less memory.
- Li
  
  10/08/2014 at 09:20
  
  我运行的时候怎么出现，2501 training and 0 testing? 谢谢？
Dodo

10/04/2014 at 13:14

Hi, Ming-Ming. I’m a newcomer of this field. I have read your perfect work and recently I’m reading the source code. I can successfully run it and obtain trained filter and test results. What I’m trying to do next is applying it to other images. However, after loading an image and generating thousands of proposal windows, I don’t know how to distinguish the windows I expect from others. Is there any methods in this project can help me finishing this job? Hope to get your reply soon and I would be grateful.
- MM ChengPost author
  
  10/04/2014 at 14:04
  
  Thanks for your good questions. I have put the question and my suggestion to the FAQs 8.
Bo Luo

04/04/2014 at 07:38

Hi, I got an error when running your code. Can you give me a favor? There is nothing wrong when debugging, but when running, an error is occurred when calling the function “__popcnt64” in function “dot” in FilterTIG.h. The error information is “0x00007FF6B8361DE0 处有未经处理的异常(在 Objectness.exe 中): 0xC000001D: Illegal Instruction。”. I have checked the input of the function, the input is alright(it is 1) and searched the funcation “__popcnt64” in google, but still have no idea how to deal with it. I will really be appreciated if you can help sovle the problem.
Looking for your reply!
- MM ChengPost author
  
  05/04/2014 at 11:18
  
  Sorry I don’t have any idea about this problem. You might be able to get resutls by searching the error code in Google yourself. Most people can run it correctly in both debug and release mode.
- qqfbleach
  
  08/04/2014 at 08:51
  
  I have the same problem.How do you solve that?
  - MM ChengPost author
    
    08/04/2014 at 14:00
    
    I don’t know what’s the exactly problem you meet, but thousands of people have download it and run correctly. It might be helpful to check if your Visual studio 2012 and OpenCV settings are correctly. There is also a forum where people disusing their experience of using this method. Currently there are 270+ comments there (in Chinese) http://www.cvchina.info/2014/02/25/14cvprbing/ . You might find helpful information there.
    - qqfbleach
      
      09/04/2014 at 03:09
      
      Thank you very much..I’ve read that before.User “lxy“ has the same problem.The error happens in “DataSetVOC::loadBBoxes” function.This problem was caused by using FileStorage type .In my experiment,when I use FileStorage to read yml files,an error occurred.I think it’s the opencv’s problem.
      - MM ChengPost author
        
        09/04/2014 at 09:32
        
        It’s good to locate the issue. Regarding the opencv yml reading, it might be helpful if you update to the latest version of 64bit opencv.
- DrBalthar
  
  07/05/2014 at 10:01
  
  The problem is your running on an architecture that doesn’t seem to support the _popcnt64 intrinsic. Which was introduced along side SSE4.2. Obviously windows doesn’t supply an alternative implementation for it. But it is very easy to write one yourself. popcnt64 basically just counts the number of bits set in a 64bit integer.
  - MM ChengPost author
    
    08/05/2014 at 08:10
    
    Yes. There are many _popcnt64 software implementation available in the internet. It might be a bit slower the hardware implementation.
Bill

01/04/2014 at 06:39

Hi Ming-Ming,

Any requirement of location for OpenCV installation?

Thanks,
- MM ChengPost author
  
  01/04/2014 at 11:11
  
  No specific requirement. You need to configure your OpenCV as how it was with other opencv dependent projects. You can search Google for related tips.
lao6

18/03/2014 at 13:30

Thanks for your patience again. I want to apply your method in my application,but still can not get the right results. There is the following procedure:(1) using getFeature function to get feature of both negative and positive samples, the corresponding y is -1 and 1. (2) Using trainSVM fuction to get the learned weights w. (3) Get the feature of the test image i.e. ft (4) get the sum of ft.mul(w) score if the score0 values. I wonder if there miss some important information or I misuse your algorithm.
looking forward to your reply
- MM ChengPost author
  
  19/03/2014 at 10:31
  
  I don’t know what do you mean by
  
  can’t get the right results
  
  . You should be able to download the source code and VOC data, then open it in Visual studio 2012, and click Ctrl+F5 to get the right results.
  - lao6
    
    19/03/2014 at 11:19
    
    I can get the right result of your work . I want to apply it in my study but get the wrong results.I don’t know if the procedure metioned above is accurate.
    - MM ChengPost author
      
      19/03/2014 at 20:24
      
      From you description, it seems correct. But the description is quite sparse.
lao6

18/03/2014 at 03:01

Thanks for your reply. I still comfused : if we get the NG feature of a negative sample and multiply the learned filter and sum them up, will it get a negative value ? But in my experiment I still get a positive value.
- MM ChengPost author
  
  18/03/2014 at 09:02
  
  Linear SVM is not pow-full enough to deal with every negative examples. You might get some positive predictions for an negative examples, but statistically top 1000 proposals could give you 96+% recall.
lao6

17/03/2014 at 08:40

Hi,I have read your perfect work about BING and read the source code recently. I have some questions unsoloved:(1) in the generateTrianData part I don’t find the quantized window size. (2) if just use NG feature instead of BING, does it affect a lot? (3) when we get the BING feature ,how to find the object on the image
- MM ChengPost author
  
  17/03/2014 at 16:54
  
  Thanks for your interest. In generateTrianData, i.e. stage I, we learn a single 8×8 filter, regardless window size. See the paper Sec. 3.2 state I. for more details. If use NG feature, you need 64 float multiply, 64 float add, and a few other operations for calculating the score for each window, where for BING the number of operations reduces to what is shown in Table 2. You should refer to how I get bounding boxes in predictBBoxSI() for your third question.
Li Hang

16/03/2014 at 15:43

Thanks a lot to share the idea and the source code !!
Kai Wang

06/03/2014 at 02:24

Hi,I have read your perfect work about BING. The most highlight is that it just use 1000 proposals to get almost the highest detection rate. It is very exciting for this huge innovation in object detection.

My one question is how did you get this 1000 bounding boxes (object windows) around the object? You resize input image into 36 quantized target window . So, from each size , e.g. 80*160 target window, you directly further resized it into a 8*8 object window to get NG feature or anything else?

In addition, I notice that you adopt non-maximal suppression (NMS) to select a set of proposals (object window) for each size of target window, it means that you totally get 1000 proposals from 36 quantized target window and each target window generates several to dozens of proposals by using NMS? I am right? Would you like to recommend me some References NMS used in you paper? I can not find it in your paper.

At last, why NG features use the ground truth object windows as positive training samples to learn a linear model w in Equation (1), How can we deal with some training samples which have not ground truth object windows?

Hope to receive your answer in your conveniences and help me deeply understand this perfect work. Thanks!

best regards,
Kai
- MM ChengPost author
  
  06/03/2014 at 08:14
  
  Thank you for your interest in our work. Instead of resizing each image window, we resize the whole image (as described in the paper). For example, if the original image is 600*800 and we want to find a roughly 40*40 object in the image, we simply resize the original image to 120*160 and find 8*8 patches in this 120*160 image.
  
  Your understand of NMS is correct. I’m not familiar with any reference for NMS, or else I will put it in the paper. I consider it as a standard tips and just use it.
  
  In order to train objectness, you need ground truth annotation, the same as you train any other object detection system. Having ground truth is important for training, that why people spend so much effort to build ground truth datasets.
- Shuai Bing
  
  13/03/2014 at 04:00
  
  Hi, Kai and Ming-Ming
  
  First, very lucky that I share the same name with Ming-Ming’s fantastic feature, :-).
  
  I didn’ t check the code of BING,so I’m not sure what specific NMS technique he uses in his implementation. However, NMS is a just a usual postprocessing step to suppress the same firing windows. There are some papers that you could read to get some idea: the first one is DPM paper from Felzenswalb (PAMI 2010) or you can read the code of DPM directly, and the other one is Dalals PhD thesis (INRIA 2006). Hopefully my information can help you.
  
  Best regards,
  Bing
  - MM ChengPost author
    
    13/03/2014 at 10:01
    
    Thanks 🙂
  - Kai Wang
    
    17/03/2014 at 12:38
    
    Hi, Shuai Bing
    
    Thank you for supplying me the references about NMS. I got the idea of NMS from Wikipedia. However, I will read the DPM paper and further learn the NMS in object detection due to that I am not familiar with object detection.
    
    Thanks for your helpful answer! Also special thanks to Mingming Chen for sharing your paper’s idea to all of us.
    
    best regards,
    Kai
- cong geng
  
  16/04/2014 at 21:20
  
  I think you could refer to this paper Proposal Generation for Object Detection using Cascaded Ranking SVMs https://docs.google.com/a/brookes.ac.uk/viewer?a=v&pid=sites&srcid=YnJvb2tlcy5hYy51a3x6aW1pbmd6aGFuZ3xneDoxMmY3NDE4OGE1MzM2NGI1&pli=1, maybe you can get the answer. Or you could email ziming zhang instead, whose website is https://sites.google.com/a/brookes.ac.uk/zimingzhang/Welcome. He is the author of the idea you are interested.
- cong geng
  
  16/04/2014 at 21:30
  
  you should refer to this paper Proposal Generation for Object Detection using Cascaded Ranking SVMs https://docs.google.com/a/brookes.ac.uk/viewer?a=v&pid=sites&srcid=YnJvb2tlcy5hYy51a3x6aW1pbmd6aGFuZ3xneDoxMmY3NDE4OGE1MzM2NGI1&pli=1. Regarding your questions, I think Ziming Zhang should be more helpful, who is the author of this idea.
- MM ChengPost author
  
  20/04/2014 at 17:19
  
  For the learning part, we follow the CVPR 2011 paper as introduced in our paper. You might find more details about them in that paper.
bo

06/03/2014 at 01:45

Whats ur verson of opencv? Is there any difference if I use another copy of opencv?
- MM ChengPost author
  
  06/03/2014 at 08:05
  
  I guess there is no difference of using other versions. My personal habit is using the latest version, because in every version there are some bug fix.
bohu

05/03/2014 at 13:29

Your work is perfect.I saw a file in your code named FilterTIG,can you tell me the full name of TIG.
- MM ChengPost author
  
  05/03/2014 at 17:13
  
  Thanks. Initially we name the feature Tiny Image Gradient because it’s the image gradient (magnitude only) within a tiny (8×8) range. Later on, we changed the name to the current one, but only change the name in the paper, not in the code 🙂
Dong Zhang

04/03/2014 at 03:06

Your work is great.
- MM ChengPost author
  
  04/03/2014 at 07:50
  
  Thank you 🙂
Xintian Cheng

03/03/2014 at 07:25

Hello, I am reading your paper about objects detection,so I need the passwd of the code.
- MM ChengPost author
  
  03/03/2014 at 08:22
  
  Please read the notice (red text) in the code sharing part and get the password for unzip automatically.
CV

02/03/2014 at 12:08

Nice work!
You can try to plug this into convolutional neural networks and speed up generic object detection.
E.g. instead of usual “pixel input + convolution filter” , change to “binarised gradient + approximate fast filter”
- MM ChengPost author
  
  02/03/2014 at 15:11
  
  Nice suggestions.
- Nick Sergievskiy
  
  16/04/2014 at 08:01
  
  Sounds interesting.
  I only read about laplacian approximation by Haar-like filters.
  Can you give some links on papers with this (“binarised gradient + approximate fast filter”) ideas.
whoyoung

27/02/2014 at 08:56

Meaningful work~
- MM ChengPost author
  
  16/05/2014 at 12:21
  
  Thanks
Ronghang Hu

27/02/2014 at 04:12

Your result is inspiring.
- MM ChengPost author
  
  27/02/2014 at 07:21
  
  Thank you for your interest in it.
YangPan

26/02/2014 at 12:01

How to get the passwd
- MM ChengPost author
  
  26/02/2014 at 15:33
  
  Please see the FAQs 3 in the bottom of this page.
  - gillyamylee
    
    16/05/2014 at 07:46
    
    cool good！
Zhaowei Cai

26/02/2014 at 05:43

cool stuff
- MM ChengPost author
  
  27/02/2014 at 07:21
  
  Thank you for your interest in it.
caijiyuan

26/02/2014 at 05:36

perfect
- MM ChengPost author
  
  27/02/2014 at 07:21
  
  Thank you for your interest in it.