SalientShape: Group Saliency in Image Collections

Ming-Ming Cheng1,4   Niloy J. Mitra2  Xiaolei Huang3  Shi-Min Hu1

1TNList, Tsinghua University, Beijing     2UCL     3Lehigh University     4Oxford Brookes University

Figure. Our system explicitly extracts salient object regions from a group of related images with heteroge-neous quality offline (a-d, f) to enable efficient online (e) shape based query.


Efficiently identifying salient objects in large image collections is essential for many applications including image retrieval, surveillance, image annotation, and object recognition. We propose a simple, fast, and effective algorithm for locating and segmenting salient objects by analysing image collections. As a key novelty, we introduce group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness. To evaluate our method, we construct a large benchmark dataset consisting of 15K images across multiple categories with 6000+ pixel-accurate ground truth annotations for salient object regions where applicable. In all our tests, group saliency consistently outperforms state-of-the-art single-image saliency algorithms, resulting in both higher precision and better recall. Our algorithm successfully handles image collections, of an order larger than any existing benchmark datasets, consisting of diverse and heterogeneous images from various inter-net sources.


SalientShape: Group Saliency in Image Collections. Ming-Ming Cheng, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. The Visual Computer, 2013. [pdf] [supplementary] [bib]


We introduce a labeled dataset of categorized images for evaluating sketch based image retrieval. Using Flickr, we downloaded about 3000 images for each of the 5 keywords: “butterfly”, “coffee mug”, “dog jump”, “giraffe”, and “plane”, together comprising of about 15000 images. For each image, if there is a non-ambiguous object with correct content matching with the query keyword and most part of the object is visible, we mark such an object region. Similar to MSRA10K, the salient regions are marked at a pixel level. We only label salient object region for objects with almost fully visible since partially occluded objects are is less useful for shape matching. Different from MSRA10K, the THUR15K dataset do not contain a salient region labeled for every image in the dataset, i.e., some images may not have any salient region. This dataset is used to evaluate shape based image retrieval performance.

Please read the notice first to see how to automatically get the password for unzip.

Comparisons with state of the art methods

Figure. Evaluation results on our benchmark dataset. (a) Precision-recall curves for naive thresholding of saliency maps. S, G1, G2 represent single image saliency, group saliency after the 1st and 2nd iterations, respectively. Subscripts B, C, D, G, P represent groups of ‘butterfly’, ‘coffee mug’, ‘dog jump’, ‘giraffe’, ‘plane’, respectively. (b) Comparison of F-Measure for image groups using single image saliency segmentation methods (FT [1], SEG [37], RC [14]) vs. group saliency (GS) segmentation.

Figure. SBIR comparison. In each group from left to right, first column shows images downloaded from Flickr using the corresponding keyword; second column shows our retrieval results obtained by comparing user-input sketch with group saliency segmentation results; third column shows corresponding sketch based retrieval results using SHoG [20]. Two input sketches with their retrieval results are shown in (e).

Learned appearance from Flickr images (without time consuming image annotation)




  1. Global Contrast based Salient Region Detection. Ming-Ming Cheng, Guo-Xin Zhang, Niloy J. Mitra, Xiaolei Huang, Shi-Min Hu. IEEE CVPR, 2011, p. 409-416.
  2. Unsupervised Joint Object Discovery and Segmentation in Internet Images, Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu, IEEE CVPR 2013.
  3. Unsupervised joint object discovery and segmentation in internet images, M. Rubinstein, A. Joulin, J. Kopf, and C. Liu, in IEEE CVPR, 2013, pp. 1939–1946. (Used the proposed saliency measure and showed that saliency-based segmentation produces state-of-the-art results on co-segmentation benchmarks, without using co-segmentation!)
  4. Unsupervised Object Discovery via Saliency-Guided Multiple Class Learning, Jun-Yan Zhu, Jiajun Wu, Yichen Wei, Eric Chang, and Zhuowen Tu, IEEE CVPR, 2012.


Locations of visitors to this page Locations of visitors to this page
(Visited 8,907 times, 6 visits today)
Notify of

After refinement we left with few top quality images and how can we evaluate Prior valu
Is id done by any leniar operation on GMM of each image or anything else ??
Thanks in advance


老师好,最近看了您的这篇文章,我想问您一下Precision-recall curves是怎么评价算法优劣性的?

Lei Bao
Lei Bao



[…] 关于Salient object detection,如果一个图像只生成一个saliency map的话,用单张图像搞Saliency map,发展空间已经不是特别大了,我11年投PAMI那篇在MSRA1000上做到了93%左右的FMeasure,之后没看过别的比我CVPR11论文中segmentation结果(F = 90%)更高的正确率。用多张图像,特别是从internet上随机download的图像,从中提取有用的Salient object,并自动剔除单张图像分析产生的错误,应该还有很多事情可做。具体可参考: […]