Foreground map evaluation is crucial for gauging the progress of object segmentation algorithms, in particular in the filed of salient object detection where the purpose is to accurately detect and segment the most salient object in a scene. Several widely-used measures such as Area Under the Curve (AUC), Average Precision (AP) and the recently proposed Fωβ(Fbw) have been utilized to evaluate the similarity between a non-binary saliency map (SM) and a ground-truth (GT) map. These measures are based on pixel-wise errors and often ignore the structural similarities. Behavioral vision studies, however, have shown that the human visual system is highly sensitive to structures in scenes. Here, we propose a novel, efficient, and easy to calculate measure known an structural similarity measure (Structure-measure) to evaluate non-binary foreground maps. Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map. We demonstrate superiority of our measure over existing ones using 5 meta-measures on 5 benchmark datasets.
Structure-measure: A new way to evaluate foreground maps IEEE ICCV, 2017 ( Spotlight Oral, Accept rate: % )
- Region perspectives: Although it is difficult to describe the object structure of a foreground map, we notice that the entire structure of an object can be well illustrated by combining structures of constituent object-parts (regions).
- Object perspectives: In the high-quality SMs, the foreground region of the maps contrast sharply with the background regions and these regions usually have approximately uniform distributions.
- The AP, AUC and Fbw evaluation measures are computed in a similar way. They are all based on the pixel-wise manner and ignore the structure similarity evaluation.
- Current evaluation measures (AP, AUC, Fbw) are based on pixel-wise manner and consider each pixel as independent. Hence, they all ignore the structure of the foreground maps, thus result the same score.
(a) Image (b) GT (c) state-of-art map (d) generic map
Meta-measure 2: Generic vs. state-of-the-art. A evaluation measure should give the FM which generated by the state-of-the-art method (c) a higher score than the generic map (d) that do not consider the content of the image. Unfortunately, all of the current evaluation measure give the map in (d) a higher score than (c). Only our measure correctly ranks the state-of-the-art result higher.
- Table 1. Quantitative comparison with current measures on 3 meta-Measures. The best result is highlighted in blue. MM:meta-Measure.