Research

Scoot: A Perceptual Metric for Facial Sketches

Deng-Ping Fan1,2, ShengChuan Zhang3, Yu-Huan Wu1, Yun Liu1, Ming-Ming Cheng1, Bo Ren1, Paul L. Rosin4Rongrong Ji3

1TKLNDST, CS, Nankai University      2Inception Institute of Artificial Intelligence (IIAI)       3Xiamen University      4Cardiff University

Abstract

The human visual system has a strong ability to quickly assess the perceptual similarity between two facial sketches. However, existing two widely-used facial sketch metrics, e.g., FSIM and SSIM fail to address this perceptual similarity in this field. A recent study in the facial modeling area has verified that the inclusion of both structure and texture has a significant positive benefit for face sketch synthesis (FSS). But which statistics are more important, and are helpful for their success? In this paper, we design a perceptual metric, called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block level spatial structure and co-occurrence texture statistics. To test the quality of metrics, we propose three novel meta-measures based on various reliable properties. Extensive experiments verify that our Scoot metric exceeds the performance of prior work. Besides, we built the first largest scale (152k judgments) human-perception-based sketch database that can evaluate how well a metric consistent with human perception. Our results suggest that “spatial structure” and “co-occurrence texture” are two generally applicable perceptual features in face sketch synthesis.

Figure 1: Which synthesized sketch is more similar to the middle sketch? For the middle case, sketch 0 is more similar than sketch 1 w.r.t. reference in terms of structure and texture. sketch 1 almost completely destroys the structure of the hair. The widely-used (SSIM [65], FSIM [79]), classic (IFC [40], VIF [39]) and recently released (GMSD [74]) metrics disagree with humans. Only our Scoot metric agrees well with humans.

Publication

Deng-Ping Fan, ShengChuan Zhang, Yu-Huan Wu, Yun LiuMing-Ming Cheng, Bo Ren, Paul L RosinRongrong Ji

Scoot: A Perceptual Metric for Facial Sketches,  ICCV, 2019   

[project page][bib][pdf][supp][official version][code][Dataset (77M)]

Most related projects on this website

Motivation

Designing a good perceptual metric should take into account human perception in facial sketch comparison, which should:

  • obtain high visual perception so that the good sketch can be directly used in various subjective applications.
  • be insensitive to slight mismatches (i.e., resize, rotation) since real-world sketches drawn by artists do not precisely match each pixel to the original photos.
  • be capable of capturing holistic content, that is, prefer the complete sketch than which one only contains strokes (lost some components of facial).

What did we do?

  • Firstly, we propose a Structure Co-Occurrence Texture (Scoot) perceptual metric for FSS that provides a unified evaluation considering both structure and texture.
  • Secondly, we design three meta-measures based on the above three reliable properties. Extensive experiments on these meta-measures verify that our Scoot metric exceeds the performance of prior works. Our experiments indicate that “spatial structure” and “cooccurrence” texture are two generally applicable perceptual features in FSS.
  • Thirdly, we explore different ways of exploiting texture statistics (e.g., Gabor, Sobel, and Canny, etc.). We find that the simple texture feature [14, 15] performs far better than the commonly used metrics in these literature [39, 40, 65, 74, 79]. Based on our findings, we construct the first largescale human-perception-based sketch database that can evaluate how well a metric goes in line with human perception.

Our three contributions presented above offer a complete metric benchmark suite, which provides a novel view and practical tools (e.g., metric, meta-measures, and database) to analyze data similarity from human perception direction.

Meta-Measure 1: Stability to Slight Resizing

The first meta-measure specifies that the rankings of synthetic sketches should not change much with slight changes in the GT sketch. Therefore, we perform a minor 5 pixels downsizing of the GT by using nearest-neighbor interpolation.

Figure 2: Visual comparison of existing widely-used FSS measures (SSIM [8], FSIM [10], and VIF [4]) on meta-measure 1. The experiment clearly shows that the proposed SCOOT measure is more stable to slightly resize.

Meta-Measure 2: Rotation Sensitivity

In real-world situations, sketches drawn by artists may also have slight rotations compared to the original photographs. Thus, the proposed second meta-measure verifies the sensitivity of GT rotation for the evaluation measure. We did a slight counter-clockwise rotation (5o) for each GT sketch.

Figure 3: Visual comparison of existing widely-used FSS measures (SSIM [8], FSIM [10], and VIF [4]) on meta-measure 2. The experiment clearly demonstrates that the proposed SCOOT measure is less sensitive to minor rotation.

Meta-Measure 4: Human Judgment

The fourth meta-measure (Jug) specifies that the ranking result according to an evaluation measure should agree with human judgment.

Figure 4: Meta-measure 4. Sample images from our human ranked database. The first row is the GT sketch, followed by the first and second-ranked synthesis results. We refer the reader to the accompanying attachment (“Proposed Datasets”) for more details.

  • Download:    Perceptual Similarity Dataset 

Performance

Table 1: Benchmarking results of classical and alternative texture/edge based metrics. The best result is highlighted in bold. These differences are all statistically significant at the α < 0.05 level. This ↑ indicates that the higher the score is, the better the metric performs, and vice versa (↓).

..waiting update…

(Visited 1,943 times, 1 visits today)
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments