Visual saliency: Fundamentals, Applications, and Recent Progress

ICIP 2015 Tutorial at September 27th, Morning Sessions (09:00 – 12:30)


  • Ali BORJI, University of Wisconsin-Milwaukee, USA
  • Neil D. B. BRUCE, University of Manitoba, Canada
  • Ming-Ming CHENG, Nankai University, China
  • Jian LI, National University of Defense Technology, China

Course Motivation and Description

Recently, visual saliency has received extensively growing attention across many disciplines including cognitive psychology, neurobiology, image processing, and computer vision. Based on our observed reaction times and estimated signal transmission times along biological pathways, human attention theories hypothesize that the human visual system processes only parts of an image in detail, with only limited processing of areas outside of the focus of attention. From an engineering perspective, such visual attention mechanisms have inspired a series of key research topics in the last few decades. One of the key forces behind these rapid developments is the vast amount of successful applications. These applications, marked by different requirements and points of emphasis have resulted in a rich kinship between fixation prediction, salient object detection, and objectness proposal generation.

It is noted that there has consistently been many papers about visual saliency appearing in ICIP over the past decade. While there are still many open issues and challenges (sometimes diverging arguments and debates) that need to be addressed in this area, the field of saliency computing continues to grow very rapidly. In this tutorial, we will introduce basic ideas, important models and applications of visual attention and saliency. Some key research issues will be discussed including top-down vs. bottom-up attention, and the relationship between fixation prediction, salient object detection, object proposal generation, etc. Recent advances in fixation prediction, salient object detection, and objectness proposals will be introduced in detail, with a significant emphasis on their respective potential applications. Finally, we will discuss the fairness of model evaluation criteria, model benchmarking, divergent opinions, open challenges, and potential future work.

Course Outline

This tutorial will consist of 4 talks (about 45 minutes for each talk). This begins with the fundamental knowledge and important classical models. Then, we discuss the divergence of, and correlation among different subareas (fixation prediction, salient object detection, and objectness proposals), followed by detailed introduction to each subarea. Finally, we discuss topics relating to model evaluation and benchmarking. The contents of the tutorial are as follows.

Course Prerequisites

The attendee only needs to have basic knowledge of digital image processing in order to follow the course.

Distributed Material

All materials will be distributed to the attendees electronically via webpage downloads. No physical materials will be distributed.


Ali BORJI received his B.S. and M.S. degrees in computer engineering from the Petroleum University of Technology, Tehran, Iran, 2001 and Shiraz University, Shiraz, Iran, 2004, respectively. He received his Ph.D. degree in computational neurosciences from the Institute for Studies in Fundamental Sciences (IPM) in Tehran, 2009. He then spent a year at University of Bonn as a postdoc. Before coming to the University of Wisconsin-Milwaukee in the fall of 2014, Dr. Borji was a postdoctoral scholar at iLab, University of Southern California, Los Angeles for four years.

Neil D. BRUCE is an Assistant Professor at the University of Manitoba in Canada. His research interests include a variety of topics including both computer vision and human vision, image processing, visual attention, machine learning, computational neuroscience, information theory, sparse coding, 3D modeling and reconstruction, natural image statistics, and statistical and graphical models. Prior to joining the University of Manitoba he completed two post-doctoral fellowships, one at the Centre for Vision Research at York University, and the other at INRIA Sophia Antipolis. Previously, he completed a Ph.D. in the department of Computer Science and Engineering in 2008 as a member of the Centre for Vision Research at York University, Toronto, Canada. In 2003, he completed a M. A. Sc. in System Design Engineering at the University of Waterloo, and received an Honors B.Sc. with a double major in Computer Science and Mathematics from the University of Guelph in 2001.

Ming-Ming CHENG is an associate professor with College of Computer and Control Engineering, Nankai University. He received his PhD degree from Tsinghua University in 2012 under guidance of Prof. Shi-Min Hu, and working closely with Prof. Niloy Mitra. Then he worked as a research fellow for 2 years, working with Prof. Philip Torr in Oxford. Dr. Cheng’s research primarily centers on algorithmic issues in image understanding and processing, including image segmentation, editing, retrieval, etc. During the past 5 years, he has published a serials of influential papers in several sub-areas of visual saliency modeling, including salient object detection (e.g. his CVPR 2011 paper has received 900+ citations), objectness estimation (e.g. his CVPR 2014 oral paper has received 70+ citations and 3000+ source code downloads), and visual saliency based applications (e.g. his SIGGRAPH Asia 2009 paper ‘Sketch2Photo’ has received 250+ citations, and been reported by ‘The Telegraph’ from UK and ‘Spiegel’ from Germany).

Jian LI is an assistant professor with National University of Defense Technology. He received the B.E. degree, the M.E. degree and the PhD Degree from National University of Defense Technology (NUDT), Changsha, P.R. China. From Jan 2010 to Jan 2011, he was a visiting Ph.D. student (Academic Trainee) at Center for Intelligent Machines (CIM) in McGill University under the supervision of Prof. Martin Levine.