How Many Glances? Modeling Multi-duration Saliency

Workshop on Shared Visual Representations in Human and Machine Intelligence at NeurIPS (SVRHM)

Published December 13, 2019

Camilo Fosco, Anelise Newman, Patr Sukhum, Yun Bin Zhang, Aude Oliva, Zoya Bylinskii

Traditional models of visual saliency have ignored the temporal aspect of visual attention and have produced prediction maps at fixed viewing durations. As a result, current applications of saliency are rigidly tailored for a fixed viewing duration. To incorporate knowledge of viewing duration into saliency modeling, we collect the CodeCharts1K dataset, which contains viewing data at three durations on 1000 images from diverse computer vision datasets. Our analysis shows distinct differences in gaze locations at these time points and exposes recurring temporal patterns about which objects attract attention. We use these insights to develop a lightweight saliency model that simultaneously trains on data from multiple time points. Our Multi Duration Saliency Excited Model (MD-SEM) achieves state-ofthe-art performance on the LSUN 2017 Challenge with 57% fewer parameters than comparable architectures.

Learn More