Editing Images with Natural Language

Gierad Laput

University of Michigan

Mira Dontcheva

Adobe Research

Gregg Wilensky

Adobe Research

Walter Chang

Adobe Research

Aseem Agarwala

Adobe Research

Jason Linder

Adobe Research

Eytan Adar

University of Michigan

Photo editing can be a challenging task, and it becomes even more difficult on the small, portable screens of mobile devices that are now frequently used to capture and edit images. To address this problem we present PixelTone, a multimodal photo editing interface that combines speech and direct manipulation. We observe existing image editing practices and derive a set of principles that guide our design. In particular, we use natural language for expressing desired changes to an image, and sketching to localize these changes to specific regions. To support the language commonly used in photoediting we develop a customized natural language interpreter that maps user phrases to specific image processing operations. Finally, we perform a user study that evaluates and demonstrates the effectiveness of our interface.

Check out this video for a demonstration.

Project Publications

PixelTone: A Multimodal Interface for Image Editing

Laput, G., Dontcheva, M., Wilensky, G., Chang, W., Agarwala, A., Linder, J., Adar, E. (Apr. 27, 2013)
Proceedings of ACM Conference on Human Factors and Computing Systems (SIGCHI), 2185-2194