Adobe Researcher Aaron Hertzmann has a new theory of perception that breaks the rules

July 24, 2024

Tags: Computer Vision, Imaging & Video, Conferences, Graphics (2D & 3D), Researcher Spotlights

Photo by @kasselmancreative

Adobe Researchers spend their days at the intersection of big ideas and practical applications. That’s one reason why Aaron Hertzmann, Principal Scientist for Adobe Research, thinks so much about how theories of art can help explain what we see and how we create—and how those same theories sometimes fall short. He’s also working on developing new theories of perception that could change the future of creative tools.

Hertzmann will receive one of the highest honors in the field, the Computer Graphics Achievement award later this summer from ACM SIGGRAPH, the premier international community for computer graphics research and art. He talked with us about his insights on how we see pictures and he shared some of the most interesting lessons he’s learned from life at the intersection of art and tech. 

What have you been working on lately? We hear you have a new theory about art that’s just coming together.

Ever since I studied art back in high school and college, I’ve been interested in how humans perceive pictures. I was always confused because it seemed like there were contradictions between things like the rules of perspective and what artists actually do in their work.

So now I have the outline of a theory for understanding these contradictions and making sense of how shape in pictures works. My theory integrates recent developments in human vision science, computer graphics, and art history to explain how, when we look at a picture, we get a sense of perspective.

To start thinking about it, try looking at a word in some text. Without moving your eyes, see how much text you can read around the word. You probably can’t read much more than the word that you’re looking at, and a few words around it. That eye “fixation” gives you fine detail. Everything else around that is peripheral vision, which is really important, but we’re not really using it for understanding details or shapes.

What’s more, when you move your eyes from one thing to another, you don’t remember most of the tiny details you saw in the first fixation—most of it is gone. Try remembering the precise appearance of something that you looked at just moments earlier—you probably can’t recall how many leaves were on a plant or how many buttons on a shirt, unless you had counted. In a similar way, when you look at things in the real world, you’re moving from one fixation to the next, and each one has its own perspective for you. Artists often create this effect of multiple perspectives in their work, giving different fixations in a painting or drawing their own perspective, shape, and notion of how the eye relates to the context of the picture.

When we view a picture, our eyes don’t take it all in at once. We notice details in each moment only for the parts that we look at. Different parts of a picture can have their own perspectives, as shown in this artwork by Aaron Hertzmann.

But this isn’t how we talk about perspective in conventional theories of art. The old rules say there’s one perspective for an entire picture. In our new theory, there’s no global, overarching rule that defines how everything in an image has to relate. This illuminates the contradiction between the rigid rules of how we’re sometimes told we’re supposed to make art, and the more free-form, abstract arrangements that artists actually create.

The same things happen in photography: our smartphones use the conventional rules, and they create “distortions” just like ones known since the Renaissance—for example, the faces of people in the corner look stretched out. But there are some clever multi-perspective photography techniques that create pictures without these distortions, and they’re not widely known. These techniques inspired my theory, which thinks of a picture in terms of a set of perspectives instead of just one single perspective.

This new theory started as a way to think about the rules of perspective, but it’s turning out to be more than that. At the moment, I’m also working on follow-up theories to explain visual illusions and tones and colors and what those things say about perception in general.

One of my hopes is that we can use these theories to build better creative tools that allow us to take photographs that look more like what we really see in the world, or more like the way we’d want to paint or draw something. For example, in a photo, the building I was looking at may seem too small, or it might not capture an aspect of my experience in the world. The photography technology doesn’t allow me to depict the world as I experienced it. And it creates distortions, like stretched faces. So there’s work to be done here. With a new theory of how pictures work, I think we can build tools that will let people take the photographs they really want. 

When we talked a couple of years ago, you explained why AI can’t take the place of human creativity. What do you think about the newest developments in AI for creativity?

I really like the way the generative tools are being incorporated into products rather than just being stand-alone tools where you type in a text prompt and hope for the best. These things will keep getting more useful as they become part of an artist’s creative workflow. They’ll allow artists to solve problems along the way.

At this year’s conference, you will be honored by ACM SIGGRAPH for pioneering work in non-photorealistic animation and rendering, image synthesis, character animation, computational photography, and the interplay between computer generated and traditional art. What does the award mean to you? Any thoughts on the things you’ve learned along the way?

I went to my first SIGGRAPH right after college and I knew it was a community I wanted to be part of. They’re an amazing group of people who are excited about making technologies for art, so it’s a wonderful place to be creative and inventive. Being recognized by them is immensely gratifying.

At the meeting this summer I get to give a talk. So I’ve been thinking a lot about what I’d like to say about my path. When I graduated with art and computer science degrees, I thought they were separate, and I had a lot of questions. Like, what does it mean to be an artist? What are the rules for which things you should include in a picture and which things you shouldn’t? Why is it okay to be abstract, and how do you know if an abstract painting is good?

Much of what I’ve been doing since then is trying to come up with answers using the tools of science and computer graphics, while avoiding rigid rules that don’t allow for creativity, subjectivity, or humanity. Along the way, I’ve realized that interdisciplinary work, and finding connections between different fields, is a hard and rare thing to do, but that it’s also extremely valuable.

Drawings by Aaron Hertzmann using Adobe Fresco

I’ve also been thinking about the ways creativity and innovation in art and research are a lot alike. They’re open-ended processes, and the initial intention is just a starting point. You don’t just execute on your goals—you figure out your goals while working.

When I draw pictures or write, I often don’t succeed at my initial intention, but I end up doing something that’s surprising and more interesting. And with research, the best projects start with one goal and end up being something totally different because of the things we discover as we go. I think it’s really important for beginning artists and researchers to develop the skills and intuition to explore ideas without a fixed plan. That’s how creativity happens.

Wondering what else is happening inside Adobe Research? Check out our latest news here. You can also learn more on Aaron Hertzmann’s theories here.

Recent Posts