Humans are good at figuring out where sounds come from. “We can do that with up to one degree of accuracy,” says Timothy Langlois, research scientist at Adobe Research.
What we are not yet quite so good at is recording sounds coming from the environment around us, especially in 360 videos.
Known as ambisonic sound, the type of audio recording needed for immersive video can represent where in space sounds emanate from—their directionality. This kind of rich sound is now playable on many formats, including on the popular video platform YouTube, and is in demand for virtual and augmented reality.
But there’s a problem. When immersive videos are being filmed, traditional microphones are unable to capture the directional nature of the sound. In post-production, it becomes very hard to make the sound seem to come from the right place. Newer 360 cameras can record ambisonic sound, but it remains still difficult to edit. “There aren’t many good tools to help with editing spatialized sounds currently,” Langlois says.
Now, Adobe Research scientists and colleagues—including Stephen DiVerdi, Yaniv de Ridder, and Langlois—have found a promising experimental solution.
Their approach, Sonic Scape, helps editors “see” ambisonic sounds in a video using color particles, enabling a unique style of visual editing of the sound’s origin in space and time. In this system, colored dots help show where a sound is coming from. These dots can be moved around the video to match where they actually originated.
“Sonic Scape visualizes the directionality of the sound—which is much more than just listening to it—while you are editing a video,” Langlois explains. The colors portrayed depend directly on the sound’s origin. “It draws a heatmap of the magnitude of the sounds coming from each direction,” he says. The visualizing element was developed jointly by DiVerdi, principal scientist at Adobe Research, and de Ridder, experience developer lead, Adobe Design.
Langlois’ “ambisonics library” played a major role in the project. “It’s a bit of code that can process these ambisonic formats. “The library can take ambisonic sound and play it correctly on headphones. It can also take individual sounds, along with their desired positions, and turn them into the ambisonic format,” he explains.
The focus of a MAX Sneak in fall 2017, Sonic Scape drew excited cheers from the audience when the screen showed visuals of red and yellow colors representing bird calls in a seashore scene. The editor was able to position the sounds to align with the birds themselves, creating a whole new auditory experience.
The team continues to pursue this research pathway. In the future, their system may even be able to take the geometry of the scene into account, allowing truly realistic sound in immersive environments.
Yaniv de Ridder (center) and colleagues try out new 360 editing techniques.