A new feature in Adobe Firefly built in collaboration with Adobe Research is making text-to-image generation more accessible for blind and low-vision (BLV) creators. It automatically produces detailed alt text descriptions of each generated or edited image, giving creators the information they need to choose images for their presentations, articles, social posts, and more.
The challenge: Making generated images explain themselves
Alt text descriptions are brief text descriptions attached to digital images, read aloud by screen readers to make visual content accessible. The Adobe team wanted to use alt text to solve a basic problem: When users enter a prompt to generate an image, they get back a description that is “exactly the same as their prompt—which is not very useful for blind or low-vision users,” says Josh Myers-Dean, an Adobe Applied Scientist who has worked on the Firefly Services team and as an Adobe Research intern. He spearheaded the design and implementation of the new feature.
Instead, BLV users “need to know if an image is faithful to their prompt, and they want details that help them choose the image they prefer,” he says.
Here’s how the new feature addresses this challenge: When a creator enters a prompt to edit or generate an image in Firefly, each image now comes with a detailed description of its elements and tone. For example, when Myers-Dean tested a prompt for an image of a mushroom in a forest in the Pacific Northwest, the alt text described the damp moss on the forest floor, the nearby pinecone, ferns, and fallen cedar branch, and the towering conifers in the background.
“Now in Firefly, we’re able to generate alt text in a way that’s fairer, with higher fidelity to the generated images,” Myers-Dean says.
The feature will empower more BLV users to create and choose images that match their creative ideas. And, Myers-Dean notes, providing automatic alt text in Firefly could help encourage more people to include alt text when they publish their work online, increasing image accessibility across the internet.
From Research internships to a finished feature
About three years ago, Adobe Senior Research Scientist Dingzeyu Li became interested in making creative authoring tools more accessible. He was inspired by a collaboration with his then-intern Mina Huh, who focused her summer’s research on accessibility. Li and Huh’s work kickstarted the idea that custom alt text could help BLV creators get more out of the power of generative AI.
“We have all of these generative models that are designed to make creation more equitable and accessible, but they haven’t always been designed to give opportunities to blind and low-vision users,” says Li. “So we started thinking about projects like alt text in Firefly that could help BLV users do the same thing we all love to do: compare and contrast generated images and pick the ones we like.”
Li and Huh launched an Adobe Research Slack channel for brainstorming about accessibility, which attracted another intern at the time, Myers-Dean. Meanwhile, generative models grew more robust and it became more feasible to provide tailored alt text for each generated image—and Myers-Dean finished up his PhD and joined Adobe full-time.
One of Myers-Dean’s first projects in his new role at Adobe was to help lead the collaboration to bring alt text to life in Firefly. The work included deep contributions from Adobe Research and the Applied Science and Machine Learning Group inside the Firefly team. “It was a really cool, holistic effort,” says Myers-Dean.
Li hopes the new alt text feature will make image generation in Photoshop accessible for many more users—and he believes that accessibility is a useful metric for thinking about the progress we’re making with AI. “One way to test if AI is working well is to look at it in the real world,” he explains. “I think it’s important for AI to address the day-to-day needs for all of our users—and give them more tools to express their creativity.”
To see this feature in action, take look at the video at the top of this post: When Myers-Dean wrote a prompt for an image of a mushroom in a forest in the Pacific Northwest, the alt text also described the scene’s tone and the mushroom’s surroundings.
Wondering what else is happening in Adobe Research? Check out our latest news here.