Helping people create accessible PDFs and tap into the intelligence of their documents

December 19, 2024

Tags: AI & Machine Learning, Document Intelligence, Researcher Spotlights

When Research Scientist Alexa Siu was an undergrad, she thought she might like to study medicine. But looking at cells in a lab didn’t interest her nearly as much as working with people, understanding their needs, and figuring out how technology could help.

Siu found her true passion when she discovered human-computer interaction and human-centered design. Since then, she’s been working on building smarter and more accessible technology. That work includes developing creative tools tailored to the disability community and a project that makes it easier to create accessible documents. She has also collaborated on the AI Assistant for Acrobat, which helps everyone do more with the information inside their PDFs. Siu’s work has been recognized with a Tech Excellence Award from Adobe and the AI Assistant for Acrobat was recently on Time magazine’s list of the best inventions in 2024.

Can you tell us a bit about your graduate research and how that sparked your interest in accessibility?

When I started my PhD program at Stanford, I was also doing volunteer work in the disability community in the Bay Area. I observed that people who are visually impaired often depend on others to be able to create. That got me thinking about interfaces, and how you can communicate information in different, multimodal ways, including haptics and audio.

With this in mind, I focused my research on applying human-computer interaction and design principles to help improve creative design workflows for people who are blind or visually impaired. I also wanted to improve how people access complex information.

One of my projects helped improve access to 3D design tools using multimodal feedback, including haptics and audio feedback. We used an array of pins to render shapes as people design them, allowing users to interact multimodally as they create an object in 3D. With this work, I wanted to empower people to create on their own. And when creation is accessible to more people, we also reduce the cycle of people creating inaccessible content. I also worked on a project to make data visualizations more accessible by using audio. For example, let’s say you’re visualizing population growth. The technology would give context about the graph and then it would include sounds to indicate the progression of the data, along with commentary about important data points.

How have you brought your expertise in accessibility to your work with Adobe Research?

One project we’re working on now, in partnership with the University of Maryland, is about accessible authoring for Adobe Express documents. Usually, after you’ve created a PDF, you have to go back and add tagging or semantic data to make the document accessible. That means accessibility is an extra step after the fact—and many people don’t even know about it. As a result, there are a lot of inaccessible documents out there.

In our project, we’re rethinking the document creation workflow by adding prompts and smart suggestions for accessibility as you’re authoring. It’s just like when you start a list with one bullet point, and then all you have to do is press enter to add another bulleted item. In this same way, accessibility becomes part of the authoring process, so it’s more efficient and the end result is more accessible.

You’re also working on document intelligence. Can you tell us more about that?

Yes, we’re working on AI technologies and new experiences to help knowledge workers consume, organize, and synthesize information from documents. This provides a new dimension to accessible technology: knowledge accessibility through generative AI. Knowledge accessibility offers the means to access and consume information in a style that matches people’s expectations and individual cognitive strengths and expertise.

When we look at knowledge workers’ current workflows, it happens document-by-document. For example, if you’re a financial analyst reviewing hundreds or thousands of contracts to make recommendations or summarize patterns, or if you’re an HR person reviewing and extracting information from hundreds of resumes, or a procurement specialist looking at vendor contracts, there’s a lot of manual work to open each document and find information. But there’s a higher end goal—you’re creating a report or a presentation or making a decision and that’s where your expertise is really needed.

So we looked at the time it takes to forage for information, which turns out to be tedious and very time-consuming. From there, we created AI technology that extracts information from a document or even a whole collection of documents. This allows people to spend less time finding information and more time thinking about how they want to use the information.

We have created different ways for users to leverage AI when they work with information. One is asking questions about documents. When our AI provides answers, we also give users the source so that they can verify or complement their understanding. Other times, when a user wants to be closer to the text, they can request and read relevant snippets. It really depends on the tasks they’re doing and what level of AI collaboration they need. We’re also thinking about how AI can more proactively provide insights and support to users in the future. In cases where you want deep comprehension, an AI conversation could go deep into a topic to help a user really understand it.

This research is behind the new AI Assistant for Acrobat, which was recently recognized by Time magazine as one of the best inventions of 2024. Can you tell us about that process of going from research to product?

The process, and the recognition of our work, have both been very exciting. When I first started working at Adobe in 2021, right after I completed my PhD, using AI to extract insights from documents seemed like a very far out goal. Then AI began moving much more quickly and there was a lot of interest in our prototype. We started working closely with a small group of people from the product team, narrowing down to the specific things we wanted to move to production. From there, we got to work with design, product managers, and engineering. It was such a great experience to take our ideas from research prototypes and make them robust enough for production.

You recently received a Tech Excellence Award from Adobe. Congratulations! Can you tell us about the honor?

The award was related to my contributions to the AI Assistant for Acrobat, especially on the evaluation side. One thing that’s challenging with AI is that the output is not necessarily predictable or deterministic, so you can’t easily write a test for it. The way I think about evaluation is that there are different layers of understanding. You can do user research where you observe people using the technology and how it’s helping them. But that can be very time consuming and you can’t do it continuously.

At the other end of the spectrum, there’s evaluating text with very specific metrics like accuracy, readability, and repetitiveness. So, I helped develop an approach with different layers of evaluation, which allows us to do some evaluations quickly during development, and then use different evaluations at later stages to give us a more complete picture.

What do you enjoy most about working at Adobe Research?

I enjoy constantly learning from people with diverse expertise, whether in research, product, design, or engineering. And seeing how the research shapes and translates into real impact makes the work both meaningful and motivating.

Wondering what else is happening inside Adobe Research? Check out our latest news here.

Recent Posts