Research Scientist David Arbour is using machine learning, along with causal inference, to change the way companies do research. With causal inference methods, researchers can look at data from whole systems and infer the effects of the changes they make — which allows them to answer questions that would otherwise require time-consuming controlled experiments. The tools Arbour works on help Adobe and its customers understand the impact of product changes and get information more quickly so they can make informed business decisions.
We talked to Arbour about how he got interested in causal inference, the kinds of questions he hopes to answer, and how data beats intuition when it comes to planning for what’s next.
You’ve described your research as an intersection between experimentation, causal inference, and machine learning. Can you tell us what sparked your interest, and what kinds of problems you’d like to solve?
Back when I was a master’s student, I was taking a machine learning course and my professor — who happens to be one of my bosses now — brought up causality. We were in the middle of talking about predictive models, and he mentioned that people tend to reason over causes and the actions they should take rather than looking at data and learning from patterns. From then on, I was fascinated with finding better ways to use data to make decisions.
What kinds of questions do you help people answer?
Causality problems tend to pop up in several ways. One example is A/B testing. Most online businesses — including Adobe — are constantly running A/B tests to understand how new features and other product changes impact users. With our research, we’re changing how people do this kind of experimentation. We can now allow people to continuously monitor an A/B test rather than waiting until the end of an experiment and then looking at the data, all while maintaining the statistical guarantees you get in a classical setting.
So, say you’re doing an experiment where you’ve changed something about your webpage and half of your traffic sees the new site and half doesn’t. If you’re monitoring continuously, then you can stop an experiment early if the impact of your change is really good — for example, you can let everyone access a new feature right away if you know that it works well and makes customers happy. On the other hand, if your results show that a change isn’t working well, you can stop the experiment quickly. Sometimes, the effect of a change is smaller than you expected. In those cases, you can keep the experiment running longer and see what you find.
Another example is monitoring the impact of a change when an A/B test doesn’t make sense. Maybe you’re making a sensitive update and you don’t want to disrupt the user experience, or perhaps it would be prohibitively expensive to run your experiment. In this case, we can use our tools and observational data to compare users’ behavior before and after the change and then infer the impact.
Causal inference also lets us look at a set of data that might be confounded by outside factors, account for these, and then infer the effect of a change. This takes everything one step further — we’re discovering causal relationships from the whole system so we can act more intelligently.
How are these new methods of research different from the ways we’re used to looking at data and making decisions?
If you think of the classic data analytics framework, people take statistics and combine them with business reasoning and intuition and say, ‘This is what I think this means.’ With causal inference and machine learning, we can take that intuition, make an explicit hypothesis, and then test it. From there, we can model the data and patterns in a really rich way and get precise answers.
How is your research impacting Adobe and Adobe’s customers?
One of the great things about working in causal inference at Adobe Research is that it’s relevant for a broad set of our company’s applications. We do a lot of experiments to help make Adobe products better, whether it’s understanding the impacts of products or monitoring to understand the root cause of problems in the system so we can quickly fix them.
We’ve also built our technology for continuous monitoring of experiments into Adobe’s Customer Journey Analytics. It’s so rewarding to help customers ask questions that let them understand their businesses better — and then distill complex and nuanced data into answers they can use to make decisions.
Where do you think causal inference research is headed next? What are you most excited about?
I think we’ll be using ideas from causal inference and machine learning to answer questions that used to be impossible. For example, maybe your data is in text or images. Or perhaps, instead of focusing on short-term impacts, you want to know how happy a customer is over several years. That’s not something we can model well yet, so people have to let their intuition do all the work. Eventually we’ll be able to do these things better and better with machine learning. And that opens the door to answering even more questions — questions we haven’t even thought to ask before.
Interested in an internship or career with Adobe Research? You can learn about opportunities here!