MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

CVPR 2024

Publication date: June 19, 2024

Ryan Burgert, Brian Price, Jason Kuen, Yijun Li, Michael Ryoo

We introduce MAGICK, a large-scale dataset of generated objects with high-quality alpha mattes. While image generation methods have produced segmentations, they cannot generate alpha mattes with accurate details in hair, fur, and transparencies. This is likely due to the small size of current alpha matting datasets and the difficulty in obtaining ground-truth alpha. We propose a scalable method for synthesizing images of objects with high-quality alpha that can be used as a ground-truth dataset. A key idea is to generate objects on a single-colored background so chroma keying approaches can be used to extract the alpha. However, this faces several challenges, including that current text-to-image generation methods cannot create images that can be easily chroma keyed and that chroma keying is an underconstrained problem that generally requires manual intervention for high-quality results. We address this using a combination of generation and alpha extraction methods. Using our method, we generate a dataset of 150,000 objects with alpha. We show the utility of our dataset by training an alpha-to-rgb generation method that outperforms baselines. Our dataset will be released to the public upon publication.