Generating Realistic Binarization Data with Generative Adversarial Networks

International Conference on Document Analysis and Recognition (ICDAR)

Published November 11, 2019

Chris Tensmeyer, Mike Brodie, Daniel Saunders, Tony Martinez

One of the limitations for using Deep Learning models to solve binarization tasks is that there is a lack of large quantities of labeled data available to train such models. Efforts to create synthetic data for binarization mostly rely on heuristic image processing techniques and generally lack realism. In this work, we propose a method to produce realistic synthetic data using an adversarially trained image translation model. We extend the popular CycleGAN model to be conditioned on the ground truth binarization mask as it translates images from the domain of synthetic images to the domain of real images. For evaluation, we train deep networks on synthetic datasets produced in different ways and measure their performance on the DIBCO datasets. Compared to not pretraining, we reduce error by 13% on average, and compared to pretraining on unrealistic data, we reduce error by 6%. Visually, we show that DGT-CycleGAN model produces more realistic synthetic data than other models.

Learn More