Historical Document Image Binarization – A Review

Springer Nature Computer Science

Publication date: May 16, 2020

Chris Tensmeyer, Tony Martinez

This review provides a comprehensive view of the field of historical document image binarization with a focus on the contributions made in the last decade. After the introduction of a standard benchmark dataset with the 2009 Document Image Binarization Contest, research in the field accelerated. Besides the standard methods for image thresholding, preprocessing, and post-processing, we review the literature on methods such as statistical models, pixel classification with learning algorithms, and parameter tuning. In addition to reviewing binarization algorithms, we discuss the available public datasets and evaluation metrics, including those that require pixel-level ground truth and those that do not. We conclude with recommendations for future work.

Learn More