Deep Splitting and Merging for Table Structure Decomposition

Given the large variety and complexity of tables, table structure extraction is a challenging task in automated document analysis systems. We present a pair of novel deep learning models (Split and Merge models) that given an input image, 1) predicts the basic table grid pattern and 2) predicts which grid elements should be merged to recover cells that span multiple rows or columns. We propose projection pooling as a novel component of the Split model and grid pooling as a novel part of the Merge model. While most Fully Convolutional Networks rely on local evidence, these unique pooling regions allow our models to take advantage of the global table structure. We achieve state-of-the-art performance on the public ICDAR 2013 Table Competition dataset of PDF documents. On a much larger private dataset which we used to train the models, we significantly outperform both a state-ofthe-art deep model and a major commercial software system.

Learn More

Publications

Deep Splitting and Merging for Table Structure Decomposition

International Conference on Document Analysis and Recognition (ICDAR)

Publication date: September 23, 2019

Chris Tensmeyer, Vlad Morariu, Brian Price, Scott Cohen, Tony Martinez

Oral

Research Areas: AI & Machine Learning Computer Vision, Imaging & Video Document Intelligence