Tables and Figures Detection Using Layout Parser
View the complete implementation in Google Colab: Open Notebook Tables and Figures Detection Notebook
Introduction
Layout Parser is a toolkit for Document Layout Analysis that helps detect and extract various elements from documents, including tables, figures, text blocks, and more. It uses deep learning models trained on large datasets like PubLayNet to identify different components in document images.
Installation
Install LayoutParser and its dependencies:
pip install layoutparser
pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.5#egg=detectron2" #if you are encountring any problem with this installation refer to readme.md
pip install "layoutparser[layoutmodels]"
Components
Model Initialization
model = lp.Detectron2LayoutModel(
'lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", min(table_threshold, figure_threshold)],
label_map={0: "Text", 1: "Title", 2: "List", 3: "Table", 4: "Figure"}
)
for customization:
- Modify label_map to detect different elements
- Adjust thresholds for detection sensitivity
- Use different pre-trained models (e.g., lp://PrimaLayout/mask_rcnn_R_50_FPN_3x for historical documents)
Block Type Detection
def get_block_type(block):
"""Helper function to safely get block type from layout detection"""
if hasattr(block, 'type'):
return block.type
if hasattr(block, 'label'):
if isinstance(block.label, str):
return block.label
if isinstance(block.label, (int, float)):
type_mapping = {
0: 'Text',
1: 'Title',
2: 'List',
3: 'Table',
4: 'Figure'
}
return type_mapping.get(int(block.label), 'Unknown')
return 'Unknown'
for customization:
- Add new types to type_mapping
- Modify return values for different classification needs
- Add custom type detection logic
Visualization
def create_visualization(image, detected_elements, show_plot=True):
"""Create visualization of detected tables and figures"""
viz_image = image.copy()
draw = ImageDraw.Draw(viz_image)
# Customize colors and labels for different element types
element_styles = {
'tables': {'color': 'red', 'label': 'Table'},
'figures': {'color': 'green', 'label': 'Figure'}
}
Detection Processing
def process_single_page(image_path, table_threshold=0.3, figure_threshold=0.8):
"""Process a single page to detect tables and figures"""
parameters to adjust:
- table_threshold: Lower values detect more tables but may increase false positives
- figure_threshold: Higher values ensure more confident figure detection
- new thresholds for more element types
Usage Examples
Basic usage with default thresholds:
result = process_single_page("path/to/document.png")
Adjust detection sensitivity:
# More lenient detection
result_lenient = process_single_page(
"path/to/document.png",
table_threshold=0.1,
figure_threshold=0.6
)
# Stricter detection
result_strict = process_single_page(
"path/to/document.png",
table_threshold=0.5,
figure_threshold=0.9
)