Results

Analysis complete

Upload a document to begin analysis

Process Another
Headers
Paragraphs
Tables
Figures
Shapes/Flowcharts

Document Analysis

Confidence: 0%
Regions Found 0
Headers 0
Tables 0
Shapes 0

Graph Neural Networks for Document Understanding

Document layout analysis remains a foundational task in computer vision and natural-language processing. Traditional CNN-based pipelines treat pages as flat pixel grids and miss the relational structure between visual regions.

2. Methodology

We model each page as a graph where nodes represent visual regions detected by a backbone CNN, and edges encode spatial proximity and reading-order relationships. A two-layer Graph Attention Network propagates context.

MethodF1mAP
CNN baseline0.8910.842
DOCUGRAPH (ours)0.9740.928

[Figure 1 — Pipeline overview]

Results on the PubLayNet benchmark show consistent improvement across all region categories, with the most significant gains on tables and figures.