Plan2Map: A Multimodal Benchmark for Document-Grounded Geospatial Boundary Reconstruction from Planning Records
Summary
Plan2Map is a new 208-case multimodal benchmark designed for document-grounded geospatial boundary reconstruction using UK planning records. This benchmark requires systems to reconstruct a valid geospatial boundary in GeoJSON format solely from a source planning document, which includes notice text, schedules, map plates, map labels, and boundary annotations. Researchers introduce GeoPlanAgent, a document-grounded, geospatial-tool-in-the-loop system that breaks down the task into evidence extraction, localisation, map registration, boundary segmentation, projection, and verification. GeoPlanAgent achieved a 0.736 mean IoU and 0.904 median IoU on Plan2Map, with 67.8% of predictions at or above 0.8 IoU, significantly surpassing direct VLM-to-GeoJSON baselines. Diagnostic analysis indicates direct VLM prediction remains unreliable, with remaining errors concentrated in localisation and map registration, while supervised boundary segmentation enhances pixel-level mask quality.
Key takeaway
For Computer Vision Engineers developing systems for geospatial boundary reconstruction from planning records, you should prioritize multimodal approaches that decompose the task into distinct stages. Direct VLM-to-GeoJSON methods are unreliable; instead, focus your efforts on improving localisation and map registration components, as these are critical error sources. Consider implementing supervised boundary segmentation to significantly enhance pixel-level mask quality in your solutions.
Key insights
Plan2Map enables multimodal geospatial boundary reconstruction from diverse planning documents via structured processing.
Principles
- Direct VLM prediction is unreliable for geospatial boundaries.
- Decompose complex multimodal tasks for better results.
- Supervised segmentation improves pixel-level mask quality.
Method
GeoPlanAgent reconstructs boundaries by decomposing the task into evidence extraction, localisation, map registration, boundary segmentation, projection, and verification, integrating geospatial tools.
In practice
- Integrate diverse document elements for geospatial tasks.
- Adopt a multi-stage pipeline for document-grounded reconstruction.
- Prioritize improving localisation and map registration steps.
Topics
- Plan2Map Benchmark
- Geospatial Boundary Reconstruction
- Multimodal AI
- Document Analysis
- Computer Vision
- GeoJSON
Best for: AI Scientist, Computer Vision Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.