Plan2Map: A Multimodal Benchmark for Document-Grounded Geospatial Boundary Reconstruction from Planning Records

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Geospatial AI & Computer Vision · Depth: Expert, quick

Summary

Plan2Map is a new 208-case multimodal benchmark designed for document-grounded geospatial boundary reconstruction using UK planning records. This benchmark requires systems to reconstruct a valid geospatial boundary in GeoJSON format solely from a source planning document, which includes notice text, schedules, map plates, map labels, and boundary annotations. Researchers introduce GeoPlanAgent, a document-grounded, geospatial-tool-in-the-loop system that breaks down the task into evidence extraction, localisation, map registration, boundary segmentation, projection, and verification. GeoPlanAgent achieved a 0.736 mean IoU and 0.904 median IoU on Plan2Map, with 67.8% of predictions at or above 0.8 IoU, significantly surpassing direct VLM-to-GeoJSON baselines. Diagnostic analysis indicates direct VLM prediction remains unreliable, with remaining errors concentrated in localisation and map registration, while supervised boundary segmentation enhances pixel-level mask quality.

Key takeaway

For Computer Vision Engineers developing systems for geospatial boundary reconstruction from planning records, you should prioritize multimodal approaches that decompose the task into distinct stages. Direct VLM-to-GeoJSON methods are unreliable; instead, focus your efforts on improving localisation and map registration components, as these are critical error sources. Consider implementing supervised boundary segmentation to significantly enhance pixel-level mask quality in your solutions.

Key insights

Plan2Map enables multimodal geospatial boundary reconstruction from diverse planning documents via structured processing.

Principles

Method

GeoPlanAgent reconstructs boundaries by decomposing the task into evidence extraction, localisation, map registration, boundary segmentation, projection, and verification, integrating geospatial tools.

In practice

Topics

Best for: AI Scientist, Computer Vision Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.