Text region detection in historical astronomical diagrams

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A new large-scale, open-access dataset for text region detection in historical astronomical diagrams has been introduced, addressing a gap in existing benchmarks for mathematical diagrams. The dataset comprises 948 diagrams from the 8th to 18th centuries, featuring 10,940 oriented polygonal text regions across seven linguistic traditions including Arabic, Chinese, Byzantine, Latin, Hebrew, and Sanskrit. Each text instance is precisely annotated with ordered polygons encoding reading direction, and 2,293 Latin regions include 20 class labels. Researchers evaluated strong baselines like TESTR and DeepSolo++, alongside Poly-DETR, a DINO-DETR extension designed for ordered polygon vertex prediction. Poly-DETR achieved state-of-the-art performance on MTHv2 and cBAD2019 benchmarks, providing a solid baseline for this new dataset.

Key takeaway

For Computer Vision Engineers developing text detection solutions for historical documents, this new dataset and the Poly-DETR baseline offer a significant resource. You should consider integrating this open-access dataset into your training pipelines to improve model robustness for diverse historical diagram styles and linguistic traditions. Applying Poly-DETR's performance on oriented polygonal text regions can enhance accuracy in challenging historical document analysis tasks.

Key insights

A new large-scale dataset and Poly-DETR baseline advance text detection in historical astronomical diagrams.

Principles

Method

Poly-DETR, an extension of DINO-DETR, predicts ordered polygon vertices to delineate text regions, achieving state-of-the-art performance on relevant benchmarks.

In practice

Topics

Best for: AI Scientist, Computer Vision Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.