Text Detection and OCR with Microsoft Cognitive Services

· Source: Adrian Rosebrock, Author at PyImageSearch · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, long

Summary

This tutorial details how to implement text detection and Optical Character Recognition (OCR) using the Microsoft Cognitive Services (MCS) API, part of Microsoft Azure. It is the second in a three-part series on text detection and OCR, following Amazon Rekognition and preceding Google Cloud Vision API. The guide covers obtaining MCS API keys, configuring a Python development environment with OpenCV and Azure Computer Vision libraries, and structuring a project with a configuration file for API credentials. A Python script is provided to make calls to the MCS OCR API, process image data, and display OCR results, including bounding box annotations. The MCS API demonstrates robust performance, accurately OCR'ing text from various challenging images, such as warning signs, low-quality bus timetables, and pixelated text, even handling rotated text bounding boxes.

Key takeaway

For AI Engineers evaluating cloud OCR solutions, consider Microsoft Cognitive Services (MCS) OCR API, especially if your projects involve low-quality or challenging images. While its implementation might be slightly more complex than Amazon Rekognition, MCS demonstrates strong accuracy in diverse scenarios. If you are already within the Azure ecosystem, staying with MCS could streamline your workflow, despite the polling mechanism for results.

Key insights

Microsoft Cognitive Services OCR API offers robust text detection, even on challenging, low-quality images.

Principles

Method

Obtain MCS API keys, configure Python environment with OpenCV and Azure Computer Vision, create a config file for credentials, then use a Python script to send images to the MCS OCR API, poll for results, and annotate output.

In practice

Topics

Best for: Machine Learning Engineer, AI Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Adrian Rosebrock, Author at PyImageSearch.