PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend

2026-05-18 · Source: Hugging Face - Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, short

Summary

PaddleOCR 3.5, released on May 18, 2026, integrates its OCR and document parsing models with the Hugging Face Transformers ecosystem, allowing supported PaddleOCR models like PP-OCRv5 and PaddleOCR-VL 1.5 to use Transformers as an inference backend. This update introduces a flexible `engine` parameter, enabling developers to select "transformers" as the backend and configure options like `dtype`, device placement, and attention implementation via `engine_config`. While PaddleOCR continues to manage the underlying OCR and document parsing pipelines, this integration simplifies connecting these capabilities with existing PyTorch/Transformers infrastructure, particularly for applications involving RAG, Document AI, and document agents. A live demo is available on Hugging Face Spaces.

Key takeaway

For AI Engineers building RAG, Document AI, or agent applications within a Hugging Face-centered stack, PaddleOCR 3.5 significantly reduces integration friction. You can now leverage PaddleOCR's robust OCR and document parsing capabilities, such as PP-OCRv5, directly through a familiar Transformers backend, streamlining your workflow from document ingestion to LLM processing. Consider using the `paddle_static` backend if raw throughput is your primary concern.

Key insights

PaddleOCR 3.5 enables using Hugging Face Transformers as an inference backend for its OCR and document parsing models.

Principles

Flexible inference backends enhance developer choice.
Integration friction impedes downstream AI workflows.

Method

Install PaddleOCR 3.5, PaddleX, Transformers, and PyTorch. Initialize PaddleOCR with `engine="transformers"` and optionally configure `engine_config` for backend-specific settings like `dtype` or `attn_implementation`.

In practice

Use `engine="transformers"` for Hugging Face-centered stacks.
Configure `engine_config` for `dtype` (e.g., "bfloat16").
Prioritize `paddle_static` for maximum throughput.

Topics

PaddleOCR 3.5
Hugging Face Transformers
Document Parsing
Optical Character Recognition
Inference Backend

Code references

PaddlePaddle/PaddleOCR

Best for: AI Engineer, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Hugging Face - Blog.