Benchmarking RAG Architectures Locally on a Real Financial PDF

· Source: Artificial Intelligence on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, FinTech & Digital Financial Services · Depth: Advanced, long

Summary

This article, the first in a three-part series, benchmarks retrieval-augmented generation (RAG) architectures operating entirely on local, open models without external endpoints. The project uses Akbank's 4Q2025 earnings presentation, a 38-page financial PDF with critical data embedded in charts and tables, posing significant extraction challenges. A custom pipeline employing Docling with EasyOCR and TableFormer converts the image-only PDF content into Markdown. Evaluation involves 57 questions, categorized by content type (20 text, 19 table, 18 chart), and uses RAGAS metrics, with a local qwen3:14b model as the judge. Part 1 compares three text-retrieval pipelines: Naive RAG, Hybrid RAG, and Hybrid RAG with ColBERT reranking. Results show answer correctness improving from 0.415 for Naive RAG to 0.544 for Hybrid RAG with ColBERT reranking, demonstrating the value of architectural sophistication even with local constraints.

Key takeaway

For AI Engineers building RAG systems for regulated financial documents that must run entirely in-house, you should prioritize robust data extraction and advanced retrieval architectures. Standard PDF parsing is insufficient for image-heavy content; instead, implement OCR-based pipelines like Docling with TableFormer. Furthermore, integrating Hybrid RAG with Reciprocal Rank Fusion and a ColBERT-style reranker significantly improves answer correctness, moving from 0.415 to 0.544, even with local open models. This approach is crucial for achieving reliable performance in sensitive, offline environments.

Key insights

Local RAG on complex financial PDFs requires specialized extraction and architectural choices to achieve meaningful accuracy.

Principles

Method

Convert image-only financial PDFs to Markdown using Docling, EasyOCR (full-page), and TableFormer for structured table reconstruction. Evaluate with RAGAS metrics.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence on Medium.