Build 100% Local Advanced RAG System for Financial PDFs with Qwen 3.5 | Docling, LangGraph & Ollama
Summary
This content details the construction of a completely local Retrieval Augmented Generation (RAG) system designed to analyze financial documents. The system leverages DocLink for PDF processing, Ollama for local LLM inference, LangChain, and LangGraph for workflow orchestration. It features a Streamlit front end for document upload and chat, communicating with a FastAPI backend that manages ingestion, retrieval, and inference layers. The ingestion pipeline uses a custom chunker and stores document chunks in a Superbase database with PGVector for embeddings. An inference layer hosts LLMs (e.g., Ollama), while a retrieval layer employs query expansion and FlashRank for reranking. The entire architecture is designed to run locally, enabling private and offline document analysis, demonstrated with Apple and Nvidia earnings reports.
Key takeaway
For AI Engineers building secure, offline document analysis solutions, this local RAG architecture provides a robust blueprint. You should consider integrating DocLink for accurate PDF parsing and Superbase with PGVector for efficient chunk storage and retrieval. The use of Ollama for local LLMs and LangGraph for workflow orchestration offers flexibility and control over the entire pipeline, ensuring sensitive financial data remains within your local environment.
Key insights
A local RAG system can analyze financial PDFs using open-source tools for privacy and offline capability.
Principles
- Local RAG ensures data privacy.
- Hybrid search improves retrieval accuracy.
- Modular architecture supports component swapping.
Method
The system processes PDFs into markdown, chunks them, enriches chunks with an LLM, and stores them in a PGVector-enabled Superbase database. Queries undergo expansion, hybrid retrieval, and reranking via a LangGraph workflow.
In practice
- Use DocLink for PDF to markdown conversion.
- Implement PGVector for embedding storage.
- Employ FlashRank for document reranking.
Topics
- Local RAG System
- Financial Document Analysis
- Docling
- Ollama
- LangGraph
Best for: AI Engineer, Software Engineer, AI Student
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Venelin Valkov.