Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost

2026-04-05 · Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, long

Summary

The article introduces Proxy-Pointer RAG, a novel ingestion and retrieval pipeline designed to combine the structural awareness and high accuracy of "Vectorless RAG" systems like PageIndex with the scalability and cost-efficiency of traditional Vector RAG. While PageIndex achieves 98.7% accuracy on financial benchmarks by building a hierarchical "Smart Table of Contents" and using LLMs for navigation, its reliance on LLM calls for indexing and retrieval makes it slow and expensive for multi-document scenarios. Proxy-Pointer RAG addresses this by creating a "Skeleton Tree" without LLM summarization, injecting structural metadata ("breadcrumbs") into embeddings, using structure-guided chunking, and applying noise filtering. This approach allows a standard vector database like FAISS to perform structurally aware retrieval, matching or exceeding PageIndex's quality on 8 out of 10 queries in a benchmark using a 131-page World Bank report, while maintaining low indexing and retrieval costs.

Key takeaway

For AI Engineers and ML Engineers building RAG systems for complex, structured documents, Proxy-Pointer RAG offers a compelling alternative to traditional vector RAG and LLM-heavy vectorless approaches. You should consider implementing its zero-cost engineering techniques—skeleton trees, metadata pointers, breadcrumb injection, structure-guided chunking, and noise filtering—to significantly improve retrieval quality and explainability without incurring high LLM API costs or sacrificing scalability for enterprise knowledge bases.

Key insights

Proxy-Pointer RAG combines structural awareness with vector search scalability for superior document retrieval.

Principles

Structure enhances RAG quality
Embed structure, don't just summarize
Pointers are superior to raw chunks

Method

Proxy-Pointer RAG builds a skeleton tree, injects breadcrumbs into embeddings, uses structure-guided chunking, and filters noise to enable scalable, structurally aware vector retrieval.

In practice

Use regex for skeleton tree generation
Prepend ancestry paths to chunks
Filter common noise sections like "Table of Contents"

Topics

Proxy-Pointer RAG
Vectorless RAG
Retrieval-Augmented Generation
Structural Awareness
LLM Cost Optimization

Code references

VectifyAI/PageIndex

Best for: AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.