Increase Recommendation Systems’ Precision with LLMs, Using Python

2026-06-08 · Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, medium

Summary

This article details a two-stage recommendation system designed to enhance precision using Large Language Models (LLMs) while managing computational costs. The system first employs a cheap, rule-based distance filter (Stage 1) to narrow down a dataset of approximately 10,000 restaurants across eight cities to 50 geographically closest candidates. Subsequently, a powerful LLM (Stage 2) reranks these 50 candidates based on a user's natural language query, providing 5 to 10 highly precise recommendations with a "fit_score" and explanation. This hybrid approach balances scalability and intelligence, leveraging LLM capabilities only on a pre-filtered, relevant subset to optimize cost and performance.

Key takeaway

For AI Engineers or Machine Learning Engineers designing intelligent recommendation systems, you should adopt a two-stage funnel to balance LLM costs with precision. By first using a cheap, rule-based filter to generate high-recall candidates, you can then apply an LLM for high-precision reranking on a significantly smaller dataset. This strategy ensures your system is both scalable and intelligent, making efficient use of expensive LLM resources.

Key insights

Optimize LLM-powered recommendation systems by combining cheap, high-recall filtering with expensive, high-precision reranking.

Principles

System design involves trade-offs, often described by the Accuracy-Scale-Time triangle.
Initial candidate generation should prioritize high recall and low precision.
Final selection benefits from high precision models applied to a reduced candidate set.

Method

Generate synthetic restaurant data. Implement a rule-based filter to select N_DISTANCE_CANDIDATES (e.g., 50) based on proximity. Use an LLM with structured output (e.g., Pydantic) to rerank these candidates, providing fit scores and reasons.

In practice

Use simple, rule-based methods for initial data reduction to minimize LLM API calls.
Apply LLMs for nuanced interpretation and precise ranking on smaller, pre-filtered datasets.
Enforce structured LLM responses using tools like Pydantic for reliable parsing.

Topics

Recommendation Systems
Large Language Models
System Design
Hybrid AI Architectures
Data Filtering
Pydantic
OpenAI API

Code references

PieroPaialungaAI/RestaurantLLM

Best for: AI Engineer, Machine Learning Engineer, AI Student

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.