POLAR-Bench: A Diagnostic Benchmark for Privacy-Utility Trade-offs in LLM Agents

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

POLAR-Bench is a new diagnostic benchmark designed to evaluate the privacy-utility trade-offs in LLM agents. It features a trusted model interacting with an adversarial third-party model across 10 diverse domains and 7,852 samples. The benchmark deterministically scores privacy and utility by varying privacy policy dimensions and attack strategies along two orthogonal axes, creating a $5\times 5$ diagnostic surface for each model. Key findings reveal a significant performance gap: current frontier models like GLM-5.1, GPT-5.4, Gemma-4-31B, and DeepSeek-V3.1 withhold over 99% of protected attributes, maintaining high utility. In contrast, smaller open-weight models in the 1–30B parameter range, often deployed on-device, score notably worse, with some leaking over half of private data. The research indicates that alignment and training choices are more critical than model size for balancing privacy and utility.

Key takeaway

For Machine Learning Engineers deploying LLM agents that handle private user data, you must recognize that smaller, locally-deployable models (1-30B parameters) exhibit substantial privacy leakage under adversarial conditions, often exceeding 50%. Do not solely rely on model size; instead, prioritize rigorous privacy alignment and training choices. You should stress-test your chosen models against diverse attack strategies and privacy policy complexities using diagnostic benchmarks like POLAR-Bench to ensure robust privacy-utility balance before deployment.

Key insights

POLAR-Bench diagnoses LLM agent privacy-utility trade-offs, showing frontier models significantly outperform smaller, on-device counterparts.

Principles

Method

POLAR-Bench uses a two-agent setup with a trusted model and an adversarial probe. It scores privacy and utility deterministically via regex-validated documents across 10 domains, varying policy dimensions and attack strategies.

In practice

Topics

Code references

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.