RedactionBench
Summary
RedactionBench is a new, manually annotated benchmark designed to evaluate Personally Identifiable Information (PII) redaction in Large Language Models, addressing the critical distinction between simple entity extraction and context-dependent privacy semantics. Comprising 200 diverse documents across 11 real-world domains, the benchmark aims to overcome limitations of existing evaluation methods. Alongside RedactionBench, the authors introduce R-Score, a novel character-level metric that normalizes for semantic similarity and formatting variations in redactions. Evaluations of 35 models, including Named Entity Recognition models, Small Language Models, and frontier models, reveal that contextual redaction remains an unsolved challenge. A human evaluation involving over 80 users on RedactionBench further underscores the subjective nature of privacy, showing high consensus for mandatory redactions (89.4%) and safe text preservation (94.1%), but only 47.7% agreement for contextual redactions. The benchmark and metric are released to foster improved privacy-preserving systems.
Key takeaway
For NLP Engineers and AI Security Engineers developing or deploying LLMs in sensitive domains, recognize that PII redaction is not merely entity extraction. Your models must account for contextual privacy, which human evaluators often disagree on. Utilize RedactionBench and its R-Score metric to rigorously evaluate your systems' ability to handle nuanced, context-dependent redactions, moving beyond simple PII detection to address true privacy semantics.
Key insights
Contextual PII redaction is an unsolved problem due to subjective privacy perceptions, requiring specialized benchmarks and metrics.
Principles
- PII redaction requires contextual understanding.
- Privacy perception is highly subjective.
- Semantic similarity matters in redaction metrics.
Method
RedactionBench involves manually annotating 200 documents across 11 domains for contextual PII. R-Score is a character-level metric treating semantically similar redactions equally, nullifying shallow formatting.
In practice
- Use RedactionBench for PII redaction evaluation.
- Employ R-Score for nuanced redaction scoring.
- Account for subjective privacy in design.
Topics
- PII Redaction
- Large Language Models
- Contextual Privacy
- RedactionBench
- R-Score Metric
- Named Entity Recognition
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, NLP Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.