Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

2026-05-13 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Data Science & Analytics · Depth: Expert, medium

Summary

Researchers propose an on-device pipeline for Personally Identifiable Information (PII) substitution, replacing detected entities with consistent, type-preserving fake values instead of generic placeholders. The system uses a 1.5B mixture-of-experts token classifier (openai/privacy-filter) for span detection, a 1-bit Bonsai-1.7B Small Language Model (SLM) for contextual surrogates (names, addresses, dates), and a rule-based generator (faker) for patterned fields. A critical finding revealed that naive fixed three-shot demonstrations caused the 1-bit SLM to regurgitate demonstration outputs verbatim. This issue was resolved by implementing locale-conditioned rotating few-shot demonstrations, which use a character-range heuristic and per-input MD5 hash to sample three unique, locale-pure demonstrations. While this fix ensured unique, locale-correct surrogates, the SLM still copied from a small same-locale pool. Downstream NER evaluation showed that while SLM surrogates produced more natural text, they resulted in a less varied training distribution, with faker (0.506 F1) outperforming the hybrid approach (0.346 F1) on a matched subset.

Key takeaway

For AI Engineers developing on-device PII handling, you should prioritize dynamic, locale-conditioned few-shot prompting to prevent small language models from verbatim regurgitation of demonstration data. While SLM-generated surrogates create more natural text, be aware that this can lead to a less varied training distribution for downstream tasks like NER, potentially reducing performance compared to rule-based generators. Evaluate the trade-off between text naturalness and data diversity based on your specific application's requirements.

Key insights

Locale-conditioned few-shot prompting prevents SLMs from regurgitating PII substitution demonstrations.

Principles

Fixed few-shot demonstrations cause verbatim regurgitation.
Downstream NER benefits more from data variety than naturalness.

Method

A pipeline combines a 1.5B token classifier, a 1-bit SLM for contextual surrogates, and a rule-based generator for PII substitution, using locale-conditioned rotating few-shot prompting.

In practice

Implement locale-conditioned few-shot prompting to avoid SLM regurgitation.
Prioritize data variety over naturalness for downstream NER training.
Consider hybrid PII substitution for natural text generation.

Topics

PII Substitution
On-Device Processing
Small Language Models
Few-Shot Prompting
Demonstration Regurgitation

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.