Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models
Summary
Researchers propose an on-device pipeline for Personally Identifiable Information (PII) substitution, replacing detected entities with consistent, type-preserving fake values instead of generic placeholders. The system uses a 1.5B mixture-of-experts token classifier (openai/privacy-filter) for span detection, a 1-bit Bonsai-1.7B Small Language Model (SLM) for contextual surrogates (names, addresses, dates), and a rule-based generator (faker) for patterned fields. A critical finding revealed that naive fixed three-shot demonstrations caused the 1-bit SLM to regurgitate demonstration outputs verbatim. This issue was resolved by implementing locale-conditioned rotating few-shot demonstrations, which use a character-range heuristic and per-input MD5 hash to sample three unique, locale-pure demonstrations. While this fix ensured unique, locale-correct surrogates, the SLM still copied from a small same-locale pool. Downstream NER evaluation showed that while SLM surrogates produced more natural text, they resulted in a less varied training distribution, with faker (0.506 F1) outperforming the hybrid approach (0.346 F1) on a matched subset.
Key takeaway
For AI Engineers developing on-device PII handling, you should prioritize dynamic, locale-conditioned few-shot prompting to prevent small language models from verbatim regurgitation of demonstration data. While SLM-generated surrogates create more natural text, be aware that this can lead to a less varied training distribution for downstream tasks like NER, potentially reducing performance compared to rule-based generators. Evaluate the trade-off between text naturalness and data diversity based on your specific application's requirements.
Key insights
Locale-conditioned few-shot prompting prevents SLMs from regurgitating PII substitution demonstrations.
Principles
- Fixed few-shot demonstrations cause verbatim regurgitation.
- Downstream NER benefits more from data variety than naturalness.
Method
A pipeline combines a 1.5B token classifier, a 1-bit SLM for contextual surrogates, and a rule-based generator for PII substitution, using locale-conditioned rotating few-shot prompting.
In practice
- Implement locale-conditioned few-shot prompting to avoid SLM regurgitation.
- Prioritize data variety over naturalness for downstream NER training.
- Consider hybrid PII substitution for natural text generation.
Topics
- PII Substitution
- On-Device Processing
- Small Language Models
- Few-Shot Prompting
- Demonstration Regurgitation
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.