NAMESAKES: Probing Identity Memorization in Text-to-Image Models
Summary
A new black-box behavioral probe, NAMESAKES, has been introduced to distinguish identity memorization from fabrication in Text-to-Image (T2I) models. T2I models can generate realistic likenesses from names, posing privacy risks, but existing detection methods require ground-truth photos, training data access, or white-box model internals. The NAMESAKES probe operates without these requirements, offering a fully black-box solution. To benchmark this task, the researchers created the NAMESAKES dataset, comprising over one thousand names and faces of public figures across various fame levels, alongside perturbed, less famous names. Experiments on leading T2I models, published on 2026-06-18, demonstrate that the probe accurately predicts identity memorization and differentiates between memorized and unrecognized names, providing insights into variations across model families.
Key takeaway
For AI Security Engineers evaluating Text-to-Image models for deployment, this black-box probe offers a critical tool to assess identity memorization risks. You can now identify models that generate realistic likenesses from names without needing internal access or ground-truth photos. This allows you to proactively mitigate privacy concerns and ensure compliance by selecting models less prone to memorizing specific public figures, enhancing responsible AI development.
Key insights
A black-box probe can reliably detect identity memorization in text-to-image models without needing ground truth or training data access.
Principles
- T2I models memorize specific identities from training data.
- Black-box probes can infer model memorization states.
- Fame level influences identity memorization.
Method
The NAMESAKES method uses a black-box behavioral probe to distinguish memorized from fabricated identities in T2I models. It benchmarks this using a dataset of public figures' names and perturbed names.
In practice
- Identify T2I models with privacy risks.
- Benchmark T2I models for memorization.
- Inform T2I training data curation.
Topics
- Text-to-Image Models
- Identity Memorization
- Privacy Concerns
- Black-box Probing
- NAMESAKES Dataset
- AI Security
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, AI Security Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.