AI chatbots are giving out people’s real phone numbers

· Source: MIT Technology Review · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Intermediate, medium

Summary

Generative AI chatbots, including Google's Gemini, OpenAI's ChatGPT, and xAI's Grok, are exposing users' real phone numbers and other personally identifiable information (PII), leading to privacy concerns and harassment. Several incidents highlight this issue, such as a Redditor receiving unwanted calls, an Israeli software developer's number being shared by Gemini for customer service, and a University of Washington PhD candidate's colleague having her cell number revealed. Experts attribute these lapses to PII being included in large language model (LLM) training data, often scraped from the public web. DeleteMe, a data removal service, reports a 400% increase in AI-related privacy requests, with 55% referencing ChatGPT and 20% Gemini. Despite guardrails and content filters designed to prevent PII exposure, these systems can still be prompted to reveal sensitive data, and current privacy legislation offers limited recourse for data already incorporated into training sets.

Key takeaway

For CTOs and VPs of Engineering evaluating AI deployments, understand that current generative AI models inherently risk exposing PII due to training data practices. Your teams should prioritize robust PII filtering and explore data anonymization techniques upstream, as post-deployment removal mechanisms are imperfect and jurisdiction-dependent. Be prepared for increased privacy compliance challenges and user complaints, and consider the reputational risks associated with such data breaches.

Key insights

Generative AI chatbots frequently expose personal data, primarily due to PII in training sets, with limited user recourse.

Principles

Method

Chatbots are trained on vast web-scraped datasets, including PII. They can memorize and reproduce this data, even from obscure sources, making it easily discoverable through direct prompts.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Ethicist, Legal Professional, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by MIT Technology Review.