Thousands of people are selling their identities to train AI – but at what cost?

· Source: AI (artificial intelligence) | The Guardian · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Emerging Technologies & Innovation · Depth: Novice, medium

Summary

A new global industry of data marketplaces has emerged to supply high-quality, human-grade data for training AI models, as traditional web-scraping sources face restrictions and synthetic data leads to model degradation. Thousands of individuals, particularly in developing countries, are selling their biometric identities and intimate data, such as videos, audio, and private conversations, to platforms like Kled AI, Silencio, and Neon Mobile. These "gig AI trainers" earn quick cash, often significantly more than local wages, by micro-licensing their personal data. However, this practice comes with substantial risks, including irrevocable data licenses, potential misuse in deepfakes or predatory advertisements, lack of transparency regarding data deployment, and the precarious nature of the work, which offers no transferable skills or safety net.

Key takeaway

For CTOs and VPs of Engineering evaluating data sourcing strategies for AI development, recognize that relying on human-sourced data from marketplaces introduces significant ethical, privacy, and long-term liability risks. While such data offers high quality, the lack of transparency and irrevocable licensing terms can lead to reputational damage and legal challenges. Prioritize robust data governance and explore alternative, ethically sound data generation methods to mitigate future exploitation and ensure sustainable AI development.

Key insights

A new gig economy monetizes personal data for AI training, raising significant ethical and privacy concerns.

Principles

In practice

Topics

Best for: CTO, VP of Engineering/Data, Executive, AI Ethicist, Policy Maker, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI (artificial intelligence) | The Guardian.