AI is replacing humans in responding to some surveys – but simulated opinions are not the same as public opinion
Summary
AI models are increasingly being used to generate "synthetic surveys" or "silicon sampling," replacing human respondents to reduce the considerable cost of traditional polling, which can reach tens of thousands of dollars for a 10-minute survey of 1,000 people. This practice involves prompting large language models like ChatGPT with demographic contexts to simulate 10,000 different responses. However, this approach raises significant concerns about trustworthiness, as AI simulations are not actual public opinion measurements. AI models inherit biases from their training data, potentially oversimplifying or distorting opinions from underrepresented groups, and their internal workings are often opaque in proprietary models. Unlike synthetic data used in AI training for self-driving cars, which is rigorously checked against reality, synthetic survey responses risk shaping public policy with flawed conclusions if distortions go unnoticed. Despite these issues, AI can assist survey research by improving question clarity, adapting surveys across languages, and efficiently analyzing open-ended human responses.
Key takeaway
For social science researchers or pollsters considering AI for survey responses, understand that synthetic respondents simulate, rather than measure, public opinion. Relying solely on AI-generated data risks embedding biases and distorting reality, potentially leading to flawed policy or business decisions. Instead, focus your efforts on leveraging AI to enhance survey design and analyze human responses, ensuring transparency and scientific defensibility while preserving the critical connection to actual human voices.
Key insights
Simulating public opinion with AI models is not equivalent to measuring actual public opinion due to inherent biases and lack of real-world validation.
Principles
- Surveys measure actual thought, not just predict.
- AI models inherit and can distort opinions.
- Synthetic data needs real-world validation.
Method
Pollsters prompt LLMs with demographic contexts and leverage internal randomness to generate thousands of diverse synthetic responses for survey questions.
In practice
- Use AI to refine survey questions for clarity.
- Employ AI for efficient analysis of open-ended human responses.
- Explore hybrid human-AI survey approaches.
Topics
- Synthetic Surveys
- Large Language Models
- Public Opinion Polling
- AI Bias
- Survey Research
- Data Validation
Best for: AI Scientist, Research Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial intelligence (AI) – The Conversation.