When Do LLMs Generate Realistic Social Networks? A Multi-Dimensional Study of Culture, Language, Scale, and Method
Summary
A study investigated how Large Language Models (LLMs) generate synthetic social networks, focusing on the impact of prompt design, cultural framing, prompt language, and model scale. Researchers formalized four LLM-based tie-formation mechanisms: sequential, global, local, and iterative, treating them as distinct conditional distributions over edge sets. Using a fixed roster of 50 demographically grounded personas, they generated 192 verified directed networks across four cultural contexts, four prompt languages, three GPT-4.1 variants, and four prompting architectures, with two seeds per condition. Key findings include that cultural framing alters inbreeding homophily and largest-component connectivity, and political affiliation often dominates tie formation, though the global method substitutes age. Model scale showed a stable divergence ranking, with the smallest variant behaving qualitatively differently. Prompt language, especially Hindi, sharply shifted religion homophily while leaving political homophily invariant. LLM-generated networks matched real social graphs on clustering and modularity better than standard baselines but encoded demographic biases above empirical levels.
Key takeaway
For AI Scientists and Research Scientists developing or utilizing LLMs for social simulations, you should carefully consider the sociological implications of your prompt design and model choices. Your selection of prompt architecture, cultural framing, and even prompt language can significantly alter the generated network's characteristics, such as homophily and connectivity. Be aware that LLM-generated networks, while structurally similar to real graphs, may embed demographic biases, necessitating validation against empirical levels to ensure realistic and unbiased simulations.
Key insights
LLM-generated social networks are significantly influenced by prompt design, culture, language, and model scale.
Principles
- Prompt architecture functions as a sociological variable.
- Model scale produces stable divergence in network generation.
- Prompt choices encode substantive sociological assumptions.
Method
The study formalized four LLM tie-formation mechanisms (sequential, global, local, iterative) and generated 192 networks using 50 personas across varied cultural, linguistic, and model conditions.
In practice
- Vary prompt architecture to explore different sociological variables.
- Consider prompt language's impact on specific homophily types.
- Be aware of demographic biases in LLM-generated networks.
Topics
- Large Language Models
- Social Network Generation
- Prompt Engineering
- Cultural Framing
- Homophily Theory
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.