Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A new study comprehensively evaluates modern Large Language Models (LLMs), including GPT-4, GPT-4o, GPT-3.5-Turbo, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2, and BERT, across three critical social media analytics tasks using a Twitter (X) dataset. The tasks include Social Media Authorship Verification, Social Media Post Generation, and User Attribute Inference. For authorship verification, a systematic sampling framework was introduced, evaluating generalization on new tweets from January 2024 to avoid data bias. Post generation was assessed for authenticity and user-like content using extensive metrics, including a user study where real users evaluated LLM-generated posts based on their own writing. User attribute inference involved annotating occupations and interests with IAB Tech Lab 2023 and 2018 U.S. SOC taxonomies, benchmarking LLMs against existing baselines. This unified evaluation offers new insights and establishes reproducible benchmarks for LLM applications in social media analytics.

Key takeaway

For AI Engineers and Research Scientists developing social media analytics tools, this study provides crucial benchmarks for selecting and applying LLMs. You should consider the performance of models like GPT-4o and Gemini 1.5 Pro across specific tasks such as authorship verification or user attribute inference, leveraging the established methodologies to improve your model selection and evaluation processes for real-world social media data.

Key insights

Modern LLMs were systematically evaluated across key social media analytics tasks using a Twitter (X) dataset.

Principles

Method

The study used a systematic sampling framework for authorship verification, comprehensive metrics for post generation, and standardized taxonomies (IAB Tech Lab 2023, 2018 U.S. SOC) for attribute inference.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.