LLMs Evaluated on Social Media Analytics Tasks

ai-technology · 2026-05-01

A recent study analyzed the effectiveness of various large language models (LLMs) in performing social media tasks on Twitter, now rebranded as X. The assessment included models like GPT-4, Gemini 1.5 Pro, and BERT, among others. Researchers focused on three main tasks: verifying authorship of posts, generating realistic content, and inferring user attributes. To ensure accuracy, they developed a systematic sampling approach for analyzing user posts, utilizing new tweets collected from January 2024. A user evaluation also compared the writing quality of the LLMs with that of actual users, assessing both authenticity and engagement.

Key facts

First comprehensive evaluation of modern LLMs on social media analytics tasks
Models evaluated: GPT-4, GPT-4o, GPT-3.5-Turbo, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2, BERT
Three tasks: authorship verification, post generation, user attribute inference
New tweets from January 2024 onward used to mitigate seen-data bias
User study conducted to measure perceptions of LLM-generated posts

Entities

—

Sources

arXiv cs.AI — 2026-04-22