Rating–Text Mismatch in Brazilian Portuguese Reviews: How Reliable Are Zero-Shot LLMs?
Summary
A study evaluated the capability of large language models (LLMs) to identify inconsistencies between product review text and their corresponding 1-star or 5-star ratings in Brazilian Portuguese. Researchers utilized popular LLMs like GPT-5, Llama-4, and DeepSeek-3.2, alongside models optimized for Brazilian Portuguese, Sabiá-3.1 and Bode-3.1. The findings indicate that some LLMs achieved high performance, with F1 scores exceeding 90% in a zero-shot protocol for detecting these rating-text mismatches. Furthermore, the models demonstrated strong agreement in their predictions, showing low variability across multiple rounds (Fleiss' κ > 0.95). Approximately 10% of comments across all product categories exhibited this incoherence, suggesting LLMs are highly promising for complex semantic interpretation tasks and valuable for online monitoring and recommendation systems.
Key takeaway
For AI Engineers developing content moderation or recommendation systems, these findings suggest that zero-shot LLMs can reliably identify rating-text inconsistencies in Brazilian Portuguese. You should consider integrating models like GPT-5 or Sabiá-3.1 to automatically flag potentially misleading reviews, improving data quality and user trust. This capability can enhance the accuracy of sentiment analysis and product insights.
Key insights
LLMs effectively detect rating-text incoherence in Brazilian Portuguese reviews with high F1 scores and strong inter-model agreement.
Principles
- Zero-shot LLMs can achieve high semantic interpretation.
- Rating-text incoherence is prevalent across product categories.
Method
The study evaluated LLMs (GPT-5, Llama-4, DeepSeek-3.2, Sabiá-3.1, Bode-3.1) for detecting 1-star or 5-star rating-text incoherence in Brazilian Portuguese reviews using a zero-shot protocol.
In practice
- Use LLMs for online content monitoring.
- Integrate LLMs into recommendation systems.
Topics
- Rating-Text Mismatch
- Zero-Shot LLMs
- Brazilian Portuguese
- Product Reviews
- Semantic Interpretation
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.