Portuguese Sentiment Analysis with Open-Source LLMs: Models, Prompts, and Efficient Deployment
Summary
A comparative study evaluated 29 open-source Large Language Models (LLMs) and two proprietary models for Portuguese sentiment classification. Researchers tested four prompting strategies: Zero-Shot, Few-Shot, Chain-of-Thought (CoT), and CoT with Few-Shot (CoT+FS). The experiments involved approximately 372,000 inferences across a unified three-class benchmark derived from three public review corpora, totaling about 3,000 instances. This process generated roughly 150M input tokens and 65M output tokens. The findings indicate that CoT+FS generally achieves the best performance for larger models. Crucially, several compact open-source models demonstrated competitive F1-scores with significantly reduced computational costs, making them practical for real-world deployments. The study also identified specific teacher–student configurations suitable for knowledge distillation in Portuguese sentiment analysis.
Key takeaway
For AI Engineers building sentiment analysis systems for Lusophone markets, this research highlights that you can achieve robust performance using compact open-source LLMs. Prioritize CoT+FS prompting for larger models, but also evaluate smaller models for their competitive F1-scores and lower computational overhead. This approach allows for efficient deployment without relying solely on proprietary solutions, potentially reducing operational costs and improving scalability for your applications.
Key insights
Open-source LLMs offer competitive Portuguese sentiment analysis with efficient deployment, especially using CoT+FS prompting.
Principles
- CoT+FS generally improves larger LLM performance.
- Compact open-source LLMs can be computationally efficient.
- Knowledge distillation is viable for Portuguese sentiment analysis.
Method
The study compared 29 open-source and 2 proprietary LLMs on Portuguese sentiment classification using Zero-Shot, Few-Shot, CoT, and CoT+FS prompting strategies across a unified 3,000-instance benchmark.
In practice
- Use CoT+FS for optimal performance with larger LLMs.
- Consider compact open-source LLMs for cost-effective deployment.
- Explore teacher–student configurations for knowledge distillation.
Topics
- Portuguese Sentiment Analysis
- Open-Source LLMs
- Prompting Strategies
- Knowledge Distillation
- Computational Efficiency
Best for: AI Engineer, Machine Learning Engineer, Research Scientist, AI Scientist, NLP Engineer, Prompt Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.