CURUPIRA: Clever guard for harm and linguistic prompt mitigation in Brazilian Portuguese
Summary
Curupira is a Brazilian Portuguese-language guard model developed to mitigate harmful prompt exploitation in Large Language Models, addressing challenges in multilingual safety. This model was created using a three-step methodology involving adaptation, data generation, and fine-tuning. Researchers evaluated Curupira against two open guardrail architectures, demonstrating that targeted fine-tuning significantly improves safety classification for Portuguese prompts. The evaluation also revealed favorable efficiency-performance trade-offs for compact models and minimal degradation during cross-lingual assessment. This work, presented at the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) in Salvador, Brazil, highlights the importance of language-specific safety measures for LLMs.
Key takeaway
For research scientists developing LLMs for diverse linguistic contexts, you should consider implementing language-specific guard models like Curupira. This approach, particularly with targeted fine-tuning, can significantly enhance safety classification for underrepresented languages such as Brazilian Portuguese, ensuring more robust and culturally appropriate model deployment. Prioritize evaluating both efficiency and cross-lingual performance to maintain broad applicability.
Key insights
Targeted fine-tuning improves LLM safety classification for underrepresented languages like Brazilian Portuguese.
Principles
- Multilingual LLM safety is challenging.
- Language-specific guard models enhance safety.
Method
The methodology involves adaptation, data generation, and fine-tuning to create language-specific guard models for prompt mitigation.
In practice
- Develop language-specific guard models.
- Utilize targeted fine-tuning for safety.
- Evaluate cross-lingually for robustness.
Topics
- CURUPIRA
- Large Language Models
- Brazilian Portuguese
- Prompt Mitigation
- Safety Classification
Best for: Research Scientist, AI Scientist, NLP Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.