Lexical and Syntactic Characterization of Fake News in Portuguese Produced by Humans and by Machines
Summary
A study analyzed Portuguese fake news from human and machine sources, utilizing the Fake.br and FakeTrueBR corpora, which were augmented with automatically generated fake news. The research focused on lexically and syntactically characterizing these texts to differentiate between human and machine authorship. Key findings indicate that machine-generated fake news contains significantly longer words, a higher frequency of adjectival modifiers, and reduced syntactic diversity, even while employing more syntactic rules per sentence. Conversely, human-produced fake news demonstrated greater stylistic variability across all examined lexical and syntactic dimensions, highlighting distinct patterns in how each source constructs deceptive content.
Key takeaway
For NLP engineers developing fake news detection systems for Portuguese, your models should prioritize features related to word length, adjectival modifier usage, and syntactic diversity. Incorporating these specific lexical and syntactic characteristics can significantly improve the accuracy of differentiating between human and machine-generated deceptive content, especially given the observed stylistic differences.
Key insights
Machine-generated Portuguese fake news shows distinct lexical and syntactic patterns compared to human-produced content.
Principles
- Machine texts use longer words.
- Human texts show greater stylistic variability.
Method
The study expanded Fake.br and FakeTrueBR corpora with automatically generated fake news, then performed lexical and syntactic characterization to differentiate human from machine authorship.
In practice
- Analyze word length for machine detection.
- Check adjectival modifier frequency.
- Assess syntactic diversity in content.
Topics
- Fake News Detection
- Generative AI
- Lexical Characterization
- Syntactic Characterization
- Portuguese Language
Best for: AI Scientist, NLP Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.