Lexical and Syntactic Characterization of Fake News in Portuguese Produced by Humans and by Machines

· Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

A study analyzed Portuguese fake news from human and machine sources, utilizing the Fake.br and FakeTrueBR corpora, which were augmented with automatically generated fake news. The research focused on lexically and syntactically characterizing these texts to differentiate between human and machine authorship. Key findings indicate that machine-generated fake news contains significantly longer words, a higher frequency of adjectival modifiers, and reduced syntactic diversity, even while employing more syntactic rules per sentence. Conversely, human-produced fake news demonstrated greater stylistic variability across all examined lexical and syntactic dimensions, highlighting distinct patterns in how each source constructs deceptive content.

Key takeaway

For NLP engineers developing fake news detection systems for Portuguese, your models should prioritize features related to word length, adjectival modifier usage, and syntactic diversity. Incorporating these specific lexical and syntactic characteristics can significantly improve the accuracy of differentiating between human and machine-generated deceptive content, especially given the observed stylistic differences.

Key insights

Machine-generated Portuguese fake news shows distinct lexical and syntactic patterns compared to human-produced content.

Principles

Method

The study expanded Fake.br and FakeTrueBR corpora with automatically generated fake news, then performed lexical and syntactic characterization to differentiate human from machine authorship.

In practice

Topics

Best for: AI Scientist, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.