Towards a Linguistic Evaluation of Narratives: A Quantitative Stylistic Framework

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Natural Language Processing · Depth: Expert, quick

Summary

This work introduces a quantitative framework for evaluating narrative quality by analyzing linguistic features. It proposes a methodology that extracts 33 quantitative linguistic features, categorized into lexical, syntactic, and semantic groups, to automatically assess narratives. An experiment on a corpus of 23 books, comprising both canonical masterpieces and self-published works, demonstrated the system's ability to cluster narratives and distinguish between professionally edited and self-published texts with high accuracy using a similarity matrix. The methodology was further validated against a human-annotated dataset, where it significantly outperformed traditional story-level evaluation metrics, confirming the efficacy of linguistic features in determining narrative quality.

Key takeaway

For research scientists developing automated content analysis tools, this framework offers a robust method for objectively evaluating narrative quality. You should consider integrating quantitative linguistic features, such as those proposed, into your models to enhance the precision of narrative assessment and distinguish between different levels of textual professionalism.

Key insights

Linguistic features can quantitatively assess narrative quality, distinguishing professional from self-published works.

Principles

Method

The method extracts 33 quantitative linguistic features (lexical, syntactic, semantic) from narratives, then uses a similarity matrix to cluster and evaluate texts, validated against human annotations.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.