Em dash
Summary
The author addresses accusations of using Large Language Models (LLMs) to generate blog content, asserting that their writing style generally lacks an "LLM smell." A specific exception is noted: the automated insertion of em dashes into posts. This functionality, implemented via a Python code snippet `s = s.replace(' - ', u'\u2014')`, has been part of the author's blog infrastructure since at least 2015. This predates the widespread public awareness and use of advanced LLMs, originating from a blog port from an older Django version to a new GitHub repository.
Key takeaway
For content creators concerned about their writing being mistaken for AI-generated text, review your publishing pipeline for automated stylistic modifications. Understanding the origins of such features, like the em dash insertion from 2015, can provide evidence against LLM accusations and maintain your authentic voice.
Key insights
Automated text processing features can inadvertently mimic LLM-generated content characteristics.
Principles
- Code history reveals feature origins.
- Style choices predate LLM prevalence.
Method
The method for inserting em dashes involves a simple string replacement: `s.replace(' - ', u'\u2014')` within the blog's templating system, dating back to a 2015 Django port.
In practice
- Review code for legacy text processing.
- Document stylistic automation.
Topics
- LLM Content Generation
- Blog Development
- Python String Manipulation
- Django Framework
- Em Dash Formatting
Code references
Best for: Software Engineer, General Interest
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.