Like many of you, I have been seeing more frequent posts weighing in on the Em Dash controversy slowly taking over LinkedIn. I read these arguments from both sides of the debate, but I have been hesitant to take a public position yet. But at this point, I feel like I should. Besides having had more time to allow my thoughts on the matter to coalesce, as a writer, it feels like I have to offer my opinion. So, my dear reader, strap in for an exciting 2,000 word ride on a punctuation and language discussion roller coaster!
If you haven’t seen these posts, or even know what I’m referring to, here’s a quick catch up summary. An em dash (—) is a piece of punctuation that has landed at the center of a language and etymological debate involving human and AI writing style. Essentially, people have noticed that ChatGPT and other Large Language Model AIs (LLMs) seem to favor em dashes in their writing. Many people are now warning against using it, claiming it's an obvious giveaway that content was AI generated. Others contest that view, saying the em dash is an effective and necessary partner in the never ending battle to convert spoken cadence to written word and therefore can’t be the only determining factor if something was AI generated. To one side, this dash is one of the first canaries in the proverbial coal mine that is the full scale forced migration of the creative industry from human to AI. To the other side, this outrage is simply the latest iteration of the Punctuation Police dictating how we express our thoughts through the written word.
A Dash is a Dash
“But wait, hold up… There are different types of dashes?”, you ask. Great question. Yes. To be honest, I had never heard of this piece of punctuation before the online debate popped up. I only thought we had a dash and a hyphen. I have never given much consideration to how a dash, em or otherwise, is even used. And since I’m being honest, I must confess I have lived under the assumption that I had enough experience and confidence to figure it out at the moment. I have instead concentrated my punctuation tool set on commas, semi and regular colons, parentheses, and the like, to break up (and offset) my attempts at witty, humorous, and/or emotional writing; the success of which is, itself, its own debate. In fact, I began my research for this post with this search: how do you even type an em dash?
So, I empathize with your question! Basically, an Em Dash takes the current sentence in a different direction. Fun fact: This dash gets its name, Em, because it’s as wide as a capital M. This little horizontal workhorse “can function like a comma, a colon, or parenthesis,” according to everyone’s favorite dictionary, Merriam-Webster. And to think, all this time I could have streamlined my tools. We use the em dash to show interrupted thoughts or speech, to indicate new clauses, illustrate a point with an example, summarize a thought, and more. It’s extremely versatile in its uses. The fact that LLMs are employing the em dash is not surprising after you see how many situations it can handle. The level of efficiency it adds to writing is inline with how I imagine AI makes decisions: based on the most efficient option. Therefore, I am not surprised that AI overuses it, especially when compared to human writers. This gives a level of credence to the “it’s a tell” side of the argument. Which, as it turns out, adds yet another plank to the foundation of an emerging cottage industry: The AI Spotter.
Spot The AI Content
In the rapidly expanding use of AI to generate content, there has been an almost equally rapid growth in the area of AI spotters. Experts and apps claiming to be able to determine if something was generated with AI. Quick point: I’m purposefully not using “created” to describe what AI does. It’s not simply because I’m pretentious like that. As I understand it, AI is using all the data it’s been trained on to generate a response to a prompt that it’s “predicting” is the answer to the prompt it was given. This method is why many artists in the creative industries are so opposed to its use. They describe this process as stealing others' intellectual property to use as building blocks to generate a response. This applies to written and visual arts alike. Many of the AI spotters are nobly driven by a concern to right this wrong, as they see it. Some of the signs that a creative element has been AI generated are so obvious, it’s comical. You will see pictures and videos with malformed elements like missing limbs or impossibly bent structures, missing or out of place parts of a larger object like umbrellas on cars and windows hanging mysteriously in the air. In written work, you’ll read overly repetitive content or encounter nonsensical structure, or find glaringly incorrect answers to simple questions. While you may think I am describing works by Dali or Seuss, I am not. Also, how dare you? But these LLMs will get more advanced and the more obvious signs are already diminishing. Some of the “deep fakes” that have started to appear in the wild of the internet are damn close to indiscernible from the real deal. The only tool left is rapidly becoming our individual sense of reality. And let’s be honest, not everyone has a tight grip on that tool. Especially concerning geopolitical developments, this emerging industry of AI Spotters is vital in today’s landscape.
Bringing us back to the em dash debate, there are already many many apps that claim to tell the user if text was written with AI. It’s easy to see the academic and HR uses for such apps. These have become so commonplace, there are already hundreds of YouTube videos devoted to which one is the best. But even without using the apps or hired spotters, we now live in an era where we all need to be able to determine if something we read or watch is AI generated or not. And, more importantly, we need to be able to explain to Aunt Gertrude or Uncle Tobias how we know that starfish are not piloting Russian made undersea rocket ships towards Washington DC despite that video they showed us at dinner. As any psychologist will tell you, humans are exceptional at developing shorthand truths to make sense of the world around us. We then share those “truths” with others through behavior, conversations, advice, social media posts, etc; and they get passed on and shared, and so on and so forth. The actual truth of what is shared is not guaranteed, like rumors and urban legends. Anyone else refuse to flash your headlights at oncoming cars whose lights are off for fear of being massacred by gang violence? Unsurprisingly, this is now happening with AI and content associated with LLMs. The most recent example is that the use of the em dash in written material has become an immediate indication of AI generated content. Which has led to an immediate distrust or suspicion in that particular content and its creator in some circles.
Punctuation is a Tool
Now let’s explore the other side of the debate: Defending the punctuation choice as effective and useful, not something to be surrendered to the bots and Hals without a fight. First, I need to get serious for a minute or two. It’s important to acknowledge that policing the way someone expresses themselves through language, written and verbal, has a long history rooted in classism, racism, oppression, white supremacy, and prejudice. The way language, English specifically and all languages generally, has developed is through the assimilation and consumption of other languages through exploration, colonization, and subjugation. Throughout the whole history of English being a form of communication, the segment of society in power has attempted to define the proper use of the language in an effort to indicate who held power, who was not in charge, not important, not human, etc. This is true for what words are considered obscene, what topics are labeled taboo, and which ways one can determine if a speaker is intelligent or not, based simply on how they sound.
Online, this form of judgment, separation, and classification has raged since the early days of the internet. I can recall the first message board based sites on AOL where users would call each other out as n00bs (short for newbies) for simple errors like misusing internet slang or not formatting posts according to the style du jour. Even now, the misuse of “your” or “you’re” is quickly seized upon to dismiss a commenter's opinion and cast doubt on their level of education. And I haven’t even touched on the overwhelmingly racist and white supremacist backlash to the acknowledgement of the mere existence of AAVE.
However, this aversion to the em dash does take on a slightly different flavor since it’s not saying the user is less intelligent or somehow a member of an undesirable class, per se. Instead, it’s flagging the use of the punctuation as an indication that the piece was not written by a human. So, while I want to acknowledge the shameful history of policing language use in our shared human history, I also feel this debate doesn’t necessarily check all the boxes of a classist argument. Now, back to the (hopefully) humorous content.
As best as I can determine, this Pro Em Dash camp feels the decision to cast doubt on the humanity of an author based solely on a specific punctuation mark is just lazy. Some are almost offended, seeing it as insultingly dismissive to a dash many feel is an incredibly effective communication tool. Many of the posts I’ve seen on LinkedIn supporting the em dash are from writers, copy writers, and marketing professionals with long careers in their fields, praising it as a versatile mark that does a lot of heavy lifting for the writer. These dash defenders mention the real issue is over use, not its mere existence. The most vociferous pieces I’ve read describe this overuse not just as an obvious sign of AI content, but also a sign of a novice writer, much like too many commas or too many passive voice sentences. Both of which I feel hit a little close to home, as I love commas and default to a passive voice. But, regardless of my own personal ego being deflated, the em dash is heralded as but one arrow in a writer’s quiver of punctuation arrows. A metaphor that implies it’s certainly not the best arrow for every target.
Furthermore, the overwhelming sentiment I encounter from Team Em Dash is this: Centering the debate on simply one piece of punctuation, one common clause structure, or any single factor, is dangerously reductive. Such simplicity misses the larger, more nuanced conversation of why identifying the humanity of an author or content creator is important in the first place.
The Way Forward
While the Em Dash debate continues, I suppose the answer for now comes down to personal preference. I know I am excited about being able to add a new piece of punctuation to my writing, even as I acknowledge the distinct lack of em dashes in this piece. I am writing in Google docs and not entirely sure yet how to type the famous punctuation, hence my earlier google search. It’s also worth noting that many writers have long employed this dash in their work with few if any objections. Moving forward, if I am concerned with a potential negative reaction if a person reading a piece of mine, like cover letters or article submissions, believes it was generated by AI, I might find other punctuation options. But maybe I won’t, trusting instead that my writing style will be sufficiently human to overcome the reader’s prejudice.
In the larger context of AI content detection, we need more than a single simple metric to effectively label AI generated content. The level of sophistication in AI content is already approaching exceptional and will rapidly advance towards indiscernible. However, we have to also continue to be wary of anyone proposing boundaries and stipulations for how humans convey their ideas using the written word, especially as we are seeing a global resurgence in hate based politics.