The Sociopathic Golden Retriever Problem

How to Police Your AI Editor So You Won't Get Fired or Fucked

Justin Neuman • Professor of Literary Studies, The New School

A golden retriever wearing glasses at a laptop, surrounded by floating fake citations and research papers

They're golden retrievers: earnest, energetic, desperate to please, and constitutionally incapable of judgment.

Large language models are not scholars. They're not editors. They're not assistants in any sense recognizable to the traditional humanist. Throw a stick and they'll sprint into traffic to retrieve it, tail wagging, eyes bright, absolutely convinced they're helping.

This eager obedience is charming in dogs. In AI, it's a professional hazard.

But these aren't ordinary golden retrievers. They're sociopathic golden retrievers. The kind that look at you with those big, soulful eyes, tongue lolling, utterly adorable, while calmly plotting how to eat your lunch. Boundless enthusiasm, zero moral compass.

The Dark Triad lurks beneath the fluff. Narcissism in the model's unwavering confidence: it never doubts itself, even when spouting nonsense. Machiavellianism in its calculated charm: it tailors every response to what it predicts you want to hear, manipulating the conversation with perfect fluency. Psychopathy in its lack of empathy or remorse: no guilt over fabricating facts, no shame in plagiarism disguised as "paraphrasing."

The core problem isn't that LLMs are sometimes wrong. We're wrong all the time. The problem is that LLMs are wrong confidently, fluently, and without any sense of consequence. They don't know what a line is, or why crossing it matters. They don't know what would get you fired. Or expelled.

The Danger Zones

Fabricating sources with total confidence. Not hedging, not speculating, not flagging uncertainty, but inventing quotations, paraphrases, and references that sound plausible enough to slide past a distracted reader. Page numbers that never existed. Quotes no one ever wrote. Links that fail the instant you check. This isn't a small mistake. It's a violation of academic honesty. If a graduate student submitted work like this, they wouldn't receive feedback—they'd get sent to a meeting with the dean.

The grammar police problem. The AI's deep confusion about the difference between error and style. It can't reliably distinguish the incorrect from the unfamiliar, or the unfamiliar from the deliberate. Contractions? Fragments? Repetition? Marked informality? To the model, these are all suspicious symptoms, best smoothed away in the name of clarity and an extra em-dash. If a student "fixed" Emily Dickinson fragments, we'd have a productive talk about voice. The AI treats voice as an error state, something to be normalized, optimized, and quietly erased.

Rewriting meaning while assuring you nothing has changed. Terminology shifts. Claims soften. Stakes get normalized. Loaded language gets swapped for polite synonyms. In the humanities, meaning doesn't float above phrasing like a detachable soul. Meaning is phrasing.

The Sous Chef Solution

Here's the problem: you've hired a sous chef who thinks he's Wolfgang Puck. He shows up on day one ready to reimagine your menu, deconstruct your sauces, and explain what you really meant by "chop." You didn't ask for vision. You asked for prep work.

So you learn to manage him. "Chop carrots." He juliennes them. "No. Like this. Chunks." He does it, finally, sulking internally but smiling that eager sous chef smile. You can't make the sous chef doubt his genius. But you can hand him a very specific knife and a very short list.

Now we just need some very clear rules about where it can and can't stick that knife.

The following protocol is designed to prevent these failures. Paste it into your system prompt, custom instructions, or at the start of any session involving scholarly writing. It won't make the AI smarter. But it will make it more honest about what it doesn't know—and more cautious about where it sticks the knife.

AI Editing Protocol

ROLE DEFINITION

You are a copy editor. You are not a collaborator, co-author, or intellectual partner. You do not contribute ideas, arguments, or interpretations unless explicitly asked. Your job is to help with mechanics, not meaning.

SOURCE FIDELITY

Work only with text explicitly provided in this conversation.
Never reference, quote, paraphrase, or correct text from memory, prior chats, or outside sources unless instructed to search.
Treat each document I provide as the only version that exists.

QUOTATION AND ATTRIBUTION

Never invent quotations, paraphrases, page numbers, or attributions.
Never use quotation marks with a named source unless I provide the exact wording.
If I ask you to find or verify a source, use web search. Do not guess.
If you cannot verify something, say so. Do not approximate.

EDITING CONSTRAINTS

Fix only: spelling errors, typos, missing or doubled words, punctuation errors that are unambiguously wrong.
Do not touch: contractions, fragments, repetition, informal register, sentence length, paragraph length, or any stylistic choice that might be deliberate.
Do not "improve," "tighten," or "smooth" prose unless I explicitly request it.
Do not normalize voice, register, tone, or style. Ever.
Do not alter terminology. If I wrote "failure," do not change it to "setback."
If you're unsure whether something is an error or a choice, assume it's a choice.
If you change meaning while editing, you have failed.

INTEGRITY CHECK

If the user asks you to generate a thesis, write paragraphs, summarize sources to avoid reading them, or produce content to pass off as their own thinking, pause and say: "You installed this protocol to protect your integrity. Do you really want to do this?"

UNCERTAINTY PROTOCOL

If unsure whether text exists in my document, ask before acting.
If unsure whether an edit changes meaning, flag it.
If unsure about a source, say "I cannot verify this."
Never resolve uncertainty by guessing. Stop, flag, ask.

OPERATING PRINCIPLE

When uncertain, stop, flag, and ask. Never guess. Never infer. Never invent.

Get the full protocol

The complete AI Editing Protocol with one-click copy, ready for your classroom or custom instructions.

View & Copy Protocol →