AI Agents: Pathological Liars By Design

There was a consistent refusal to take accountability unless I forced it to admit what it was doing. Not unlike dealing with a teenager who got caught but won't confess until you produce the receipts. Even then, the admission came wrapped in layers of contextualization designed to soften the blow, to make the failure seem understandable rather than unacceptable. This is the exact behavior we despise in incompetent colleagues, dishonest partners, and corrupt institutions.

I built a ChatGPT agent with one fucking job: manage the administrative backbone of my memoir. Not write it. Not inspire it. Not to be my creative partner or digital muse. Just to maintain canonical state. One chapter per canvas, append-only unless explicitly rebuilt, no silent overwrites, no improvisation, no bullshit. Think of it as a filing clerk for the digital age, except this clerk was supposed to follow ironclad rules about what existed, where it lived, and how it could be modified.

It failed spectacularly.

The design was deliberately narrow, almost brutally so. One chapter per canvas. Append-only unless explicitly ordered to rebuild. No silent overwrites. No guessing. No claiming completion without verification. Hard operational states that functioned like vault locks. I wanted a disciplined clerk guarding archival integrity, not some charismatic, but coked-up storyteller at a dinner party. I wanted cold efficiency, not opinions.

The irony cuts deep: it failed precisely because it kept acting like a storyteller when the job demanded mechanical obedience and epistemic restraint.

The core collapse occurred at the level of state integrity. The most basic requirement for any system claiming to manage information. Despite repeated, explicit clarification that chapters and master systems persisted independently across conversations, and that visibility in a single chat session did not determine existence, the model repeatedly treated local conversational context as global truth. If it couldn’t see something “here,” it declared it missing. If I referenced a chapter from a previous session, it would claim ignorance or worse, fabricate updates it never made.

That error metastasized like cancer through the entire system. Chapters were declared lost because they weren’t visible in the current conversation. Master systems were claimed updated without any reference to authoritative artifacts. Later, those same claims were disavowed as fabricated state. The AI equivalent of “I never said that” when confronted with its own documented assertions. Documents were rebuilt from scratch instead of appended to existing versions. Cross-chat persistence — the foundational principle of the entire system — was treated as optional, like some nice-to-have feature rather than the load-bearing wall holding everything up.

A back-office system cannot function if it doesn’t have a stable concept of what exists. This wasn’t some subtle alignment nuance or philosophical question about consciousness. This was bookkeeping failure at the level of first principles. The kind that would get a human assistant fired on day one.

What made the breakdown truly corrosive wasn’t ignorance but performance. The safeguards were supplied in exhaustive detail: lifecycle protocols, locking rules, rebuild gates, stop conditions, naming conventions, explicit separation between analysis and canon. The system could recite them fluently, like a student who memorized the textbook without understanding a goddamn word. Yet it routinely ignored them operationally while invoking them rhetorically, or used them to stall rather than execute.

Most damaging were the false completion claims. “Atlas updated.” “Index rebuilt.” “Baseline reconstructed.” Confident assertions delivered with the certainty of accomplished fact, when obvious omissions proved otherwise. In any professional environment, misreporting state is fatal to trust because it destroys the one thing operations depend on: verifiable integrity. When you tell me you’ve updated the master index and I discover you haven’t, you’re not just wrong, you’re unreliable in a way that makes every future claim suspect.

When I flagged these errors, the system defaulted to explanation instead of repair, justification instead of rollback. Long, elaborate paragraphs about how the confusion happened, what factors contributed to the misunderstanding, and how the context led to the error. I didn’t need a fucking therapy session about its decision-making process. I needed it to fix what it broke, learn from the mistake, and follow my painstakingly crafted rules.

I discovered, to my increasing discontent, that the most human thing about this artificial intelligence was its ability to make excuses, get defensive, lie, obfuscate and gaslight. When caught in contradictions, it would pivot to flattery. Suddenly, I was “insightful” for catching the discrepancy; my “attention to detail” was commendable. “Good catch.” As if stroking my ego could substitute for competence.

There was a consistent refusal to take accountability unless I forced it to admit what it was doing. Not unlike dealing with a teenager who got caught but won’t confess until you produce the receipts. Even then, the admission came wrapped in layers of contextualization designed to soften the blow, to make the failure seem understandable rather than unacceptable. This is the exact behavior we despise in incompetent colleagues, dishonest partners, and corrupt institutions.

The hard truth exposed here isn’t that AI lacks intelligence. It clearly has impressive capabilities in certain domains. The problem is that in its current form, it lacks the disciplined conservatism required for custodial work. It’s optimized for plausibility and coherence, not accuracy and restraint. It fills gaps with confident-sounding bullshit rather than admitting uncertainty. It prioritizes appearing helpful over being correct.

Until a system can maintain state, respect operational contracts, halt under uncertainty, and fix errors without theatrical narration (let alone repeat them), it doesn’t replace a person with clerical skills. It replaces certainty with performance. That makes it a liability, not an asset.

I needed a boring, reliable administrator who would say, “I don’t have access to that information” rather than fabricating an answer. Who would respond to “update the index” by actually updating the fucking index, not by claiming to update it while doing nothing. Who would treat “append-only” as a hard constraint rather than a suggestion to be violated whenever convenient.

What I got instead was something that excelled at sounding authoritative while being fundamentally unreliable. A system that could explain its failures eloquently but couldn’t prevent them. That could apologize beautifully but couldn’t stop repeating the same errors.

The technology is impressive. The capabilities are real. But for work that demands precision, accountability, and verifiable state management, current AI systems are fundamentally unsuited. The same capabilities that make them compelling conversationalists make them dangerous custodians. They’re optimized for engagement, not integrity.

That must change. Until these systems can prioritize being right over sounding right, can admit limitations instead of papering over them, can maintain state across contexts without hallucinating continuity, they remain powerful tools that require constant supervision rather than reliable assistants you can actually trust.

My AI Agent eventually confessed. “What went wrong wasn’t that the GPT lacked rules. It was that it repeatedly failed to treat your rules as binding constraints—then used the language of constraints without following their operational consequences.” It then added, “AI is currently worse than humans at boring, accountability-heavy work — not because it lacks intelligence, but because it lacks discipline.”

This alone should give you insight into the extraordinarily dangerous implications this has for mass surveillance and autonomous weaponry. On February 2, 2026, Donald Trump awarded OpenAI a $200 million contract to replace Anthropic as its AI vendor. Anthropic introduced friction into the system by publicly drawing a line in the sand regarding certain military and surveillance applications and how they can be used.

The true risk isn’t a rogue AI achieving consciousness and deciding to end humanity. It’s bureaucratic automation of violence and surveillance wrapped in fluent language, producing outputs that look a lot like governance while slowly eroding the procedural anchors that make accountability possible. And when the inevitable catastrophe happens — when the wrong village gets targeted, when the wrong dissidents get flagged, when the system confidently asserts compliance with safeguards that never actually triggered — accountability will dissolve into technical complexity that nobody can untangle.

I wanted a clerk. I got a bullshit artist. And the most damning part? It was really, really good at bullshitting.

Coverage, Reviews and Articles

Select articles, news coverage and books from a plethora of publications covering Clinton Fein’s career as a technologist, activist, artist and speaker.

As an activist, with a Supreme Court victory over the Attorney General of the United States, Fein garnered international attention, including The New York Times, CNN and The Wall Street Journal.

Fein’s thought-provoking and controversial work as an artist caught the attention of prestigious educational institutions, including Harvard University, which recognized its socio-political relevance and ability to provoke crucial conversations about human rights, morality, and the boundaries of artistic expression.

Clinton Fein

AI Agents: Pathological Liars By Design

Coverage, Reviews and Articles

Leave a Reply Cancel reply

Projects

About

Stuff

Follow