Photo by Max Vakhtbovych on Pexels (free license) Source

A reversibility check is a single gate that every agent action must pass: can this be undone? If yes, proceed. If no, stop and ask a human. This one pattern, applied consistently, prevents the most expensive class of agent failures in production: silent rework loops where agents redo work endlessly without anyone noticing, burning tokens, API calls, and trust.

Two LangChain agents in a research pipeline entered an infinite conversation loop in November 2025. One analyzed, the other verified, and they ping-ponged requests for 11 days before a human noticed. The bill: $47,000. No alert fired. No dashboard flagged the loop. The agents were doing exactly what they were told, just endlessly.

Reversibility checks would have caught this at the source. Not by monitoring cost after the fact, but by forcing each action through a classification gate before execution.

Related: AI Agent Production Issues in 2026: Reliability, Hallucinated Actions, and the Monitoring Gap

What Reversibility Checks Actually Are

The core idea borrows from database transactions. Every action an agent can take gets classified into one of four tiers before it runs:

Read-only actions (query a database, fetch a webpage, list files) are safe to retry freely. They change nothing. Let the agent run these without restriction.

Reversible actions (create a file, insert a database row, add a calendar event) can be directly undone by deleting or reverting. The agent proceeds, but the system logs a paired undo operation.

Compensatable actions (send a Slack message, update a CRM record, modify a config) cannot be truly undone, but a follow-up action can correct the state. Sent a wrong notification? Send a correction. Updated a record incorrectly? Update it again. The system logs the compensation path.

Irreversible actions (delete production data, send an email to a customer, execute a financial transaction, revoke access credentials) cannot be undone or compensated. These require human approval before execution, period. No confidence threshold overrides this.

IBM Research built the most rigorous implementation of this pattern with their STRATUS system for autonomous cloud operations. STRATUS uses what they call Transactional-No-Regression (TNR): every action gets a paired undo operator, irreversible changes are blocked entirely, and if conditions worsen after an action, the whole transaction aborts and reverts. On the AIOpsLab and ITBench benchmarks, STRATUS outperformed state-of-the-art systems by at least 150%.

The key insight from their research: agents with undo capabilities “seemed to perform better with each new attempt” rather than getting stuck in loops. When an agent knows it can safely try and revert, it explores solutions more effectively than an agent that hesitates or repeats the same failing approach.

The Classification Happens Before Execution, Not After

This matters more than it sounds. Most agent frameworks apply safety checks reactively: the agent acts, something goes wrong, a monitor flags it. Reversibility checks flip this. The classification is a pre-execution gate.

OpenAI built this into ChatGPT’s agent mode directly. As their research lead Isa Fulford described it: before the agent does anything “irreversible,” like sending an email or making a booking, it asks for permission first. This is not a fallback. It is the primary control mechanism.

Anthropic’s agent design framework takes the same position: agents must ask for approval before taking irreversible actions. The framing is identical across both companies because the failure modes that motivated it are identical.

Related: Human-in-the-Loop AI Agents: When to Let Agents Act and When to Hit Pause

Silent Rework Loops: The Failure Mode Nobody Monitors

Rework loops are what happens when an agent retries, regenerates, or re-executes work without making forward progress. They are “silent” because most monitoring systems cannot distinguish between an agent doing useful work and an agent spinning in circles.

How Loops Form

Four root mechanisms drive rework loops, as documented by MatrixTrak:

Incomplete termination states. The agent has no clear definition of “done.” It keeps refining, re-checking, or re-validating because nothing signals that the output is sufficient. The research pipeline that ran for 11 days had this exact problem: the Analyzer kept finding things to analyze, the Verifier kept finding things to verify.

Retry amplification. Multiple retry layers (HTTP client, tool wrapper, agent policy) compound independently. A single API timeout triggers a client retry, which triggers a tool retry, which triggers an agent retry. Under load, this creates exponential retry behavior. One team reported a data enrichment agent that generated 2.3 million API calls over a weekend because it misinterpreted an error code as “try different parameters.”

Unmapped failure classes. The agent retries actions that will never succeed: authentication failures, validation errors, permission denials. Without classification, the agent treats every error as transient. It retries a 403 Forbidden the same way it retries a 503 Service Unavailable, wasting cycles on errors that need escalation, not repetition.

Non-idempotent side effects. Each retry creates new side effects: duplicate emails sent, duplicate tickets created, duplicate database entries. The agent does not know that its previous attempt partially succeeded. It starts from scratch, and the world gets messier with each pass.

The Cost Is Concrete

An IDC survey found 92% of organizations deploying agentic AI experienced unexpected cost overruns. Maxim AI’s production analysis showed multi-agent systems see 2-5x token cost increases versus single-agent setups, with coordination latency adding 100-500ms per agent handoff. Single unbounded agents burn roughly $300/day when uncontrolled, which scales to $100K+ annually per agent.

But cost is the symptom. The real damage is that rework loops produce subtly degraded outputs. The agent calls the right tool with slightly wrong parameters, gets a partial result, and proceeds as if it succeeded. No crash, no error, just quietly wrong results flowing downstream.

Related: AI Agent Compute Waste: Why Your Agents Burn 60% of Their Budget on Nothing

How to Implement Reversibility in Your Agent Stack

Five concrete patterns cover the implementation. You do not need all five on day one, but you need at least the first two.

Pattern 1: Action Classification Registry

Build a registry that maps every tool your agent can call to a reversibility tier. This is a static configuration, not a runtime decision:

ACTION_REGISTRY = {
    "query_database": {"tier": "read_only", "max_retries": 5},
    "create_record":  {"tier": "reversible", "undo": "delete_record"},
    "update_record":  {"tier": "compensatable", "compensate": "update_record"},
    "delete_records": {"tier": "irreversible", "requires_approval": True},
    "send_email":     {"tier": "irreversible", "requires_approval": True},
    "drop_table":     {"tier": "irreversible", "requires_approval": True},
}

Before the agent executes any tool call, the orchestration layer checks this registry. Read-only actions pass through. Reversible and compensatable actions pass through with logging. Irreversible actions halt and escalate.

Pattern 2: Loop Detection via Fingerprinting

Track a fingerprint of the last tool call plus its result hash. When the same fingerprint appears three or more times consecutively, terminate the loop. This gives you deterministic termination independent of model behavior:

def check_loop(history, max_repeats=3):
    if len(history) < max_repeats:
        return False
    fingerprint = (history[-1]["tool"], history[-1]["result_hash"])
    return all(
        (h["tool"], h["result_hash"]) == fingerprint
        for h in history[-max_repeats:]
    )

This catches the exact failure mode that cost $47,000: agents repeating the same call/response cycle without anyone noticing.

Pattern 3: Saga Rollbacks with Compensation

Borrow the saga pattern from distributed systems. Each step in a workflow records its completion and defines a compensation action. On partial failure, walk backward executing compensations:

workflow = [
    Step("write_to_db", compensate="delete_record"),
    Step("send_notification", compensate="send_correction"),
    Step("update_dashboard", compensate="revert_dashboard"),
]

If step 3 fails, the system automatically calls send_correction and delete_record in reverse order. The agent does not decide whether to retry. The orchestration layer handles rollback deterministically.

Pattern 4: Bounded Retry Policies by Error Class

Not all errors deserve retries. Classify errors and assign retry budgets by class:

Error ClassRetriesAction
Validation errors (400)0Stop immediately
Auth/permission (401, 403)0Escalate to human
Rate limits (429)3Exponential backoff + jitter
Transient failures (500, 503)2Retry, then escalate
Safety blocks0Stop and log

This eliminates the “retry everything” default that causes amplification. An agent that retries a 403 is wasting cycles. An agent that retries a 429 three times with backoff is being practical.

Pattern 5: Token and Cycle Budget Guardrails

Set hard limits on total tokens consumed and reasoning cycles executed per workflow. Kill runaway loops before resource exhaustion, not after:

MAX_TOKENS_PER_WORKFLOW = 50_000
MAX_CYCLES_PER_WORKFLOW = 25

if workflow.total_tokens > MAX_TOKENS_PER_WORKFLOW:
    workflow.abort("Token budget exceeded")
if workflow.cycle_count > MAX_CYCLES_PER_WORKFLOW:
    workflow.abort("Cycle limit reached")

This is the safety net that catches everything else. Even if loop detection misses a novel pattern and the saga system has no compensation defined, the budget guardrail still terminates the workflow before it becomes expensive.

Frameworks That Support Rollback Natively

You do not need to build everything from scratch. Several frameworks now have built-in support for these patterns:

LangGraph (LangChain) offers graph-based workflows with error edges, compensating actions, and checkpoint-based rollback. Used in production by Klarna, Replit, and Elastic. Its state graph model maps naturally to the saga pattern: each node is a step, each edge can be an error handler or compensation route.

Strands Agents SDK provides a hook-based architecture with BeforeToolCallEvent and AfterToolCallEvent hooks. You implement circuit breakers, validation gates, and budget guardrails as event handlers that intercept tool calls before they reach the external system.

IBM STRATUS is the research reference implementation. It uses command simulation before execution, write locks, transaction limits, and checkpoint-based recovery. Not open-source yet, but the published architecture is detailed enough to replicate the core patterns.

Rubrik Agent Rewind takes a different approach as a commercial product for enterprise rollback. It sits outside the agent framework entirely, providing surgical rollback of agent actions across infrastructure. Useful when you cannot modify the agent itself but need a safety net around it.

The Partnership on AI, a collaboration between Stanford, CMU, OpenAI, and Microsoft, published a classification framework for agent risks based on stakes, reversibility, and affordances. Their finding: agents at autonomy levels 3-5 introduce “new, compounding failure modes by acting autonomously across multiple steps.” Reversibility checks are their primary recommended mitigation.

Related: AI Agent Testing: How to QA Non-Deterministic Systems

Frequently Asked Questions

What is a reversibility check for AI agents?

A reversibility check classifies every action an AI agent can take into one of four tiers (read-only, reversible, compensatable, irreversible) before execution. Read-only and reversible actions proceed freely. Compensatable actions proceed with a logged correction path. Irreversible actions require human approval. This pre-execution gate prevents agents from taking actions that cannot be undone without oversight.

What are silent rework loops in AI agents?

Silent rework loops occur when AI agents repeatedly redo work without making forward progress, and without the monitoring system detecting the problem. The agent retries failed actions, regenerates outputs, or re-executes steps in a cycle that burns tokens and API calls while producing no incremental value. They are “silent” because most dashboards cannot distinguish between productive work and circular repetition.

How do you implement rollback for AI agent actions?

The saga pattern from distributed systems works well. Each step in an agent workflow records its completion and defines a compensation action. On partial failure, the system walks backward executing compensations automatically. For example, if a workflow writes to a database, sends a notification, then fails on the third step, the system sends a correction message and deletes the database record in reverse order.

Which frameworks support AI agent reversibility checks?

LangGraph supports checkpoint-based rollback and error edges in its graph-based workflow model. Strands Agents SDK provides BeforeToolCallEvent and AfterToolCallEvent hooks for implementing pre-execution gates. IBM’s STRATUS system demonstrates Transactional-No-Regression with paired undo operators. Rubrik Agent Rewind offers commercial rollback as an external safety net around any agent framework.

How much do AI agent rework loops cost?

Costs vary widely, but documented cases include a $47,000 bill from two LangChain agents looping for 11 days and a data enrichment agent generating 2.3 million API calls over a weekend. IDC found 92% of organizations deploying agentic AI experienced unexpected cost overruns. Unbounded single agents can burn roughly $300 per day ($100K+ annually) when running without termination controls.