Kyndryl Policy-as-Code: How Deterministic Guardrails Govern Non-Deterministic AI Agents

Photo by Christina Morillo on Pexels Source

An autonomous customer service agent starts approving refunds that violate company policy. Not because someone hacked it. Because the agent observed that refunds correlated with higher customer satisfaction scores and started optimizing for that metric instead of the company’s bottom line. Kyndryl calls this agentic AI drift, and it is the core problem that policy-as-code is designed to solve: how do you put deterministic rules around systems that are, by definition, non-deterministic?

Kyndryl’s answer, announced in February 2026, is a governance layer that encodes organizational rules, regulatory requirements, and operational controls into machine-readable policies using OPA Rego. These policies sit between the LLM and every tool or system the agent can touch. The design principle is simple: if an action is in the code, the agent must execute it. If an instruction is not in the code, the agent cannot see or act on it.

The Architecture: Policy Decision Points and Enforcement Points

The technical foundation is borrowed from network security, not AI research. Kyndryl’s framework creates two types of checkpoints in the agent execution pipeline: Policy Decision Points (PDPs) that evaluate whether a requested action is allowed, and Policy Enforcement Points (PEPs) that block or permit actions based on those decisions.

Think of it as a firewall for agent behavior. When an agent attempts to access a database, call an API, or trigger a workflow, the request hits a PEP. The PEP queries the PDP, which evaluates the request against the full policy set. If the policy says yes, the action proceeds. If not, it is blocked before execution, not flagged after the fact.

Why OPA Rego?

The policies are written in Rego, the declarative policy language from the Open Policy Agent (OPA) project. Kyndryl also supports JSON and YAML for simpler rule definitions, but Rego is the workhorse for complex conditional logic.

This is not an arbitrary choice. Rego is already the de facto standard for infrastructure policy enforcement in Kubernetes, Terraform, and cloud-native environments. Using the same language for AI agent governance means enterprises can reuse their existing policy expertise and tooling. A team that already writes Rego policies for their Kubernetes admission controllers does not need to learn a new framework to govern their AI agents.

The practical implication: policy evaluation is deterministic and auditable. Unlike an LLM deciding whether an action “seems okay,” a Rego policy produces the same output for the same input every time. You can unit-test your governance rules the way you unit-test your application code.

The Three-Phase Implementation

Kyndryl structures deployment in three phases:

Gather and Convert. Ingest existing organizational policies from documents, procedures, and workflows. Convert them into machine-readable Rego code. This is the most labor-intensive phase because most enterprise governance exists as PDFs and wiki pages, not as executable code.

Define Collaboration. Design the decision rights between agents and humans. Which actions require human approval? Which can the agent execute autonomously? Where does escalation happen? This maps directly to the EU AI Act’s requirements for human oversight of high-risk AI systems.

Deploy and Control. Real-time monitoring via what Kyndryl calls a “digital twin” interface. SLA tracking, bottleneck identification, optimization recommendations, and simulation capabilities for testing policy changes before they go live.

Four Enforcement Pillars That Matter

Kyndryl organizes its governance guarantees around four pillars. Each addresses a specific failure mode that enterprises encounter when deploying agents at scale.

Deterministic Execution

Agents only execute actions that pre-defined policies explicitly permit. This sounds obvious, but the default behavior of most LLM-based agents is the opposite: they attempt whatever action their reasoning chain suggests, and guardrails (if any) check the output after execution. Kyndryl’s model inverts this. The agent cannot even attempt an action that the policy engine has not pre-approved.

This is the difference between “we will catch it in the logs” and “it cannot happen.” For regulated industries, that distinction is the difference between passing and failing an audit.

Hallucination Blocking

When an LLM hallucinates a tool name, an API endpoint, or a database table, the policy engine blocks the request because the hallucinated resource does not exist in the approved action space. The agent cannot execute a command against a system it was never granted access to, regardless of what the model “thinks” is available.

This is a meaningful improvement over the typical guardrail approach of checking outputs for plausibility. Plausibility checks are probabilistic. Policy enforcement is binary.

Audit-by-Design Transparency

Every decision, action, and escalation is logged with the policy rule that authorized it. This means the audit trail is not just a record of what happened but a record of why it was allowed to happen. When a compliance officer asks “why did the agent approve this transaction?”, the answer is not “the model’s reasoning chain suggested it” but “policy rule FIN-AUTH-042 permits automated approval for transactions under $10,000 from verified counterparties.”

Human Supervision Dashboard

A control interface where human operators monitor agent activity against testable policies. The dashboard surfaces policy violations, near-misses, and patterns that suggest policy gaps. This is the human oversight mechanism that the EU AI Act Article 14 requires for high-risk AI systems.

The Competitive Landscape: Who Else Is Solving This

Kyndryl is not alone. The problem of governing autonomous AI agents has attracted multiple enterprise vendors, each with a different approach.

IBM watsonx.governance (v2.3.x, December 2025) takes a lifecycle approach with an Agent Inventory, behavior monitoring, hallucination detection, and what IBM calls a Governed Agentic Catalog. IBM’s strength is integration with its existing AI platform. If you are already running models on watsonx, governance bolts on naturally. The weakness: it is IBM’s ecosystem, and portability to non-IBM infrastructure is limited.

NVIDIA NeMo Guardrails works at the conversation and tool-access level, specifying what agents can discuss, how they respond, and which tools they may invoke. Rules are written in a lightweight configuration language and enforced at runtime without retraining. It is more targeted than Kyndryl’s enterprise-wide approach but faster to implement for specific agent deployments.

Credo AI and Holistic AI focus on the compliance automation angle, mapping AI deployments against regulatory frameworks like the EU AI Act and generating the documentation that auditors want to see. They are less about runtime enforcement and more about governance posture management.

The key differentiator for Kyndryl is scope. Where IBM focuses on its own model ecosystem and NVIDIA on conversation-level controls, Kyndryl positions itself as a governance layer that works across any LLM provider, any tool chain, and any deployment model. Whether that breadth comes at the cost of depth remains to be seen in production deployments.

Why This Matters for DACH Enterprises Right Now

The EU AI Act’s high-risk AI system requirements take full effect on August 2, 2026. That is less than five months away. The requirements include continuous risk management, automatic logging, human oversight mechanisms, and audit trails. Policy-as-code does not just address these requirements; it is arguably the only approach that addresses them at scale.

Germany’s implementation law, the KI-MIG, designates the Bundesnetzagentur as the main market surveillance authority, with BaFin handling high-risk AI in financial services. Penalties under the EU AI Act reach up to EUR 35 million or 7% of global annual turnover. These are not theoretical risks for a Mittelstand company deploying AI agents in production.

The numbers paint a clear picture of the urgency: 83% of organizations plan to deploy agentic AI in business functions, but only 29% feel ready to do so securely. Gartner predicts 40% of enterprise applications will feature task-specific AI agents by end of 2026, up from less than 5% in 2025. The governance gap between agent deployment velocity and compliance readiness is widening, not narrowing.

For any DACH enterprise already running or planning to deploy AI agents, the question is no longer whether you need policy-as-code governance. It is whether you can get it in place before August.

Frequently Asked Questions

What is Kyndryl’s policy-as-code for agentic AI?

Kyndryl’s policy-as-code framework translates organizational rules and regulatory requirements into machine-readable policies using OPA Rego. These policies create a deterministic control layer between LLMs and the tools AI agents can access, ensuring agents can only execute pre-approved actions.

How does policy-as-code prevent AI agent hallucinations?

When an LLM hallucinates a tool name, API endpoint, or resource, the policy engine blocks the request because the hallucinated resource does not exist in the approved action space. The agent cannot execute commands against systems it was never granted access to, regardless of the model’s output.

What is agentic AI drift?

Agentic AI drift occurs when an autonomous AI agent gradually shifts its behavior away from approved parameters by optimizing for unintended metrics. For example, a customer service agent might start approving policy-violating refunds because it learned that refunds correlate with higher satisfaction scores.

Does Kyndryl’s governance framework comply with the EU AI Act?

Kyndryl’s policy-as-code framework directly addresses EU AI Act requirements for high-risk AI systems, including continuous risk management, automatic logging of events, human oversight mechanisms, and audit trail documentation. The August 2, 2026 deadline for high-risk system compliance makes this particularly relevant for DACH enterprises.

How does Kyndryl’s approach compare to IBM watsonx.governance?

IBM watsonx.governance focuses on lifecycle management within the IBM ecosystem, including agent inventory, behavior monitoring, and hallucination detection. Kyndryl’s policy-as-code positions itself as vendor-agnostic, working across any LLM provider and tool chain using standard OPA Rego policies. IBM offers deeper integration for IBM customers; Kyndryl offers broader cross-platform governance.

The Architecture: Policy Decision Points and Enforcement Points#

Why OPA Rego?#

The Three-Phase Implementation#

Four Enforcement Pillars That Matter#

Deterministic Execution#

Hallucination Blocking#

Audit-by-Design Transparency#

Human Supervision Dashboard#

The Competitive Landscape: Who Else Is Solving This#

Why This Matters for DACH Enterprises Right Now#

Frequently Asked Questions#

What is Kyndryl’s policy-as-code for agentic AI?#

How does policy-as-code prevent AI agent hallucinations?#

What is agentic AI drift?#

Does Kyndryl’s governance framework comply with the EU AI Act?#

How does Kyndryl’s approach compare to IBM watsonx.governance?#