NVIDIA’s GTC 2026 keynote announced the Agent Toolkit, but the two components that matter most for developers building production agents got buried under the GPU headlines: NemoClaw and AgentIQ. NemoClaw takes OpenClaw, the open-source agent runtime that had 24,478 internet-exposed instances and a critical RCE vulnerability, and wraps it in kernel-level sandboxing, policy enforcement, and a privacy router. AgentIQ is an open-source profiler and observability toolkit that traces every tool call, token, and latency spike across multi-agent workflows, regardless of which framework you use. Together, they form an open agent development platform that solves the two problems that have blocked enterprise agent adoption: security and visibility.
NemoClaw: OpenClaw with Enterprise Security Baked In
OpenClaw became the default open-source agent runtime in 2025, but its security story was a disaster. Self-evolving agents running with full network access, no filesystem isolation, and API keys stored on disk. The CVSS 8.8 RCE vulnerability discovered in early 2026 was not a surprise to anyone paying attention.
NemoClaw fixes this at the infrastructure layer, not the application layer. That distinction matters. Application-level guardrails can be bypassed through prompt injection. Infrastructure-level containment cannot, because the agent process literally lacks the system capabilities to escape its sandbox.
Three Security Layers That Actually Work
NemoClaw wraps every OpenClaw agent in three controls, each enforced below the application:
Kernel-level sandbox (deny-by-default). The OpenShell runtime creates an isolated container for each agent. Filesystem access is locked at container creation. No agent can modify the host filesystem, read other agents’ data, or escalate privileges. This is not a Docker container with default permissions; it is a purpose-built sandbox with least-privilege access controls enforced at the kernel level.
Out-of-process policy engine. Security policies are written in YAML and enforced by a separate process that compromised agents cannot override. An administrator can permit a specific agent to call api.openai.com while blocking every other network endpoint. Policies are hot-swappable, meaning you can tighten or loosen constraints without redeploying agents. A typical policy looks like this:
network:
default: deny
allow:
- api.openai.com:443
- api.anthropic.com:443
filesystem:
default: deny
allow:
- /workspace/data:read
- /workspace/output:write
Privacy router. This is the most architecturally interesting piece. The router sits between the agent and its model backends, deciding which queries go to local Nemotron models running on your hardware and which route to cloud frontier models (Claude, GPT-5). Sensitive data, customer records, financial data, PII, stays on the local model. Complex reasoning tasks that need frontier-level capabilities get routed to the cloud, but with the sensitive context stripped.
This is not just a security feature. It is a cost optimization feature. NVIDIA reports that hybrid routing between local Nemotron models and cloud frontier models cuts query costs by more than 50% compared to routing everything through cloud APIs.
Getting Started with NemoClaw
Installation is a single command:
curl -fsSL https://nvidia.com/nemoclaw.sh | bash
The script installs the OpenShell runtime and Nemotron models, then runs a guided setup wizard to create your first sandboxed agent environment. NemoClaw is currently in early-access alpha, so expect rough edges. But the architecture is sound, and the fact that it is hardware-agnostic (runs on NVIDIA, AMD, and Intel hardware) removes the vendor lock-in concern.
AgentIQ: Profiling and Observability for Multi-Agent Pipelines
Security gets you to “safe enough to deploy.” But production agent systems also need observability. When a multi-agent pipeline takes 45 seconds instead of 5, where is the bottleneck? When your monthly API bill spikes 3x, which agent is responsible? When an agent chain produces a wrong answer, which step in the chain failed?
AgentIQ (now part of the NeMo Agent Toolkit) answers all three questions. It is an open-source library that treats agents, tools, and workflows as composable function calls and instruments every one of them.
The End-to-End Profiler
The AgentIQ profiler tracks input/output tokens and execution timings at every node in your agent pipeline. If you have a five-agent chain where Agent 3 calls two tools and Agent 5 calls a sub-agent, the profiler captures latency, token usage, and cost for every single step, regardless of nesting depth.
The profiler also generates predictions about future token and tool usage. This means you can stress-test your workflow in pre-production to get sizing guidance before you hit real traffic. If the profiler predicts that Agent 3’s document retrieval tool will consume 80% of your token budget, you know to optimize that tool before shipping.
This is the kind of tooling that separates “works in a notebook” from “runs in production.” Most teams building multi-agent systems today discover performance problems after deployment. AgentIQ lets you discover them before.
Framework-Agnostic Observability
AgentIQ does not care which framework you use. It works with LangGraph, CrewAI, OpenAI Agents SDK, custom Python agents, or any combination. Workflows are defined in YAML, and every component (agent, tool, or sub-workflow) exposes a standard function_call interface that AgentIQ can instrument.
The observability layer uses an event-driven architecture that exports telemetry to whatever platform you already run: Phoenix, Langfuse, Weave, or any OpenTelemetry-compatible backend. You get traces, spans, and metrics without changing your agent code. Just plug in the AgentIQ middleware and your existing observability stack picks up agent telemetry alongside your regular application metrics.
from agentic_toolkit import AgentIQProfiler
profiler = AgentIQProfiler(
export_to="langfuse",
track_tokens=True,
track_latency=True,
track_cost=True
)
# Wrap any agent workflow
result = profiler.run(my_agent_pipeline, input="Analyze Q1 revenue")
profiler.report() # Generates per-node breakdown
The UI for Debugging Workflows
AgentIQ ships with a chat-based UI that lets you interact with your agents while visualizing the execution trace in real time. You can see which tools were called, what data flowed between agents, and where errors occurred. For teams that have been debugging multi-agent systems by reading log files, this is a significant improvement.
How NemoClaw and AgentIQ Work Together
The real value is not either tool in isolation. It is the combination.
NemoClaw runs your agents in secure, policy-governed sandboxes. AgentIQ instruments every action those agents take. Together, you get a development loop that looks like this:
- Build your agent pipeline using any framework (LangGraph, CrewAI, custom).
- Profile the pipeline with AgentIQ to find latency bottlenecks and token waste.
- Deploy the pipeline inside NemoClaw sandboxes with YAML-defined security policies.
- Monitor production behavior through AgentIQ’s telemetry, exported to your existing observability stack.
- Optimize by using AgentIQ’s profiler to test policy changes and model routing adjustments before deploying them.
This is the workflow that has been missing from the open-source agent ecosystem. Individual pieces existed (sandboxing via Docker, observability via Langfuse, profiling via custom scripts), but nobody had integrated them into a single open platform. NVIDIA has done that, and made it hardware-agnostic.
What This Means for Agent Builders
Three things stand out.
The security model is right. Enforcing security at the infrastructure layer rather than the application layer is the correct architectural decision. NemoClaw’s approach means that even if an agent is compromised through prompt injection, it cannot escape its sandbox. This is the same security model that cloud providers use for multi-tenant compute, applied to AI agents.
Observability is no longer optional. AgentIQ’s profiler makes it obvious how much money you are wasting on poorly optimized agent pipelines. When you can see that 80% of your tokens go to a single sub-agent that could run on a cheaper local model, the ROI of the tooling pays for itself immediately.
Hardware agnosticism is strategic. NemoClaw runs on NVIDIA, AMD, and Intel hardware. NVIDIA is betting that the value is in the software layer, not GPU lock-in. This is the right bet. Enterprise teams will not adopt an agent platform that only runs on one vendor’s hardware.
The alpha status is the main risk. NemoClaw and AgentIQ are both early-stage, and NVIDIA warns developers to expect rough edges. But the architecture is sound, the code is open source (Apache 2.0), and partners like Adobe, Salesforce, SAP, CrowdStrike, and Dell are already integrating. If you are building production agents today, these tools are worth evaluating now, even in alpha.
Frequently Asked Questions
What is NVIDIA NemoClaw?
NemoClaw is NVIDIA’s open-source enterprise security layer for OpenClaw, the popular AI agent runtime. It adds kernel-level sandboxing via the OpenShell runtime, an out-of-process YAML-based policy engine, and a privacy router that keeps sensitive data on local Nemotron models while routing complex reasoning to cloud frontier models. It was announced at GTC 2026 and is currently in early-access alpha.
What is NVIDIA AgentIQ and what does it do?
AgentIQ (part of the NeMo Agent Toolkit) is an open-source library for profiling, debugging, and observing multi-agent AI pipelines. It tracks tokens, latency, and cost at every node in an agent workflow, works with any framework (LangGraph, CrewAI, OpenAI Agents SDK), and exports telemetry to platforms like Langfuse, Phoenix, and any OpenTelemetry-compatible backend.
Does NemoClaw only work on NVIDIA GPUs?
No. NemoClaw is hardware-agnostic and runs on NVIDIA, AMD, and Intel hardware. The local Nemotron models benefit from GPU acceleration, but the security sandbox, policy engine, and privacy router work on any hardware. NVIDIA designed the platform to avoid vendor lock-in.
How does NemoClaw’s privacy router work?
The privacy router sits between agents and their model backends. It classifies each query’s sensitivity level and routes sensitive queries (containing PII, financial data, customer records) to local Nemotron models running on your own hardware. Complex reasoning tasks that need frontier-model capabilities route to cloud APIs (Claude, GPT-5) with sensitive context stripped. NVIDIA reports this hybrid approach cuts query costs by over 50%.
Can AgentIQ work with LangGraph and CrewAI?
Yes. AgentIQ is framework-agnostic and works with LangGraph, CrewAI, OpenAI Agents SDK, custom Python agents, or any combination. It uses a standard function_call interface for instrumentation and exports telemetry through an event-driven architecture compatible with OpenTelemetry, Langfuse, Phoenix, and Weave.
