Salesforce delivered 2.4 billion Agentic Work Units (AWUs) across Agentforce and Slack as of Q4 FY2026, with 771 million in the quarter alone, up 57% from Q3. An AWU is one discrete task that an AI agent completes: a CRM record updated, a support case resolved, a workflow triggered. CEO Marc Benioff introduced the metric during the February 2026 earnings call alongside $11.2 billion in quarterly revenue and Agentforce ARR of roughly $800 million (169% year-over-year growth).
The pitch is straightforward: tokens measure how much an AI talks, AWUs measure how much it works. But the metric has drawn sharp criticism from analysts who argue it conflates doing work with achieving outcomes. Whether AWU becomes the standard unit for measuring agent productivity or a vanity metric that obscures more than it reveals depends on what enterprises actually do with the number.
What an Agentic Work Unit Actually Measures
An AWU represents a single, discrete task executed by an AI agent in production. Not a prompt processed in a sandbox. Not a reasoning chain that leads nowhere. A completed action that changed something in a business system. Salesforce’s official definition lists specific examples: updating a customer record, triggering an automated workflow, resolving a support ticket, calling an external API, or making a routed decision.
The distinction from tokens matters for a practical reason. A single customer service interaction might consume 15,000 tokens but produce one AWU (the resolved ticket). A data enrichment agent might use 3,000 tokens but generate 50 AWUs (50 records updated). Token consumption tells you about inference cost. AWU tells you about work completed.
The Token-to-AWU Ratio
Salesforce tracks what it calls the inference efficiency ratio: AWUs produced per token consumed. The company expects this ratio to improve over time as agents get better at completing tasks with fewer reasoning steps. Constellation Research notes that the relationship between tokens and AWUs is “elastic,” not fixed, meaning that platform improvements should let agents do more work per inference dollar.
This is the metric Salesforce actually cares about: divergence between token consumption and AWU output. If tokens stay flat while AWUs climb, Agentforce is getting more efficient. If both climb proportionally, it is just doing more work at the same efficiency.
What AWU Does Not Measure
AWU counts execution, not quality. A workflow that triggers incorrectly still counts as one AWU. A CRM record updated with wrong data still counts. A support ticket “resolved” that gets reopened the next day still counts. CIO magazine called this out directly: the metric tells CIOs little of value on its own because it does not distinguish between useful work and busywork.
This is the core gap. An agent caught in a retry loop that updates the same record 50 times generates 50 AWUs. An agent that resolves a complex case on the first try generates one. Without a quality overlay, raw AWU numbers can be gamed, inflated, or simply misleading.
Why Salesforce Needed a New Metric
The token economy works well for chat interfaces and copilots where the output is text. You consume tokens, you generate text, and you can price it per thousand tokens. But when agents take autonomous actions, token consumption stops correlating with value delivered. A chatbot that writes a long, unhelpful response consumes more tokens (and costs more) than a terse, correct one. An agent that resolves a billing dispute in three API calls creates more value than one that reasons through 20 steps and then escalates anyway.
Salesforce has gone through three pricing models in under a year. The original $2-per-conversation model launched with Agentforce in late 2024. By May 2025, they introduced Flex Credits at $0.10 per action. Now AWUs represent the latest evolution, an attempt to tie the metric to completions rather than conversations or raw token consumption.
The SaaS-to-Agent Pricing Shift
The traditional SaaS metric is seats: users times price per user per month. Salesforce’s own revenue ($41.5 billion in FY2026) was built on that model. But AI agents do not sit in seats. They do not log in. They run autonomously, often handling work that no human previously did, or handling it at 3 AM when no one is watching.
AWU is Salesforce’s answer to the question every enterprise AI vendor faces: if you cannot count seats, what do you count? Microsoft counts Copilot actions and consumption-based credits. OpenAI charges per token. Intercom charges $0.99 per resolved ticket. Each model encodes different assumptions about where value lives.
The Analyst Criticism: Activity vs. Outcomes
The harshest critique comes from CustomerThink’s analysis, which calls AWU “the new bad query of the AI era,” comparing it to the early web analytics days when companies measured pageviews without understanding whether visitors actually bought anything.
The argument has three parts.
AWU conflates execution with outcomes. An API call that fails downstream still counts as an AWU. A workflow that triggers but produces the wrong result still counts. The metric tracks activity, not accuracy or business impact.
AWU is vendor-controlled, not independently auditable. There is no third-party standard for what constitutes a “discrete task.” Salesforce defines it, Salesforce counts it, Salesforce reports it. Marc Bara on Medium notes that this makes it structurally similar to engagement metrics that social media platforms used to self-report before advertisers demanded independent verification.
AWU conflates volume with efficiency. Reporting 771 million AWUs in a quarter sounds impressive. But without knowing the success rate, error rate, or rework rate, the number is ambiguous. If 30% of those AWUs were retries, error corrections, or redundant operations, the actual productive output is much lower.
The Counterargument
Diginomica’s analysis offers a more measured take: AWU is not meant to stand alone. Salesforce tracks it alongside token consumption, resolution rates, and customer satisfaction scores. The inference efficiency ratio (AWUs per token) is the more meaningful metric because it captures whether agents are getting better at turning compute into completed work.
Patrick Stokes, Salesforce’s CMO, positioned AWU explicitly as a complement to existing metrics, not a replacement. The idea is that enterprises should track AWU alongside outcome metrics specific to their use case: ticket resolution time, customer satisfaction, revenue per interaction, or whatever KPI the agent is supposed to move.
What AWU Means for Enterprise AI Measurement
If you are running Agentforce or evaluating it, AWU gives you one useful and one dangerous thing.
The useful part: a standardized count of agent actions across your Salesforce environment. Before AWU, comparing agent activity across Service Cloud, Sales Cloud, and Slack required stitching together different logs. AWU normalizes that into a single number. For capacity planning and trend analysis, that has real value.
The dangerous part: treating AWU as a success metric. 771 million AWUs means nothing if you do not know the percentage that achieved their intended result, the cost per successful AWU, or the human time saved per AWU. Without those layers, AWU is a speedometer on a car that might be driving in circles.
Building a Real Agent Measurement Stack
For enterprises building their own agent measurement frameworks, AWU points to a useful structure even if the specific metric has gaps. The Salesforce Agentforce Metrics page breaks this into three tiers:
Activity metrics (AWU-equivalent): How many tasks did agents complete? What types? How is volume trending? This tells you about throughput.
Efficiency metrics (token-to-AWU ratio): How much compute does each task consume? Is efficiency improving over time? This tells you about cost-effectiveness.
Outcome metrics (resolution rate, CSAT, rework rate): Did the work actually achieve the intended result? This tells you about value.
Most enterprises have robust outcome metrics from their existing CRM and ticketing systems. What they lack is the first two tiers. AWU, whatever its flaws, pushes the conversation toward building those layers.
Frequently Asked Questions
What is a Salesforce Agentic Work Unit (AWU)?
An Agentic Work Unit (AWU) is Salesforce’s metric for measuring discrete tasks completed by AI agents in production. Each AWU represents one completed action such as updating a CRM record, triggering a workflow, resolving a support case, or calling an external API. Salesforce introduced the metric during its Q4 FY2026 earnings call in February 2026.
How is AWU different from token-based AI metrics?
Tokens measure how much text an AI model processes (input and output). AWUs measure completed work: tasks that changed something in a business system. A single interaction might consume thousands of tokens but produce only one AWU (one resolved ticket), or it might consume few tokens but produce many AWUs (batch record updates). Salesforce tracks the ratio between the two as an efficiency indicator.
How many AWUs has Salesforce reported?
As of Q4 FY2026, Salesforce reported 2.4 billion cumulative AWUs across Agentforce and Slack, with 771 million in Q4 alone. That represents 57% quarter-over-quarter growth.
Does AWU measure the quality of AI agent work?
No. AWU counts completed actions regardless of whether the outcome was correct or valuable. A workflow that triggers incorrectly, a record updated with wrong data, or a ticket resolved that gets reopened all count as AWUs. Salesforce positions AWU as a complement to outcome metrics like resolution rate and customer satisfaction, not a standalone quality measure.
Should enterprises use AWU to measure AI agent ROI?
AWU alone is not sufficient for ROI measurement. It provides useful throughput data (how many tasks agents complete) and, combined with token consumption, an efficiency ratio. But ROI requires outcome metrics: was the task completed correctly, did it save human time, and did it produce measurable business value? Use AWU as one layer in a three-tier measurement stack: activity, efficiency, and outcomes.
