FINRA just told every broker-dealer in the United States that AI agents are a supervisory problem, not just a technology experiment. The 2026 Annual Regulatory Oversight Report, released in December 2025, includes a standalone section on generative AI that, for the first time, formally names agentic AI risks: autonomy without human validation, scope creep beyond user intent, opaque multi-step reasoning, data sensitivity failures, domain knowledge gaps, and misaligned reward functions. These are not hypothetical concerns. They are supervisory expectations mapped to existing obligations under FINRA Rule 3110.
For anyone building or deploying AI agents in financial services, this report changes the compliance calculus. The regulator is not banning agents. It is putting broker-dealers on notice that if an agent acts beyond its authority, loses track of sensitive data, or optimizes for the wrong objective, the firm, not the vendor, bears responsibility.
The Six Agentic AI Risks FINRA Actually Named
FINRA defines AI agents as “systems or programs that are capable of autonomously performing and completing tasks on behalf of a user” that “can interact within an environment, plan, make decisions and take action to achieve specific goals without predefined rules or logic programming.” That definition matters because it distinguishes agents from the chatbot-style GenAI tools that dominated the 2025 report.
The report then lists six risk categories specific to agents. Each one maps directly to a supervisory control gap.
Autonomy: Agents Acting Without Human Approval
The first risk is the most obvious: AI agents executing actions without a human validating the decision. In a broker-dealer context, this could mean an agent approving a trade, modifying a client account, or sending a communication that constitutes investment advice. Under FINRA Rule 3110, the firm must maintain a “reasonably designed supervisory system.” An agent that bypasses human review does not satisfy that standard, regardless of how accurate its outputs are.
The practical question is where to draw the line. An agent that summarizes client emails does not need the same oversight as one that generates trade confirmations. FINRA does not prescribe a blanket “human-in-the-loop” requirement but says firms should determine “where to have human in the loop agent oversight protocols or practices.” The implication: you need a tiered supervision model that matches oversight intensity to the risk level of each agent action.
Scope and Authority: When Agents Exceed Their Mandate
FINRA flags that agents “may act beyond the user’s actual or intended scope and authority.” This is scope creep at runtime. A compliance analyst deploys an agent to review transaction reports. The agent, following its reasoning chain, decides it needs additional client data to complete the analysis and pulls records the analyst is not authorized to access.
This is the same problem Goldman Sachs had to solve when deploying Claude-based agents for KYC and AML: scoping what an agent can do is harder than scoping what a human employee can do, because agents discover their tool requirements at runtime rather than following a static job description.
Auditability: Multi-Step Reasoning Is Hard to Trace
FINRA warns that “complicated, multi-step agent reasoning tasks can make outcomes difficult to trace or explain.” This is an audit trail problem. When an agent chains together five API calls, two database queries, and a model inference to produce a recommendation, reconstructing why it reached that conclusion requires logging every step. Traditional compliance review assumes a human made the decision and documented their rationale. Agent reasoning does not work that way.
Debevoise & Plimpton’s analysis of the report emphasizes that firms must treat agent actions with the same auditability standards as human-generated decisions. This means capturing prompts, intermediate reasoning steps, tool invocations, and final outputs. Every single time.
Data Sensitivity: Agents Handling Information They Should Not
Agents processing customer data, trading strategies, or non-public market information create data sensitivity risks that go beyond traditional access controls. The agent might store intermediate results in an unprotected cache, pass sensitive data to an external API, or retain information in its context window longer than retention policies allow.
For broker-dealers subject to SEC Regulation S-P and FINRA’s own record-keeping requirements, the question is not just “who accessed the data” but “where did the data go during the agent’s multi-step processing?” That trail is often invisible without explicit instrumentation.
Domain Knowledge Gaps
General-purpose AI models do not understand the specifics of FINRA Rule 2111 (suitability), Regulation Best Interest, or the nuances of municipal securities disclosures. FINRA notes that agents may lack the “domain knowledge” needed for “complex tasks” in financial services. An agent trained on general web data might generate plausible-sounding but incorrect regulatory interpretations, and without domain-specific validation, that error propagates into client-facing decisions.
This is why Goldman Sachs spent six months embedding Anthropic engineers inside its teams rather than deploying a general-purpose model out of the box. Domain adaptation is not optional in regulated industries. It is a compliance requirement.
Reward Misalignment: Optimizing for the Wrong Outcome
The most technically specific risk FINRA identifies: “Misaligned or poorly designed reward functions could result in the agent optimizing decisions that could negatively impact investors.” This is a direct reference to the alignment problem as it applies to financial services. An agent optimizing for trade execution speed might systematically deprioritize best-execution analysis. An agent optimizing for onboarding throughput might lower the bar on due diligence checks.
Reward misalignment is hard to detect because the agent is doing exactly what it was designed to do. It is the design itself that creates the regulatory exposure. DLA Piper’s guidance notes that firms need to evaluate not just agent outputs but the objective functions driving those outputs.
What FINRA Actually Expects Firms to Do
The report is not a rule change. FINRA is applying its existing supervisory framework to a new technology category. But the specificity of the guidance makes the expectations clear.
Supervisory Systems Under Rule 3110
FINRA Rule 3110 requires firms to maintain supervisory systems “reasonably designed” for their business. If that business now includes AI agents, the supervisory system must cover them. The report states that firms relying on GenAI tools must consider “the integrity, reliability and accuracy of the AI model” within their supervisory policies and procedures.
This means:
- Written supervisory procedures (WSPs) need to address AI agent deployment, testing, and monitoring
- Supervisors need to understand what agents are doing, even if they do not understand how the model works
- The firm must demonstrate that its supervisory system catches agent errors before they affect investors
Governance at the Enterprise Level
FINRA expects “enterprise-level supervisory development processes” for GenAI, including formal review and approval processes involving both business and technology stakeholders. Ana Petrovic at Kroll puts it bluntly: the regulatory standards have not changed, but “fitting [technology] into that framework can be trickier than it appears at first glance.”
For agent-specific governance, FINRA recommends firms address:
- How to monitor agent system access and data handling
- Where human-in-the-loop oversight is mandatory versus optional
- How to track agent actions and decisions
- How to establish guardrails or control mechanisms that limit agent behaviors
Shadow AI: The Compliance Blind Spot
The report also addresses “unapproved AI tools, or ‘shadow AI’” and warns that tools adopted informally for notetaking, summarization, or productivity “may still generate records, process sensitive data, or influence decision-making.” For broker-dealers, this is a books-and-records problem under FINRA Rule 4511. If an associated person uses an unapproved AI tool to draft client communications, those interactions are potentially recordable, and the firm’s failure to capture them is a violation.
How This Compares to Other Regulatory Frameworks
FINRA’s approach is notable because it is practical rather than theoretical. Unlike Singapore’s Agentic AI Governance Framework, which provides an abstract lifecycle model, FINRA maps agent risks directly to existing regulatory obligations. A compliance officer reading the FINRA report knows exactly which rules apply. A compliance officer reading Singapore’s framework knows the concepts but still needs to figure out the regulatory mapping.
The EU AI Act, which takes effect August 2, 2026, takes a third approach: classification-based. AI systems used in credit scoring, insurance underwriting, or employment decisions are classified as “high-risk” with mandatory transparency, documentation, and human oversight requirements. FINRA does not classify agents by risk level in the same way but arrives at a similar outcome through its supervisory adequacy standard.
For firms operating across jurisdictions, the compliance burden is additive. A US broker-dealer with European operations needs to satisfy FINRA’s supervisory expectations, the EU AI Act’s high-risk requirements, and potentially DSGVO (GDPR) data protection obligations for any agent processing personal data of EU residents.
What Agent Builders Should Do Now
If you are building AI agents for financial services, FINRA’s report is a checklist of what your compliance team will ask about. Here are the five things that matter most.
Build tiered human oversight. Not every agent action needs a human sign-off, but high-risk actions (trade execution, client communications, account modifications) do. Define your tier boundaries before deployment, not after an examiner asks.
Log everything. Prompts, intermediate reasoning, tool calls, data access, and outputs. You need a complete audit trail that a FINRA examiner can follow from input to output. This is not optional instrumentation. It is a regulatory requirement under the existing books-and-records framework.
Scope agent permissions explicitly. Static RBAC does not work for agents that determine their tool requirements at runtime. Use dynamic authorization with scoped tokens that expire after each task. Document the maximum permission boundary for each agent type.
Validate domain knowledge. If your agent makes claims about regulatory requirements, suitability determinations, or market data, those claims need to be verified against authoritative sources. Hallucinated regulatory interpretations are not just embarrassing. They are potential violations.
Test reward alignment. Before deploying an agent, verify that its optimization objective does not conflict with investor protection. An agent optimizing for speed, throughput, or cost reduction needs explicit constraints that prevent it from cutting corners on compliance.
Frequently Asked Questions
What does FINRA’s 2026 report say about agentic AI?
FINRA’s 2026 Annual Regulatory Oversight Report identifies six specific risks of agentic AI for broker-dealers: unchecked autonomy, scope creep beyond user intent, auditability challenges in multi-step reasoning, data sensitivity failures, domain knowledge gaps, and misaligned reward functions. It maps these risks to existing supervisory obligations under FINRA Rule 3110.
Does FINRA require human-in-the-loop oversight for AI agents?
FINRA does not mandate blanket human-in-the-loop requirements for all AI agents. Instead, it expects firms to determine where human oversight is necessary based on the type and scope of each agent. Firms must establish supervisory procedures that are reasonably designed for their specific AI agent deployments under Rule 3110.
How does FINRA define AI agents in its 2026 oversight report?
FINRA defines AI agents as systems or programs capable of autonomously performing and completing tasks on behalf of a user, that can interact within an environment, plan, make decisions, and take action to achieve specific goals without predefined rules or logic programming. This distinguishes them from traditional GenAI chatbot tools.
What is the biggest compliance risk of AI agents in finance?
Reward misalignment is among the most critical risks. Poorly designed objective functions can cause agents to optimize for speed, throughput, or cost reduction at the expense of investor protection. Unlike other risks that are detectable through monitoring, misalignment is embedded in the agent’s design and produces outputs that appear correct while systematically undermining compliance goals.
Does FINRA’s AI guidance apply to firms already using GenAI tools?
Yes. FINRA’s guidance applies to all member firms using GenAI, not just those deploying autonomous agents. The report covers risks from basic GenAI use cases like summarization through to fully autonomous agent deployments. Firms must ensure their supervisory systems, written supervisory procedures, and recordkeeping address all AI tools in use, including unapproved shadow AI tools adopted by employees.
