Goldman Sachs and Anthropic Build AI Agents for Wall Street's Back Office

Goldman Sachs is building autonomous AI agents with Anthropic’s Claude to handle trade accounting, client onboarding, and compliance work across $2.5 trillion in assets under supervision. The bank embedded Anthropic engineers inside its technology teams for six months to co-develop the agents. Early results show 30% faster client onboarding and over 20% developer productivity gains. More than 12,000 Goldman developers now use Claude daily.

This is not a chatbot bolted onto an internal knowledge base. Goldman is deploying agents that reason through multi-step compliance processes, parse millions of transactions, and apply regulatory judgment at a scale that would require hundreds of additional back-office hires.

What Goldman Is Actually Building with Claude

Goldman’s Chief Information Officer, Marco Argenti, told CNBC the bank was “surprised” at how capable Claude was at tasks beyond coding. The specific areas where agents are being deployed break down into two tracks.

Trade Accounting and Reconciliation

The first track targets trade reconciliation and transaction accounting. Goldman processes millions of transactions annually. Each one needs to be matched, verified, and reconciled across counterparties, custodians, and clearing houses. When records do not match, someone has to investigate, and that investigation requires understanding contracts, corporate actions, and settlement rules simultaneously.

The Claude-based agents review these transactions, flag discrepancies, and resolve straightforward mismatches without human involvement. Complex cases get escalated with a pre-built analysis of what went wrong and what the likely resolution is. Settlement delays drop because the agents work around the clock and do not wait for a morning queue.

KYC, AML, and Client Onboarding

The second track handles know-your-customer (KYC) and anti-money laundering (AML) compliance for client onboarding. This process involves collecting identity documents, verifying beneficial ownership structures, screening against sanctions lists, and applying risk-scoring frameworks that change every time a regulator updates guidance.

These are tasks that combine document parsing, rule application, and judgment. A single institutional client onboarding can involve hundreds of pages of documentation across multiple jurisdictions. The agents interpret policy language, execute multi-step verification processes, and flag anomalies, cutting the median onboarding time by 30%.

The Technical Stack

The deployment runs on Claude Opus 4.6, which includes a 1-million-token context window in beta. That context window is not a marketing number here; processing a full KYC package for a multinational corporate client with subsidiaries across 15 jurisdictions can easily exceed 500,000 tokens of documentation. The agents operate within Anthropic’s Cowork environment, which provides the sandboxed execution layer for multi-step autonomous workflows.

Why Goldman Chose Anthropic Over OpenAI

Goldman selecting Anthropic is not accidental. Multiple reporting sources indicate the bank evaluated several foundation model providers and landed on Anthropic for three reasons.

Safety and Interpretability

Financial regulators require explainability. When an AI agent flags a transaction as suspicious or approves a client through KYC, the bank needs to show auditors and regulators how that decision was reached. Anthropic’s focus on Constitutional AI and interpretability research aligns with the documentation requirements that banks face under regulations from the SEC, OCC, and Federal Reserve.

This is different from choosing a model that scores highest on benchmarks. Goldman needs a model whose reasoning chain can be inspected, documented, and defended in a regulatory examination.

The Embedded Engineering Model

Rather than buying an API and building in-house, Goldman embedded Anthropic engineers directly within its technology teams for six months. This co-development model means the agents were built with domain expertise from both sides: Goldman’s people understand trade settlement and compliance workflows, Anthropic’s people understand how to architect reliable agent systems.

That embedded approach is expensive but reduces the most common failure mode in enterprise AI projects: building something that works in a demo but breaks on real-world edge cases. Goldman’s compliance workflows have thousands of edge cases that only surface when you process actual transactions.

Reliability for Regulated Industries

Banking is not a domain where you can tolerate a 5% error rate and iterate. A false negative in AML screening can result in regulatory fines measured in hundreds of millions of dollars. A false positive in KYC can delay revenue-generating client relationships. Goldman needs agents that are reliable enough to operate with limited human oversight on routine cases while correctly escalating unusual ones.

What This Means for Wall Street’s Workforce

CEO David Solomon has been explicit about the strategy: Goldman will not cut jobs in the near term, but it will constrain headcount growth. The bank describes the agents as “digital colleagues” rather than replacements.

The Math Behind “Constrain Headcount Growth”

Goldman has roughly 46,000 employees. Back-office operations, including accounting, compliance, and client services, represent a significant portion of that workforce. With assets under supervision growing (up from $2.0 trillion to $2.5 trillion in recent years), the traditional approach would require proportional hiring.

If AI agents can handle the incremental workload from growth without adding headcount, Goldman keeps its cost-to-revenue ratio flat while expanding. That is a stronger financial argument than cutting 2,000 jobs, which creates bad press and does not solve the scaling problem.

Who Is Actually Affected

The agents target work that is repetitive, rule-heavy, and document-intensive. Junior compliance analysts who spend their days matching transactions against settlement records are the most directly affected. Senior compliance officers who make judgment calls on novel situations are less affected because the agents escalate those cases rather than resolving them.

The more interesting shift is in developer productivity. With 12,000+ developers using Claude for coding tasks and seeing 20%+ productivity gains, Goldman can ship more internal tools with the same team. That compounds over time: more automation built faster, which accelerates the shift further.

How Other Banks Compare

Goldman’s deployment does not exist in a vacuum. The broader Wall Street AI race provides context.

JPMorgan Chase spends roughly $18 billion annually on technology and has moved from pilot projects to 400+ production AI use cases through its “OmniAI” platform. JPMorgan recently replaced external proxy advisory firms with an internal AI tool called “Proxy IQ” for voting decisions on US shares.

Morgan Stanley deployed an OpenAI-powered internal tool for financial advisors to query the firm’s 70,000+ research reports. By late 2023, 98% of advisor teams were using it regularly. The tool has since expanded to investment banking and trading staff.

Citi launched “Stylus Workspaces,” an agentic platform for workflow automation across banking operations.

The difference with Goldman’s approach is specificity. While JPMorgan and Morgan Stanley started with knowledge retrieval (search over documents), Goldman went straight to autonomous agents that execute multi-step processes. That is a riskier bet with higher potential payoff.

What Enterprise Teams Can Learn from Goldman’s Approach

Goldman’s deployment offers a playbook that applies beyond banking.

Start with rule-heavy, document-intensive processes. Goldman did not start with investment strategy or client advisory. They started where the rules are clear, the documents are structured, and the cost of getting it wrong is measurable. Trade reconciliation and KYC compliance are ideal agent targets because success is objective: either the transaction matches or it does not. Either the client passes screening or they do not.

Embed vendor engineers. The six-month embedded partnership is the single most important detail in this story. Most enterprise AI projects fail because of the gap between what the model can do in a lab and what the business actually needs. Embedding Anthropic engineers inside Goldman’s teams closed that gap by forcing both sides to confront real-world edge cases early.

Measure capacity gains, not just cost cuts. Goldman’s framing of “constrain headcount growth” instead of “reduce headcount” is strategically smart. It avoids the political costs of layoffs while capturing the financial benefit. For CFOs evaluating AI agent projects, this framing is easier to approve because it supports growth rather than cutting into the existing organization.

Invest in observability from day one. Agents operating across $2.5 trillion in assets need comprehensive audit trails. Every decision, every escalation, every resolved discrepancy must be logged in a format that regulators can review. According to PYMNTS, only 7% of enterprise CFOs have deployed agentic AI in live workflows so far, but 70% are “very or extremely interested.” The ones who succeed will be those who build the compliance and observability layer first, not after.

Frequently Asked Questions

What AI agents is Goldman Sachs building with Anthropic?

Goldman Sachs is building autonomous AI agents using Anthropic’s Claude Opus 4.6 for trade accounting, transaction reconciliation, KYC (know-your-customer) verification, AML (anti-money laundering) compliance, and client onboarding. The agents manage operations across $2.5 trillion in assets under supervision.

Why did Goldman Sachs choose Anthropic over OpenAI?

Goldman selected Anthropic for three reasons: safety and interpretability features needed for regulatory compliance, the embedded engineering model where Anthropic engineers worked inside Goldman’s teams for six months, and reliability for regulated industries where error rates must be extremely low.

Will Goldman Sachs AI agents replace jobs?

Goldman CEO David Solomon has stated the bank will not cut jobs but will constrain headcount growth. The agents handle incremental workload from business growth without proportional hiring increases, focusing on rule-heavy tasks like transaction matching and compliance screening.

What results has Goldman Sachs seen from its AI agents?

Early results show 30% faster client onboarding for KYC and compliance workflows, and over 20% developer productivity gains across the 12,000+ Goldman developers using Claude for coding tasks.

How does Goldman’s AI deployment compare to JPMorgan and Morgan Stanley?

JPMorgan has 400+ production AI use cases through its OmniAI platform with an $18 billion annual tech budget. Morgan Stanley uses OpenAI-powered tools for research retrieval. Goldman’s approach differs by deploying fully autonomous agents that execute multi-step processes rather than knowledge retrieval tools.

Source

What Goldman Is Actually Building with Claude#

Trade Accounting and Reconciliation#

KYC, AML, and Client Onboarding#

The Technical Stack#

Why Goldman Chose Anthropic Over OpenAI#

Safety and Interpretability#

The Embedded Engineering Model#

Reliability for Regulated Industries#

What This Means for Wall Street’s Workforce#

The Math Behind “Constrain Headcount Growth”#

Who Is Actually Affected#

How Other Banks Compare#

What Enterprise Teams Can Learn from Goldman’s Approach#

Frequently Asked Questions#

What AI agents is Goldman Sachs building with Anthropic?#

Why did Goldman Sachs choose Anthropic over OpenAI?#

Will Goldman Sachs AI agents replace jobs?#

What results has Goldman Sachs seen from its AI agents?#

How does Goldman’s AI deployment compare to JPMorgan and Morgan Stanley?#