AI AgentAI 产品与平台

Claude Managed Agents: Anthropic Wants to Run Your Agents for You

Research date: 2026-04-08, launch day.

The one line

Today Anthropic shipped Claude Managed Agents. You write an agent definition — the model, the prompt, the tools, MCP, skills — hand it to Anthropic, and they run the session inside their own sandbox, billing you by runtime hours plus tokens. The official tagline is “build and deploy agents 10x faster,” and every customer quote says “days instead of months.” On the surface this looks like a product about saving developers from infrastructure work.

After going through all the material, my read is different. The convenience is a side effect. The real story is that Anthropic — not AWS — wants to hold the entry point to the agent layer. Whether you use MA or not is actually a decision about one thing: are you willing to park your agent’s operational state (credentials, memory, session history) at Anthropic in exchange for a few weeks of plumbing you no longer have to write, or do you keep managing it yourself in exchange for not being tied to Claude.

That choice itself isn’t new. Every cloud service eventually becomes this question. What’s new is that the “agent” layer has just become the new battleground for it.

The product itself

The fastest way to understand it is to walk through a concrete scenario. Suppose you want to add a “research assistant” to your SaaS product — the user throws in a question, the agent searches the web, reads your internal wiki, and hands back a written report. Historically this is work you had to do yourself: rent a container, write the agent loop, manage session state, configure credentials, wire up tracing. MA exists so you don’t touch any of that, end to end.

You open the Anthropic console and fill in a form: the agent uses Claude Opus 4.6, its system prompt is “you are a research assistant,” its tools are web search plus a custom MCP server of yours, and its skills include that “company style guide” you wrote. You submit the form and get back an agent ID. This is the first concept, called Agent — it’s your complete declaration of “what this AI is and what it can do,” a static object.

Next you tell Anthropic what kind of environment this agent should run in: pre-install pandoc and weasyprint, only allow the container to reach api.mycompany.com and serpapi.com, block everything else. You get back an environment ID. This is the second concept, Environment — a container template. It’s kept separate from agent because the same agent can run in different environments: prod and staging might have different host allowlists, for example.

When a user arrives, you create a session, passing in both the agent ID and the environment ID. Anthropic spins up a container on their side and your agent comes alive inside it. This is the third concept, Session — a concrete running instance. Your app and the session talk through an SSE channel, passing messages back and forth. Those messages are called Events, the fourth concept, and they’re persisted server-side. Your web server can crash and reconnect with no drama. Your user closes the browser and the session goes idle, billing pauses. When they come back later, the session resumes and you didn’t have to manage conversation state yourself.

After walking through that, you notice two things. First: you didn’t write a single line of agent loop. When the model should call web search, how to retry a failing tool call, how to compact context that grew too long, where to store conversation history — all of those decisions are made by Anthropic’s harness on your behalf. That’s MA’s core value prop. Second: credentials live in a write-only store called a vault, the agent can use them at runtime without reading them back, and every tool call is written to the event stream and rendered in the console as an auditable timeline. That’s MA’s governance value prop.

But there are also two things in that same flow that Anthropic doesn’t say out loud, worth knowing about.

One is that agents.update has no approval mechanism. Your agent definition is itself immutable (every update is a new version number), but anyone who holds your API key can directly issue an update that rewrites the system prompt and tool list. Anthropic acknowledges this tradeoff in their own cookbook and expects the caller to fill the gap with version pinning and PR review. For finance, compliance, or high-risk scenarios, this hole needs to be closed explicitly during integration.

Two is that MA’s three most attractive features are all in research preview today, with no GA timeline. Outcomes is the most demo-like of them — you write a rubric, Claude iterates by itself until the rubric is satisfied. The line Anthropic keeps repeating in the blog post — “define outcomes and success criteria and Claude self-evaluates and iterates until it gets there” — is this feature. You can’t use it today. The other two are Multi-agent orchestration (currently only single-level delegation — agents can’t nest agents that nest agents) and Memory (persistent memory across sessions). What you can actually run on launch day is the orchestration harness and the governance primitives. The most photogenic capabilities are still behind a door.

The pricing is a single line, and the definition is generous. The blog literally says standard Claude Platform token rates apply, plus $0.08 per session-hour for active runtime. TheNewStack’s independent reporting added two details: idle time doesn’t count, web search costs an extra $10 per 1,000 queries. Anthropic’s pricing docs page still has no Managed Agents section. Whether active runtime rounds by second, minute, or hour, whether rescheduling state counts as active — the docs don’t say.

Why this is Anthropic’s entry-point war

Set aside “what the product does for you” for a moment and ask a different question: what problem is Anthropic actually trying to solve by launching this now?

By timing alone, Anthropic is the last of the four big players to ship. AWS Bedrock AgentCore Runtime went GA in October 2025. OpenAI AgentKit plus the Responses API landed in the same window. Google Vertex Agent Engine had already been in place. Anthropic is six months late. This delay isn’t technical — Claude Code already had a full agent runtime in February 2025, and MA’s harness is almost certainly ripped out of Claude Code internals.

The real explanation for the timing is two things that happened close together.

The first is financial. Public reporting says Anthropic paid AWS about $2.66 billion in compute costs during the first nine months of 2025, exceeding revenue for the same period. 2024 gross margin was roughly -94%. The commitment to investors is to reach 77% gross margin by 2028. For that curve to work, they have to claw margin back somewhere. Claude Code did about $1B ARR by end of 2025 and over $2.5B by early 2026, which proved that “products one level higher than raw API” can in fact lift gross margin. MA is the second serve of the same logic: upgrading from “selling tokens” to “selling runtime plus tokens.”

The second is platform control. Four days before MA launched, Anthropic cut off OpenClaw and a handful of similar third-party harnesses from accessing Claude through the subscription plan. Two events this close together don’t look like coincidence. The natural reading: first they closed the cheap path where you could “run their harness on my tokens” (on that path Anthropic couldn’t collect runtime revenue and also couldn’t control how the agent interacts with the user), then they opened the official path where you “run my harness in my runtime.” Block and offer, two halves of the same move. Someone on Hacker News on launch day said it directly (47693047, user cedws):

“Anthropic wants to shift developers on to their platform where they’re in control. The fight for harness control has been terribly inconvenient for them. To score a big IPO they need to be a platform, not just a token pipeline.”

There’s a third, quieter thing. Claude Cowork is already embedded inside Microsoft Copilot. If Anthropic had no independent platform layer of its own, the long-run picture would be “Claude ends up as one more model option inside Microsoft’s product.” MA, Claude Code, and Cowork together give Anthropic three distribution entry points — consumer, developer, and platform builder — that it directly controls and doesn’t have to hand over to any hyperscaler.

Line these three things up side by side: financial pressure to lift gross margin, product pressure to reclaim harness control, channel pressure to escape hyperscaler dependency. MA solves all three at once. “Saving developers months of infrastructure work” is the pitch, not the motive.

Whose work is actually being saved

Two independent voices from launch day are worth reading side by side.

One is JLO64 on the main Hacker News thread (source), someone running the Anthropic Agent SDK inside Docker containers to build Jekyll sites for customers:

“I didn’t find it that difficult to set up the infrastructure, the hard part was getting the agents to do exactly what I wanted.”

The other is Tamas Kaljuste, a Swiss developer who posted a real billing comparison on LinkedIn later that evening (source). He ported a newsletter agent onto Managed Agents and ran it:

“Managed Agent: 20+ minutes, $5+ burned. My same n8n workflow: runs in a few minutes, costs cents… Forcing a model to pause and ‘reason’ through a strictly repeatable pipeline is a fundamentally expensive architecture.”

Put those two together and the edge of MA’s value prop becomes visible. The work MA wants to save you from is “several weeks of writing sandboxes, checkpointing, credential management, tracing.” For someone who already runs agents in production, that several-week chunk isn’t the hard part — getting the agent to do the right thing is. For someone who stuffs a strictly deterministic pipeline into an agent loop, the $0.08 runtime is peanuts; the real bill comes from “making the model pause and reason its way through a path you could have just hard-coded.” $5 for twenty minutes isn’t MA being expensive, it’s the agent architecture itself being a wrong fit for workflow automation.

So the people MA actually serves are very specific: you haven’t built a production agent infrastructure before, but you already have a shipping SaaS product that wants an agent feature. LinkedIn comments and the early customer list are almost entirely this group. bnchrch is writing an invoicing agent for a small brewery. Dan Rooney already wrote custom MCP servers for Google Ads, HubSpot, and Peec AI but the orchestration was still duct tape. Notion wants users to delegate open-ended work to agents without leaving Notion. Sentry wants an error to turn straight into a PR. In these scenarios, what MA saves you is real.

The HN crowd is a different persona: they already have Docker, K8s, multi-model pipelines. They’re past the “standing up infrastructure” phase. Their launch-day reaction is almost uniformly “I’m not switching.” At least five users on the main thread publicly said they’re running their own self-hosted equivalents — JLO64 on Docker + SDK, rick1290 on Pydantic AI plus DBOS/Temporal, jawiggins with K8s-based Optio, _pdp_ pulling together an open-source version, 0o_MrPatrick_o0 with multi-model routing. MA’s appeal to them is negative: they already have what it saves, and they can’t afford the cost (Claude-only lock-in).

mccoyb on the same thread gave what I think is the sharpest single line of the day:

“Opus 4.6 on max does not hold a candle to GPT 5.4 xhigh in terms of bug finding.”

His entire production pipeline runs Claude for planning, GPT for bug finding, local Qwen for the simple stuff. MA supports none of that mixed-agent shape. For developers like him, the value prop isn’t break-even, it’s negative: accepting MA means using a suboptimal model on every sub-task.

Where the lock-in actually lives

MA’s lock-in comes in three layers, worth separating because their migration difficulty is completely different.

The first layer is model binding. MA only supports Claude 4.5 and above. Everyone sees this one. Ironically the Claude Agent SDK (the self-host path) is more open — it explicitly supports routing through Bedrock, Vertex, and Azure to other models.

The second layer is API shape. Agent, Environment, Session, and Events are Anthropic-specific names. They don’t line up with OpenAI’s Responses API or Bedrock AgentCore’s runtime at all. Migrating means rewriting your agent loop, but at least the code is yours and the rewrite effort is estimable.

The third layer is operational state, the most invisible one. Independent blogger Dan Goodman wrote a prescient piece twelve days before MA launched, pointing out that the real lock-in from model labs isn’t in the API, it’s in the context. Managed compaction, persisted sessions, cross-session memory — once these are hosted at Anthropic, what you’d need to migrate out isn’t code, it’s state. Anthropic ships no official export tool. The credentials in your vault, the cross-session memory in your memory store, the full event history of every session, the evolution of your agent versions — these are MA’s real egress cost.

This layer is harder to price than the cloud egress fee. AWS egress is at least on a rate card — you know exactly how tight the lock is. Context lock-in compounds with operating time. After six months of running an agent on MA, what’s actually hard to migrate isn’t the few hundred lines of agent definition code you wrote on day one, it’s the six months of accumulated memory store and skill configuration.

Another user on launch-day HN, 0o_MrPatrick_o0, gave a concrete reason why he refuses this layer of lock-in:

“Model reliability is transient. When the models have an off day, the workflows you’ve grown to depend upon fail. Build in multi-model support, so your agents can modify routing if an observer discovers variability.”

This isn’t an ideological anti-monopoly position. It’s “I’ve lost a full day of work when Claude had an off day and my pipeline broke” experience. MA’s architecture gives you no escape hatch for that.

A prediction that needs a correction

This section is for readers familiar with two earlier essays of mine. Skip to the next section if you aren’t.

When I wrote about Claude Code in early 2025, my read was that it’s a critical piece in AI-native software development, pushing a new paradigm I called “Library as a Service” — every software library turning into an agent that AI can directly invoke. MA pushes that direction one step further: your library can now exist as a managed agent delivering a service. The Notion, Asana, and Atlassian cases are exactly this pattern showing up as proof of concept. From the LaaS angle, MA is on the expected path.

At the same time, in the essay about Agentic AI frameworks, I wrote a more fundamental prediction: “Eventually the various components of Agentic AI will settle down and converge on common interfaces like web standards.” Looking back a year and change later, convergence is in fact happening — four big players shipping managed agent runtimes in the same window is the clearest signal of that — but the convergence isn’t moving in the direction I predicted. It’s converging onto each vendor’s proprietary runtime, not onto an open standard. MA’s Agent/Environment/Session, OpenAI’s Responses API, Bedrock’s AgentCore Runtime — none of them interoperate. MCP is the single exception, because it’s a tool-layer protocol Anthropic originated and then donated to the Linux Foundation, and all four vendors support it. But the agent layer itself has no protocol equivalent to MCP. A2A is Google’s. AG-UI is CopilotKit’s. MA supports neither.

So the prediction needs a patch. Convergence is happening and the moment is now, but the shape isn’t HTTP/MIME-type, the kind of public standard everyone agrees on. The shape is four or five vendors each with their own managed runtime, held loosely together by MCP doing shallow tool interop, with the agent-to-agent layer still a blank space where no consensus exists.

The practical consequence of this patch: in 2026, your choice of agent production runtime is a heavier decision than picking LangGraph vs. SmolAgents was in 2025. Frameworks you can rewrite. Operational state inside a runtime, you can’t. The cost of avoiding lock-in is also higher than it was a year ago, because what you’d have to build yourself now is not just the agent loop but the sandbox, vault, and memory store. Both sides are getting more expensive. There’s no free option.

What three kinds of readers should actually do

If you’re already running agents in production, using Claude Agent SDK plus your own Docker/K8s/Cloud Run: stay where you are. MA has nothing today that justifies the migration. Re-evaluate when at least one of two things happens — Outcomes/Memory/Multi-agent ship out of research preview with clear pricing, or Anthropic provides an official export tool for operational state. Until then, don’t move.

If you’re a SaaS product team that wants to add an agent as a feature inside an existing product, targeting end users rather than developers: MA is the smoothest current choice, assuming you accept Claude-only binding. It has less onboarding friction than AgentCore (you don’t decide sizing), supports much longer sessions than OpenAI Responses API’s 20-minute container cap, and saves you weeks of plumbing vs. building from scratch. The fact that Notion, Asana, and Atlassian show up on the early customer list isn’t random. The one thing worth doing now: maintain a source of truth for your agent configuration on your own side, storing vault content, memory store contents, and skill bundles outside MA and syncing them in on a schedule. That way if you ever migrate out, what you’re migrating is a sync script, not six months of accumulated runtime state. MA’s API supports this pattern.

If you haven’t started building an agent yet and you’re evaluating options: my advice is don’t pick a managed runtime first. Use Claude Agent SDK to run a minimal viable agent on Fly.io or Cloud Run for a week or two, and feel where the real bottleneck is. In almost all cases the bottleneck isn’t infrastructure, it’s getting the agent to do the right thing. Once you’ve earned that intuition, you can decide whether any specific vendor’s lock-in is worth the convenience. Jumping straight onto a managed runtime means half the capabilities you’ll use you could have built yourself, the other half you don’t actually need, and the lock-in has already happened.

Questions that still need answers

Launch-day information isn’t enough to verify a few things worth watching over the next couple of weeks.

One is real billing. Tamas Kaljuste’s 20-minute newsletter test is the only independent behavioral data point from day one. We don’t yet have a “I ran an 8-hour coding agent and here’s the bill” end-to-end writeup. This kind of data usually surfaces 7-14 days after launch.

One is migration stories. Every customer quote so far is “we integrated,” none of them is “we migrated from X.” Nobody moved from Bedrock AgentCore to MA. Nobody moved from self-hosted. The absence itself is a signal, but we need migration stories to surface before we can tell whether MA is compelling enough to pull customers who already picked a side.

One is when Outcomes, Memory, and Multi-agent ship out of research preview, and whether pricing changes when they do. These are the three most attractive capabilities in MA’s marketing and also the most uncertain today.

One is what container technology MA actually uses for isolation. The docs only say “sandboxed.” Firecracker? gVisor? Docker? Something else? Not mentioned. AgentCore explicitly states Firecracker microVM. For compliance scenarios, this answer has to end up in a contract. Right now it can’t.

One is whether MA can reach $500M+ ARR within 12 months. If it can, MA is Anthropic’s second independent application-level revenue pillar after Claude Code, and the gross margin story holds. If it can’t, MA is a catch-up move — important but not fate-determining. Nobody can answer this number today. A year from now, the answer will be obvious in hindsight.


Primary sources

Firsthand material - Claude Managed Agents official blog - Managed Agents docs overview - Managed Agents quickstart - Claude Agent SDK secure deployment — Anthropic’s own doc admits “container minimum cost roughly 5 cents per hour”

Independent coverage and analysis - The New Stack: Anthropic wants to run your AI agents for you - Dan Goodman: Where Agents Converge — prescient piece from 12 days before launch, arguing lock-in lives in context - Nicholas Rhodes: Managed Agents for Small Business — solopreneur view with the $0.70/hour math

Independent community feedback - Hacker News launch-day main thread #47693047 - LinkedIn Claude official launch post — Tamas Kaljuste’s 20-minute $5+ cost comparison

Competitive material - AWS Builder Center: Bedrock Agents vs AgentCore

Author’s earlier essays - The Underestimated Claude Code: A Key Piece for AI-Native Software Development - Why the First Step in Learning Agentic AI Is to Forget All Frameworks