AI

Introducing Claude Managed Agents

Claude Managed Agents is now available in public beta on the Claude Platform. It delivers a managed runtime and composable APIs for building and deploying cloud‑hosted agents without standing up bespoke infrastructure. The promise is concrete: shift effort from scaffolding and operations to user experience and measurable outcomes. Teams get a faster path to production, predictable governance, and a clearer debugging story.

What is Claude Managed Agents?

Claude Managed Agents is a suite of APIs and a production harness for running AI agents in the cloud. It handles authentication, secure tool execution, state and memory, and how and when the agent calls tools. In short, it removes undifferentiated engineering and gives teams a reliable execution environment with observability. The draw is not a new model trick – it is the reduction of day‑to‑day overhead: no custom permissioning layer to write, no brittle session management, no ad‑hoc retries when a tool call fails. The harness is performance‑tuned for Claude models and designed to be composable, so tools can be added or swapped without rewriting a control loop.

Security, state, and orchestration by default

Agents run in secure sandboxes where authentication, secrets, and tool execution are handled by the platform. That matters when agents read repositories, touch internal APIs, or prepare documents that include sensitive data. Scoped permissions and identity controls reduce blast radius: each agent operates with only the rights it needs, and every decision and tool call is traceable. Long‑running work requires durable state. Sessions can persist for hours, and progress survives disconnects. An agent that begins parsing a large CSV does not lose its place if a client drops; it can resume, complete the transformation, then hand the result to a second agent for summarization. This is the difference between a chat demo and a production workflow that runs to completion.

The orchestration harness manages context and memory across steps and recovers from predictable errors. Rather than jamming everything into a single prompt, tools, outcomes, and permissions are defined upfront. The harness coordinates the sequence, ensuring state is consistent and traceable. Multi‑agent coordination is available in a research preview, where agents can spawn and direct other agents to parallelize or modularize work – useful for pipelines like "retrieve sources," "draft," and "fact‑check," or for splitting a codebase analysis into components.

Outcome‑driven loops and full Console visibility

Outcomes and acceptance criteria can be defined upfront – such as "produce schema‑valid JSON" or "compile successfully with zero lint errors." In a research preview, Claude can self‑evaluate against these criteria and iterate until it reaches the goal, while still allowing prompt‑and‑response control when tighter human supervision is needed. Internal testing has reported up to a 10‑point improvement on structured file generation compared to standard prompting, with the largest gains on harder tasks. That lines up with what teams see in practice: fewer manual retries, cleaner handoffs to downstream systems, and more predictable completion. The Claude Console provides real‑time session tracing, integration analytics, and troubleshooting, shortening the time between "it failed" and "we know why" while giving stakeholders a clear audit trail.

Where Managed Agents fit in the stack

Many teams begin with custom code or a lightweight orchestration library. That works for prototypes, but production brings recurring work: permissions, identity, secrets, retries, timeouts, and upgrade paths when models or tools change. Managed Agents covers this foundation while staying composable. Compared to running agents on custom infrastructure, the managed approach centers three advantages:

  • Security and governance by default: sandboxed execution, scoped permissions, identity management, and tracing for auditable runs.
  • Durability and scale: long‑running sessions, recoverable progress, and coordination across agents without writing custom state management.
  • Observability: first‑class logs and analytics that help teams tune prompts, tools, and success criteria without trawling raw traces.

Real‑world patterns from early adopters

Teams are shipping three common patterns – coding agents that read codebases, propose changes, and open reviewable pull requests; productivity agents that pick up routine tasks and report back with structured outputs; and finance and legal agents that extract, validate, and summarize information from documents with auditable trails. Several organizations are already live. Notion's Custom Agents, Rakuten's Slack and Teams specialists – stood up in about a week – Asana's AI Teammates, and Vibecode's prompt‑to‑deploy tooling are public examples. The themes in their feedback are consistent: the managed runtime reduces infrastructure overhead, time‑to‑ship moves from months to weeks, and sandboxing with scoped permissions enables safe scaling across teams. Sentry, Atlassian, General Legal, and Blockit are among those embedding agent capabilities for debugging, project management, specialized task creation, and meeting preparation.

Predictions and what to watch next

Outcome‑first development will spread as teams define machine‑checkable goals and let agents iterate to green checks. Governance will become non‑negotiable – scoped permissions, identity per agent, and auditable traces are already moving toward table stakes for procurement and security reviews. Multi‑agent coordination will shift from novelty to norm as standard patterns for splitting and reviewing work emerge, much like build pipelines in software engineering. Cost models will tilt toward runtime plus tokens, pushing teams to invest in outcome checks and lean tool contracts.

The shift from prompts to production hinges on reliability, governance, and speed. Claude Managed Agents addresses those concerns directly: a secure, observable runtime; durable sessions; outcome‑driven loops; and a clean path to multi‑agent workflows. The net effect is fewer weeks lost to infrastructure, more time spent on user experience, and results that stand up to audits and scale. Start with a single, well‑bounded workflow, define the outcomes, and turn on tracing – the first success is not a demo that looks impressive, but a run that completes, passes its checks, and can be repeated the next day without surprises.

Mimmi Liljegren

Founder & CEO
Ayra

Let Ayra do all the work for you!

Ready to take your communication to the next level? Book in a Demo with the team and we will show you the power of Ayra.