Claude Code Launches Dynamic Workflow: Teaching AI to Build Teams and Get Work Done

Bitsfull2026/06/04 10:035757

概要:

Generate an execution framework on-the-fly for the task, scheduling multiple Agents to divide work, validate, and collaborate.


Editor's Note: Claude Code is evolving from a code assistant to a programmable Agent workspace.


The key value of the workflows described in this article is to transform Claude from merely "thinking before doing" in the same context window to dynamically generating an execution framework based on the task: task decomposition, sub-Agent dispatch, parallel processing, cross-validation, iterative loops, even allowing different Agents to compete with each other, and finally synthesizing the results.


This means that the use cases of Claude Code are significantly expanding. It is not only applicable to code migration, refactoring, test reproducibility, and code reviews but can also be used for in-depth research, fact-checking, resume screening, incident postmortems, rule codification, business plan reviews, name brainstorming, and other non-technical tasks. Many complex tasks are fundamentally similar to programming: they require problem decomposition, context isolation, hypothesis validation, handling of numerous details, and making choices among multiple candidate paths.


Dynamic workflows aim to address several common issues of large models in long tasks: the "agent inertia" that declares completion halfway through, the "self-preference bias" that tends to endorse its own conclusions, and the "goal drift" that gradually deviates from the original objective after multiple executions. By entrusting tasks to multiple Claudes with independent contexts, it transforms complex tasks from "single Agent marathon" to "multi-Agent collaboration."


Of course, workflows are not a universal solution. They often consume more tokens and may not necessarily be suitable for every ordinary coding task. However, they provide a crucial direction: the competition of future AI tools may lie not only in how intelligent a single model is but in its ability to organize a reliable, reusable, and auditable execution flow around a complex objective.


Below is the original text:


While the default Claude Code execution framework is built for programming, it is also applicable to many other types of tasks. Indeed, many tasks structurally resemble programming tasks. However, for certain specific task types to achieve optimal performance, we still need to build customized execution frameworks on top of Claude Code, such as research, security analysis, agent team collaboration, or code reviews.


Workflows allow you to dynamically create execution frameworks, enabling Claude to address the above issues more natively within Claude Code, as well as other types of issues. You can also share and reuse these workflows with others.


In this article, I will share my initial experience and insights using workflows to help you fully unleash its capabilities.


However, it is important to note that relevant best practices are still emerging. Dynamic workflows often consume more tokens, so you need to carefully consider when and how to use them.


Note: This article is also published on the Claude Blog.


Example Prompt


Before diving into technical details, I would like to provide some example prompts to help you understand the possibilities of workflows:


“This test fails approximately once every 50 runs. Set up a workflow to reproduce it, propose hypotheses, and conduct adversarial testing in different worktrees. /goal is not to stop until a hypothesis is confirmed.”


“Using a workflow, review my last 50 sessions to unearth the corrections I have repeatedly made, and transform these recurrent issues into CLAUDE.md rules.”


“Using a workflow, review the last six months of Slack’s #incidents channel to identify the root causes of recurring issues that no one has submitted a ticket for.”


“Run a workflow with my business proposal to deconstruct it from the perspectives of different stakeholders—investors, customers, and competitors.”


“Here is a folder containing 80 resumes. Use a workflow to sort them based on backend job requirements and review the top ten. Use the AskUserQuestion tool to ask me questions and help you establish evaluation criteria.”


“I need to name this CLI tool. Use a workflow to brainstorm a batch of options and then select the top three through a tournament mechanism.”


“Using a workflow, rename our User model to Account everywhere.”


“Review my blog draft and use the workflow to validate each technical judgment in the code repository. I don't want to publish any inaccurate content.”


How Dynamic Workflow Works


The dynamic workflow executes a JavaScript file that contains several special functions used to create and coordinate subagents.



The dynamic workflow also includes standard JavaScript functions such as JSON, Math, and Array for data manipulation.


Of particular note, the dynamic workflow can determine which model a particular agent should use and whether a subagent should run in its own worktree. This allows Claude to autonomously select the required level of intelligence and isolation based on the task at hand.


If a workflow is interrupted, such as by user intervention or termination of the session, upon session recovery, the workflow can resume execution from the interruption point.


Why Dynamic Workflow is Needed


When you task the default Claude Code execution framework with a job, it needs to perform planning and execution within the same context window. While this is very effective for many programming tasks, in long-running, large-scale parallel, or highly structured adversarial tasks, it can sometimes falter.


The reason is that the longer Claude spends processing complex tasks in a single context window, the more susceptible it is to specific failure modes:


Agentic laziness refers to Claude prematurely stopping before completing a particularly complex task composed of multiple parts and claiming completion after making partial progress. For example, in a security audit, claiming completion after reviewing only 20 out of 50 items.


Self-preferential bias indicates that Claude tends to prefer its own results or findings, especially when asked to validate or judge its output based on a specific evaluation criteria.


Goal drift refers to Claude's decreasing fidelity to the original goal over multiple execution rounds, especially after contextual compression. Each summary leads to information loss, and some detailed requirements, such as edge cases or restrictions like "do not do X," may get lost.


Creating a workflow helps alleviate these issues because it can orchestrate multiple independent Claudes, each with its own context window, focusing on isolated and clearly defined tasks.


Dynamic Workflow vs. Static Workflow


You may have previously created a static workflow using the Claude Agent SDK or claude -p to coordinate multiple Claude Code instances.


However, static workflows, due to the need to cover various edge cases, are usually more generic. With the introduction of Claude Opus 4.8 and dynamic workflows, Claude is now intelligent enough to write a tailored execution framework for your specific use case.



Practical Patterns When Using Dynamic Workflows


You can have Claude directly create a dynamic workflow or use the trigger word "ultracode" to ensure Claude Code creates a workflow.


However, if you can establish a mental model of how dynamic workflows operate, it is easier to determine when to use them and guide Claude through prompts more effectively.


When building workflows, Claude commonly uses and combines the following patterns:



Classify and Execute: Use a classifier agent to determine the task type and then route it to different agents or behaviors based on the task type. You can also use a classifier at the end of the process to determine the output.


Fan-out and Aggregate: Break down a task into multiple smaller steps, each handled by an agent, and then aggregate these results. This approach is particularly suitable for tasks with many small steps or where each step requires a clean context window to avoid interference or cross-contamination. The aggregation step acts as a "barrier": it waits for all fan-out agents to complete, then merges their structured output into one result.


Adversarial Validation: For each generated agent, run an independent agent to adversarially validate its output according to a set of evaluation criteria or guidelines.


Generate and Filter: Generate a large number of ideas around a topic, then filter them based on evaluation criteria or verification processes to remove duplicates, returning only the tested and highest-quality ideas.


Tournament: Instead of breaking down the work, have agents compete with each other. Generate N agents and have them each attempt to complete the same task using different methods. Then, have a prompt or model review the agents' results through pairwise comparisons until a winner is chosen.


Iterate Until Completion: For tasks with unknown workloads, do not set a fixed number of rounds. Instead, iterate by generating agents until a stopping condition is met, such as no new discoveries or no more errors in the logs.


Use Cases


You can think more creatively about when and how to have Claude Code create a dynamic workflow. I've found that workflows are sometimes even more useful in non-technical work.



Migration and Refactoring


Bun once used workflows to rewrite from Zig to Rust. You can read Jarred's post on X to learn about the specific process.


The key is to break the task into a series of steps to be dealt with, such as a call point, failing tests, modules, etc. Start a sub-agent in the worktree for each fix task to complete the fix; then have another agent perform an adversarial review, and finally merge the results. Consider explicitly instructing the agent not to use resource-intensive commands so that you can maximize parallelism without depleting local machine resources.


In-depth Research


We released a deep research skill (/deep-research) at Claude Code, which uses dynamic workflows. Specifically, it will fan out to perform web searches, fetch sources, conduct adversarial validation of relevant claims, and then generate a cited report.


However, such research is not limited to web searches. For example, you can also have Claude compile a status report from Slack context or delve into a codebase to study how a feature works.


Thorough Validation



On the other hand, if you have a report and want to fact-check every factual claim and source cited, you can create a workflow: first have an agent identify all factual claims, then start a sub-agent for each claim to conduct a thorough check. You can also have a validation agent inspect the sub-agents responsible for tracing back to ensure the quality of their sources is high enough.


Sorting



You may have a set of items that you want to sort based on a certain qualitative metric, and you believe Claude Code excels at evaluating this metric. For example, sorting support tickets by bug severity.


However, if you try to sort over 1000 lines of content in one go, the quality will degrade, and the context window won't accommodate everything. A better approach is to run a tournament mechanism, establishing a pipeline of pairwise comparison agents, as comparative judgment is usually more reliable than absolute scoring; or to first parallel bucket sort and then merge the results. Each comparison is done by an independent agent, so a deterministic loop can sustain the entire tournament structure, with only the current order of play needing to be kept in context.


Memory and Rule Compliance



If you have a set of specific rules, and Claude, even after seeing these rules in CLAUDE.md, still often overlooks or performs poorly on them, you can create a workflow listing out these rules and have validation agents check them one by one—each rule corresponds to a validation agent. Creating a sub-agent personality as a "skeptic" to review whether these rules are reasonable also helps avoid excessive false positives.


Conversely, you can mine your recent conversations and code review comments to identify corrections you repeatedly make; have parallel agents cluster these issues; then subject each candidate rule to adversarial validation to determine if it actually prevents a real mistake; finally, distill the filtered rules back into CLAUDE.md.


Root Cause Investigation


The most effective way to debug is to propose several mutually exclusive hypotheses and test them one by one. But if you only use a single context window, Claude may fall into confirmation bias.


Workflows can structurally prevent this scenario: they can launch multiple agents to have them generate hypotheses based on non-overlapping evidence. For example, have different agents look at logs, files, and data separately. Subsequently, each hypothesis can undergo scrutiny by a set of validators and rebutters.


This isn't exclusive to just code. Workflows can also be used for sales analysis, like "Why did sales drop in March?"; for data engineering, like "Why did this pipeline fail?"; or for any postmortem analysis.


Large-Scale Triage



Each team has a support queue, bug reports, or other backlogs that cannot be fully handled by humans. A triage workflow can help classify each item, deduplicate against known issues, and take action. This may involve attempted fixes or escalation to a human for resolution.


For the triage workflow, a useful pattern is quarantine. This means that agents that ingest untrusted public content are restricted from performing high-privileged operations; high-privileged operations should be carried out by agents dedicated to such actions.


You can combine triage workflows with /loop to have Claude continuously perform such tasks.


Exploration with Aesthetic Judgment


When you need to explore different paths for solutions, especially tasks involving design, naming, or other aesthetic judgment that can benefit from a set of criteria, workflows are valuable.


You can have Claude explore numerous solutions and provide a set of criteria to the review agent about what constitutes a "good" solution. When the review agent believes the outcome meets the criteria, the task is considered complete. Different solutions can also be ranked or filtered through a tournament-style mechanism based on this evaluation criteria.


Evals


You can run lightweight evals for specific tasks by starting a standalone agent in a worktree, then initiating a comparison agent that scores and compares specific outputs based on evaluation criteria. For example, you can evaluate and enhance a skill you created to see if it meets certain standards.


Model and Intelligence Level Routing: You can create a classification agent tailored to fine-tune your task, allowing it to decide which model to use. This approach is useful when a task involves numerous tool invocations and researching before execution can help identify the most suitable model.


For instance, for a task like "Explain how the auth module works," the most suitable model depends on how many files are in the auth module and what the codebase's structure looks like. The classification agent can conduct this preliminary research and then, based on the expected complexity, route the task to Sonnet or Opus.


When Not to Use Dynamic Workflows


Workflows are still a novelty. While in many use cases, they can bring far more than conventional ways, not every task requires them, and they may significantly increase token consumption.


It is best to use workflows for tasks that can push the boundaries of Claude Code capabilities in a new way. For typical programming tasks, you can start by asking yourself: Does this task really need more computing resources? For example, most traditional programming tasks do not require a team of five reviewers.


Tips for Building Dynamic Workflows


Prompt Design


When writing a prompt for a dynamic workflow, the more detailed, the better the outcome usually, especially when using the specific tips mentioned above.


Workflows are not only suitable for large tasks. You can also prompt the model to use a "quick workflow." For example, you could create a quick adversarial review process to test a hypothesis.


Combining with /goal and /loop


When using workflows that can be executed repeatedly, such as triage, research, or validation workflows, you can combine them with /loop to run at fixed intervals; simultaneously use /goal to set strict completion requirements.


Token Usage Budget


You can set a specific token usage budget for dynamic workflows to limit the token amount consumed by tasks. You can include a budget requirement in the prompt, such as "use 10k tokens," to set the upper limit to 10k tokens.


Saving and Sharing Dynamic Workflows


You can press "s" in the workflow menu to save workflows. You can submit them to ~/.claude/workflows, or distribute them via skills.



If you want to share them through skills, you can place JavaScript workflow files in the skill folder and reference them in SKILL.md. For greater flexibility, you can also prompt Claude to treat workflows in the skill as templates rather than scripts that must be run verbatim.



A Whole New World


Workflows are an exciting new way to extend Claude Code. I encourage you to see it as a starting point. There is much to explore about how best to use it. Please feel free to share your discoveries with us.


Thariq Shihipar and Sid Bidasaria (@sidbid) are members of the Anthropic tech team working on Claude Code.


[Original Article]



Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia