Who Makes Decisions for AI Is Carving Out a $40 Trillion Threshold

Bitsfull2026/04/21 19:2017124

Summary:

Who Makes Decisions for AI Is Carving Out a $40 Trillion Threshold


Editor's Note: The future differentiation of AI agents does not depend on the leap in model capabilities, but on a more fundamental design variable—namely, who ultimately bears the responsibility.


The author argues that the so-called "augmenting humans" and "replacing humans" are not two separate technological paths, but two outcomes of the same system under different design choices—when decisions still require human signatures and responsibility can be traced back to specific individuals, AI acts as an amplifier; when this link is removed (such as with automatic approvals, bypassing permissions), the system naturally shifts towards replacement.


The article further points out that the true value of AI agents is not in "getting the job done," but in compressing the complex world into a "signable decision unit," allowing humans to take responsibility after understanding. However, in reality, "permission fatigue" causes users to gradually abandon review processes, moving from line-by-line approval to default consent, eventually allowing the system to bypass humans. This is a cognitive mechanism, not an individual problem.


Therefore, the article proposes two key constraints: first, every important decision must correspond to a specific, rejectable individual; second, whoever benefits from the agent's autonomy must take responsibility if issues arise.


Once responsibility falls back to the builder, the default logic of the system will change. Within this framework, the commercial narrative of AI is also rewritten. Rather than being a handful of tech giants dominating a market by "replacing half of the workforce," it is more of a distributed tool market "enhancing human productivity," anchored in the approximately $40 trillion global knowledge work income, rather than enterprise software spending.


Ultimately, the article boils down the issue to a simple yet poignant choice: is AI here to serve humanity or itself—and this answer is quietly being decided by every product design detail.


Below is the original text:


TL;DR


· The "augmented future" and "replacement future" use the same model, the same tools. What truly distinguishes the two is a design choice about "who ultimately bears the consequences."


· The real work of an agent is not to complete tasks for humans but to compress the complex world into a minimal and faithful "decision-making unit" that someone can sign off on. As long as this compression is done right, everything else will follow naturally.


·And this "someone" must be an identifiable individual. Vague, generalized responsibility disintegrates rapidly under high load, so every action with real consequences must be traceable back to someone with true veto power.


·"Permission fatigue" will lead Agent systems on their own evolutionary path to slide inexorably toward "substitute humans." Therefore, an "enhanced future" is not an automatic outcome but rather something that needs to be intentionally designed to counteract this tendency.


·If you build an Agent and benefit from its autonomy, then you should also bear responsibility when this autonomy malfunctions. Once the cost truly falls on the builder, the default behavior of the entire system will change accordingly.


·In a market formed under the premise of "humans still need to bear responsibility," its scale is likely to be an order of magnitude larger than the current heavily invested narrative of "vertical Agent replacing half of the job positions," because it is anchored not on corporate software budgets but on the total wage bill of highly skilled labor.



Claude Code offers a flag called --dangerously-skip-permissions. The naming is honest; its function is just as literal. An Agent running with this flag enabled is not more capable than when the flag is off; what changes is that a link that previously required a human now bypasses humans.


This flag itself is a form of candor. It acknowledges that with identical underlying capabilities, the same system can operate in both an "enhance human" mode and a "quietly replace human" mode. The so-called replacement mode does not require a different model; it just removes the step of "consent."


This is the compressed argument. In the most powerful Agent systems currently being deployed, much of the gap between "enhancement" and "effective replacement" comes from the removal of approval, rather than inventing a new category of capability. The next decade is more a question of whether it will be a "enhanced human world" or a "world where autonomous Agents act on our behalf," depending not so much on the model's capabilities as on whether the people building these systems see "humans in the loop" as core to the system or as friction.


Is AI to enhance humans or bypass them?


Beneath every technical question lies a non-technical one that few are willing to openly pose: Is AI meant to enhance humanity, or is AI itself the end goal?


These two answers imply very different futures. The "augment" position believes that value resides in humans themselves, and an Agent's job is to help this individual go further and make better decisions. The "AI as an end" position, on the other hand, sees intelligence in the world itself as the true value, viewing humans merely as inefficient vessels. Most Agent products silently encode one of these positions, yet surprisingly, few founders have been directly asked which category they belong to.


Capability design and consent mechanism design are still evolving. This article will focus on the "consent" side, as it is a variable that builders can truly control today. Even as the cost of generating capabilities becomes cheaper, what retains economic value are attributes that cannot be separated from humans: judgment, taste, relationships, responsibility, and the willingness to sign off on a decision and bear its consequences. Among these, "responsibility" is the most concrete element and the only one already backed by centuries-old enforcement infrastructure.


Responsibility, Augmentation, and Substitution Boundary


The structural rule distinguishing between an "augmented future" and a "substitutive future" can be roughly expressed as follows: Any action taken by an Agent that has real-world consequences must be traceable back through a recorded chain to a specific individual—a person who saw the relevant context and genuinely had the opportunity to say "no" to it.


Generalized responsibility quickly fails this test. "The company is responsible" operationally covers no real content. "The user clicked consent" does not consent to anything specific. "A human has reviewed the process" permits someone to review something entirely different from what eventually gets released. What is truly required is a specific individual, a named person, who saw the decision before them, had the option to reject it, and chose not to reject it.


This might sound bureaucratic until you realize that responsibility carries unique characteristics that other solutions lack. It cannot be optimized away by enhancing capabilities; a smarter model will not affect who ultimately gets sued, fined, or imprisoned. It forces a design interface to expose a "point of refusal." It naturally expands with risk. And it is a cross-domain superlative constraint with existing enforcement infrastructure: courts, insurance institutions, professional boards, regulatory bodies. Licensing regimes, fiduciary duties, and industry regulations indeed play their roles, but their constraints are narrower and all presume the problem of "responsibility attribution" has been resolved.


In contrast, AI-level substitution solutions cannot pass the same test. "Alignment" is non-executable; we cannot even agree on what it means. "Explainability" can be technically satisfied, but not substantively. "Humans in the loop" have been hollowed out to "somewhere, there is a person." The reason "responsibility" carries weight is that the enforcement infrastructure supporting it was established centuries before the technology's advent.


Permission Fatigue, Pushing the System toward "Opt-In"


This gradient will push the system toward "Opt-In," and the driving force is strong. Every permission confirmation consumes attention. The Agent is often correct. From the perspective of a single decision, the expected benefit of "clicking Agree without reading" is often positive. Therefore, a rational user will learn to click Agree faster, then batch Agree, then open automatic Agree for a certain type of operation, then expand to more categories, then in some session turn on that dangerous switch, and finally even forget about the existence of that switch.


In the second week of using Claude Code, I turned on this switch, and by the third week, I was no longer aware. All the developers I know who have been using Cursor or Devin for a long time have had a similar experience. This pattern also appears in cookie pop-ups, EULA agreements, TLS warnings, and mobile permission requests. Repeated low-risk agreement decisions will eventually converge into "unconditional consent." This is a cognitive feature, not a moral issue.


An "enhanced future" will not happen automatically. An agent system that is not carefully designed will default to the opt-in path because users, in their pursuit of convenience time after time, will actively choose the opt-in path. Another future must be designed against this gradient.


The Value of an Agent is not in Execution but in Enabling "Signature"


What truly adds value to an agent is not completing the work itself but compressing the work into a form that can be signed.


A cutting-edge model can easily write a 4000-line code commit, draft a 30-page contract, generate a clinical record, or execute a transaction. However, the bottleneck where these products truly have an impact is not in "generating them" but in whether humans are capable of bearing the consequences once they are implemented. A code commit that no one truly understands will become a burden upon merging; a contract that no one has read, once signed, becomes a time bomb; a clinical record that has not been endorsed by a practicing physician is not even considered a valid record in most regulated healthcare systems.


In the "enhanced" framework, an agent does everything except "signing": reading ten thousand pages of context, writing four thousand lines of code, calculating thirty reasonable solutions, and then compressing this content into a minimal and faithful expression so that someone can make a "yes" or "no" decision based on it and sign their name at the bottom of the document.


An agent can be understood as a press secretary. The President is responsible for signing, and the press secretary's job is to complete all preparations before the signature.


This is actually a more difficult engineering problem than "letting the system autonomously do the work." The capability to generate content is advancing rapidly, but the ability to "faithfully compress decisions" lags far behind. In the future, those who will succeed in the "enhanced market" are teams that can provide the shortest and most faithful decision summaries for high-stakes scenarios.


The real unsolved issue in this phrase is the word "faithful." A summary understandable by a human is only valuable when its compression process does not distort information. The truly challenging technical problem in the "enhanced future" is whether this can be verified in a programmatic way, and most people haven't even truly started to face it yet.


Some fundamental methods are emerging:


Confirming human understanding through paraphrasing tests


Forcing the presentation of minority opinions or dissenting views in summaries


Conducting counterfactual tests ("What would this Agent do if you refused?")


Reproducibility checks (Can another Agent generate the same summary based on the same context?)


These are all far from being resolved. And the teams that first solve these problems will establish a moat that won't easily be eroded by model capacity enhancements.


Establishing Responsibility Grading for AI Behavior


If "responsibility" is playing a structural role, then every action performed by an Agent should come with a "responsibility grade," and this grade should determine the minimum signature mechanism required for that action.


Currently, such a standard system has not been widely established—but it is likely that it should be.



The "approval posture" matching the consequences is the only realistic path to managing permission fatigue. In high-risk tiers, more restrictive positive engagement mechanisms need to be introduced (such as paraphrasing tests, cooling-off periods, secondary reviewers) because in these scenarios, the true failure mode is not an incorrect suggestion by the Agent but humans approving without thoughtful consideration.


Do You Care?


All the above questions ultimately point to a fundamental founder-level question: Do you care if humanity is still part of this future? Many of the current design decisions about the Agent product essentially amount to a kind of "silent vote" on this question, although the voters often refuse to acknowledge that they are making a choice.


If you care, then the design constraints are not actually vague: you need to build a layered responsibility system; design "rejection" as a first-class feature; the measure should be the quality of the summary the Agent gives to humans, not its autonomy to complete tasks without human intervention; you need to tie every action with real consequences to a specific individual in a log with extremely low tamperability.


These technical tasks themselves are realistic and feasible. The real challenge lies in whether one is willing to do so—because the "enhanced" path of construction is not as dazzling in demonstration effects and is not as radical in the economic model based on seat-based billing as another path.


Anthropic Paradox: Emphasizing Security Makes It Easiest to Bypass People


Anthropic is a very typical case that demonstrates how the field experiences "endogenous drift." This is not because it is particularly negligent; on the contrary, but because its expression on security issues is most clear, the gap between the "framework" and the "product surface layer" is also most easily seen. Its "Responsible Scaling Policy" and "Constitutional AI" work mainly constrain model behavior during the training phase; however, the Agent built on these models has a default autonomy setting as part of another set of policy systems, and that convenient "danger switch" can be enabled from the default state with just one press of a button.


This pattern exists in most mainstream programmed Agents, with Anthropic's case being the most easily observed. This is the so-called "Anthropic Paradox": the laboratory that most clearly writes the security framework in this industry also provides the shortest path from "enhancement" to "replacement," and the reason we can see the latter is precisely because the former is clear enough.


To be fair, they introduced an "auto mode" in March this year as a middle path between manual approval and the danger switch. In this mode, each action is reviewed by a Sonnet 4.6 classifier before execution. They directly acknowledged the issue in their official statement—calling it "approval fatigue" and provided a statistic: users accept 93% of prompts in manual mode. This is actually a quantification of "permission fatigue." This judgment is consistent with the analysis in this article.


However, along the pathway to a solution, I will present a different view. The "automatic mode" has replaced human approval with model approval, meaning that the gradient of "sliding towards substitution" has not ended but has only shifted up one level. The classifier can indeed prevent harmful behavior, but for those behaviors that are approved, there is no specific person truly responsible. Anthropic itself acknowledges that the "automatic mode" cannot eliminate risk and recommends users to run it in an isolated environment — in other words, the issue of "responsibility attribution" remains unresolved.


An obvious opposing view is: if ultimate responsibility falls on individuals, isn't that just manual mode? And manual mode is precisely what has been worn down by fatigue. The reason "responsibility falls on the builder" can break free from this gradient is that it changes the cost bearer of "excessive approval." In the current structure, users bear the cost for every careful read, while builders do not, so default settings tend to reduce user friction and externalize risk. Once the "cost of unchecked behavior" is shifted to the builder, the entire calculus is reversed: builders will have a direct economic incentive to design responsibility layers, replay tests, and approval mechanisms, making the cost of signing off on low-risk decisions lower and high-risk decisions higher. The gradient will not disappear, but the direction will change. To this day, no major lab has truly put this into practice, including the one closest to recognizing the problem.


If You Build an Agent, You Should Bear Responsibility


If an Agent's explicit purpose is to replace human execution of actions originally performed by humans, then the company building and operating this Agent should bear the same responsibility as humans. This principle is not radical; it has long applied to all industries where actions are taken in the real world: Toyota is responsible for brakes, Boeing for flight control systems, Pfizer for drugs, bridge engineers for bridges, doctors for prescriptions. This responsibility model exists in almost all legal systems.


However, AI currently enjoys a kind of "implicit immunity" to some extent. Model providers claim to be mere tool suppliers; upper-layer application companies claim to be merely a thin wrapper around the model; users, in the first place, waive all responsibility through arbitration clauses. When an Agent system experiences cascading failures (e.g., the Canadian aviation chatbot case, the Replit incident of deleting a production database, or similar to the 2012 Knight Capital trading glitch that lost $440 million in 45 minutes), the ultimate bearer of the loss is often the least capable of bearing it — the user. This allocation of responsibility will not persist in a major incident that truly "involves money and paperwork" for the first time.


The solution is actually quite simple in its expression: whoever builds the Agent and benefits from its autonomy should be held accountable when it goes rogue. Once responsibility truly falls on the builder, permission prompts are no longer seen as "friction," but as "insurance." That dangerous switch will be renamed, and default settings will change accordingly.


Willingness to take responsibility for one's system is key to distinguishing a genuine industry from an "extractive industry."


Regulation as a "Guiding Mechanism"


The market itself will not naturally move towards an "augmented future." The true guiding forces are often regulatory bodies and insurance underwriters, and overall, this may not be a bad thing.


Europe is likely to become the earliest regulatory gateway. The EU has clear precedents in rule-making (such as GDPR, the "AI Act," DMA), and its rules are often globally assumed to be followed because maintaining a separate set of products for non-EU markets is usually more costly than complying directly with European standards. A baseline that requires "all actions with real-world consequences to ultimately be verified by a named human with veto power" is closer to automotive crash test standards than to hindering technological progress.


A more direct driving force comes from the insurance industry. Underwriters for Errors & Omissions (E&O), Directors & Officers (D&O) liability insurance, and those pricing cyber insurance must answer one question: how is liability determined when an Agent acts under user authorization and causes harm? The easiest path to a compensable structure is to have a named human in the loop. Therefore, systems without such a structure will naturally have higher risk premiums reflected in their insurance costs. For builders who wish to define rules themselves rather than have rules set by regulators or insurance companies, the time window is actually not very wide.


Market Logic Overshadowed by the Mainstream Narrative


The current mainstream narrative holds that vertical-specific Agents will absorb around half of the jobs in the industries they touch, with value consolidating in a few vertically integrated Agent companies—a Law "Anthropic," a Health "Anthropic," an Accounting "Anthropic." Nearly all the AI funding in the past year and a half to the tune of billions of dollars has been built to some extent on this assumption. This is a version of "substitution logic" dressed up in a commercial guise, but its judgment on market structure is incorrect, and this error will directly affect capital allocation.


An "augmented" framework implies a different market form. If every action with real-world consequences must ultimately be attributed to a named individual, then what is being sold is not an "autonomous Agent" but "amplified human capability." The buyer is the doctor who can handle three times the caseload with higher accuracy, just as the lawyer who can cover ten times the transaction flow, the engineer who delivers at five times the speed, and the supporting cast of accountants, underwriters, analysts, architects, surgeons, teachers, loan officers, journalists, and pharmacists behind them are buyers as well.


The reason this market is larger is that it relies on scale-out rather than centralization. A reasonable valuation anchor should not be an enterprise software budget, but the "amplified" total labor wages. Global enterprise IT spending is approximately around $4 trillion per year (Gartner data); whereas the total compensation for global skilled, certified, and knowledge workers is roughly an order of magnitude higher, around $40 trillion (based on International Labour Organization data, and excluding the estimate of low-skilled parts). AI companies will certainly not capture the entire wage pool, but they can capture a portion of the productivity dividend. Even capturing a single-digit percentage share is enough to support a market equivalent to today's entire enterprise software market size, which is just a floor, not a ceiling. The ultimate size of the market space depends on a key design decision: on whom the responsibility ultimately falls.


The ultimate winners are more like tools than substitutes, priced based on the "amplified human" rather than the "replaced position"; they embed in existing professional workflows rather than disrupt them; they will number in the thousands, not just a few. The ultimate form of this market is closer to SaaS than cloud infrastructure. Currently, we are still in the very early stages of deployment curve, and the common penetration rate curve, on a coordinate axis that will extend for another decade, is just the leftmost few pixels. And the form of these "pixels" is determined by the design choices in a small fraction of products today.


Choice: Empower or Displace?


Allowing humans to continue to take responsibility will force system architecture to revolve around "enhancing humans"; and once humans are removed from the chain of responsibility, the system defaults to "replacement," even though every person present, if explicitly asked, may not choose that outcome.


The real question is not whether some behaviors should be fully automated—this framework has already recognized this point, for example, purely informational reading operations can indeed be automated. The key is how this boundary moves as risks gradually rise and who decides it. In the most advanced Agent systems today, the path from "enhancement" to "effective replacement" is extremely short, often requiring just a parameter switch or a default setting. The truly important work is to ensure that this switch is always seen as a "dangerous option" and does not gradually become the default driven by convenience.


If the builders actively do this work, we will smoothly move into an "enhanced future"; if they don't, regulators and insurers will do it for them, and the outcome will be the same.


Whether you care, this is a design choice. And this choice will determine what you are building. Today, every founder launching an Agent product must publicly answer a question they seem unwilling to face: Are you building enhancement, or are you building replacement?


[Original Article]