All posts

Susana Khan

Settlement Without Proof

On provenance, accountability, and the infrastructure gap beneath autonomous execution…

Recent analysis from writers including Melody Koh have mapped the competitive geometry of the emerging agent web clearly: who controls supply, whether orchestration can be re-aggregated, and which incumbents risk tolling themselves out of relevance. Those are the right questions for the current moment. But there is a layer missing from the stack, one the industry is treating as a compliance problem rather than an infrastructure problem, and it is the layer that determines whether everything built on top of it holds. I examined the first signs of this gap in an earlier piece, You Cannot Audit a Probability: The Agentic AI Trust Wall. What has become clearer since is how structural the problem actually is.


The Agent Web Has a Missing Layer

The frameworks being written about this transition are largely correct. The disruption is real, the competitive logic is sound, and the companies being named, as either disintermediators or displaced, are generally the right ones.

What they miss is a category shift happening underneath all of it.

You Cannot Audit a Probability examined why probabilistic systems become difficult to govern once they produce consequential outputs. The agent web extends that problem from decision support into autonomous execution itself. The earlier piece was about model epistemology. This one is about infrastructure consequences.


When Routing Becomes Execution

For two decades, digital power was organized around discovery and aggregation. The interface between intent and selection was where control lived: search, feeds, platforms, marketplaces. The critical question was always who routes demand.

Agents change that. Not incrementally. Categorically.

When an agent recommends a hotel, the old logic applies. Someone still selects. There is still a human at the moment of consequence, and the system is accountable through that human. When an agent books the hotel, the category changes. The bottleneck is no longer routing; it is whether that action can be trusted, verified, and accounted for afterward, across systems that did not observe the decision process themselves.

Agents have become a new kind of actor, one that initiates, executes, and settles across multiple systems, often without a human in the loop at the moment things go wrong.

Once intent no longer requires selection, the system reorganizes, away from distribution and toward the integrity of autonomous execution. And integrity, at that level, requires infrastructure that does not yet exist at scale.


The Tollgate Incumbents Are Solving the Wrong Problem

SAP, ServiceNow, Workday: the major enterprise incumbents are choosing between two postures right now.

ServiceNow introduced Action Fabric at Knowledge 2026, opening its platform to external AI agents via a generally available MCP server. JPMorgan analyst, Mark Murphy, described the action-based consumption pricing as effectively a tax on customers using outside AI agents to interact with data they already store in ServiceNow’s apps. SAP published API Policy v4/2026 in April, prohibiting third-party AI agents from autonomously sequencing calls outside SAP-approved architectures, drawing immediate pushback from partners who called it lock-in, even as CEO Christian Klein sought to soften the message on the Q1 investor call. Workday has similarly foregrounded the financial upside of monetizing agent access.

This strategy is historically difficult to sustain once orchestration becomes portable. Tollgate incumbents signal their own vulnerability the moment they impose friction on autonomous orchestration. Enterprises charged enough for agent access eventually begin routing around the gate. That is the pattern of digital aggregation markets, and nothing about the current moment suggests it breaks here.

But the deeper issue runs underneath the question of discovery control. Their strategy solves orchestration control before solving evidentiary trust, and those are different problems on different timelines. Once agents execute autonomously across systems, somebody must be able to prove what occurred.

Reconstruction after the fact is insufficient. The relevant standard is evidentiary integrity under adversarial scrutiny: the proof a court, regulator, auditor, insurer, or counterparty would demand once autonomous systems generate real financial and operational consequences. The tollgate incumbents are unprepared for that question, and so, yet, are most of the disintermediators moving to replace them.


Settlement Is Not Verification

The Universal Commerce Protocol, co-developed by Google and Shopify and announced in January 2026 at the National Retail Federation conference, is probably the single clearest expression of where the market is heading. Backed by Walmart, Target, Etsy, and Wayfair as co-developers, and endorsed by more than twenty partners including Visa, Mastercard, and Stripe, UCP attempts to standardize how agents discover, negotiate, and execute transactions across merchant systems: agent-mediated discovery, agent-mediated checkout, machine-readable commerce negotiation, interoperable execution flows. It is the beginning of protocolized agent commerce infrastructure, and it formalizes a premise the market has now accepted: that agents are becoming transactional actors, not just recommendation systems.

Shopify was always merchant infrastructure rather than a consumer destination. Extending headless commerce into agentic commerce is evolutionary rather than disruptive to its identity. Google’s position is equally legible: if agents operate through Gemini and merchants speak UCP, then Google Pay becomes the credential and settlement layer underneath agent-mediated commerce.

Settlement and verification are different problems, and UCP solves only one of them.

Protocols like UCP, x402, and AWS AgentCore Payments address transactional execution between systems. They solve real and important parts of the payment coordination problem. What they leave untouched is the verification problem, which becomes increasingly consequential as agents interact recursively across organizational and protocol boundaries, each one inheriting the state produced by the last. As Christian Catalini and others have argued, cryptographic coordination systems become economically relevant when they reduce the coordination costs of establishing trust across parties that cannot rely on a single shared intermediary (see Built Before It Was Named). UCP solves coordination; trust remains unaddressed.

OpenAI’s Instant Checkout illustrated the difficulty directly. Operational issues reportedly included near-zero conversion rates, weak fraud infrastructure, sales tax complications, and merchant onboarding friction. Reading these as product immaturity misses the structural point: when the agent’s account of system state becomes the only account of system state, errors propagate in ways that surrounding infrastructure cannot reliably adjudicate afterward.

The operative question is who bears liability when autonomous execution is based on stale state, incorrect inventory, misinterpreted instructions, or records generated by the same probabilistic engine that performed the action. Clifford Chance’s analysis of agentic AI contracts found that under most current technology agreements, if an agent incorrectly authorizes a payment or misprices a product, suppliers’ standard disclaimers typically leave the customer holding liability for a system they did not design and cannot fully audit. The contractual framework wipes out meaningful recovery precisely when the stakes are highest.


This Does Not Stay Contained

The trust problem remains underestimated because agent infrastructure is still mentally divided into two categories: lightweight consumer systems and serious institutional systems. The assumption is that provenance, deterministic guarantees, and verifiable execution only matter once agents enter regulated domains, finance, healthcare, enterprise procurement.

That distinction will not hold.

The systems that contributed to the 2008 financial crisis were initially treated as distributed, localized, statistically manageable components of a larger market. The systemic risk emerged once interconnectedness, opacity, leverage, automation, and cascading dependencies reached sufficient scale. Nobody designed for systemic fragility. It emerged from composition.

Agent systems are beginning to evolve along structurally similar lines.

One hallucinated purchase or failed booking is containable. The deeper risk is recursive interaction between autonomous systems: agents consuming outputs from other agents, probabilistic state propagating across execution environments, unverifiable actions becoming composable infrastructure beneath larger systems. Once that occurs, low-trust domains stop remaining isolated. What began as a consumer-facing reliability problem becomes load-bearing infrastructure for decisions with real financial and legal consequences.

Aggregators centralized demand. Agent systems decentralize execution. But decentralized execution without verifiable state eventually recreates systemic fragility at machine speed.

McKinsey’s 2026 AI Trust Maturity Survey, drawing on approximately 500 organizations across industries and regions, found that only around 30 percent reached a maturity level of three or higher on agentic AI governance and controls. Separately, Microsoft’s Cyber Pulse report found that 80 percent of Fortune 500 companies now have active AI agents embedded in production workflows. The gap between those two figures is an infrastructure problem, one that policies and frameworks cannot close on their own, and that better training programs will not fix.

The emerging field of governed execution research reflects the same recognition from a different direction. Academic work on provenance systems for autonomous agents, proof-derived authorization architectures, and accountable multi-agent governance is converging independently on the same missing property: verifiable autonomous execution. When academic institutions, regulatory bodies, and enterprise deployments begin arriving at the same structural deficiency from different starting points, that convergence is signal, not coincidence.


What Serious Deployments Already Know

Gondola is a travel AI agent that books end-to-end itineraries autonomously: flights, hotels, transfers, dining reservations, coordinating across multiple provider systems in a single execution flow. It is one of the more complete examples of agentic commerce operating at production scale.

What made Gondola work was something less visible than the orchestration layer: the record of what the agent did had to be legible and authoritative to every downstream participant in the transaction chain, hotel reservation systems, airline databases, loyalty programs, payment processors, and the customer, none of whom were present when the agent made its decisions.

Marriott awards loyalty points because Gondola’s bookings are recognized by downstream hotel systems as valid direct reservations eligible for loyalty accrual. That recognition depends on transaction records that can move reliably across multiple systems, hotel reservation infrastructure, payment systems, loyalty databases, and customer accounts, while preserving enough integrity for parties who never observed the original transaction to treat the resulting state as authoritative.

This is the pattern that separates serious agent deployments from impressive demos. Discovery is difficult but tractable. Payment execution is attracting multiple well-capitalized efforts. The foundational problem that keeps getting deferred is producing a record of autonomous execution that remains trustworthy to systems and institutions that were never present for it. As McKinsey Partner Rich Isenberg put it: “Agency isn’t a feature, it’s a transfer of decision rights. The question shifts from ‘Is the model accurate?’ to ‘Who is accountable when the system acts?’”

Liability, in other words, is becoming a technical problem: specifically, whether system state can be independently verified at the moment a claim is made against it. Policy cannot substitute for infrastructure. Without portable provenance, downstream systems would have no reliable way to distinguish between a valid autonomous booking flow and an unverifiable synthetic transaction narrative generated after execution.

A log generated after the fact by the same system that acted answers nothing.

Subscribe now


The Legal Timeline Is Not Waiting

The IMF recently named the underlying structural tension: payment systems require deterministic execution and settlement finality, while agentic AI introduces probabilistic reasoning and non-deterministic execution paths. Separating AI orchestration from deterministic settlement is a coherent first step, but insufficient on its own, because even with deterministic settlement, the surrounding record may remain probabilistic, generated after execution by the same system that executed, coherent in narrative but independently unverifiable in fact.

That distinction matters acutely once systems become adversarial.

Litigation involving UnitedHealth demonstrates the issue in real time. The legal question cuts deeper than whether an AI-assisted decision occurred; organizations must prove, under scrutiny, that the records surrounding those decisions are accurate, complete, and independently verifiable, and produce that proof without relying on the AI system’s own account of itself.

The regulatory landscape is tightening on multiple fronts, though unevenly. California’s AB 316, which took effect January 1, 2026, forecloses the autonomous-harm defense: developers, modifiers, and users of AI systems can no longer argue that the AI acted independently as a shield against civil liability. Colorado’s original AI Act, which would have required annual impact assessments for high-risk systems by June 2026, has been stayed by a federal court and substantially replaced by the legislature; its successor, if signed, takes effect January 1, 2027, under a narrower framework. The EU Product Liability Directive classifies AI software as a product subject to strict liability, with member-state implementation deadlines in December 2026. The direction of travel across jurisdictions is consistent even where individual timelines shift.

There is also a geopolitical dimension beginning to emerge. Economies that cannot independently verify autonomous machine execution may find themselves structurally dependent on foreign orchestration layers, a sovereignty problem dressed as a technical one. Cross-jurisdiction verification, sovereign execution infrastructure, and regulatory interoperability are not yet central to the agent web conversation. Once autonomous systems begin mediating financial, commercial, and governmental coordination at scale, they will be impossible to avoid.

These are current exposures being accumulated, on a timeline that is not synchronized with most organizations’ infrastructure roadmaps.


The Architecture the Market Is Converging On

The verification problem has a structural solution, and its shape is becoming clearer as the failures accumulate.

The key insight is this: a record generated separately from an action, and stored separately from an action, can always be questioned. It can be lost, altered, reconstructed, or simply wrong. A record bound cryptographically to the action at the moment of execution, one that travels with the transaction the way a bearer instrument travels with value, cannot be reconstructed after the fact, because it was never separate to begin with.

This is the architectural direction that serious provenance infrastructure takes. Rather than logging what an agent did and storing that log somewhere retrievable, the proof of what occurred becomes a structural property of the transaction itself. Any party can verify it independently. No intermediary needs to be trusted. No reconstruction is required, because nothing was ever separated.

The practical consequence is that compliance stops being a reporting layer bolted onto an execution system. It becomes a property of the execution itself, present at every transaction, verifiable by any counterparty, portable across the systems that need to trust it. Settlement and verifiable record become the same operation rather than two operations that must be reconciled.

The emergence of companies focused specifically on agent governance, policy enforcement, and execution accountability signals that the market is beginning to recognize this gap. Orchestration alone is insufficient once agents begin operating across institutional boundaries. That recognition is still early; the infrastructure buildout that follows it will move fast.

This is where the market for agentic infrastructure is converging, not because any single company has made it so, but because the failures accumulating across agent deployments all point to the same missing property. A small but growing set of infrastructure efforts is beginning to converge on this problem from different directions: verifiable state transfer, portable provenance, governed execution, and cryptographically attestable coordination across systems. TODAQ through Qatom, is building this layer around execution-bound cryptographic proof designed for the phase of the agent web that follows autonomous execution at scale


The Missing Layer

The discovery layer is being built. The payment execution layer is being built.

Beneath both of them is a layer the agent web has not yet fully recognized as infrastructure: the systems that make autonomous execution independently verifiable, portable across institutions, and resilient under adversarial scrutiny.

The problem is not observability. It is evidentiary integrity.

A log generated after the fact by the same system that acted cannot serve as the foundation for financial coordination, institutional accountability, or autonomous commerce at scale. As agents begin operating across organizational and jurisdictional boundaries, every consequential action eventually collapses into the same question:

Not simply what did the agent do, but can the system prove it independently of the agent’s own account of itself?

That requirement changes the architecture of the stack beneath autonomous execution. Verification stops being a reporting layer attached afterward and becomes a property of the transaction itself: portable, attestable, and durable across the systems that inherit its state.

This is the layer the market is beginning to converge toward, not because regulation demands it or because companies prefer it, but because autonomous systems operating without verifiable state eventually become ungovernable.

The organizations that recognize trust as infrastructure will help define the next phase of the agent web. The ones that continue treating it as compliance will discover the distinction too late.

Subscribe now


References and Notes

  1. California Assembly Bill 316 (2026, January 1). Amends California Civil Code § 1714.46, prohibiting autonomous-harm defenses in civil actions involving AI systems. Baker Botts. (2026, January). California eliminates the ‘autonomous AI’ defense. https://ourtake.bakerbotts.com/post/102m29i/california-eliminates-the-autonomous-ai-defense-what-ab-316-means-for-ai-deplo

  2. Catalini, C. (2026, February 24). Some simple economics of AGI (MIT Sloan Research Paper). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6298838
    Referenced in the context of cryptographic coordination systems and the economics of trust across parties without shared intermediaries.

  3. Clifford Chance. (2026, February 10). Agentic AI: The liability gap your contracts may not cover. https://www.cliffordchance.com/insights/resources/blogs/talking-tech/en/articles/2026/02/agentic-ai-and-the-liability-gap-your-contracts-may-not-cover.html
    Analysis of liability allocation in agentic AI contracts and how standard technology agreement disclaimers interact with autonomous execution failures.

  4. Colorado AI legislation update. (2026). SB 24-205 (original Colorado AI Act) stayed by federal court (April 27, 2026); replaced by SB 189 (passed May 2026, effective January 1, 2027 if signed).

  5. Dignan, L. (2026, May 5). ServiceNow Knowledge 2026: AI Control Tower, Action Fabric, Autonomous Workforce and more. Constellation Research. https://www.constellationr.com/insights/news/servicenow-knowledge-2026-ai-control-tower-action-fabric-autonomous-workforce-and

  6. European Union. (2024). Product Liability Directive (classifies AI software as a product subject to strict liability). Member-state implementation deadline: December 2026.

  7. Gondola AI. (n.d.). Production travel agent coordinating multi-system autonomous bookings (case study in provenance infrastructure requirements). https://www.gondola.ai/

  8. International Monetary Fund. (2025–2026). Analysis of structural tensions between deterministic payment settlement and probabilistic AI execution paths.

  9. Koh, M. (2026, May 13). How the agent web gets built: Why incumbents will toll themselves out of relevance. Ground Truth.

    Foundational analysis of agent-driven disintermediation, orchestration portability, and tollgate incumbent dynamics. This essay builds on Koh’s competitive framework while examining the unresolved verification and provenance layer beneath autonomous execution.

  10. McKinsey & Company. (2026, March). State of AI trust in 2026: Shifting to the agentic era. https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/tech-forward/state-of-ai-trust-in-2026-shifting-to-the-agentic-era (Survey of ~500 organizations; ~30% reached maturity level 3+ on agentic AI governance and controls.)

  11. Microsoft. (2026, February). 80% of Fortune 500 use active AI Agents. Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2026/02/10/80-of-fortune-500-use-active-ai-agents-observability-governance-and-security-shape-the-new-frontier/

  12. OpenAI Instant Checkout. (2025–2026). Reported operational challenges including near-zero conversion rates, fraud infrastructure gaps, sales tax complications, and merchant onboarding friction.

  13. Rich Isenberg (McKinsey Partner). (2026). Comments on accountability implications of agentic AI.

  14. SAP API Policy v4/2026. (2026, April). AI clause reported in: “AI clause in new SAP API policy provokes lock-in concern,” The Register. Waehner, K. (2026, May 2). Data ownership in the age of agentic AI: Why SAP’s API policy forces a data integration reckoning for every enterprise. https://www.kai-waehner.de/blog/2026/05/02/data-ownership-in-the-age-of-agentic-ai-why-saps-api-policy-forces-a-data-integration-reckoning-for-every-enterprise/

  15. ServiceNow. (2026, May 5). ServiceNow opens its full system of action to every AI Agent in the enterprise. ServiceNow Newsroom. https://newsroom.servicenow.com/press-releases/details/2026/ServiceNow-opens-its-full-system-of-action-to-every-AI-Agent-in-the-enterprise/default.aspx

    Mark Murphy quote/analysis on action-based pricing: “ServiceNow, SAP and Workday Make AI Agents Pay to Play,” PYMNTS.com (May 2026). https://www.pymnts.com/artificial-intelligence-2/2026/servicenow-sap-and-workday-make-ai-agents-pay-to-play/

  16. UnitedHealth Group litigation. (2024–2026). Ongoing litigation and regulatory scrutiny regarding use of AI in coverage decisions.

  17. Universal Commerce Protocol (UCP). (2026, January 11). Announced at National Retail Federation conference. Co-developers: Google, Shopify, Walmart, Target, Etsy, Wayfair. Google Developers Blog (2026, January). Under the Hood: Universal Commerce Protocol (UCP). Shopify Engineering (2026). Building the Universal Commerce Protocol.

Further reading:

Thanks for reading! Subscribe for free to receive new posts and support my work.