The Deck Says Agent. The Code Says Wrapper.

In Part 1: The AI Wrapper Epidemic - What we learned, I walked through the five patterns that keep resurfacing in pitch decks — the same verification spine I’ve always used: what the deck claims versus what actually lives in git, infra, and data. The vocabulary has moved from “proprietary AI platform” to agent, but the shape of the gap is familiar. Part 1 also showed how that gap meets governance once software is live: more systems in the field, uneven visibility, and liability that doesn’t care what the slide promised.

What most “death of wrappers” commentary still misses is that the underlying pattern isn’t disappearing. It’s migrating.

Same pattern, new slide

The 2023–2025 wave was chatbots and content tools — thin UIs on OpenAI or Anthropic-class models, sold as platforms. That narrative is exhausting, and the analogy to the dot-com era holds: a polished front on someone else’s rails is not the same as owning the stack.

The next wave speaks agentic AI. Gartner projects heavy embedding of AI agents in enterprise applications by mid-2026, up from a tiny share in early 2025 — a genuine pull from buyers. The same body of work warns that many agentic programs will be cancelled by 2027 as costs climb and business value stays fuzzy — a reminder that demand and durability aren’t the same thing.

Into that opening come hundreds of startups pitching “autonomous agent platforms,” “multi-agent orchestration,” and “agentic workflows.” Architecturally, a large subset is what we already know how to spot: a thin orchestration layer on someone else’s intelligence, renamed for the cycle.

The tells still rhyme with Part 1 — the scheduler dressed as autonomy, architecture slides with no failure story, demos that contradict production — only now error chains are longer and reasoning is harder to inspect. RAND Corporation research puts AI initiatives failing at roughly twice the rate of traditional IT; agents make compounding mistakes easier, not harder, when the scaffolding underneath is thin.

What the decks say vs. what the code shows

The agentic wrapper is already in the decks I see week to week. The language is newer; the dodge is old.

“Autonomous agent platform” that’s really chained API calls with a scheduler. In several recent diligences the entire “agentic” layer was an ordered list of LLM calls with a simple router — no durable state, no recovery when a tool times out, no session memory you could point to in a design doc. That can still be a useful product; it is not what the word agent usually implies when you underwrite it.

When step three fails, the deck should say what the system does — retry, escalate, partial commit, human handoff — not what the founder hopes will happen. If that answer isn’t in the repo narrative, you’re looking at a script with an LLM, which may be fine, but then own the label.

The multi-agent architecture slide with no governance layer. Six neat boxes — Research, Planning, Execution — and arrows everywhere, but no audit trail, no rule for Agent A contradicting Agent B, no guardrail on token spend as calls multiply. That’s the same class of diagram criticism as in Part 1, just with more actors. Gartner calls the rebranding “agent washing” when chatbots or RPA get a new badge; survey work still shows most enterprises in experiment mode and only a sliver near production-ready — a wide gap where wrappers thrive.

Production multi-agent means tracing who decided what, at what cost, and how a human steps in when two agents disagree — not only boxes on a slide. If governance is an afterthought, failures get expensive before they get visible.

The “fully autonomous” claim that falls apart in the logs. I’ve started asking for execution logs — not screen recordings, the real traces. Too often the “autonomous” flow needs a human every few steps. That’s not leverage; it’s staff augmentation with extra latency. For a credible Series A story on a core workflow, I’d expect human intervention rates that are the exception, not the rhythm — and founders who can show it from telemetry, not anecdote.

Logs are where autonomy stops being adjective and becomes a number: interventions per hundred runs, cost per successful task, where the model had to be overridden. If that packet isn’t ready in diligence, the autonomy claim isn’t serious yet.

The agent demo that costs more than the work it replaces. Agentic flows stack reasoning steps, tool calls, retries, and growing context. Unit economics can invert quickly — I’ve seen a founder demo a task “worth” tens of dollars while inference per run ran higher. That’s not a rounding error; it’s a business-model question teams often haven’t modeled.

Before you celebrate the demo, model tokens and tool calls per successful outcome and compare to the labor or software you’re displacing. If the spreadsheet only works at infinite scale, the deck is ahead of the product.

The “we use MCP/A2A” slide that substitutes protocol for product. MCP and agent-to-agent protocols matter — they’re real infrastructure. Listing them beside your logo isn’t depth any more than listing REST was in 2015. What matters is what runs on top: your data paths, your governance, your evaluation loop — not the acronym row.

Protocols lower integration cost; they don’t replace proof that your orchestration, failure handling, and data moat hold up when customers stress the system.

None of this is niche — it’s most of what crosses my desk, and the pace is up. InnMind reports dense look-alike decks at high volume; commentary on recent accelerator batches points to agents occupying a large share of new companies. High volume, low differentiation: you know how that ends.

TechTruth is where I’m encoding the same two decades of build, audit, and invest that Part 1 described — the first-pass diligence I used to hold in my head, now systematic for others. Investors can run it as an engine under their own process and load their review method and thesis into the human-in-the-loop step, so the judgment is partnership-grade, not generic scoring. Founders can use it to stress-test agentic claims before investors do — especially where the deck says autonomy, memory, or multi-agent coordination. On top of the baseline check against LinkedIn, sites, and real model depth, agentic deals need a ghost scan: a careful pass over codebase and infrastructure that answers what the deck dances around — does state survive a run? Is orchestration real or a linear prompt chain with choreography? Are there hooks for logging, escalation, and rollback, or mostly retries around raw LLM calls? Those answers live in repo and infra, not in Figma.

The product is in beta, it improves with every deck, and I’m clear this is still a beginning — serious, but early — the same posture as in Part 1. Expert review stays on what ships. First audit is free — try it here →

What actually survives

Two years ago the question was real AI or wrapper? Now it’s real agent or wrapper with a new adjective? The answer still isn’t on slide sixteen; it’s in the system behind it.

I’ve been building AI since 2003, and the durable companies still aren’t defined by the buzzword of the year. They own data that is hard to copy, sit deep in a workflow, and integrate where work actually happens — not only at demo depth. They invest in infrastructure the way Part 1 argued: monitoring, governance, feedback loops, meaningful human judgment.

For agentic stacks specifically, that infrastructure has a few extra non-negotiables: durable state and clear recovery when tools fail; propagation rules when one step poisons the next; cost and latency per successful task you can defend; and an audit trail that lets you reconstruct who decided what, for regulators and for your own debugging. Without those, “autonomous” is a slide, not an operating model.

The ask

Most founders I meet are serious; the technology is just legitimately hard. For investors, the ask is unchanged in spirit, sharper in detail: stop ending diligence at the deck. Read the code, read the logs, follow the infra — especially now that “agent” invites you to hand-wave orchestration.

The agentic wave will re-teach what Part 1 already said: a label on a slide isn’t evidence. TechTruth is meant to make that first pass systematic — including ghost scan when the claims are architectural — so you spend partner time on judgment, not on guessing whether the autonomy story is wired up. That’s the same “verify behind the curtain” standard; the curtain just got another layer of buzzwords.

We can do better than funding wrappers — even when the deck says agent.

Haven’t read Part 1 yet? It’s where I unpack the five patterns that keep resurfacing in every AI deck — the verification spine (deck vs git, infra, data) that still matters now that the word on the slide is agent instead of platform.

About the author: Bastiaan van de Rakt founded Why Commit Capital and is Venture Partner to several international funds and family offices. He is Operating Partner at Volve Capital and Venture Partner at Aenu; co-founded INIT8 (exited 2011), Enjins, and Deeploy. Twenty-five years in AI build, invest, and audit; 1,000+ decks, 100+ tech diligences. Building TechTruth. About → · LinkedIn →

Building something real? If you recognized your deck here, that’s a good sign. Genuine AI depth in Climate or deep tech, pre-seed–A — I want the conversation, not to audit you. Let’s talk →