The AI Subsidy Era is Over

April 28, 2026 · Episode Links & Takeaways

Sign up for AgentOS, our latest free training course. It’s a platform and model agnostic program to help you build a personal agentic operating system that can evolve with you over time. The program is self-directed and project based, running over four weeks.

MAIN STORY

The AI Subsidy Era Is Over

Here's a thing you probably didn't realize: even the most expensive AI subscriptions — $200 a month plans — haven't been covering the actual cost to serve you. The labs have been subsidizing the difference with venture capital, and that era is now ending. What's forcing the reckoning is the agentic shift: token consumption has gone through the roof, stability is straining everywhere, and the business models built for a chatbot world are collapsing under the weight of coding agents running for hours at a time. This isn't a short-term blip — it's a cascade that will reshape how everyone from individual developers to large enterprises thinks about and pays for AI.

SHIFT TO USAGE BASED PRICING

GitHub Copilot
"There could not be a clearer indicator the subsidy era is over."
GitHub's announcement Monday was the single most clarifying data point of the week. Their $39/month top-tier Copilot subscription had become an extraordinary deal as coding agents took off — pricing was based on requests rather than token usage, which caused massive distortions in the agentic era. The new model shifts to usage-based billing starting June 1, matching how Cursor already operates. The revised multiplier table tells the whole story: Claude Opus 4.7 goes from a 7.5x multiplier to 27x, and Gemini 3.1 Pro and GPT-5.3 Codex jump from 1x to 6x — roughly a 6x price hike for frontier coding models across the board. Microsoft was literally revealing, through their own price hike, how deep those subsidies had been.

Anthropic
Straining under the weight of its own success.
Anthropic has been tiptoeing toward the same cliff for weeks, but without ripping off the band-aid cleanly. The sequence: forcing OpenClaw users to the API, running a test that removed Claude Code from the Pro plan, withholding Mythos, and confirming last week that they did deliberately throttle model performance. The Hermes incident crystallized the tension — a developer was charged $200 because the text "Hermes.md" appeared in their git commit history, triggering third-party agent detection. Claude Code's Thariq publicly apologized and issued refunds, but Theo put it plainly: some bugs suggest the thing you're trying to do is a bad idea. A separate Reddit thread went viral when an entire 110-person organization was apparently banned without explanation, given only a Google Form for appeals. The community response to all of this has been loud, with Jason from artifacts.com summarizing: "Anthropic is still overcorrecting on security and understaffing support… I would not recommend anyone launch production services on Anthropic without an enterprise contract in place."

The Wall Street Misread
Reacting to data from two to six months ago on a dog-year timescale.
Alongside the GitHub announcement, stocks tied to OpenAI and AI broadly got hammered on a WSJ report that OpenAI missed key revenue and user targets toward the end of last year and start of this one. That's worth noting — but the report is ancient history by AI standards. Codex user growth has gone 20x this year, from around 200,000 users on January 1st to 4 million the week before GPT-5.5 launched. The broader bubble-narrative shift is similarly telling: the bears have now abandoned "revenue growth looks weak" in favor of "revenue is great but subsidized." At some point you have to wonder if they just need the bubble to exist. That said — what Wall Street thinks does matter, because financing the continued buildout of compute depends on market confidence.

The More Interesting Implication: OpEx to CapEx
This isn't a layoff. It's the transition from neurons to silicon.
The market implications of the subsidy era ending are actually the least interesting part. What matters more is the structural shift from headcount to AI spend — and what that really means for jobs. In three consecutive months of listener pulse data (January, February, March), cost savings didn't appear anywhere on the list of primary AI benefits. Time savings as the top benefit dropped from 19.7% to 12.7%; new capabilities as the top benefit rose from 21.9% to 29.3%. If AI adoption is being driven by capability unlocks rather than cost reduction, the shape of labor displacement looks very different from the doomsday framing. There's also an underappreciated irony: the physics constraints driving this cost reckoning — grid limitations, component shortages, data center buildout timelines — may end up doing more to slow AI diffusion than any open letter ever has.

Cheaper Models Are About to Have a Moment
"Intelligence per unit of cost" is the new benchmark that matters.
As companies start to feel the real cost of frontier tokens, lower-cost models are going to get a hard look. This has been happening quietly for a while — Brian Chesky said Airbnb uses Alibaba's Qwen because it's "fast and cheap," and Cursor has confirmed using Kimi K2.5 as the base for their internal model. The AEI reported that one in four YC Winter 2025 companies used AI to write 95% of their codebase, some via fine-tuned Chinese models. The Menlo VC state of AI report puts open source at about 11% of enterprise LLM API use today, with Chinese open source at roughly 1% overall — but those numbers are from Q4 and are almost certainly moving. The more interesting question for most companies isn't whether to use Qwen or Deepseek — it's whether their current systems were designed with any cost intentionality at all.

Five Practical Steps for Enterprises

For companies already mid-workflow-transformation, here are five moves to get ahead of rising model bills.
1. Find the AI spending leaks — audit every use case and task for where a frontier model is doing work a cheaper one could handle.
2. Run a cheap model bake-off — build a framework to test smaller, more efficient, and open models against your key tasks and against each other.
3. Create a Model Sommelier role — give one person ownership of this ongoing selection process, tracking price changes, new releases, and task-specific performance.
4. Build escape hatch architecture — design systems that route routine work to cheaper models but can escalate to frontier models (or humans) for low-confidence, sensitive, or high-value cases.
5. Build an AI cost scoreboard — make agent economics visible across teams, integrate cost metrics with performance data, and celebrate the wins.