Corporate America’s new AI problem is not capability. It is cost. The latest sign is that large companies are starting to ration access as usage grows and the economics get less flattering than the hype cycle suggested.
That shift matters because it marks the end of the “spray AI everywhere” era. For the first wave of enterprise adoption, companies could treat generative AI like a shiny pilot program: give employees access, watch the demos, collect a few productivity wins, and pretend the bill would sort itself out later. Later has arrived. And the math is getting uncomfortably real.
<> For most use cases, AI is either not yet good enough on its own or good enough but too expensive./>
That line captures the core tension better than any vendor pitch deck. Enterprises are discovering that AI value is not a binary yes-or-no question. It is a unit-economics question. If a tool saves 30 seconds but burns through costly inference, orchestration, security review, logging, and human oversight, the business case gets shaky fast.
The likely response is not “turn AI off.” It is far more boring and far more revealing: ration it.
- Limit who gets access.
- Cap how many prompts people can send.
- Route simple tasks to cheaper models.
- Reserve premium models for hard problems.
- Push routine work toward retrieval, caching, and smaller systems.
That is the real enterprise AI stack emerging in production: not one magical model, but a cost hierarchy. The expensive frontier model becomes the exception, not the default. That should not surprise anyone who has ever watched a promising technology move from pilot to procurement. Enthusiasm is cheap; scale is expensive.
For developers, this is the important lesson: model quality is only half the product. The other half is cost control. If you are building AI features for a company, you now need to think like a cloud architect and a finance analyst at the same time.
- Track token spend per feature.
- Set response-length ceilings.
- Cache repeated queries.
- Use smaller models for classification and extraction.
- Add routing so only difficult requests hit premium systems.
- Measure ROI at the workflow level, not the demo level.
The broader market implication is even sharper. If large customers start rationing usage, vendors selling premium inference may face tougher negotiations and slower expansion. That creates room for open-weight models, private deployments, and optimization tooling that helps companies squeeze more value out of fewer tokens.
My take: this is not a setback for AI adoption; it is a necessary correction. The first wave of enterprise AI was driven by possibility. The next wave will be driven by discipline. That is usually where technology starts becoming real.
