AI Industry

The Trillion-Dollar AI Bet: On-Premise, Not Cloud

Why Apple, enterprises, and smart builders are going local

May 8, 20268 min read
The Trillion-Dollar AI Bet: On-Premise, Not Cloud

Imagine you run a business. Sixty employees. Every workflow, every automated process, every AI agent your team depends on runs through one cloud provider.

Then one morning, the account gets flagged. Locked. And the only way to reach anyone is through a contact form.

That actually happened. A Spanish entrepreneur I came across on X was in exactly that position with his Claude account.

I do not know how it was resolved. But the story is the real point: this is what happens when you build your entire AI agent operation on someone else’s infrastructure.

The industry needs to have this conversation right now.

Everyone is Betting on The Wrong Thing

Photo by Nick Fewings on Unsplash
Photo by Nick Fewings on Unsplash

AI agents are going to be how the majority of businesses use AI in the future. That is not a particularly original prediction at this point.

What often goes unaddressed, however, is the question of where those agents will actually run.

Most businesses currently think about AI as a subscription service: they pick a provider, pay per token, and scale as needed.

It feels straightforward. But there is a structural problem underlying that model, and it will become very obvious over the next two to three years:

Cloud AI pricing is not real yet. It is subsidised by investor capital.

Anthropic, OpenAI, and every other frontier lab are spending far more on compute than they are collecting from customers. That gap is filled by venture funding. Investors are underwriting cheap AI access to drive adoption. That strategy has a shelf life.

When those investors start demanding returns, the AI cost equation changes. Token costs go up. Variable costs baked into your agent workflows go up with them. And if your entire operation is running through one provider’s API, you have no leverage when that happens.

Apple Spotted This Before Anyone Else

John Ternus, Apple’s likely next CEO, whose hardware background signals the company’s strategic bet on local AI.
John Ternus, Apple’s likely next CEO, whose hardware background signals the company’s strategic bet on local AI.

A detail makes all of this click into place.

Apple recently appointed two hardware executives to the very top of the organisation. That is a strange move for a company that has been behind on AI.

The move is about cornering the market on local AI compute.

Nate Jones made this argument compellingly in his video “Apple Just Positioned Itself for the Next Trillion Dollars,” and I think he is right. Apple is making the same bet it made in the 1970s and 80s.

Back then, computing lived in mainframes. Expensive, centralised, and controlled by a handful of vendors. Power users paid the cost because they had no alternative.

Then the Apple II arrived. It was not as powerful as a mainframe. But it was good enough, cheap to run, and most importantly, it was yours.

Power users left the mainframes. The personal computer era began.

Apple is betting that history repeats. Cloud AI is the mainframe. Local AI, running on personal hardware or on-premises servers, is the Apple II. And the transition is already starting.

The Mac Mini craze around open source tools like OpenClaw was an early signal: developers running capable AI locally, not because local models beat GPT-4 on benchmarks, but because they were controllable and the AI cost was essentially just electricity.

What Enterprises Actually Want

Photo by Nguyen Dang Hoang Nhu on Unsplash
Photo by Nguyen Dang Hoang Nhu on Unsplash

When the conversation gets practical, businesses say this.

They do not necessarily need the smartest model. They need a model that is predictable, that does not expose sensitive data to third-party infrastructure, and that does not come with variable costs that make financial planning impossible.

Maybe that model is not as sharp as Claude or GPT-5 on complex reasoning tasks. Fine. If it runs on your servers, it handles your data, and you pay for electricity rather than per-token fees, the trade-off is often worth it.

This is especially true for sectors like law, healthcare, finance, and government. These industries have compliance requirements that make cloud-everything a non-starter.

They also have large volumes of routine human labour that AI agents can absorb, but only if those agents are reliable and cost-controlled over the long term.

The Agent Stack that Actually Makes Sense

At Elephant Stripes, this is how we think about building agent systems that hold up.

You start with a single interface. Something the organisation already uses, whether that is Slack, Teams, Discord, or any other chat platform. Staff interact with agents the same way they already communicate at work. No new interface to learn.

Behind that interface sits an open source orchestration layer. Frameworks like OpenClaw and Hermes are the ones we follow closely.

They give you genuine control over how your agents work: memory management, both short-term and long-term, and the ability to route calls to different models depending on what the task actually requires.

From that layer, you can send calls to frontier model APIs when the task demands it. But you can also fall back to an open source model running on-premises.

A cheap VPS. A Mac Mini sitting in a server room somewhere. Something that is yours, that runs on your infrastructure, and that works even when everything else goes wrong.

Built-in redundancy is not a bonus feature. It is the whole point.

IT Teams are About to Have a Very Different Job

Photo by Shamin Haky on Unsplash
Photo by Shamin Haky on Unsplash

The traditional IT function, managing software licences, user accounts, and system security, is not disappearing. But a new layer is being added on top of it.

The IT teams that will matter most over the next five years are the ones that can manage agents. That means tracking which agents are running, how much compute each is consuming, and what models are powering them.

It also means prototyping new functionality on frontier models, then actively working to shift that functionality to smaller, cheaper, or open source models as quickly as the task allows. Frontier models for research and development. Smaller or open source models for production and everyday operations.

That cycle is where the cost discipline comes from, and it is also where the technical skill is going to sit. Demand for people who can build and optimise this kind of stack is going to be significant.

I think it is actually a great time to be a programmer who understands this space.

Why Open Source Wins the Critical Layer

Photo by Markus Winkler on Unsplash
Photo by Markus Winkler on Unsplash

Between enterprises and the big model providers, there is going to be a layer of open source tooling that businesses depend on deeply. That layer is where the control sits.

Proprietary cloud platforms give you convenience. Open source agent frameworks give you something more valuable: the ability to make your own decisions.

Which model gets called for which task? Yours to decide. How much memory does each agent carry? Yours to configure. What happens when a provider goes down, changes its pricing, or locks your account? You have a fallback because you built one in.

That is the core of why I am enthusiastic about open source frameworks in this space. They are well-built, and they shift the power dynamic back toward the people building with them.

The Realistic Picture

I want to be clear about something: I am not saying cloud AI is going away. Frontier models are getting better at a remarkable pace, and for genuinely complex reasoning tasks, they remain impressive tools.

But the current model, where businesses pipe everything through a single cloud provider’s API with variable AI costs and a contact form as their primary support channel, is not a stable long-term arrangement.

The Spanish entrepreneur with 60 staff and one form to submit is not an edge case. It is a preview.

The businesses that figure this out early will build agent infrastructure they control. They will use frontier models as a high-performance tier, not as a foundation. They will treat open source frameworks as their backbone, and local compute as their safety net.

AI agents are going to be how most businesses run on AI. The ones that win will make sure those agents are not someone else’s to switch off.

My thinking on all of this is still developing. But the direction feels clear.

I would genuinely like to hear your take.


This article grew out of a longer video I recorded for the Elephant Stripes YouTube channel. You can watch the full video here:

At Elephant Stripes, we help businesses build AI agent systems that they actually own and control. If that is the direction your organisation is heading, come find us at elephantstripes.ai

Read next

Related posts