Ruurd
Ruurd

Apr 15, 2026 5 min read

Tokens are the new MIPS

I’ve been watching the AI hype cycle the same way I watch most hype cycles: from a safe distance. Mildly amused, occasionally curious, but never fully buying in.

I’m not a hype guy. I’ll sip the kool-aid, sure—but I don’t chug it.

Over the last decade, we’ve seen this pattern repeat often enough that it’s hard to get excited too early. Big promises, slick demos, VCs throwing money at anything that moves… and then reality catches up. The graveyard of “revolutionary” developer tools is already crowded.

So I waited. And then, somewhere in the last six months, something shifted.

This time it actually works

AI for development crossed a line recently. Not in a flashy keynote kind of way, but in a quieter, slightly unsettling “oh… this is actually useful now” sense.

We moved beyond autocomplete on steroids into something else entirely: agent-based workflows, reasoning models. Systems that don’t just finish your line of code, but can take a vague goal and stumble—sometimes elegantly, sometimes not—toward something that resembles a solution.

It’s far from perfect. It breaks in odd ways, hallucinates with confidence, and burns tokens like a startup burns through its first round of funding. But it works often enough that you start relying on it. And that’s where things get interesting.

The uncomfortable part: we lost something

For the last 20 years, being a developer meant a certain kind of independence.

Give me a decent laptop, some CPU, enough RAM, and an internet connection for the occasional StackOverflow search, and I could build pretty much anything. Backend, frontend, infrastructure, some weird side project at 2am—it didn’t matter.

It was local. It was mine.

Now we’re drifting back toward something that looks suspiciously like a mainframe model.

Tokens are the new MIPS.

And the actual “compute” lives somewhere else: somewhere far away, behind an API, owned by a small number of vendors. You’re no longer running your tooling—you’re renting intelligence. Metered, rate-limited, and subject to whatever policy decision gets made on a random Tuesday.

That’s a significant shift, and I’m not sure we’ve fully internalized what it means yet.

It’s not just technical. It’s geopolitical.

This is the part that tends to make people slightly uncomfortable in architecture discussions: most of these “mainframes” are operated by US-based companies.

Which means that, structurally, a growing portion of our development capability depends on a permanent connection to systems outside our control, governed by jurisdictions that don’t necessarily align with our own.

That’s not just a technical dependency. It’s a strategic one.

What happens if access changes overnight? What happens if pricing shifts in a way that breaks your unit economics? Or—less hypothetically—what happens if someone in Washington wakes up with the wrong milk in their morning coffee and decides your region is now a “risk”?

It sounds dramatic until it isn’t. We’ve seen similar dynamics play out in other domains.

Also… what happens to developers?

There’s another angle here that doesn’t get enough attention: the day-to-day work is changing.

There’s less typing and more steering. Less “I write code”, and more “I manage a system that writes code”. That sounds efficient, even appealing, until you look at what it actually involves: you’re not just solving problems anymore—you’re orchestrating agents, debugging their failures, managing context windows, and nudging outputs back on track.

That’s a form of management. Not people management, but still management. And not everyone signed up for that.

A lot of developers genuinely enjoy building things directly. There’s a real question whether a portion of them will burn out trying to adapt to this shift. On the other hand it will likely bring back seasoned engineers who got tired of writing code all day and are looking to have more impact.

Interesting times.

So how do we get some independence back?

If the trajectory points toward “mainframes,” the obvious question is whether we can rebuild some level of independence. That’s where open models, local setups, and regional infrastructure providers come into play.

I’ve been experimenting a bit on that front. My current setup is a 4090 box—solid, but not exactly a data center.

And the honest answer so far is: it’s fine. Good enough for coding assistants. Autocomplete, smaller reasoning tasks, some local workflows—it all works reasonably well. But it doesn’t come close to the frontier models.

Which raises the obvious question: why? Is it simply the VRAM limitation? Is it the models themselves? Or is it the surrounding ecosystem—the tooling, orchestration, memory layers—that actually make these systems useful? It’s probably a combination of all three.

Next step: rent bigger toys

The next logical step is to stop pretending a single GPU is enough: run something larger. Proper GPUs. Multi-card setups.

Until someone (Extropic) gets thermodynamic computing working we’ll need a lot of transistors to do artificial probabilistics. And I won’t complain: I love hardware.

But—and this part matters—do it in a way that doesn’t just reinforce the same dependency problem. There are some interesting options emerging in Europe: Nebius, Mistral AI, Axelera AI.

So the plan is straightforward: rent compute from a proper company, not from a random dude renting out their privately owned H200s unsecured (yes that happens), run larger open models, and see how far this can go.

Can we get close to the agent-style workflows people are getting used to? Or does everything fall apart once you leave the ecosystem of the major providers? I honestly don’t know yet, but I’m going to find out.

Where this probably lands

If I had to guess, we won’t fully return to independence, and we won’t completely embrace the mainframe model either.

We’ll land somewhere in the messy middle: some local models, some cloud dependencies, and a lot of glue code and questionable architectural decisions connecting the two.

As always, the real story won’t show up in polished demos. It will show up in the failure modes. The strange edge cases. The outages. The unexpected costs. The “why did the agent just delete production” moments. That’s where things become real.

I’m still not a hype guy, but this time I’m paying attention.