Google’s Three-Frontier Theory of AI Models Changes How We Should Build With Them

Headline & intro

Google’s latest framing of AI progress quietly kills a myth that’s dominated the last two years: that “smarter models” are all that matters. In an interview with TechCrunch, Google Cloud’s AI VP Michael Gerstenhaber describes three frontiers models are pushing at once: intelligence, latency and cost-at-scale. It sounds abstract, but it’s exactly the mental model product teams — and regulators — have been missing. In this piece, we’ll unpack why this triad matters more than benchmark scores, how it reshapes the cloud AI race, and what it means for European companies trying to deploy agentic AI without blowing their budget or their GDPR risk profile.

The news in brief

According to TechCrunch, Google Cloud product VP Michael Gerstenhaber, who leads the Vertex AI platform, outlined a three-part view of how AI model capabilities are evolving. Drawing on his experience at Anthropic and now Google, he argues that models are simultaneously advancing along three axes: raw intelligence, response time and cost-effective deployment at unpredictable scale.

He uses concrete scenarios. For tasks like complex code generation, enterprises maximise intelligence even if responses take minutes. For real-time interactions such as customer support, intelligence is constrained by strict latency budgets — being right doesn’t matter if the user has already left. For high-volume workloads like content moderation across social networks, the decisive factor is cost per call at internet scale.

Gerstenhaber also explains why “agentic” systems — AI agents that act autonomously over tools and data — are slower to appear in production than hype suggests. He points to missing infrastructure for auditing, authorisation and safe deployment, contrasting that with software engineering, where mature human‑in‑the‑loop processes already exist.

Why this matters: a new playbook for AI builders

Gerstenhaber’s three-frontier model sounds simple, but it’s a major shift from the industry’s obsession with a single question: which model is smartest? For anyone actually shipping products, the more useful question is: how much intelligence do I need, at what latency, for what cost profile?

The winners here are pragmatic builders. If you’re a CTO or product manager, this framing legitimises something you probably feel intuitively: the “best” model is rarely the one with the most parameters. A procurement process that only compares accuracy benchmarks is now obviously incomplete; you need to model user patience and cost volatility just as rigorously as token quality.

Cloud providers that can offer a portfolio across these three frontiers benefit too. Google can position Gemini Ultra for intelligence-heavy work, lighter Gemini variants for low-latency tasks, and smaller distilled models for bulk workloads. OpenAI and Anthropic are already on similar trajectories with tiered model families; the fact that a Google Cloud VP is articulating this openly signals that the era of a single flagship model is over.

Who loses? Startups that pin their story solely on “we trained a slightly smarter model” without a strong angle on latency, cost, or integration will find this narrative unforgiving. Also at risk are enterprises that treat AI as a single line item, rather than a portfolio of capabilities tuned to specific operational constraints.

Perhaps most importantly, Gerstenhaber’s comments about missing audit and authorisation patterns are a warning: agentic AI isn’t stuck because of model quality; it’s stuck because organisations haven’t yet rebuilt their processes, controls and tooling around autonomous systems. That’s a socio‑technical problem, not a benchmark problem.

The bigger picture: from “one big brain” to a mesh of specialised services

This three-frontier view fits neatly into several ongoing shifts in the AI industry.

First, it explains why every major lab has moved from a monolithic flagship model to a family strategy. OpenAI split GPT‑4 into high-intelligence and “Turbo” style offerings. Anthropic positioned Claude Opus, Sonnet and Haiku along a clear quality–latency–cost spectrum. Google did the same with Gemini Ultra, Pro and Nano. They’re not just segmenting by price; they’re mapping offerings onto specific positions in Gerstenhaber’s three-dimensional space.

Second, it reframes the “agents” conversation. Many agent demos today are over-engineered: they use top-tier models to orchestrate trivial tasks, burning latency and money for marginal gains. If you accept that most business workflows are a mix of deep reasoning, quick interactions and high-volume classification, then a serious agent platform will route between different models depending on the step. The orchestration layer — not the raw model — becomes the strategic battleground.

Third, it highlights why infra and governance startups are suddenly interesting again. Logging what an agent did, constraining which data it can touch, simulating and testing complex toolchains before production — these are unsolved problems reminiscent of early DevOps and microservices eras. Observability, policy engines and “agent CI/CD” pipelines will be as critical as the models themselves.

Historically, we’ve been here before. When cloud computing moved from “one big server” to distributed microservices, the performance conversation shifted from CPU speed to latency budgets, SLOs and cost per request. AI is undergoing a similar transition: away from single-core IQ and towards holistic system design that balances speed, precision and spend.

The European angle: regulation meets the three frontiers

From a European perspective, Gerstenhaber’s remarks are almost a blueprint for where the EU AI Act and GDPR pressures will bite hardest.

On the intelligence frontier, high-capability models used for decision-making in areas like credit, employment or healthcare will fall into high‑risk categories under the AI Act. That implies mandatory risk management, logging and human oversight. Gerstenhaber’s call for robust auditing of agent behaviour is exactly what Brussels is about to require in law.

On the latency frontier, many European companies are experimenting with AI copilots for customer support in banking, insurance and public services. Here, you can’t just crank up model size: latency is constrained by user expectations and, in some sectors, by regulations on how quickly customer issues must be resolved. A German insurer or a Slovenian bank can’t let an ultra‑smart but slow agent keep customers waiting while it ponders the perfect answer.

On the cost/scale frontier, EU businesses dealing with user-generated content — from French marketplaces to Central European media groups moderating comments — need moderation that’s both cheap and compliant with European hate speech laws and platform regulations like the Digital Services Act. Running frontier models on every post would be financially suicidal; a layered system that uses cheaper filters first and escalates edge cases to stronger models (or humans) is far more realistic.

For European cloud and AI vendors, this is both threat and opportunity. Google’s vertical integration — chips, data centres, models, agent layer and interfaces — is hard to match. But EU players can specialise in sovereignty, compliance tooling and sector-specific agents that sit on top of US models while staying inside European data, energy and legal constraints.

Looking ahead: from model choice to system design discipline

If we take Gerstenhaber seriously, the next two to three years of AI adoption won’t be defined by one lab leaping ahead in raw IQ, but by how well organisations architect around these three frontiers.

Expect product teams to move from asking “which model?” to designing policies: for this workflow, what is the maximum acceptable latency, error rate and cost per transaction? Once that’s fixed, the choice of model (or set of models) becomes a constrained optimisation problem rather than a religious argument about benchmarks.

We should also expect more automatic model routing. Cloud platforms will increasingly offer “profiles” — cost‑optimised, latency‑optimised, quality‑optimised — and quietly select between multiple back-end models on each request. That’s great for efficiency, but dangerous for transparency: regulators and customers will want to know which systems actually made which decisions.

On the agent side, the gap Gerstenhaber mentions — missing audit and authorisation structures — is likely to be filled by a new layer of tooling. Think of agent sandboxes, simulation environments, formal policy languages and “agent ops” teams who own reliability the way SREs own uptime today. European enterprises, already used to compliance-heavy change management, may actually be better positioned than US startups to industrialise these patterns.

The main risks? First, lock‑in: vertically integrated stacks like Vertex AI make it seductively easy to build, but harder to move later if pricing or regulation shifts. Second, governance failures: if agents act across sensitive systems without proper guardrails, early scandals could trigger heavy‑handed restrictions, especially in the EU. Third, cost blowouts: treating every problem as a frontier‑model problem will quietly inflate cloud bills until CFOs slam the brakes.

The bottom line

Gerstenhaber’s three-frontier view is more than a clever metaphor; it’s a practical design brief for the next phase of AI adoption. Intelligence, latency and cost aren’t just technical metrics; they’re the levers that will decide which AI products are viable, compliant and trustworthy. Companies that embrace this systems mindset — and invest early in auditability and authorisation for agents — will build durable advantages. Those that keep chasing “the smartest model” in isolation are likely to overspend, under‑deliver and run into regulators. The real question for readers is: where on these three frontiers does your next AI project actually need to live?

Google’s Three-Frontier Theory of AI Models Changes How We Should Build With Them

Headline & intro

The news in brief

Why this matters: a new playbook for AI builders

The bigger picture: from “one big brain” to a mesh of specialised services

The European angle: regulation meets the three frontiers

Looking ahead: from model choice to system design discipline

The bottom line

Comments

Leave a Comment

Related Articles

Uber’s ‘Dara AI’ Shows What Happens When the Boss Becomes a Model

MatX vs. Nvidia: Why a $500M Bet on AI Chips Is Really a Bet on Power

Europe’s AI Advantage Might Be Small, Not Giant: Why Multiverse’s Compressed Model Matters

Stay Updated