Anthropic’s “theoretical AI job impact” isn’t a forecast – it’s a mood from 2023

April 1, 2026
5 min read
Illustration of a robotic hand overshadowing office workers at their desks

1. Headline & intro

An ominous chart from Anthropic has been making the rounds, suggesting large language models could theoretically handle 80% of tasks in many white‑collar jobs. It looks like a preview of mass professional extinction. But the small print tells a different story: that blue "theoretical capability" band is built on speculative judgments frozen at the peak of the 2023 AI hype cycle.

In this piece, we’ll unpack what Anthropic actually did, what the underlying OpenAI study really measured, and why policymakers, managers, and workers should treat these numbers as mood indicators rather than destiny. The real risk isn’t that AI can do 80% of your job today – it’s that shaky metrics like this start steering real economic decisions.

2. The news in brief

According to Ars Technica, Anthropic recently published a report on AI and the labour market that compares two things across 22 broad occupation groups: current "observed exposure" to large language models (LLMs) and a much larger "theoretical capability" estimate.

Instead of running its own forward‑looking experiments, Anthropic based that theoretical line on a 2023 paper, "GPTs are GPTs", co‑authored by OpenAI, OpenResearch and the University of Pennsylvania. That earlier work used US O*NET data to break jobs into very fine‑grained tasks. Human annotators – assisted by GPT‑4 – labelled each task according to whether the best OpenAI model at the time could cut task time by at least 50% with equal quality, or whether future "LLM‑powered software" might do so.

The original authors openly noted the subjectivity and limitations of this approach. Anthropic now uses those estimates as a benchmark, finding that real‑world AI use remains far below this potential "feasible" coverage.

3. Why this matters

The core issue is not that Anthropic cites an old study – it’s what this kind of chart does in public debate.

By compressing a messy, assumption‑heavy methodology into a single blue band, the graphic invites a simplistic reading: “AI could do 80% of most jobs; it just hasn’t been adopted yet.” For executives looking to cut costs, or politicians wanting to look tough on the future of work, that’s an attractive, easy‑to‑quote number – even though it is based on:

  • Tasks, not entire jobs
  • Time savings, not full automation
  • Expert guesses from 2023, not measured future performance
  • A completely open‑ended timeline

Those choices matter. A task that could be sped up 50% in theory might, in practice, become slower once you add prompt writing, result checking, compliance reviews, and integration into existing workflows. Ars Technica cites a 2025 study where open‑source coders using AI actually ended up slower overall when you include these frictions.

Who benefits? Vendors and consultancies selling AI transformation projects gain a powerful visual aid. Investors get another justification for sky‑high valuations. Who loses? Workers who are told their roles are "80% automatable" based on speculative labels, and policymakers who might overreact – or underreact – to the wrong risks.

The immediate implication: we risk spending the next few years optimising around a mirage, while ignoring the slower, structural ways AI actually reshapes work.

4. The bigger picture

Anthropic’s chart sits at the intersection of several trends.

First, it is part of a broader shift from measuring what models can do in the lab to speculating what that could mean for the economy. We’ve seen similar attempts from consultancies, central banks, and think tanks: all trying to turn raw model benchmarks into estimates of GDP, productivity, or job loss. The appetite is understandable; the data foundations are shaky.

Second, the underlying 2023 study was produced at the high‑water mark of AI hype. As Ars Technica reminds readers, that was the moment of open letters calling for AI moratoria and warnings about civilisation‑level risks. When you ask AI experts in that environment to imagine "anticipated LLM‑powered software" with no time horizon, you are essentially sampling the optimism and fear of that moment, not a neutral forecast.

Third, this connects to a long history of over‑promising technology‑driven job disruption. From the 1980s PC wave to early 2000s automation studies, the pattern repeats: headline numbers about potential exposure vastly exceed what shows up in real employment data over the following decade. Sometimes the jobs truly vanish (think typesetters); more often, they morph, fragment, and re‑emerge with different task mixes and titles.

Competitively, all the big AI labs – OpenAI, Google, Anthropic, Meta – now have incentives to emphasise potential scope. If you say your model might eventually touch half of all tasks in the economy, you are not just selling capability; you are justifying continued capital inflows and regulatory attention. That doesn’t make these analyses dishonest, but it does mean we should interrogate the framing.

5. The European / regional angle

For Europe, the way we measure AI’s job impact is almost as important as the impact itself.

Labour markets here are more regulated, collective bargaining is stronger, and social safety nets are thicker than in the US. Decisions on reskilling funds, university curricula, and worker‑protection rules are already being shaped by forecasts of AI‑driven disruption. The EU AI Act, GDPR, and the forthcoming implementation guidance around high‑risk AI systems will all influence how quickly firms can deploy LLMs into workflows involving people’s livelihoods.

Yet Anthropic’s benchmark rests on US O*NET data and US occupational structures. European economies have different job mixes (larger public sectors, stronger manufacturing in countries like Germany, distinct SME ecosystems in Central and Eastern Europe). Simply porting over a US exposure metric risks misjudging where European workers are actually vulnerable.

There is also a cultural dimension. European publics tend to be both more privacy‑conscious and more sceptical of opaque corporate metrics. Works councils and unions in Germany, the Nordics, and beyond will not accept "a chart from an AI lab" as evidence that roles should be restructured. They will demand task‑level, context‑specific proof.

For EU regulators and national ministries of labour, the lesson is clear: do not let a single speculative line in a US‑centric study become the de facto reference point for European policy. Invest in your own measurement, using European job taxonomies and real adoption data.

6. Looking ahead

Where does this leave workers, managers, and policymakers trying to plan for the next decade?

Expect more charts like Anthropic’s. Every major AI player and consultancy will produce its own "exposure" metrics, each with different assumptions about what counts as a task, what qualifies as "equivalent quality", and how to treat human oversight. The numbers will not be directly comparable, but they will compete for attention.

The useful signal will come from a different kind of research:

  • Field experiments inside companies that randomise AI tool access and measure actual productivity changes, error rates, and worker satisfaction.
  • Longitudinal firm‑level data that tracks how AI adoption correlates with hiring, wages, and organisational structure over years, not months.
  • Worker‑centric studies that ask employees which parts of their job LLMs genuinely help with, and where they introduce new risks or cognitive overhead.

On roughly a 3–7 year horizon, we are likely to see deep integration of LLMs into office software, customer support platforms, and internal knowledge tools. The impact will probably look less like "80% of lawyers automated" and more like "legal work is decomposed, junior roles change character, new oversight and tooling positions appear".

Key questions to watch:

  • Do we see net job loss in highly exposed occupations, or task reshuffling within them?
  • Do wage premiums accrue to those who can orchestrate AI tools, or to those whose work is hardest to formalise as text and code?
  • How do different regulatory regimes – EU vs US vs China – shape adoption speed and worker protections?

7. The bottom line

Anthropic’s "theoretical capability" curve is not a prediction of near‑term mass automation; it is a stylised snapshot of what a handful of 2023 experts imagined LLMs might eventually help with. Treating it as destiny is a category error.

AI will reshape work, but mostly by reconfiguring tasks, not by instantly erasing professions. The challenge for Europe and beyond is to build better, more grounded ways of measuring that change. Before you let any chart tell you that 80% of your job is doomed, ask: whose assumptions are hiding under that blue area?

Comments

Leave a Comment

No comments yet. Be the first to comment!

Related Articles

Stay Updated

Get the latest AI and tech news delivered to your inbox.