OpenAI is reportedly asking contractors to upload real work from past jobs

January 10, 2026
5 min read
OpenAI logo on a glass office building

OpenAI is leaning even harder on human workers to train its AI — and it reportedly wants pieces of their old jobs to do it.

According to documents and interviews obtained by Wired, OpenAI and training data startup Handshake AI are asking third‑party contractors to upload real work product from their past and current roles. The material is meant to feed systems like ChatGPT so they can better understand and eventually automate white‑collar tasks.

What OpenAI is asking for

In an internal presentation cited by Wired, OpenAI tells contractors to:

  • Describe tasks they’ve performed at other jobs
  • Upload examples of “real, on‑the‑job work” they’ve “actually done”

The company isn’t looking for summaries. It explicitly asks for the original outputs, including:

  • Word documents
  • PDFs
  • PowerPoint decks
  • Excel spreadsheets
  • Images
  • Code repositories

That’s exactly the kind of material that often contains a mix of corporate strategy, internal processes and client data.

A safety net with a lot of holes

OpenAI and Handshake AI do tell contractors not to upload anything sensitive. According to Wired, the presentation instructs workers to delete proprietary and personally identifiable information (PII) before submitting files.

To help with that, OpenAI reportedly points contractors to a ChatGPT‑based tool called “Superstar Scrubbing,” positioned as a way to strip out confidential details before the data is ingested.

On paper, that sounds like a compliance layer. In practice, it shifts the risk onto a group of people with:

  • Limited visibility into what their past employers consider confidential
  • Strong incentives to complete tasks quickly and keep the contract

And that’s exactly what worries legal experts.

‘Putting itself at great risk’

Intellectual property lawyer Evan Brown told Wired that any AI lab relying on this kind of pipeline is “putting itself at great risk.”

The reason: the whole system depends on contractors to make the call on what can and can’t be shared.

Brown said this approach requires “a lot of trust in its contractors to decide what is and isn’t confidential.” If a contractor misjudges — or simply doesn’t notice a stray slide, clause or client name — the lab could end up training on:

  • Trade secrets
  • Copyrighted material used outside license terms
  • Personal data that triggers privacy laws

For OpenAI, that’s not a hypothetical risk. The company is already facing lawsuits from authors, publishers and others questioning how its models were trained.

Why this data is so attractive

Behind the scenes, nearly every major AI lab is chasing the same thing Wired describes here: fresh, high‑quality, task‑level data that looks like what knowledge workers actually do all day.

That means:

  • Real reports, not synthetic examples
  • Real email threads and project plans
  • Real code and documentation

Wired’s reporting suggests OpenAI sees this as a way to train models that can:

  • Better follow complex, multi‑step instructions
  • Work across tools and formats (docs, spreadsheets, slides, repos)
  • Move closer to automating at least parts of white‑collar workflows

The more realistic the training data, the more capable the model. But realism here is tightly coupled with real organizations’ intellectual property.

A growing compliance headache

OpenAI declined to comment when Wired asked about the program.

That silence leaves a bunch of unanswered questions for:

  • Current and former employers whose documents might be uploaded
  • Regulators drafting rules on AI training data
  • Enterprise customers trying to assess vendor risk

Among them:

  • How does OpenAI audit what contractors upload?
  • Can companies detect or remove their data once it’s in the training corpus?
  • Who is liable if a contractor uploads clearly confidential material?

For now, the basic dynamic is clear: as AI companies run out of easy, web‑scale text to scrape, they’re moving deeper into the grey zone of workplace data — the slides, spreadsheets and specs that were never meant to leave the building.

And they’re betting that a mix of instructions, a “Superstar Scrubbing” tool and trust in low‑paid contractors will be enough to keep them out of legal trouble.

Wired’s reporting — and Brown’s warning that OpenAI is “putting itself at great risk” — suggest that bet may not age well.

Comments

Leave a Comment

No comments yet. Be the first to comment!

Related Articles

Stay Updated

Get the latest AI and tech news delivered to your inbox.