The AI Trap We’re Walking Into
Cheap Tokens, Expensive Power
We were promised that artificial intelligence would democratize knowledge work. That a kid with a laptop in a small town would have the same cognitive firepower as a corporation. For a brief, dizzying moment, that even felt true.
I’m not so sure anymore. Here’s the story I see unfolding, and why I think we’re about to repeat one of humanity’s oldest mistakes.
LLMs and agents are becoming a commodity
Two years ago, a capable language model felt like magic. Today it feels like electricity, something you plug into. The numbers are staggering: the cost of LLM inference for equivalent performance is dropping roughly 10x every year, faster than compute fell during the PC revolution or bandwidth during the dotcom boom. A capability that cost about $20 per million tokens in late 2022 now costs around $0.40, and the cheapest models matching early GPT-3 quality have fallen by a factor of 1,000 in three years. (a16z, Introl)
Open-weight models are closing in on the frontier too, trailing the best closed models by only around four months on key benchmarks. When the gap is that small, raw capability stops being scarce. (Epoch AI)
This is what commoditization looks like. And whenever a technology becomes a commodity, the interesting question stops being “Can you do it?” and becomes “Can you afford to do it at scale?”
Agentic work gets more expensive, not less
Here’s the counterintuitive part. The price per token keeps falling, and yet the cost of meaningful agentic work is climbing.
Why? Because agents don’t make one call. They read a task, get a response, then re-read everything before the next action, then re-read all of that plus the new response, building one expensive context snowball. A Stanford Digital Economy Lab study found that agentic tasks are “uniquely expensive, consuming 1000x more tokens than code reasoning and code chat,” with the cost driven mostly by input tokens. Worse, that usage is wildly unpredictable: runs on the same task can differ by up to 30x in total tokens, and burning more tokens doesn’t even guarantee a better answer. (Stanford Digital Economy Lab)
Reasoning models pour fuel on this. They “think” in hidden token sequences you still pay for, consuming five to twenty times more tokens per request than standard models. A query that takes 700 tokens normally can balloon to 3,700 once the model reasons internally. (Keito) At enterprise volumes, a support agent that looks cheap at 100 tokens per interaction can hit 2,000 to 5,000 once tool calls and multi-step reasoning kick in, producing “monthly token bills that dwarf even your infrastructure spend.” (DataRobot)
The unit price drops while total consumption explodes. For a hobbyist, that’s a rounding error. For a company running millions of autonomous workflows a day, it becomes a serious line item that scales with ambition. Researchers warn that without major system-level innovation, per-request costs could rise “by orders of magnitude,” making large-scale agent deployment “economically and environmentally prohibitive.” (arXiv)
The result: the more valuable the AI work, the more it costs to run.
Those with the budget buy the power
If agentic capability is metered, then capability becomes a function of capital. Whoever can pour the most money into compute gets faster and more thorough agents, more parallel experiments, the freshest frontier models the moment they ship, and the luxury of not thinking about cost at all.
This is a familiar pattern. Capital concentrates around whatever resource is scarce. Yesterday it was land, factories, and data. Tomorrow it’s inference budget. Researchers already warn that AI is poised to widen income inequality unless we deliberately steer it otherwise. (Brookings)
Those without it work by hand
Meanwhile, everyone else does what humans have always done when they can’t afford the machine: they work by hand. They label, moderate, correct, and annotate, filling the gaps the cheap models can’t.
We already have a name for an early version of this: the global, often invisible workforce that labels data and tunes models for a pittance. Investigations have documented Kenyan workers training AI systems for around $2 an hour under grueling conditions, churn-by-design contracts, and unpaid labor. (Brookings, TechCrunch) Kenyan data labelers have since organized into a Data Labelers Association to push back. (Computer Weekly)
As AI eats more white-collar work, this human-in-the-loop layer doesn’t disappear. It grows, and it slides down the value chain.
The handwork becomes the training fuel
Here’s the loop that makes the whole thing self-reinforcing, and genuinely uncomfortable.
Every correction, every label, every “the AI got it wrong, let me fix it” is data. It flows back upstream. It trains the next model. The work done by the people who couldn’t afford the good model is exactly what makes the good model better.
But it goes further than paid correction work, and this is where it gets personal for anyone who builds things in the open. Think about the developer who isn’t using an AI agent at all, who is sitting down and writing genuinely new, creative code, solving a hard problem, and pushing it to a public open source repository as a gift to the community. That contribution doesn’t stay a gift. It gets scraped, ingested, and turned into training data for the next coding agent. The human does the original, creative, unsolved-before work; the model absorbs it and resells it as autocomplete.
The crucial part is that almost none of this was asked for. GitHub Copilot, built by GitHub, Microsoft, and OpenAI, was trained on billions of lines of publicly available code, and in 2022 a class action lawsuit (Doe v. GitHub) accused the companies of violating open source license terms, stripping copyright attribution in breach of the DMCA, and using the work of developers who never consented. (Saveri Law Firm, GitHub Copilot Litigation) Researchers have similarly documented that code-training projects pull in repositories “regardless of license,” likely breaching the very terms under which that code was shared. (SEKE) Permissionless scraping for AI training has become the default, and copyright offices and legislators are still scrambling to decide whether it’s even legal. (U.S. Copyright Office)
So the loop tightens. People share their best, most original work for free, out of generosity or principle. Providers harvest it without asking. The resulting agents get better, and because better agents are more autonomous and more token-hungry, they also get more expensive to run (see above). The very creativity that was given away as a public good is enclosed, repackaged, and rented back to whoever can pay, often including the open source contributors themselves.
The poor and the generous produce the training signal. The provider captures it. The next generation of agents, sold back to whoever can pay, gets smarter on the back of underpaid labor and unpaid creativity.
The split
Stack these steps and you get a depressingly clean machine. Follow the value as it moves through three groups:
Capital-rich enterprises buy frontier agents at scale, and in return they get speed, leverage, and market dominance. Money buys autonomy.
AI providers sell the compute and quietly harvest the correction data and scraped creativity that flows back through it. In return they get recurring revenue and a widening data moat that’s almost impossible to compete with.
Everyone else hand-corrects cheap models and gives away original work, and in return they get wages and recognition that shrink as the very models they’re feeding improve.
The richer get richer because they own the leverage. The provider gets richer because it owns the platform and the feedback loop. And the people supplying the human signal get poorer in relative terms: their labor is the input, never the asset.
Once again, we fail to use progress for good
This is the part that stings. None of this is inevitable physics. It’s a choice, a thousand small architectural and business decisions that quietly default to extraction.
We’ve done this before. The printing press, the steam engine, the internet, each one arrived wrapped in utopian promises, and each one ended up concentrating power before society clawed back some balance. AI is moving faster than any of them, which means the concentration happens faster too, and the clawing back, if it comes, will have to be faster as well.
But it doesn’t have to end this way
I don’t want to write pure doom, because fatalism is just another way of surrendering. The same building blocks point to a different ending:
Open weights and local inference break the metering monopoly. If a good-enough model runs on your own hardware, capital stops being the gatekeeper. (Epoch AI)
Pay for the loop. If human correction is what makes models better, the people doing it deserve a cut, not a tip. This is the heart of Jaron Lanier’s idea of “data dignity,” treating data as labor that people are owed for, rather than a free resource to be mined. (TechTarget)
Efficiency as a public good. Every breakthrough that makes agents cheaper to run shifts power down the pyramid, not up. We should fund and celebrate that as much as raw capability.
Regulation that targets the loop, not the model. The danger isn’t the technology; it’s the feedback mechanism that launders cheap human labor into expensive private assets.
The takeaway
AI didn’t have to be a tool for widening the gap. We’re making it one, step by quiet step, because extraction is the path of least resistance.
The technology is genuinely miraculous. The question, the only question that has ever mattered with any new power, is who it’s for. Right now the default answer is “whoever can pay.” We still have a narrow window to change that answer.
If we don’t, we’ll have built the most capable tools in human history and used them, once again, to do the least imaginative thing possible: make the powerful more powerful.
The choice in front of us is simple to state and hard to make.

