When METR attempted to repeat a 2025 study that suggested AI boosted developer output, researchers hit an unexpected roadblock: developers declined to perform any tasks without AI assistance. The lab, unable to gather controlled data, pivoted to a May‑long survey in which programmers claimed AI doubled their value to employers. The self‑reported optimism clashes with mounting evidence that AI coding tools may actually impede progress.

Amazon’s internal token‑tracking leaderboard, Kirorank, was taken offline after employees began inflating usage to game the system, driving up costs without delivering real productivity gains, the Financial Times reported. Uber’s experience was even more stark. The ride‑share giant exhausted its entire 2026 AI budget within the first four months, and COO Andrew Macdonald admitted the spending failed to translate into measurable project outcomes, according to The Information.

Industry insiders label the focus on token consumption as “tokenmaxxing,” a practice that equates raw AI usage with efficiency. Salesforce’s CEO Marc Benioff recently called for an “intermediary layer” to route tokens intelligently between high‑end and cheaper models, implicitly acknowledging that not every token adds value.

Beyond cost concerns, the quality of AI‑generated code is under scrutiny. James Shore, a programmer and author, warned that faster code creation can mask a hidden maintenance debt. Start‑up Entelligence AI claims that 44% of tokens spent on AI coding go toward fixing bugs introduced by the same tools. Similarly, CodeRabbit’s analysis of open‑source pull requests found AI‑produced code generated 1.7 times more problems than human‑written code.

Researchers at Singapore Management University reached the same conclusion in an April report, noting that AI‑generated code may ship quickly but often carries long‑term maintenance costs. Their recommendation: treat AI output like code from a junior developer—subject it to rigorous review and keep humans accountable for architecture and security.

Companies continue to hire “vibe coders” and heavily market AI‑assisted development roles despite the mixed evidence. Cognition founder Scott Wu, whose AI coding agent Devin sits between a junior and mid‑level programmer, admits the tool is not a set‑and‑forget solution. The industry, he says, must build robust quality‑assurance infrastructure to avoid turning speed gains into technical debt.

As the AI coding market expands faster than conclusive proof of its effectiveness, the gap between perceived and actual productivity widens. Developers appear unwilling to return to a world without AI assistance, yet the tools may be delivering more problems than they solve. The emerging consensus among engineers and analysts is clear: without stronger oversight, the promise of AI‑driven development remains unfulfilled.

This article was written with the assistance of AI.
News Factory APP - agentic news to boost your SEO & AEO.

AI Coding Tools Fail to Boost Productivity, Companies Waste Millions

Key Points

Also available in: