When the METR research lab tried to repeat a 2025 study on AI‑assisted coding, it hit an unexpected roadblock: developers refused to take part unless they could use AI. The lab’s own statement explained that participants simply “do not wish to work without AI,” even for a limited experiment. That admission sparked a fresh survey in May, letting technical staff self‑report AI’s impact on their work.

According to the survey, developers feel AI makes them twice as valuable to their organizations. Yet the optimism clashes with a wave of industry reports that question whether the productivity gains are real. Amazon recently shut down its internal token‑tracking leaderboard, Kirorank, after employees began gaming the system and inflating AI usage costs, the Financial Times reported. Uber’s AI budget burned through its allocation within the first four months of 2026, and COO Andrew Macdonald admitted the spending produced no measurable uptick in projects or output.

Cost and Maintenance Issues

Beyond budget overruns, the quality of AI‑generated code is under scrutiny. James Shore, a veteran programmer, warned that speed gains could be offset by higher maintenance burdens, noting, “You write code twice as quick now? Better hope you’ve halved your maintenance costs.” A viral tweet from Aiswarya Sankar, CEO of Entelligence AI, claimed firms spend 44 % of their AI tokens fixing bugs introduced by the very tools meant to accelerate development. CodeRabbit, a code‑review platform, found AI‑produced pull requests contain 1.7 times more problems than human‑written ones, though the company’s data may serve its own marketing.

Independent researchers from Singapore Management University echoed those concerns, publishing a report that AI‑generated code can embed long‑term maintenance costs into real software projects. Their findings suggest that while developers love AI assistants, the tools may be shifting effort from writing code to debugging it.

Industry leaders are proposing workarounds. Scott Wu, founder and CEO of Cognition, promotes Devin, an AI coding agent he rates between a junior and mid‑level programmer. Wu acknowledges that Devin cannot replace human oversight and must be paired with robust quality‑assurance processes. Researchers recommend that developers treat AI output like work from a junior colleague—review it meticulously and reserve high‑level tasks such as architecture and security design for humans.

The emerging picture is one of mixed sentiment. Developers increasingly see AI as indispensable, yet companies grapple with hidden costs and potential code rot. As firms wrestle with token‑maxxing trends and the reality of AI‑driven bugs, the industry may need to rethink how it measures productivity, shifting focus from raw output speed to long‑term code health.

This article was written with the assistance of AI.
News Factory APP - agentic news to boost your SEO & AEO.