Just a year after companies turned AI usage into a competitive sport, the tide is turning. Executives who once bragged about the number of tokens burned by their staff now face mounting pressure to rein in spending. The change, dubbed "tokenminimizing," reflects a hard look at budgets that have ballooned to unsustainable levels.

AT&T has begun restricting access to GitHub Copilot for certain employees, according to a report from The Information. Meta is reportedly tightening the reins on staff spending for Anthropic and other AI services, reversing a period when engineers raced to consume the most tokens. Uber, which exhausted its entire 2026 AI coding budget by April, now caps each employee at $1,500 per month per tool. Walmart has placed limits on its proprietary AI agent, and Amazon has dismantled an internal leaderboard that once ranked workers by AI usage after it discovered the system was being gamed, driving compute costs skyward.

Companies impose caps and new monitoring tools

The trigger for these moves is simple: the bills were frightening. Some of the most AI‑obsessed firms have been spending roughly $7,500 per employee each month. Agentic tools that repeatedly call a model have doubled enterprise AI expenses even as per‑token prices have fallen. Microsoft uncovered engineers whose monthly Claude Code token spend ranged from $500 to $2,000, prompting internal reviews.

To keep a lid on costs without throttling productivity, firms are turning to new monitoring solutions. Microsoft and Databricks have introduced “gateway” tools that track and cap AI spend in real time. Nvidia‑backed Factory, valued at $1.5 billion, launched a model router that automatically directs simpler tasks to cheaper or open‑source models, effectively swapping expensive frontier models for more economical alternatives.

Box CEO Aaron Levie praised the shift, saying the company never ran token‑maxxing leaderboards and therefore avoided incentivizing wasteful behavior. In contrast, an engineering leader at Databricks noted that the AI budget for engineers remains unlimited, suggesting that some teams still operate under a token‑maxxing mindset.

The tension is clear: caps can safeguard budgets, but they may also curb the productivity gains that initially justified the spend. Satya Nadella captured the mood in a weekend essay, arguing that AI models should be interchangeable rather than dominant. He warned against a future where a few models monopolize value across every sector, a concern that resonates with companies whose software depends on the very labs they now seek to limit.

Ultimately, the move toward token‑minimizing is reshaping how enterprises approach AI. By enforcing spending caps, deploying real‑time monitoring, and favoring cheaper or open‑source models for routine tasks, firms aim to preserve the benefits of AI while preventing budget overruns. The era of unchecked token consumption may be ending, but the quest for efficient, cost‑effective AI use is just beginning.

Este artículo fue escrito con la asistencia de IA.
News Factory APP - noticias agénticas para impulsar tu SEO y AEO.