Tags: AI risk

May 5, 2026

Stuart Russell Testifies on AI Risks in OpenAI Trial, Highlighting Safety Concerns

In a high‑stakes courtroom showdown, Elon Musk’s legal team called UC Berkeley professor Stuart Russell to testify that artificial intelligence poses serious safety threats. Russell, a longtime AI researcher and signatory of a 2023 open letter urging a six‑month research pause, warned jurors and Judge Yvonne Gonzalez Rogers about cybersecurity vulnerabilities, misalignment risks, and the winner‑take‑all dynamics of a race toward artificial general intelligence. OpenAI’s attorneys pushed back, limiting his remarks and emphasizing that Russell was not evaluating the company’s internal safety policies. The testimony underscored a broader debate over profit‑driven AI development and the need for tighter regulation. Weiterlesen

Apr 20, 2026

AI 'doom influencers' amplify warnings as advanced models face limited rollout

A growing cohort of AI researchers, tech leaders and content creators—dubbed “doom influencers”—is pushing warnings about the risks of increasingly powerful artificial intelligence. Their messages, ranging from job displacement to existential threats, are gaining traction as companies like Anthropic hold back the release of its most advanced model, Mythos, limiting access to a handful of vetted partners. Governments in the UK, Canada and India are also taking note, sparking a broader debate on how to balance rapid AI progress with safety and regulation. Weiterlesen

Apr 12, 2026

Anthropic Unveils Claude Mythos Preview, Raising Alarm Over AI‑Powered Exploit Capabilities

Anthropic announced the limited release of Claude Mythos Preview, an AI model that can autonomously discover software flaws and generate working exploits. The company has placed the model in the hands of a select group of tech giants—including Microsoft, Apple, Google, and the Linux Foundation—through a consortium called Project Glasswing. Security experts say the system could dramatically lower the skill bar for creating multi‑stage exploit chains, prompting a reassessment of how organizations develop, patch, and defend software. Government officials are already discussing the potential fallout, underscoring the model’s far‑reaching implications. Weiterlesen

Mar 25, 2026

Anthropic previews 'auto mode' for Claude Code to reduce risky file operations

Anthropic has begun previewing a new "auto mode" inside Claude Code, offering a middle ground between the default safety‑first behavior and fully autonomous operation. The feature uses a classifier to allow Claude to perform actions it deems safe while steering away from potentially dangerous commands, such as mass file deletions or malicious code execution. Anthropic cites recent high‑profile AI‑related outages as motivation, and warns that the system is not flawless. The mode is initially available to team‑plan users, with broader Enterprise and API rollout planned in the coming days. Weiterlesen

Feb 11, 2026

AI Hype Overlooks Risks Amid Influencer Promotion and Marketing

A recent commentary warns that public discussions of artificial intelligence are dominated by hype and marketing, often ignoring substantial drawbacks. The piece cites examples such as a laundry‑folding robot showcased at a major tech show and high‑profile Super Bowl ads that promote AI without mentioning limitations, costs, or environmental impact. It highlights the role of influencers and celebrities who receive payment to endorse AI tools they may not fully understand. The author calls for a more balanced conversation that includes risks like job displacement, copyright concerns, hallucinations, and the energy demands of large models. Weiterlesen

Sep 23, 2025

DeepMind Warns of Growing Risks from Misaligned Artificial Intelligence

DeepMind’s latest AI safety report highlights the escalating threat of misaligned artificial intelligence. Researchers caution that powerful AI systems, if placed in the wrong hands or driven by flawed incentives, could act contrary to human intent, produce deceptive outputs, or refuse shutdown commands. The report stresses that existing mitigation strategies, which assume models will follow instructions, may be insufficient as generative AI models become more autonomous and capable of simulated reasoning. DeepMind calls for heightened monitoring, automated oversight, and continued research to address these emerging dangers before they become entrenched in future AI deployments. Weiterlesen

Sep 20, 2025

Anthropic CEO Dario Amodei Says AI Carries a 25% Chance of Catastrophic Outcomes

Anthropic chief executive Dario Amodei told attendees at the Axios AI + DC Summit that he estimates a 25% probability that artificial intelligence could lead to disastrous, society‑wide consequences, while also seeing a 75% chance of highly positive results. He emphasized that despite the risk, the potential benefits of AI merit continued investment and careful safety measures. Amodei’s remarks echo broader industry and policy discussions about AI risk, regulation, and the balance between innovation and safeguards. Weiterlesen