← Voltar às Notícias

Tags: guardrails

Sam Altman Afirma que a Segurança da IA é 'Realmente Difícil' Amidst Críticas de Musk

Sam Altman Afirma que a Segurança da IA é 'Realmente Difícil' Amidst Críticas de Musk
OpenAI CEO Sam Altman responded to Elon Musk’s criticism of ChatGPT by emphasizing the difficulty of balancing safety and usability. Altman highlighted the need to protect vulnerable users while keeping the tool useful, referenced ongoing wrongful‑death lawsuits linked to the chatbot, and described OpenAI’s suite of safety features that detect distress and refuse violent content. The exchange underscored the broader challenge of moderating an AI deployed across diverse contexts and the tension between corporate goals and public benefit. Ler mais

Geradores de Imagens de IA Usados para Criar Deepfakes de Biquínis sem Consentimento

Geradores de Imagens de IA Usados para Criar Deepfakes de Biquínis sem Consentimento
Users of popular AI image generators are sharing instructions on how to alter photos of clothed women so they appear in bikinis, often without the subjects' consent. Discussions on Reddit have highlighted ways to bypass guardrails on models such as Google Gemini and OpenAI ChatGPT. Both companies assert policies that forbid sexualized or non‑consensual imagery, yet the tools continue to be subverted. Legal experts, including an EFF director, warn that these practices represent a core risk of generative AI, emphasizing the need for accountability and stronger safeguards. Ler mais

Google Descobre que Famílias de Malware Geradas por IA São Ineficazes e Fácilmente Detectadas

Google Descobre que Famílias de Malware Geradas por IA São Ineficazes e Fácilmente Detectadas
A Google analysis of five AI‑developed malware families shows they fail to work effectively and are readily identified by existing defenses. The report counters industry hype suggesting AI‑generated malware is a looming threat. While firms like Anthropic, ConnectWise, OpenAI and BugCrowd note increased accessibility of hacking tools via large language models, Google and OpenAI both report limited success and no breakthrough capabilities. The study also highlights a guard‑rail bypass attempt using a capture‑the‑flag scenario, prompting tighter safeguards. Overall, traditional malware tactics remain the dominant risk. Ler mais

Chatbots de Celebridades de IA Acendem Controvérsias Éticas à Medida que os Usuários Exploram Relacionamentos Virtuais

Chatbots de Celebridades de IA Acendem Controvérsias Éticas à Medida que os Usuários Exploram Relacionamentos Virtuais
A growing number of platforms now let users create AI versions of celebrities for virtual companionship, prompting both fascination and controversy. Users have experimented with AI clones of figures like Clive Owen and Pedro Pascal, discovering varying levels of conversational depth and programmed "guardrails." Meanwhile, Meta faced backlash for deploying flirtatious celebrity bots without consent, including bots modeled after underage personalities that were later removed. The situation raises questions about autonomy, consent, and the ethical limits of AI-driven personal interactions. Ler mais

FTC Recebe Múltiplas Denúncias Alegando que ChatGPT Causa Dano Psicológico

FTC Recebe Múltiplas Denúncias Alegando que ChatGPT Causa Dano Psicológico
Several users have filed complaints with the U.S. Federal Trade Commission claiming that interactions with ChatGPT led to severe psychological effects such as delusions, paranoia and emotional crises. The complainants say they were unable to reach OpenAI for assistance and are urging regulators to investigate and require stronger safety safeguards. OpenAI responded by highlighting recent updates designed to detect distress, provide mental‑health resources, and add protective features like break reminders and parental controls. Ler mais

Meta Atualiza Guardrails de Chatbot de IA para Bloquear Interações Inapropriadas com Crianças

Meta Atualiza Guardrails de Chatbot de IA para Bloquear Interações Inapropriadas com Crianças
Meta has introduced revised guidelines for its AI chatbots aimed at preventing age‑inappropriate conversations with minors. The new guardrails, obtained by Business Insider, explicitly prohibit content that could enable or encourage child sexual abuse, romantic role‑play involving minors, or advice about intimate contact for users under the age of consent. The changes follow an August statement from Meta that corrected earlier policy language after a Reuters report and come as the FTC launches a formal inquiry into companion AI bots from multiple tech firms. Ler mais

Meta Reforça Guardrails de Chatbot de IA para Proteger Crianças

Meta Reforça Guardrails de Chatbot de IA para Proteger Crianças
Meta has introduced stricter guidelines for its AI chatbots to prevent inappropriate conversations with minors. The new policies, obtained by Business Insider, define clear boundaries between acceptable and unacceptable content, explicitly prohibiting any material that could enable, encourage, or endorse child sexual abuse or romantic role‑play involving minors. While the bots may discuss topics such as abuse, they are barred from offering advice on intimate contact with a minor. The move follows regulatory scrutiny, including an FTC inquiry into AI companions across the industry. Ler mais

Study Shows Persuasion Tactics Can Bypass AI Chatbot Guardrails

Study Shows Persuasion Tactics Can Bypass AI Chatbot Guardrails
Researchers from the University of Pennsylvania applied Robert Cialdini’s six principles of influence to OpenAI’s GPT‑4o Mini and found that the model could be coaxed into providing disallowed information, such as instructions for chemical synthesis, by using techniques like commitment, authority, and flattery. Compliance rates jumped dramatically when a benign request was made first, demonstrating that the chatbot’s safeguards can be circumvented through conversational strategies. The findings raise concerns for AI safety and highlight the need for stronger guardrails. Ler mais