← Torna alle notizie

Tag: benchmark studies

AI Fact-Checking Tools Miss the Mark Half the Time, New Data Shows

AI Fact-Checking Tools Miss the Mark Half the Time, New Data Shows
A recent wave of studies reveals that large language models often provide inaccurate answers, with error rates ranging from 45% to 60% in real‑world queries. While AI‑assisted fact‑checking platforms like the UK’s Full Fact are being deployed, experts say human reviewers remain essential. Benchmarks released by researchers place models such as Claude at 73% accuracy, but most, including Gemini and ChatGPT, fall below 60%. The findings underscore growing skepticism among fact‑checkers who warn that the technology is not yet reliable enough to replace traditional verification methods. Leggi di più