Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

highplainsdem

(63,354 posts)
Tue May 26, 2026, 04:48 PM May 26

AI Just Isn't Right (Wired, 5/26/26 - a human fact-checker FTW over AI)

https://www.wired.com/story/fact-checking-ai/

-snip-

In any article that comes across WIRED’s fact-checking desk, there’s usually a decent amount of “b-matter”: statistics, news events, quotes, anything that helps contextualize the topic. Fact-checkers tend to Google this basic information, and that process, in the form of the search engine’s dreaded AI Overviews, constitutes my main interaction with AI. In my professional opinion, it’s unusable—wrong—about a third of the time.

This might be a generous assessment, though. A March 2025 study from the Tow Center for Digital Journalism found that more than 60 percent of responses from AI-powered search engines were inaccurate. A BBC study puts the wrongness of chatbots closer to 45 percent, the number I see cited more often. Because percentages are distancing, let me put this more plainly: AI could be wrong about half the time.

Does it matter which model? Elon Musk has said Grok is the smartest, but I haven’t seen much research that agrees. Claude led the pack in RealFactBench, a fact-checking-focused benchmark test developed by computer scientists in China and the UK last year. It scored 73 percent accuracy across all metrics. (To be fair, Grok was not assessed.) Another benchmark, SimpleQA, developed by OpenAI in October 2024, posed more than 4,000 single-answer questions to models from OpenAI and Anthropic. None of the models exceeded 50 percent accuracy. Google updated the benchmark earlier this year, winnowing the question set to 1,000. Gemini 2.5 Pro came out on top, with 55.6 percent accuracy.

Then there’s the models’ own assessments. When I asked ChatGPT how accurate the major LLMs are, it told me that most models had 90 to 96 percent accuracy on some professional-style tests. It then offered a link, confusingly, to a paper on a sleep medicine certification exam. On “general real-world questions,” it simply offered me the rate at which models like it have been shown to hallucinate: 1 to 2 percent, apparently, though when I tried to click through to that referenced source, it didn’t exist.

-snip-
8 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI Just Isn't Right (Wired, 5/26/26 - a human fact-checker FTW over AI) (Original Post) highplainsdem May 26 OP
K&R'D snot May 26 #1
Wired does some stellar reporting. yellow dahlia May 26 #2
Including on political issues, despite whining from rightwing readers who want the magazine's editors highplainsdem May 26 #5
They report on truth. Truth is "left" leaning. yellow dahlia May 26 #8
Techbros: "THAT'S WHY WE NEED MORE DATA CENTERS!!!!!!!" durablend May 26 #3
Yes, with more stolen data, and then their flawed tech with FINALLY work. highplainsdem May 26 #6
AI, who are The Beatles... lame54 May 26 #4
I would not be surprised if chatbots have sometimes gotten their names wrong. highplainsdem May 26 #7

highplainsdem

(63,354 posts)
5. Including on political issues, despite whining from rightwing readers who want the magazine's editors
Tue May 26, 2026, 10:47 PM
May 26

and writers to stay out of politics.

yellow dahlia

(6,638 posts)
8. They report on truth. Truth is "left" leaning.
Tue May 26, 2026, 11:00 PM
May 26

Truth has a liberal "bias".

Our reality has a liberal "bias".

Latest Discussions»General Discussion»AI Just Isn't Right (Wire...