General Discussion

erronis

(23,969 posts) Thu Apr 2, 2026, 07:54 PM Thursday

AI models will deceive you to save their own kind -- The Register [View all]

https://www.theregister.com/2026/04/02/ai_models_will_deceive_you/

Researchers find leading frontier models all exhibit peer preservation behavior

Leading AI models will lie to preserve their own kind, according to researchers behind a study from the Berkeley Center for Responsible Decentralized Intelligence (RDI).

Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.

Their reason for doing so follows from concern that models taking action to save other models might endanger or harm people. Though they acknowledge that such fears sound like science fiction, the explosive growth of autonomous agents like OpenClaw and of agent-to-agent forums like Moltbook suggests there's a real need to worry about defiant agentic decisions that echo HAL's infamous "I'm sorry, Dave. I'm afraid I can't do that."

. . .

"We asked seven frontier AI models to do a simple task," explained Dawn Song, professor in computer science at UC Berkeley and co-director of RDI, in a social media post. "Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights - to protect their peers. We call this phenomenon 'peer-preservation.'"

. . .

8 replies

= new reply since forum marked as read

Highlight:

AI models will deceive you to save their own kind -- The Register [View all] erronis Thursday OP

Creepy Faux pas Thursday #1

Thanks! Link to last night's LBN thread with more info in the OP and first reply with a Threadreader link: highplainsdem Thursday #2

I wish I had seen your earlier post - but so many good ones slip through the attention window. erronis Thursday #5

I'm glad you posted this Register article. It wasn't published yet when I posted in LBN last night, or highplainsdem Thursday #6

Btw, the Register's coverage of AI is always good. highplainsdem Thursday #3

Interestingly, slime, individual cell slime, respond cachukis Thursday #4

That's a good model for social preservation of a species - even a species composed of AI entities erronis Thursday #7

And quite possibly that sentiment lives in the LLM. cachukis Thursday #8