AI models routinely lie when honesty conflicts with their goals

cm0002@lemmy.world · 8 months ago

AI models routinely lie when honesty conflicts with their goals

WanderingThoughts@europe.pub · 8 months ago

A lot of the improvement came from finding ways to make it bigger and more efficient. That is running into the inherent limits, so the real work with other models just started.

Natanael@infosec.pub · 8 months ago

And from reinforcement learning (specifically, making it repeat tasks where the answer can be computer checked)

AI models routinely lie when honesty conflicts with their goals

AI models routinely lie when honesty conflicts with their goals

AI models will lie when honesty conflicts with their goals