- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
Seems like an invitation to me.
Archive link: https://web.archive.org/save/https%3A%2F%2Fwww.anthropic.com%2Fresearch%2Fsmall-samples-poison
Seems like an invitation to me.
Archive link: https://web.archive.org/save/https%3A%2F%2Fwww.anthropic.com%2Fresearch%2Fsmall-samples-poison
It’s hard to please everybody with an answer when you also train off the responses from said answers.
For anyone interested, Computerphile did an episode about sleeper agents in models.
Sleeper Agents in Large Language Models - Computerphile