onehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 21 days agoCommunity idea: AI poisoning place for deliberate gibberish postingmessage-squaremessage-square12fedilinkarrow-up141arrow-down15file-text
arrow-up136arrow-down1message-squareCommunity idea: AI poisoning place for deliberate gibberish postingonehundredsixtynine@sh.itjust.works to Fuck AI@lemmy.worldEnglish · 21 days agomessage-square12fedilinkfile-text
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up4·21 days ago In a joint study with the UK AI Security Institute and the Alan Turing Institute, we found that as few as 250 malicious documents can produce a “backdoor” vulnerability in a large language model—regardless of model size or training data volume. This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison . 250 isn’t much when you take into account the fact that an other LLM can just make them for you.
minus-squareonehundredsixtynine@sh.itjust.worksOPlinkfedilinkarrow-up2·20 days agoI’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
minus-squareGrimy@lemmy.worldlinkfedilinkarrow-up1·edit-219 days agoBro, it’s in the article. You asked “how so” when I said it was easy, not how to.
This is the main paper I’m referencing https://www.anthropic.com/research/small-samples-poison .
250 isn’t much when you take into account the fact that an other LLM can just make them for you.
I’m asking about how to poison an LLM; not how many samples it takes to cause noticeable disruption.
Bro, it’s in the article. You asked “how so” when I said it was easy, not how to.