Wikipedia is one of the last genuine places on the Internet, and these rat bastards are trying to contaminate that, too

destructdisc@lemmy.world · 7 days ago

Wikipedia is one of the last genuine places on the Internet, and these rat bastards are trying to contaminate that, too

Kuinox@lemmy.world · 5 days ago

You are mixing two kind of AI, LLM and diffusion.
It’s way harder for a diffusion model to not change the rest, the first step of a diffusion model is to use a lossy compression to transform the picture into a soup of digits that the diffusion model can understand.

CileTheSane@lemmy.ca · 5 days ago

And an LLM will convert a prompt into a bunch of tokens the model can understand.

Kuinox@lemmy.world · 4 days ago

Tokens are a lossless conversion, you can convert it back to the original text.

CileTheSane@lemmy.ca · 4 days ago

This isn’t about saying “return the original text” this is about assuming LLMs understand language, and they don’t. Telling an LLM “don’t do these things” will be as effective as telling it “don’t hallucinate” or asking it "how many 'r’s in ‘strawberry’.

Kuinox@lemmy.world · 4 days ago

In order to make such affirmation or infirmation we’ll need to define understanding.
The example you gave can be explained by other way than “it doesn’t understand”.

For example, the “how many ‘r’ in strawberry”, LLMs see tokens, and the dataset they use, doesn’t contain a lot of data about the letters that are present in a token.