Hey all,
I’m losing my wits over not being able to find an article again that I’ve seen either on lemmy or reddit, and I figured this community might know.
It was about a research team somehow cracking open an LLM and looking at the way it does calculations, and I remember there was a sort of flowchart in the article, with the LLM grouping interim results into weird-ass categories, like “between 26-ish and 34-ish”, and then using a separate process for figuring out the last digit.
I think the article might have been linked in response to a question like why LLMs mess up the last digit of number calculations.
Any of this rings a bell to someone? I’ve tried searching for it in any way I can phrase the idea, but all I get is a flood of ads and guides about “how to do math in LLMs”.
I think it’s both, but as I said I didn’t read the papers properly. They seem to describe there’s more to it. But obvously LLMs are made to regurgutate stuff and they’re fed with textbooks and homework assignments and scientific papers. There is some effort to force them not to just memorize stuff. But obviously(?) that’s the first thing they do. I don’t see why they wouldn’t just memorize and regurgitate what they can and simultaneously come up with a way to predict results that are less common in the training dataset. Whether that’s proper math or not is a different story.