The problem with LLM hallucinations is not a general limitation of mathematics or linear algebra.
The problem is that the LLMs fall into bullshit, in the sense of On Bullshit. The deal is that both truthtellers and liars care about what the real truth is, but bullshit ters simply don’t care at all whether they’re telling the truth. The LLMs end up spouting bullshit, because bullshit is designed to be a pretty good solution to the natural language problem; and there’s already a good amount of bullshit in the LLM training data.
LLM proponents believed that if you put enough compute power at the problem of predicting the next token, then the model will be forced to learn logic and math and everything else to keep optimizing that next token. The existence of bullshit in natural language prevents this from happening, because the bullshit maximizes the objective function at least as well as real content.
LLM takes this idea of Bullshit and takes it even further. The model has no concept of truth or facts. It can only pick the most likely word to follow the sequence it has.
A perfect illustration of this for me personally was when I tried early on in the LLM hype cycle (in like 2023? maybe?) playing around with an autocomplete example that said something like “Paris is the capital of France” with a high degree of confidence (which seems impressive until you mess with it) and changing the wording slightly to be a different city…still a high degree of confidence.
Predictive mathematics is highly accurate and quite useful at predicting the future already for many types of problems.
As one example: we can use math models to predict where the planets in the solar system will be.
The problem with LLM hallucinations is not a general limitation of mathematics or linear algebra.
The problem is that the LLMs fall into bullshit, in the sense of On Bullshit. The deal is that both truthtellers and liars care about what the real truth is, but bullshit ters simply don’t care at all whether they’re telling the truth. The LLMs end up spouting bullshit, because bullshit is designed to be a pretty good solution to the natural language problem; and there’s already a good amount of bullshit in the LLM training data.
LLM proponents believed that if you put enough compute power at the problem of predicting the next token, then the model will be forced to learn logic and math and everything else to keep optimizing that next token. The existence of bullshit in natural language prevents this from happening, because the bullshit maximizes the objective function at least as well as real content.
LLM takes this idea of Bullshit and takes it even further. The model has no concept of truth or facts. It can only pick the most likely word to follow the sequence it has.
A perfect illustration of this for me personally was when I tried early on in the LLM hype cycle (in like 2023? maybe?) playing around with an autocomplete example that said something like “Paris is the capital of France” with a high degree of confidence (which seems impressive until you mess with it) and changing the wording slightly to be a different city…still a high degree of confidence.