“Once the “hallucination problem” becomes relatively solved, the reliance on external sources for credibility will diminish.”
— I think this is where the devil lies. Hallucinations aren’t just a bug of LLMs. They’re a fundamental part of this architecture and model paradigm. LLMs at the end of the day are trying to maximize the likelihood of what they output, so they will always have cases where they say something incorrect, esp when reality doesn’t match up with what is expected.
Chain of thought models do try to solve this by adding enough context as to change what is considered a “likely” continuation by the model, but hallucinations are by and large here to play
Great perspective! Also think it's possible - "perhaps hallucination will never be fixed and remain a necessary evil in the way LLMs are structured."
The average hallucination rates seem to be 10-20%, but actually seems to worsen with more complex task. Curious if the rates can be reduced without sacrificing the utility of LLMs in the future, especially for specific verticals.
Fascinating analysis, Dan
Appreciate that! I'd love to hear your take!
Interesting takes Dan!
“Once the “hallucination problem” becomes relatively solved, the reliance on external sources for credibility will diminish.”
— I think this is where the devil lies. Hallucinations aren’t just a bug of LLMs. They’re a fundamental part of this architecture and model paradigm. LLMs at the end of the day are trying to maximize the likelihood of what they output, so they will always have cases where they say something incorrect, esp when reality doesn’t match up with what is expected.
Chain of thought models do try to solve this by adding enough context as to change what is considered a “likely” continuation by the model, but hallucinations are by and large here to play
Great perspective! Also think it's possible - "perhaps hallucination will never be fixed and remain a necessary evil in the way LLMs are structured."
The average hallucination rates seem to be 10-20%, but actually seems to worsen with more complex task. Curious if the rates can be reduced without sacrificing the utility of LLMs in the future, especially for specific verticals.