Know LLM Limitations

Mar 2025

Adopt

SOTA (state of the art) LLM and generative models are very powerful and can do amazing things, but it is important to know and consider their current limitations.

Hallucination:

One of the most known one is Hallucination. Hallucination is the phenomenon of producing misleading, factually incorrect, or entirely fabricated responses. One differenciates two types of hallucination:

Factuality hallucinations: The model generates responses that are factually incorrect or entirely fabricated.
Faithfulness hallucination: when the generated content does not align with the user's instructions or the given context. reference

Models hallucinate, and you have to deal with that.

Limitations:

There are also Task where current models are not performing well, such as complex reasoning tasks. There are for example benchmarks like PlanBench, GAIA, WorkArena++, WebArena or MathArena, that show the current limitations of LLMs and where human still outperforms current models.

Apple published a paper on that topic as well: "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity". It was discussed in the AI community a lot.

Danger of "Average" content and false facts:

With the use of LLMs, the generation of "average" quality content is just a keystroke away. As a result, the internet is flooded with masses of irrelevant and sometimes incorrect content. example of well of knowledge poisoning, Github code quality decline. Also glitches in AI training may result in quality problems. Example of "vegetative electron microscopy"

We should not contribute to this unreflected use with "naive" AI applications, but do everything we can to ensure high quality and relevance.