All major language models such as ChatGPT and similar systems are currently based on the so-called transformer architecture, which mathematically emulates human abilities to focus on relevant information while ignoring less important details, as well as to form associative connections. By analyzing vast amounts of data, AI systems identify patterns and establish relationships between them.
However, according to Michael Hahn, Professor of Computational Linguistics at Saarland University, it is precisely this architecture that causes language models to reach their limits and, in some cases, leads to serious errors that cannot be eliminated even through additional training. The professor identifies three main shortcomings: “First, the models are poor at tracking changing states. They do not reliably update their internal representation when a situation evolves.” In medicine in particular, this can pose risks for patients if AI assistants, which are already being used in some cases, fail to correctly interpret the chronological order of medical test results and therefore recommend incorrect medication.
The second shortcoming of large language models is similarly problematic and can also be illustrated using an example from medicine. “If an AI system is meant to select the appropriate medication for a specific condition from a large database, it must be able to infer which symptoms correspond to that condition. The same applies to establishing a diagnosis. However, such a systematic approach based on logical rules cannot yet be adequately represented in neural networks,” explains Michael Hahn.
Language models become even less reliable when it comes to the meaningful processing of complex and deeply nested inputs. “This becomes evident, for example, in legal contexts where it must be assessed on which legal basis and in what temporal sequence one person has harmed another individual or a company. These chains of reasoning, which are often already difficult for humans to follow, can so far hardly be handled without errors using neural networks,” Hahn emphasizes.
There are therefore compelling reasons to further develop and improve the transformer architecture. To pursue this goal, computational linguist Michael Hahn has now been awarded €1.4 million through the Emmy Noether Programme of the German Research Foundation (DFG). Together with five doctoral researchers, Hahn will first examine the theoretical foundations of transformer architectures in greater detail as part of the project “Understanding and Overcoming Architectural Limitations in Neural Language Models.” The aim is to gain a better understanding of how neural networks arrive at their results. In a second phase, the research will focus on exploring hybrid systems or even entirely new architectures that offer more predictable capabilities and operate more reliably and efficiently than current large language models.
In 2025, Hahn’s research group is already the third Emmy Noether group to be approved for computer science research at the Saarland Informatics Campus (SIC). By comparison, only three Emmy Noether groups across Germany focused on computer science topics in the previous year.
The department warmly congratulates Michael Hahn on this prestigious distinction.
The full article is available in the Campus Magazine and on the SIC website. The Saarbrücker Zeitung also used Hahn’s DFG award as an opportunity to publish an in-depth report on his research.
