AI research: Where is it headed?

Prof. Dr. Volker Tresp, Professor of Machine Learning at Ludwig Maximilian University in Munich and working group leader at Plattform Lernende Systeme.

Developments in the field of AI are currently progressing rapidly. Generative AI continues to cause a stir and fuel international competition, but it is also reaching its limits due to its hunger for data and resources and a lack of transparency.

1. Mr Tresp, developments in generative AI and machine learning have been rapid in recent years. What has surprised you the most?

Volker Tresp: There have been two major breakthroughs in recent years. In 2012, it was demonstrated that multi-layer deep neural networks are significantly more powerful than architectures with only one hidden layer. Deep neural networks were immediately able to improve difficult benchmarks in image and text analysis by an order of magnitude in some cases.

The second breakthrough concerns the successes of generative AI, especially large language models (LLMs). LLMs not only meet the general public's expectations of ‘intelligent’ AI, but also surprise experts with their performance. In terms of robustness and scalability, only search engines are comparable, but they only find relevant web pages and do not represent a true dialogue system.

LLMs answer complex questions, generate high-quality text and demonstrate unexpected abilities in areas such as approximate reasoning (answers are derived based on probabilities and learned patterns) and automatic code generation. These latter capabilities have not been explicitly trained, but seem to emerge as emergent properties due to the sheer size of the models and the diversity of the training data – a phenomenon that is not yet fully understood.

2. The immense hunger for data and resources is pushing generative AI models to their limits. What alternatives to ever-larger model scaling do you see?

Volker Tresp: In fact, the human brain consumes less than 100 watts – about as much as a light bulb – and yet performs amazing cognitive feats. At the same time, humans learn with a fraction of the data that today's AI systems require. Generative AI is still a long way from this level of efficiency. Nevertheless, there are increasingly more approaches that go beyond pure upscaling.

It is essential to distinguish between the training of a generative AI model and its use (inference). We are already seeing great progress in its use: image generators such as Stable Diffusion now run directly on smartphones. Language models are also available as smaller, compressed variants of large models.

Alternatives to simple scaling are also emerging in training – through more efficient algorithms, specialised hardware and new model architectures. Graphics processing units (GPUs) currently remain the workhorse of generative AI in training. At the same time, however, neuromorphic chips are being developed that are based on the functioning of the human brain and are expected to be significantly more energy-efficient.

Another exciting field is federated learning, where the model learns decentrally from the data on users' end devices without it being stored centrally. This not only conserves resources but also strengthens data protection. Finally, synthetic training data is also being used increasingly: large models (LLMs) are partly trained with AI-generated data in order to reduce dependence on time-consuming manually annotated data sets.

3. Studies suggest that generative AI models learn by generating a gigantic collection of rules of thumb. They do not think like humans, who use more efficient mental models for drawing conclusions. How do you interpret this finding?

Volker Tresp: Although there have been significant advances in cognitive neuroscience in recent years, we still know surprisingly little about how human ‘thinking’ actually works. In my own work on the so-called ‘tensor brain,’ for example, there is a close link between perception and the memory systems of the human brain – both of which are fundamental prerequisites for complex cognitive performance.

Cognitive neuroscience is increasingly benefiting from impulses from generative AI. Renowned universities are now specifically investigating similarities in the representation and processing of information between biological and artificial systems. A distinction must be made between the structural and functional organisation of the brain. Functionally, there are certainly similarities between large language models (LLMs) and human cognitive processing. Structurally, however, the neural implementation of mechanisms is much more difficult.

4. What concepts of human thinking can be found in modern LLMs?

Volker Tresp: Interestingly, modern LLMs are sometimes referred to as Large Reasoning Models (LRMs). These LRMs seem to be capable of forming hypotheses, rejecting them and generating new approaches in order to ultimately solve complex tasks. The DeepSeek R1 model, for example, generates such intermediate hypotheses as explicit tokens – in contrast to OpenAI o1, which works more implicitly. However, the associated humanisation (anthropomorphisation) of LRMs has been met with criticism: some researchers reject the idea that these systems ‘think’ in a human-like way.

Another concept is that of internal models: humans can imagine a bouncing ball, for example, and implicitly understand the laws of physics and boundary conditions. A car mechanic has a mental model of a car and can search for faults in a targeted manner. A doctor has a deep mental understanding of the human body. Whether LLMs are capable of forming such detailed mental models, or ever will be, is currently still open to question.

Another interesting point is that humans do not work purely rationally. They use a variety of heuristics and often make decisions intuitively. Daniel Kahneman distinguishes between System 1 (intuitive, fast) and System 2 (reflective, slow). Interestingly, people often justify their actions retrospectively with System 2 arguments, even though the actual decision was made intuitively – a form of post hoc rationalisation. Deep thinking may ultimately be more of a simulation of different scenarios than a strictly logical process.

It is not uncommon for LLMs to be confronted with tasks that require superhuman abilities – problems that a human being would hardly be able to solve without aids. This raises the question: does it make sense to talk about artificial general intelligence (AGI) before we really understand the mechanisms of human intelligence?

5. Which developments beyond generative AI are currently particularly promising?

Volker Tresp: One example is safety-critical applications such as autonomous driving or aircraft control. Here, requirements such as extreme reaction speed, robustness, reliability and energy efficiency dominate. Although generative models are increasingly being used in robotics – for example, for high-level tasks such as voice control or mission planning – classic or hybrid systems continue to lead the way in low-threshold, time-critical control and regulation.

Another important field is the verification of safety-critical systems, such as the formal testing of the logic of signal boxes in railway stations or flight control systems. Here, more classical, logic-based AI is used, in which every statement must be traceable and provable – hallucinations would be unacceptable here.

The situation is similar in medical statistics, where the aim is to draw reliable, causal conclusions about the effectiveness of therapies from clinical studies. Here, methodological rigour, transparency and statistical validity are what count – not creative or generative abilities. Quantum technologies could enable significant breakthroughs in the coming years – both in quantum communication and precision sensor technology. Breakthroughs are also conceivable in the field of quantum computing.

6. AI researchers in Germany face fierce international competition. What are the biggest challenges and how can they be overcome?

Volker Tresp: Germany must be attractive to excellent scientists – both from within Germany and abroad. This requires significantly more professorships, excellently equipped research centres and funding programmes that are specifically aimed at pushing the boundaries of AI research – with the overarching goal of responsible, peaceful and humane AI. There are promising initiatives, but they need to be substantially expanded. These include more intensive research funding and strategic investments in computing infrastructure. Access to high-performance GPUs is a basic prerequisite for keeping up with global cutting-edge research.

Many highly educated students and researchers leave Germany after graduating – often heading for Zurich, London or North America – because they find better career opportunities and research conditions there. This must be counteracted by expanding internationally visible research centres in Germany and offering attractive career paths in science and industry.

Germany's strength lies in its industrial excellence, particularly in areas such as mechanical engineering, automotive technology and manufacturing. This offers an opportunity to take a leading role in the development and application of industrial AI. Many companies are already innovating, but need targeted support to fully exploit their potential.

The framework conditions for AI start-ups must also be improved. Easier access to venture capital, accelerated start-up procedures and a reduction in bureaucracy are necessary. Start-ups are a key driver of innovation – they need an environment that enables dynamic growth and rewards entrepreneurial risk-taking.