Modern artificial intelligence (AI)-based tools certainly are proving themselves to be useful, but boy do they ever guzzle energy. The data centers that supply the computational resources to run these algorithms now gobble up a meaningful percentage of some nations’ total energy consumption. Since the popularity of these tools is on the rise, and that trend is expected to continue for the foreseeable future, that could put us in a bad spot. Innovations in energy efficiency are sorely needed to keep the good times rolling in this present AI summer.
There are many potential ways to slash energy consumption, but one of the more promising techniques involves cutting the processing time involved in either model training or inferencing. Even if a model does require a lot of energy to operate, that amount can be reduced by decreasing processing time. Some help of this sort may be on the way, thanks to the efforts of a team of researchers at the Technical University of Munich. Their new approach makes it possible to speed up model training by up to 100 times — at least for certain types of algorithms — without appreciably impacting performance.
A 100x faster alternative to backpropagation
The team’s work presents an alternative to the traditional way AI models learn — backpropagation. Most deep learning models today, including large language models and image recognition systems, rely on iterative gradient-based optimization to adjust their parameters. This approach, while effective, is slow and power-hungry.
Hamiltonian Neural Networks (HNNs) offer a more structured way to learn physical and dynamical systems by incorporating Hamiltonian mechanics, which describe energy conservation in physics. HNNs are particularly useful for modeling complex systems like climate simulations, financial markets, and mechanical dynamics. However, like traditional deep learning methods, training HNNs has historically required iterative optimization via backpropagation — until now.
The researchers have developed a new technique that eliminates the need for backpropagation when training HNNs. Instead of iteratively tuning parameters over many training cycles, their approach determines the optimal parameters directly using probability-based methods.
This probabilistic technique strategically samples parameter values at crucial points in the data — particularly where rapid changes or steep gradients occur. This allows the model to learn effectively without the computational overhead of traditional training, slashing training times dramatically. According to the team, their method is not only 100 times faster but also achieves accuracy comparable to conventionally trained networks — and sometimes much better.
In tests involving chaotic systems such as the Hénon-Heiles system, a well-known mathematical model used in physics, the new approach was found to be more than four orders of magnitude more accurate than traditional methods. The researchers also demonstrated success in modeling physical systems like single and double pendulums and the Lotka-Volterra equations, which describe predator-prey interactions in ecosystems.
Working toward even greater AI energy efficiency
The team envisions expanding their work in the future to handle more complex real-world systems, including those with dissipative properties (where energy is lost due to friction or other factors). They also plan to explore ways to apply their method in noisy environments, making it even more versatile for real-world applications. If widely adopted, this probabilistic training approach could go a long way toward making AI more sustainable, ensuring that the rapid growth of these technologies does not come at an unmanageable cost.