Artificial intelligence (AI) is everywhere. From health to mobility, industry, education, and agriculture, it has infiltrated every sector in recent years, becoming a major challenge for its players.
Recent progress in this field, which is increasingly impressive (particularly in artificial vision and natural language processing), is the result of enormous human engineering efforts, colossal computational (and therefore energy) resources, and considerable publicity. This is an important path, in which many industrial and academic actors are engaged. But is it the only path for AI research?
Behind the "commercial" AI: a daily AI
There are indeed two types of artificial intelligence:
- the AI of the "frontier", very present in the congresses and the media, which does always more, always faster ;
- the AI of everyday life, far from the mediatized applications, but used by all scientists and engineers and often based on more rudimentary methods, because stable, accompanied by algorithms and well established guarantees, and less greedy in data.
Both are important, and go hand in hand. Moving from one to the other is a big deal, where "getting the algorithm to work" or beating the state of the art on a benchmark are not the only goals. "There is a difference between writing a proof-of-concept paper and having something practical. Going from published research to actual research is not easy. It's less in the spotlight because it's more thorough, but it's just as important, to understand what's going on," explains Francis Bach, Inria research director, before adding:
When you apply AI to advertising, if you're wrong, it's not dramatic, but it's much more dramatic if it's applied to complex problems, critical sectors like aviation. Then you have to go back to hypotheses, and go further than showing beautiful images. You have to go back to classical science.
In other words, it is necessary to understand the phenomena in order to better control them, and to give the theoretical guarantees necessary for their deployment in sensitive areas (airplanes, cars, health).
The path to frugal AI
Understanding AI also gives us the power to adjust the computing resources needed for its proper functioning, and therefore, to be less power-hungry.
Today, in order to obtain convincing results in AI, we rely on overparameterization, i.e. setting a very large number of parameters to obtain better performance. This raises two questions: does overparameterization really lead to better results? And can similar results be reproduced with fewer resources?
"A lot of parameters means big machines and big data. Overparameterization requires huge resources," says Francis Bach, before asking: "By making large AI models, we know that it works. But why not do it with something smaller?
The challenge today is to know how far we need to go to overparameterize in order to obtain convincing results, with the aim of being able to predict the resources needed for a given problem. And all the while hoping that they will be less.
Frugal AI will also require our ability, once we have analyzed the resources needed for our calculations, to make compromises. First of all, on computation time: "Running 100 machines to reduce computation time consumes a lot of energy. By accepting that the calculation time is a little lower, we could save energy," explains Francis Bach.
Another trade-off lies in the performance of AI: "Today, it has to work as well as possible, as quickly as possible. However, we should be ready to adjust and lose performance in order to have something less expensive. This is the classic problem of the benefit/risk ratio. We always make compromises, but the interest would be that the cursor is not only blocked on the energy expenditure", he concludes.
Francis Bach, speaker at ICM 2022
Francis Bach will be a speaker at the International Congress of Mathematicians (ICM) 2022, which will be held online from July 6 to 14.
His talk, scheduled on July 13 from 14:15 to 15:00 (room 6), will focus on collaborative work with Lénaïc Chizat, currently a professor at EPFL, in which they showed that gradient descent, the most commonly used optimization algorithm for learning overparameterized neural networks, reaches the global minimum of the objective function it seeks to minimize, thus providing a partial justification for the need for overparameterization.