Appearances can be deceiving: Francis Bach only completed his PhD in IT at the age of 30, a few years later than the majority of PhD students. He already had a degree from Polytechnique, however, and studied for his PhD at Berkeley under the supervision of Michael Jordan, a leading figure in the world of artificial intelligence. It was then that the distinctions began to pile up: two ERC grants, one in 2009 and another in 2016; put in charge of an Inria team in 2011; multiple scientific prizes; a best paper award at an international conference in 2018...
Francis Bach appointed Academician!
The Academy of Sciences announced, Thursday, March 19, 2020, the election of four new academicians to positions with targeted fields of research. Among them, Francis Bach, for the field "Data masses, machine-learning and artificial intelligence".
Academicians are appointed for life, following a rigorous process of elections in several stages over almost a year, the result of which is ratified by official decree of the President of the Republic.
“Choosing my own research topics, taking more risks”
This impressive series of achievements came as no surprise to his former PhD supervisor: “In a career spanning 20 years both at Berkeley and at MIT, I’ve never had a better PhD student”, explains Michael Jordan.
Francis is exceptional and multidisciplinary, capable of tackling even the most complex problems. He develops original solutions which engage before eventually winning unanimous support.
Bach, meanwhile, has always taken a step back when looking at his career: “This has given me total freedom in choosing my research topics. I don’t have to submit ideas or seek funding. I’m able to take more risks."
The ERC Consolidator Grant in 2016, for example, guaranteed five years of work for my team. That’s a rare luxury.
2014: machine learning comes out of the shadows
Little was known about Bach’s field of research, machine learning, until 2014. It was then that interest on the part of both the media and society in artificial intelligence saw it take centre stage: in order to become “intelligent”, an IT system must first learn. “This sudden fame changed nothing for us: we continued to study the same subjects. We were better known to the general public, however, who began to understand the challenges facing us given the possible impact on their daily lives.”
Examples? Machine learning is capable of “teaching” a vision system to recognise road signs, for use in driverless vehicles. In neuroimaging, the objective is to teach machines how to detect tumours, aneurysms, lesions, etc. In the audio world, computers are taught how to separate individual instruments in orchestra recordings, or to generate realistic sounds.
Innovations inspired by applications
“I see my job entirely in these iterations, between computing and real life”, explains Francis Bach. “The majority of interesting theoretical problems come from applications. Conversely, theoretical analysis helps us to work out why an algorithm doesn’t work and how to improve it.”
Logically, the researcher does not specialise in any one field, preferring to navigate between fields, choosing subjects as his inspiration takes him: “I noticed a while back that we were running into the same problems in cryptography, vision, natural language processing and bioinformatics".
The correct way of responding to this was to create methodological tools common to multiple applications, as opposed to bespoke tools. I put myself at the intersection between algorithms, theory and applications in an attempt to get the most out of these three fields.
By way of an example, his research team works alongside Google Brain and Facebook on academic subjects, far removed from the daily concerns of internet users.
Looking to the past in order to predict the future
In machine learning, the challenge is as follows: to develop quick, precise algorithms capable of processing data (road signs, brain defects, musical notes, etc.) and working out rules from it, thus enabling a system to develop predictive capabilities. To put it slightly differently: machine learning looks to the past in order to predict the future, i.e. the data the system will next be confronted with.
In less than ten years, two significant changes have taken place within the discipline. The first was the massive increase in the amount of data available: “In online advertising, for example, billions of items of data are available. In vision, learning processes are able to draw on millions of images each made up of millions of pixels.”
From parsimony to managing billions of pieces of data
It is worth pointing out that, in 2011, when Francis Bach set up his Inria team Sierra, one of the themes was “parsimony”, i.e. the art of extracting only the relevant data from a limited number of cases; e.g. determining genetic constants based on the genomes of a handful of individuals.
Researchers were forced to adapt to this sudden inflation while retaining acceptable processing times. Nevertheless, their algorithms, which had been designed for parsimony, could end up processing the same piece of data hundreds or even thousands of times. These had to be replaced by so-called “stochastic” algorithms, in which each piece of data is only used once. Inspired by concepts from the 1950s, these were reworked to adapt to the challenges of today.
Describing images, but with what level of precision?
To what extent is it necessary to label - i.e. to describe - images submitted to a vision system for learning purposes, with their points of interest: road signs, faces, crowd movements, tumours, etc.? This is one of the subjects of interest to Francis Bach. There are multiple schools in this field. Some involve a human operator being tasked with describing images pixel by pixel. This is long and costly and generates data flows. Others opt for partial and selective labelling, targeted on points of interest.
But what level of precision should be used for these labels, and how can you make sure that nothing is forgotten? “This remains an ill-posed problem, because it's devoid of any theoretical context. We are seeking to define this context and to work out theoretical guarantees. This will enable us to develop algorithms capable of quick, efficient learning, for all types of images.”
Adapting algorithms to distributed computing
Indeed, it was one such algorithm, christened the Stochastic Average Gradient, which earned Francis Bach* the Lagrange Prize for mathematical optimisation in 2018. Of all the distinctions he has received, this had the most impact on him “because it was awarded by my peers and because optimisation was not my original speciality.”
The second fundamental change which shook up the world of machine learning was the plateau reached in terms of computers’ processing capacities. Moore’s law, which predicts that the capacities of machines will double every 18 months, had forgotten that micro-electronics would one day run into the physical limitations of circuit miniaturisation. In order to continue processing quickly and accurately, it became necessary for learning tasks to be shared between multiple machines, with new algorithms running in parallel on multiple processors. Once again, the Sierra team was up to the challenge, with opensource codes enabling specialists to employ these methods.
Pioneer, explorer and team director
A pioneer and an explorer in machine learning, Francis Bach has also had the pleasure of discovering the role of director of an Inria scientific team. From his years spent studying for his PhD at Berkeley, he remembered an extraordinary atmosphere, full of dynamism, healthy competition and trust in each other. He was keen to create a similarly stimulating environment at the centre in Paris, where he shares the floor with 19 employees. “In fact, it's even better than at Berkeley, where everything revolved around one “boss”. Within the Sierra team, we have four researchers studying specific subjects and common subjects, in close collaboration with sixteen PhD students and postdoc researchers. There's a strong sense of cohesion, and there isn’t much of a hierarchy: things are good.”
Francis Bach - brief biography
After graduating from the École Polytechnique in 1997, Francis Bach was awarded a PhD in machine learning from Berkeley University in the USA in 2005. He joined Inria in 2007, as part of a team working on the recognition of objects and scenes, and was awarded an ERC Starting Grant in 2009. In 2011 he set up his own Inria team, Sierra, specialising in machine learning and algorithm optimisation.
In recent years, he has developed an international reputation. He was awarded an ERC Consolidator Grant in 2016, the Lagrange Prize in 2018 and the Jean-Jacques Moreau Prize in 2019. Francis Bach also features on the global rankings published by Clarivate Analytics of the most cited researchers in scientific publications.
Galvanised by this freedom which he holds so dear, Francis Bach gets fresh scientific inspiration from academic collaborations (Berkeley, MIT, British Columbia) and work trips. He prefers select symposiums to big conferences: “It's there that you get a chance to talk and to tackle the real issues.” Away from the office, new ideas often come to him when he’s out on his bike: “I go out during the week with groups in the Bois de Vincennes or the Bois de Boulogne, and at the weekend I go to the Vallée de Chevreuse. During the summer, I go to the mountain passes in the Alps. When you’re out cycling, when your legs are working away on their own, you have time to yourself and your mind starts to wander.”
* with two of his former postdoc researchers, Mark Schmidt and Nicolas Le Roux