What was the background to the creation of the pilot project Regalia?
In late 2019, after discussions with the French Directorate General for Enterprise (DGE) in Bercy, Inria realised the importance of regulating the digital giants for the public authorities. One thing became clear: regulating platforms “governed” by algorithms (for moderation, making recommendations and pricing, to name but a few examples) is far from straightforward, going beyond the capacities of the regulatory authorities. Trials against major American platforms, for instance, can take years, with competition damages accumulating.
Inria’s expertise on subjects linked to regulation, such as detecting bias or hateful content, or the explainability of algorithms, was felt by Bercy to be crucial from both a methodological perspective - to help the regulatory and legislative authorities to adapt the EU’s regulatory framework - and a technological perspective - to help to establish a technology base in support of the regulatory authorities.
It was with both of these objectives in mind that the pilot project Regalia (REGulation des ALgorithmes d’IA - Regulation of AI Algorithms) was created in 2020 within the DGDS in support of the public authorities.
What are the challenges of regulating digital platforms and how has Regalia sought to address these?
The first challenge was regulatory: in order to “make what is illegal in real life illegal online” (to paraphrase European Commissioner Margrethe Vestager), changes to EU law had to be made. An interministerial task force (“Regulation of Digital Platforms”) was set up by the DGE to bring into line the visions of the different stakeholders, to assess the issue alongside ministerial staff and to support France's position within Europe. Regalia has been a part of this task force for nearly three years now. It is our belief that the two EU laws passed in July 2022 - the DSA (Digital Services Act) and the DMA (Digital Markets Act) - accurately reflect France’s position. These laws were the first in the world to grant such powers over the digital giants. We say they are asymmetrical, in that the penalties which are applicable depend on the size of the company, linked as they are to the risks they incur. These laws are very much aimed at major US and Chinese platforms.
The second challenge was more operational: preparing the regulatory authorities for the technological changes to their role. This involved coming up with methodologies and best practice for evaluating the compliance of platforms’ algorithms with the rules in place. Just as we saw with banks post-2008, the regulations are ex ante, and it is up to companies to show (and up to the regulator to verify) that they have put in place the necessary measures to guard against risk. For this we came up with typical use cases linked to compliance issues, working alongside the Pôle d'expertise de la régulation numérique (Centre of Expertise for Digital Regulation) in Bercy, as well as French companies interested in the idea of self-regulation. Our aim was to identify the challenges in devising an effective audit protocol, in addition to identifying appropriate and efficient ways of detecting bias or dishonest practices.
There are similarities between auditing a platform in production and auditing black box algorithms, a vast and thriving field of research. To make things even more difficult, whoever is carrying out the audit not only has to identify relevant examples of bias in order to assist the regulator, but they also have to make requests of the algorithm sparingly (so as not to cause any disruption) and stealthily so that the platform doesn’t identify this as unusual behaviour. This testing - similar to the discrimination testing we have grown used to offline in recent years - is tricky to devise given the complex and multidimensional nature of the input data and the behaviour we’re seeking to test.
What is your assessment of Regalia’s achievements over the past two years or so?
Aware of their current technological limitations and with pressure resulting from EU regulations, a number of regulatory bodies have expressed strong interest in quantitative audits. We have even seen consulting firms, always quick to act on these subjects, offering to carry out algorithm audits on behalf of companies. Research laboratories (such as the LNE) have focused on methodology with a view towards algorithm certification, while a number of startups have begun to emerge in places like France, the UK and Germany, geared towards compliance certification. There is already a sense that companies fear being caught, or at least fear possible bias in their algorithms being detected, with dedicated teams being set up.
But let’s be clear: progress has been very gradual, and it won't be until the DSA, the DMA and the forthcoming AI Act properly come into force that such fears lead to significant investment and the emergence of an audit ecosystem.
What do these laws say?
The DSA deals chiefly with online content sharing platforms and search engines, while the DMA is targeted at online marketplaces.
The DSA requires, for instance:
- That major online content platforms must work with “trusted flaggers”
- That they explain how their recommender and advertising algorithms work
- That “dark patterns” (misleading interfaces) are prohibited
- That a risk assessment be carried out on the effects of algorithms, in addition to independent audits
- That they must grant researchers access to data from their interfaces to make it easier to track any changes to risk
Should platforms fail to adhere to the DSA, the European Commission will have the power to issue fines of up to 6% of their global turnover. This is considerably higher than the penalties currently in place.
The DMA, meanwhile, states:
- That marketplaces must grant retailers access to data on their marketing or advertising performance on the platform
- That they must not show any preference to their own products and services over those of vendors using their platform (what is known as self-preferencing) or use data on vendors in order to gain a competitive advantage over them.
This law also carries significant penalties: any breach could result in the European Commission fining platforms up to 10% of their global turnover, rising to 20% for repeat offences.
From Regalia’s perspective, the first piece of good news is that our audit library is up and running and is connected to the PEReN's scraping library. This means that, within the space of a few days, we can run test campaigns, identifying variables with an unexpected impact on decisions taken by platforms. We have used it on food delivery companies and travel comparison websites.
To give you an idea of the difficulty of carrying out a black box audit while respecting frugality, let’s look at the example of searching for bias with an online travel agency. Say, for instance, that the regulator suspects (and this has happened before, so it's a legitimate suspicion) that the travel agency isn’t respecting its commitment. It makes out that the hotels being recommended to you are in descending order of preference based on your reservation history. But imagine if instead of it being your preferences that were being used, it was how much commission hotels were paying to the agency, or a combination of the two. This is something the agency would have needed to declare. If they were found guilty, they could face a penalty for misleading commercial practice.
But how can such practices be confirmed? One way involves creating customer profiles (or personas) which differ considerably in terms of their preferences, and getting them to click on or look in detail at very different types of hotels in order to express their interest. You would need to cast the net wide enough geographically in order to identify non-preferred hotels but which seem to be highly recommended. Being frugal would involve using only suspect locations or periods in order to demonstrate that a certain hotel is being preferred. This is a sampling problem, where you have to adapt to what you discover in real time, while keeping the number of samples you test to a minimum. There has been a lot of research in mathematics and AI into optimising such samples, the aim being to gradually develop a model of the agency's recommender algorithm in order to build a statistically sound argument illustrating the disloyal practice or self-preferencing.
Our plan is to launch three PhDs with two Inria research teams working on these subjects. One started this year, supervised by Erwan Le Merrer from the University of Rennes Inria Centre, working with the WIDE project team on the black box auditability of the algorithms they study.
What projects and partnerships are being developed?
We’re keen to teach people about these subjects, and last month (October 2022) we staged a hackathon aimed at data science Master’s students, the “AI challenge”, which took the form of a Kaggle challenge. 80 project teams audited the pricing algorithm of a fictional online travel agency, which we had made sure was biased. We are also getting ready to launch marketplace audits, working with students this time. They will play the role of the personas, adopting statistically distinguishable behaviour, which will allow us to view the responses of recommender algorithms and any potential bias.
Thanks to our scientific partnership with the PEReN team in Bercy, which is made up of around twenty or so data engineers and data scientists, we are able to interact with the regulatory authorities and focus on problems with significant implications for public policy. I cited the example of bias (of which self-preferencing is a specific case), which is important in commercial law, but which also has an impact on labour law. Some food delivery platforms, for instance, have been accused of favouring certain delivery riders over others when it comes to allocating jobs. In this case it is the “dispatching” algorithm that is thought to be to blame. One issue which often crops up, particularly in the banking sector, is customer discrimination. Are certain types of clients given offers or prices based on “sensitive” characteristics such as gender or perceived religious identity? If such bias could be detected systematically in black box algorithms, this would make it possible to reduce the impact of the differential treatment that sees certain sub-groups disadvantaged at the expense of others.
Lastly, in addition to the PhDs we have planned on subjects close to Regalia, we have entered into scientific partnerships with teams working on statistics and optimisation at Inria, the Toulouse Mathematics Institute and Côte d'Azur University. One example is the work carried out by Jean-Michel Loubes and his team at the Toulouse Mathematics Institute on minimal repairs to biased algorithms. Once bias has been identified, and if the company wants to repair its algorithm - the bias wasn’t intentional, after all - can the algorithm be modified, correcting the bias or at least keeping it to acceptable levels without affecting its performance too much? In certain cases, there are ways of carrying out such repairs while keeping the impact on performance to a minimum. More generally, we are looking beyond effective ways of detecting bias towards the possible certification and even repair of the algorithms being studied.