A corpus with hundreds of thousands of rulings to analyse
Is the law applied equally to all? How does the Court of Cassation, the highest jurisdiction in the French legal system, manage to monitor the application of the law by lower courts? For the Court of Cassation, identifying divergences in the application of the law is a Herculean task, given that the mass of data continues to grow with each judicial decision. Among the decisions rendered by its six chambers, the Court of Cassation must identify contradictory interpretations of the same legal issue or law. At present, this long and tedious task is carried out manually by the Court’s lawyers. The complex process of identifying divergences requires both solid expertise in legal analysis and perfect knowledge of the law and jurisdiction. Until now, this task relied exclusively on human expertise (Court of Cassation judges and officers, in addition to academic analyses in legal journals). A comprehensive detection of divergences would require the comparison of hundreds of thousands of court decisions, an impossible task without a considerable increase in human resources. In the context of the digital transformation of administrative organisations, artificial intelligence offers solutions to facilitate such tasks.
Related rulings and “divergences”, training AI in association with the experts
Designed to facilitate the work of public service agents, the Lab IA enables public institutions to explore the possibilities offered by digital science, in collaboration with Inria researchers. Ioana Manolescu, scientific director of the Lab IA, matches scientific experts with the projects selected. For this particular project, the ALMAnaCH project-team, Benoît Sagot and Rachel Bawden, experts in natural language processing (NLP) and Thibault Charmet, engineer, took up the challenge. Their aim was twofold: to develop tools to assist lawyers in their tasks and to enable enhanced detection of divergences. To this end, they tackled the crucial task of identifying pairs of similar (or related) documents, which could then reveal divergences. The trio also worked closely with the legal experts and data scientists of the Court of Cassation. The identification of similar legal decisions can be automated once we know how to automatically measure the degree or relatedness of two rulings. To do this, the researchers developed a predictive tool for titling, based on court briefs (titling is the task of assigning a ruling with a sequence of labels, known as “titles”). They attributed titles to untitled decisions, and provided additional titling for all decisions, with the hypothesis that this would simplify the identification of related rulings. To produce these titles automatically, they modelled title prediction from court briefs as a machine translation task.
According to Rachel Bawden, “validating the adaptation of machine translation technology to other data types and to tasks corresponding to the Court’s requirements was really interesting and opens up the potential for future projects in other fields such as finance or biomedicine.”
Rachel Bawden has been a researcher at Inria since 2020 within the ALMAnaCH project-team. An expert in NLP and more specifically machine translation, her research focuses mainly on the integration of context (linguistic and non-linguistic), methods for low-resource languages and scenarios (i.e., for which only small quantities of data are available) and evaluation. She holds a springboard chair at the PRAIRIE Institute.
Lastly, to assess the relevance of their method, the scientists asked Court experts to carry out the same task as their similarity prediction algorithm. The scientists and lawyers first worked together to draw up guidelines for similarity levels. “Working alongside Court of Cassation experts required an adaptation process to allow us to understand each other and establish a common language,” Benoît Sagot explains. The experiments then carried out not only demonstrated that the automated approach produce results similar to experts’ judgements of similarity, but that the additional (automatically produced) titles also strengthen these similarities.
Benoît Sagot, head of the ALMAnaCH project-team since 2017, is a senior Inria researcher in Natural Language Processing (NLP) and Computational Linguistics. He has led several national and international projects and holds a chair in the PRAIRIE Institute devoted to research in artificial intelligence. He is also the co-founder of two start-ups, where he provides his expertise in NLP and text mining for the automatic analysis of employee survey results.
Cutting-edge models transferable to other fields for the digital transformation of the State
The results produced by this collaboration open the way for further improvements. The team are pursuing their work with the Court of Cassation to finalise semi-automated models aimed at facilitating the drafting of titles by experts. These final models are to be integrated into the Court’s workflows in the near future. As for the Lab IA,the ALMAnaCH project-team are ready to renew the experience with other public bodies. Several fields with arduous and repetitive tasks, such as archiving or notarial deed searches, could benefit from automation. What is the aim? To encourage the digital transformation of the State and enable citizens to benefit from high-quality services.
See a video of the project! (only in French)
Copyright video: ©Etalab