BACK IN TIME: harnessing cryptography, history and AI to decipher manuscripts

Changed on 31/10/2024

Uncovering the secrets of documents that have gone undeciphered for centuries through a combination of cryptography, history and artificial intelligence is the ambitious aim of the BACK IN TIME project. While the long-term objective is to develop automated tools for decryption and transcription, this exploratory action should also help to lay the groundwork for collaboration between three disciplines which, on the surface, might not have much in common, but have a lot to share.

Lettre écrite par Charles Quint à Jean de Saint-Mauris — ©Jean-Christophe Verhaegen / AFP

BACK IN TIME: Baliser l’Analyse CryptographiK de lettres anciennes grâce à l’INtelligence artificielle : Technique et IMplémentation Exploratoire (Delineating the cryptographic analysis of ancient texts using artificial intelligence: exploratory implementation and methods). This was the name given to one of 72 exploratory actions launched by Inria. Although the acronym makes clear the aims of the project, the ancient texts that are the subject of their research are anything but, having been intentionally encrypted so as to ensure that only a chosen few would be able to understand the sensitive information contained within. As you can imagine, therefore, it is no easy task for researchers trying to decipher these texts decades or even centuries down the line. Cécile Pierrot, research fellow with the Caramba (*) project team, found this out in 2022 with a letter written by Charles Quint. Realising that cryptography tools would not be up to the task, the researcher called upon the expertise of Camille Desenclos, lecturer in Modern History at Picardie Jules-Verne University and a specialist in 16th and 17th century cryptography. By pooling their respective skill sets, they were able to break a code that had remained a secret for close to five hundred years. Buoyed by this success and excited about what they could achieve together, the cryptographer and the historian decided to prolong the experiment and to try to design a tool to make it easier to decipher encrypted documents. And so the project BACK IN TIME was born, with Cécile Pierrot as its scientific lead.

Cryptography, a millennia-old practice

Given that cryptography is as old as writing itself, there remain a significant number of documents that have yet to be deciphered. “The earliest encrypted document to have been discovered is a sort of protection against industrial piracy”, explains Cécile Pierrot. “These are clay tablets produced in Mesopotamia around 1500 BCE by an artisan seeking to protect a recipe for a varnish he used when making his pottery.” Although there are potentially as many methods of encryption as there are writers, the basic principle tends to involve homophonic substitution, whereby alphabet symbols in the original text are replaced by one or more letters, figures or symbols. Given that very few ciphers have survived, the difficulty for anyone seeking to decrypt such texts lies in finding a way of identifying these correspondences and understanding the logic behind them. In order to do so, a qualitative and quantitative classification of the symbols used must be performed. “When you have two or three letters each running to a few pages it is possible to transcribe them manually, but when you have hundreds you need a tool to automate the process.” And so Cécile Pierrot reached out to Thibault Clérice, a member of the ALMAnaCH project team whose research combines natural language processing and computational humanities. The Charles Quint project having centred around a breakthrough partnership between computer science and history, BACK IN TIME has taken things a step further by introducing a third partner: artificial intelligence.

Learning to understand each other

Before getting to grips with the technical challenges - including devising a way of enabling AI to recognise unknown graphic elements - the team will first have to set the ground rules for their collaboration, which is the whole point of the exploratory action. This would be straightforward if each of the three partners were handling the tasks they were responsible for individually, based on their expertise, but this isn't the case. Instead, this collaboration is centred around permanent interaction between the three teams, which feed off each other as part of a virtuous circle. In this way, the historical knowledge that gives clues as to the angle of attack (decryption) is expanded upon through identification of the protection method used (encryption). “As cryptographers, we regularly employ the use of algorithms, meaning we are quite close to our colleagues from AI, but combining computer science and history is uncommon as we don’t work in the same labs, or even on the same campuses, and it is rare for us to bump into each other by chance for a discussion”, says Cécile Pierrot. But although such a collaboration might be difficult, it is certainly not impossible. And, as Camille Desenclos can avow, it can certainly be fruitful: “If you want to explain your way of working to someone from a different field of research, you need to adapt how you put things and that can help to reveal elements you hadn't thought of.”

An innovative and exploratory approach

This approach is very much in keeping with the Objectives and Performance Contract for the Inria Ambition 2023 project, which encourages the creation of new project teams to tackle innovative research topics, incorporating scientific risk-taking and interdisciplinarity in order to meet the major social challenges surrounding digital sovereignty. Thanks to support from Inria, the BACK IN TIME team will have the services of two engineers up until 2026.

Thus far, an initial prototype has been developed and the results have been enough to convince Thibault Clérice that it will be possible in the medium term to develop “a tool capable of selecting the right cipher to use, whether this is for recognising characters or decryption.” BACK IN TIME plans to develop this tool under a free licence. “The potential benefits are huge. This exploratory action gives us an opportunity to see what can be done and how, before then moving on to something bigger”, says Camille Desenclos. “From an AI development perspective, if we can find a way of handling these sorts of documents then this will break down existing barriers for certain languages for which no graphic system is available digitally. Examples include Mayan inscriptions that have not yet been added to Unicode, which is an issue for anyone looking to transcribe documents containing such writing”, adds Thibault Clérice. “We have already identified researchers from disciplines that are a bit further removed from our own, including linguists, who may be able to contribute to our research, potentially at a European level”, explains Cécile Pierrot. “That will be the next project.”

In the meantime, the three partners involved in BACK IN TIME have already scored a first victory, having successfully brought together fields of scientific expertise which wouldn’t otherwise have come into contact with one another, in addition to finding the keys that will enable them to mutually decode each of their own individual languages.

(*) The project team Caramba is a joint undertaking involving the University of Lorraine Inria Centre, the CNRS and the University of Lorraine, within the Loria laboratory.

The experts

Cécile Pierrot

After studying for a PhD at the Computer Science Lab at Sorbonne University-Pierre and Marie Curie Campus, and completing a postdoctoral placement at the Mathematical Institute of the University of Oxford and the Centrum Wiskunde & Informatica in Amsterdam, Cécile Pierrot joined the University of Lorraine Inria Centre in 2018 and studies modern cryptography as part of the Caramba project team, where she is a research fellow. A winner of the K2 “Cybersecurity” trophy in 2017, she teaches at the École des Mines Nancy and Télécom Nancy.

Camille Desenclos

After completing her PhD at the École Nationale des Chartes, Camille Desenclos was appointed as a lecturer in Modern History at Haute-Alsace University in 2015, before moving to Picardie Jules-Verne University in 2020, where she is joint head of the History, Civilisations and Heritage Master’s programme. A specialist in the 16th and 17th centuries, her areas of study are the history of cryptography, the institutions and practices of diplomacy and the relations between France and the Holy Roman Empire.

Thibault Clérice

Holder of a PhD in classics from the University of Lyon 3, Thibault Clérice was an engineer at King’s College London, head of the Master’s programme at the École Nationale des Chartes and a Junior Fellow in AI applied to the human and social sciences at Paris Sciences Lettres, before joining the ALMAnaCH project team at the Inria Paris centre in 2023. He is also the founder and co-editor of HTR United, an international catalogue for training AI to recognise handwritten text.

BACK IN TIME: harnessing cryptography, history and AI to decipher manuscripts

Cryptography, a millennia-old practice

Learning to understand each other

An innovative and exploratory approach

The experts

See also

Follow us