In 2016, a British hacker hit the headlines by simultaneously attacking all of the German operator Deutsche Telekom’s routers, by exploiting a hitherto undetected security vulnerability. In doing so, he brought over 1.25 million Internet users to a standstill and caused millions of euros of damage. Just 24 hours before this memorable cyberattack, researchers at the High Security Computing Laboratory (LHS), a joint Inria and LORIA platform with which Jérôme François collaborates closely, had identified an unusual spike in Internet dataflows. They knew there was a “problem” and that “something” was about to happen… but what? Where? And how? That was the mystery! “The ThreatPredict project is aiming to develop the tools required to make such predictions with sufficient accuracy to warn the targets and give them the opportunity to protect themselves, or at least to make preparations,” explains Jérôme François.
The RESIST team’s researchers, with the support of the LHS platform, are using a highly elaborate system to identify and analyse Internet dataflows. They possess over 4,000 IP addresses (the equivalent of post codes for computers or connected objects) which do not correspond to any physical machines. These IP addresses are nevertheless connected to the entire Internet network and therefore receive the same information flows and undergo the same attacks as any real machine. More specifically, each of these addresses is subject to between 8,000 and 10,000 cyberattacks per day. Storms are always brewing on the Internet!
The scientists can view the information flows received by these 4,000 IP addresses and in this way evaluate the amount of information exchanged. During a large-scale cyberattack, there is a significant increase in the amount of data exchanged on the Internet: this tells the scientists that “something is about to happen”. They also retrieve the information sent to these “phantom” addresses for analysis. This enables them to refine their understanding of the mechanisms behind cyberattacks and gives them hope of being able to identify reliable prediction markers.
These thousands of attacks may target a country, a region or a specific service, as in the attack on Deutsche Telekom. At present, researchers can refine their predictions until they identify a geographical target area or businesses in a certain field of activities, but they cannot yet pinpoint a specific victim, machine or person. Similarly, the timing of attacks remains very hard to anticipate based on dataflow analyses alone.
The originality of the ThreatPredict project resides in combining these technical data with the real-time analysis of feelings, opinions and emotions expressed on Twitter, and with contextual data on sporting or geopolitical events, for example, or the publication of corporate economic data. This more qualitative and societal information enables the predictions to be refined and linked to social or political demands which may be expressed on the Internet in the form of cyberattacks. “Cyberattacks generally coincide with major national or international events, such as the Olympic Games and elections, and the ability to predict them accurately would be useful,” explains Ghita Mezzour, whose team has incorporated the analysis of feelings expressed on Twitter into its predictions.
Between now and the end of the ThreatPredict project in 2020, the scientists are therefore aiming to combine the quantitative and qualitative data derived from their analyses and use it to create and test a reliable and precise cyberattack prediction tool. “We must be absolutely sure of our predictions in order to avoid false alarms which could harm our credibility”, concludes Jérôme François. Indeed, what applies to weather forecasting also applies to this field: it is best to avoid announcing bad weather unless it really is on the way…
Although cyberattacks are currently impossible to predict, the researchers on the RESIST project team have nonetheless made a number of useful tools available to the general public and experts: