22 December 2015

Digital technology nowadays provides the cornerstone for many of our economic and social activities. We rely on it for most of our communication needs, for managing industrial processes, within retail and commercial transactions, but also for our social and interpersonal interactions. This dependency is predicted to increase further considering emerging areas such as Internet of Things (IoT), Big Data, autonomous vehicles and transport systems, digital dexterity, etc. Information and Communication Technology (ICT) is at the heart of our Critical Infrastructures and as such has become a Critical Infrastructure component itself. Our dependence is such that most processes cannot be carried out once parts of the ICT infrastructure fail. Further, systems are interconnected and a failure of one component can bring down most critical services within an entire area.

We have experienced this very poignantly here in Lancaster in the past few days, where due to the recent flood events an electricity substation failed, which subsequently brought on the failure of the entire digital communication infrastructure and ICT systems people and local businesses rely upon for their daily lives. Only the plain old telephone infrastructure was upheld. But due to most households using cordless handsets communication through it was in many cases not possible either. This incident demonstrated clearly how vital it is that our digital infrastructure is kept operational under all circumstances.

Cyber Security is one of the areas investigating how infrastructures can be kept safe and operational. Though, within the Cyber Security domain this task is repeatedly described through terms such as “securing”, “protecting”, “defending”, often only looking at this in the context of singular systems or even individual system parts. Frequently the analogy of a medieval castle is used to explain how to secure ICT systems. However, medieval castles and fortifications became obsolete. This was on the one hand due to new weapons that made these defensive structures less effective. On the other hand society as a whole changed, it became more dynamic and mobile. Hence, the (reduced) safety provided through castles was outweighed by the benefits a more open environment and society had to offer. In the post-castle era internal security has been provided through policing, whereas for external threats more dynamic and responsive defence strategies have been developed.

Reflecting on today’s ICT-based infrastructures this means that systems have to be appropriately secured but at the same time there have to be mechanisms in place allowing to dynamically detect and react to challenges caused by, for instance, cyber attacks, human or system failure, or natural disasters. This has to go hand-in-hand with ensuring that infrastructures are designed so that there are no single points of failure, that they have sufficient redundancy, potential back-up capacities and the ability to selectively isolate or remove parts of the infrastructures. Thus, the elements constituting resilient infrastructures are architectural as well as managerial.

Anomaly detection and remediation mechanisms play a key-role in the early discovery of the onset of attacks and launch of countermeasures. After the event recovery has to take place that should not just restore the original state, but possibly improve the resilience by actively deducing causes and trying to remove the respective vulnerabilities. Since there are interdependencies between infrastructures (as can be easily seen by the recent events) this needs to be carried out in a coordinated manner across different domains, systems, and Critical Infrastructures. Further, infrastructures have to become more adaptive through the use of self-learning and self-healing properties.

The benefits of infrastructure resilience investment can only be assessed by the damage it prevents and the opportunities it creates. In order to provide system and infrastructure resilience it will be necessary to review our Critical Infrastructures, assess their architectural and structural properties, and implement active resilience mechanisms. Important mechanisms are for instance anomaly detection (e.g. for the discovery of DDoS or Spam attacks) and remediation (e.g. defending infrastructures through removing malicious traffic and process automatically). Research has demonstrated the feasibility of the discussed concepts. The goal has to be to make our Critical Infrastructures more secure, and, more importantly, to also make them more resilient and adaptive to all kinds of challenges so that neither events such as the recent floods nor malicious attacks will result in the complete breakdown of our digital Critical Infrastructure within an entire region.

With thanks to Dr Andreas Mauthe, Reader in Networked Systems, Lancaster University