The following post is adapted from Jose’s byline in IoT Agenda, published in June 2017.
Earlier this year, we took a look at how the Internet of Things (IoT) industry is responding to the bevies of insecure devices that were being released into the market. And while insecure IoT devices still pose a sizable threat, manufacturers are taking pains to address security holes, like the Chinese manufacturer that opted to recall some of their products to address insecure configurations. The issue has gained the attention of governments as well, with U.S. Senators Mark Warner and Cory Gardner introducing a bill to force IoT companies to adhere to set security standards.
Prior to introducing that legislation, Senator Warner recently wrote an op-ed piece in which he outlines goals for addressing the cybersecurity gaps highlighted by the WannaCry ransomware attack in the spring of 2017. Senator Warner’s first point — that we need technology to provide capabilities to tackle this challenge — will be my focus for this post. I’ll also share some research objectives that industry and academia must address soon before we can begin solving the security issues with IoT.
A typical Linux or Windows-based server or laptop has an expected lifetime of three to five years, and an operating system designed with that kind of update cycle. Hardware refresh cycles mean that you not only get the latest advantages of speed and capacity, but you can update your software as well. While we extol the virtues of mainframes and various systems that have seen one or more decade of active service, those are discussed because they’re the exception. Furthermore, they have teams of dedicated operators who maintain them. Contrast this with a typical IoT vision — thousands of devices embedded into the fabric of the world, taking measurements or making adjustments in response to conditions — low-cost devices (e.g., ZigBee network devices) scattered about. If you have to staff this for maintenance, even for replacement, your costs skyrocket — a team can only manage so many devices at a time. This model won’t work for IoT.
The recent WannaCry ransomware highlights this gap eloquently: while Microsoft had prepared and distributed the EternalBlue patch for two months, it was not fully patched and affected many systems, including IoT devices in Europe and elsewhere. The ransom message with the red background was a stark indicator of the problem, visible at some ATMs, transit stations, and elsewhere.
IoT reliability over time can obviously impact security, but also safety. To manage this risk, software reliability is of paramount concern, not only in the short term, but the long term as well. These risks center on a number of aspects that require us to invest research effort. UK cybersecurity scientist Ross Anderson delved deep into these topics in some of his recent research.
Patching is a mess
Patching is the security equivalent to constantly washing your hands during cold and flu season — it may have an impact, but this equivalence ignores how disruptive patching remains for most organizations. First, the topic of reliable software updates must be tackled head on. At present, patching software incurs downtime and risks reliability. Large enterprise firms generally only patch their managed fleet at controlled, scheduled intervals and after rigorous testing; many of these firms invoke out-of-cycle updates only when emergency situations arise such as active exploitation. As a friend recently put it, not patching immediately for every update quickly becomes a rational act, as end users and administrators estimate the risk of service disruption due to patching to be higher than of a cybersecurity incident. Patches are then left to be applied in bulk at sporadic intervals, when staffing and attention can be brought to bear during the inevitable disruption and downtime. A small but growing number of applications, such as Google Chrome, that patch silently in the background attempt to address this, but that number is too small at present.
There’s been a big push in the past five years or so in formal systems and verification of software to address software security areas, with possible benefits to reduce the risk introduced with patching. A formally verified system, with well-understood behaviors, can be used to detect the introduction of unspecified behaviors in the system that lead to unreliability and the associated risk. However, very little of this has focused on applying these methods to existing software, and most formal systems have been only in the academic sphere. Among the most visible impacts of a formally verified system is Amazon AWS, which uses the TLA+ theorem prover to design its systems, yielding tremendous reliability in the process. These breakthroughs help demonstrate that this is possible and the rewards can be obtained, but also illustrate how complex this process is. There is only a small amount of research working to apply these formal methods into existing software engineering; it’s easily a decade away, meaning starting sooner has real benefits.
On the topic of patching without downtime, only a little research has gone into the topic of dynamic software updates (DSU), but more must occur, and it must target existing codebases. DSUs allow a running system to be patched based on the program semantics. The Ginseng compiler, for example, attempts to do this for C code and has been applied to OpenSSH, Apache and other real-world codebases in the lab. A similar project, the Kitsune compiler, exists and may prove useful. Similarly, the Linux kernel tools Kpatch and Ksplice should be a focus of renewed, sustained efforts given the growing popularity of Linux-based IoT devices.
While IoT devices provide the most recent motivation for this challenge, the fact is that patching remains a mess. While there are a plethora of vendors to tell you when you should patch and some that can help you track patch status, the fact remains that the complex systems shipped by application and OS vendors incur significant downtime and risk when patching. Only if this reality is addressed will frequent and timely patching become widespread.
End of life spells doom
The topic of long-term ownership and maintenance must also be addressed. At present, when a vendor goes out of business or marks a product end of life, end users are stuck with whatever they have at that time. Consider the case of the shutdown of the Nest Hub — now scale it to a giant with a massive installed base like GE or SIEMENS: how might a shutdown affect autopilot systems in cars? Existing laws and regulations pose hurdles to anyone assuming a maintenance role outside of narrow circumstances. This topic has all sorts of questions behind it: What about liability? Given a compelling business model, someone may pick up the assets, but in the absence of a profit center why would they? Can device owners begin patching themselves? If so, can they obtain source code, signing keys or the like? These are all sorts of questions that need answers; be on the lookout for policy debates in the coming years.
The path ahead looks fraught with insurmountable hurdles; and, to be fair, it will be a challenge. But the results of recent research into these long-term software maintenance topics and prototype solutions indicate that sustained software maintenance may now be possible. And, for the reasons outlined above, the benefits are ever more pressing and will not only help the growing IoT market, but also the existing, traditional tech market. I encourage those with the responsibility of setting the research agenda to make this their focus, in technology and in policy development.