How DNA and the NSA are slowly rendering hard drives irrelevant

February 19, 2015 Updated: April 23, 2016

It has been a rough week for the ole’ hard disk drive (HDD). Two unrelated but complementary developments have dealt a mortal blow to what we thought about the ubiquitous magnetic drive. On February 15th, a study released by Switzerland’s ETH University to write and retrieve without error information encoded on to strands of DNA. Techies rejoice! The era of infinite and durable storage is actually closer than you thought.

But at the same time, Moscow-based Kaspersky Labs uncovered a malware of unparalleled complexity stemming from what it calls the “Equation” group, a group of unidentified hackers with roots in the early 2000s. For the first time, a malware module was able to infect the hard coded operating system embedded in hard drives (the HDD’s firmware), making it virtually impossible to detect and remove. Techies lament! The perfect malware might have just been discovered.

From HDD to DNA?

The global amount of data has risen from 1.8 zettabytes in 2011 to an estimated 8 zettabytes by in 2015 – short reminder, there are one billion gigabytes in an exabyte and 1,000 exabytes in a zettabyte. At the same time, the durability of our storage mechanisms has barely nudged in the past few decades, (CDs and DVDs only have a shelf life of 25 years and microfilms of some 500 years), storing the compendium of human knowledge over the long run is becoming increasingly difficult.

But thanks to the efforts of Robert Gross from ETH, the idea of coding information directly onto DNA just got one step closer to becoming reality. His team wrote the Swiss Federal Charter and the Archimedes Palimpsest on a DNA sample by using the genetic letters from the genetic code (A,C,G,T) to represent the sequence of zeros and ones that make up the binary system underpinning digital information. The idea has been tried before: in 2013 British researchers downloaded Shakespeare’s sonnets on to strands of synthetic DNA. Unfortunately, the data is not always retrievable error-free or stored over extensive periods of time due to chemical changes occurring within genetic material.

Gross’ innovation was to treat the DNA as a fossil and simulate the chemical degradation that would occur over hundreds of year of storage. He encapsulated the strand in silica and stored it at a temperature between 60°C and 70°C for a week and extrapolated the findings to show that under proper conditions, information stored on DNA could endure for a whopping one million years and still be retrievable.

Since one gram of DNA can hold up to 455 exabytes of data, a teaspoon of that stuff could potentially hold the entirety of human knowledge. The only problem at this point is cost, as the 83 kilobytes worth of data stored by Gross carried a hefty price tag: $1,500. At this rate, storing the compressed 10GB worth of Wikipedia in DNA would cost upwards of $190 million.

It’s the firmware, stupid!

Equation targets
Kaspersky Labs overview of Equation targets

Remember Flame, Stuxnet or the now infamous malware used to hack into Sony’s infrastructure? They all pale in comparison to the army of software packages intercepted by Kaspersky over the past ten months, all issued by the same “Equation” group. It was by pure chance that the cyberespionage operation was uncovered, as Kaspersky was able to sinkhole the communication between the Trojans implanted and its handlers (redirecting traffic away from the command and control server). Afterwards, the Russian team spent the better part of two weeks attempting to crack one cryptographic element, but it wasn’t until they crowdsourced the operation on Twitter on February 16th that they were able to decode the string.

According to Kaspersky, Equation “surpasses anything known in terms of complexity and sophistication of techniques, and that has been active for almost two decades”, during which time they produced a series of hacking and surveillance tools. The security company tracked down the machines infected by Equation’s Trojans, and found that the group eavesdropped on the computer networks of Iran, Russia, Pakistan, China and Afghanistan, leading to suspicions that the United States government was behind the operation.

So far, it sounds like just another case of cyberespionage. But the most insidious discovery was the implants’ capacity to infect the HDD’s firmware, thus giving itself the capacity to “resurrect forever”, even after a clean system install. “The best way to get rid of it is to physically destroy the hard drive,” Igor Soumenkov, principal security research at Kaspersky, told Mashable. Luckily, the high costs associated with developing this component – the type of hard drive reprogramming associated with it is a high profile engineering that requires months of development and millions in investment – makes it almost impossible for anyone reading this article to have been infected with this module.

However, the implications are of a different nature. For a long time, HDD firmware infections were considered the stuff of nightmares for security experts. Equation heralds an era in which malware could emerge that can evade detection and disinfection, a complete game changer for the industry. Even if that day is well into the future, it is clear that ensuring the total safety of data has just got harder.

Maybe DNA coding will provide a solution to the now-battered HDD?