Thanks to a lively discussion one evening in a pub, two scientists have come up with a way to store any type of data, from PDFs to MP3 files, into strands of DNA. The new method works without electricity and can store information for thousands of years. It is like the DNA of dinosaurs that died out millions of years ago, which preserved the information of the species and their characteristics since the Jurassic age.
“We already know that DNA is a robust way to store information, because we can extract it from wooly mammoth bones—which date back tens of thousands of years—and make sense of it,” said Nick Goldman, one of the scientists who made the discovery. “It’s also incredibly small, dense, and does not need any power for storage, so shipping and keeping it is easy.”
The two scientists, Nick Goldman and Ewen Birney, both from the European Bioinformatics Institute, admit that it was totally an accidental discovery that happened after they drank a few beers in a pub.
“We realized that DNA itself is a really efficient way of storing information. So over a second beer, we started to write on napkins and sketch out some details of how that might be made to work,” Goldman says.
The method basically works by converting the 0s and 1s used in a binomial computer code into the letters of genetic code A, C, G, T—the four nucleotides. The various combinations of those four letters can be used to record and encode just about anything from a word document to the entire characteristic of the human being’s physical body that requires “just” 3 billion of those letters arranged in a specific combination, compactly pressed deep into each cell of our bodies.
To start off with a nucleotide combination a bit easier than a human body, the scientists picked Shakespeare’s sonnets as a PDF file and Martin Luther King’s speech “I Have a Dream” in MP3 format and sent those off to the labs of Agilent Technologies—a biotech company, where the files got synthesized into a strand of DNA.
“We downloaded the files from the Web and used them to synthesize hundreds of thousands of pieces of DNA–the result looks like a tiny piece of dust,” explained Emily Leproust of Agilent Technologies.
“My first reaction was that they hadn’t done it properly, because they sent me these little tiny test tubes that were quite clearly empty,” Goldman says.
But to their surprise, the DNA strands were there as tiny specs in the bottom of the test tube. Afterward, Goldman sequenced the DNA and was able to run their cipher backward revealing 100 percent accurate recordings of both files.
“We’ve created a code that’s error tolerant using a molecular form we know will last in the right conditions for 10,000 years, or possibly longer,” said Goldman. “As long as someone knows what the code is, you will be able to read it back if you have a machine that can read DNA.”
“The data we’re being asked to be guardians of is growing exponentially,” said Goldman. “But our budgets are not growing exponentially.”
At the moment, because the technology is still very new and experimental, it is very expensive to store data in DNA–around $12,400 for every stored megabyte. This is around a million times more expensive than the current method of storing data on magnetic tape or hard drive. It must be noted, however, that magnetic tape or hard drive degrades and should be replaced every few years, while a DNA strand could be kept intact for many thousands of years.
“There’s no problem with holding a lot of information in DNA,” Goldman says. “The problem is paying for doing that. It’s an unthinkably large amount of money … at the moment.”
Another problem with the method is that it takes a long time to read back the data from DNA into a computer. It took Goldman and Birney about two weeks to decipher the two files they had recorded into DNA strands.
With the current speed of two weeks, the DNA storage method could be used to archive information that is not time critical for retrieval.
In any case, with the evolving technology, such an operation could be sped up to one day by adding more sequencing machines to do synthesis.
With the current development of public and private cloud storage and infrastructure, hosted server space becomes vital to keep up with the demands of the industry to store and retrieve data on the go. If the DNA storage method makes a breakthrough and enables faster and cheaper storage and retrieval of data, then it would most likely become a dominant industry in the future.
The implications would be enormous for companies like Google that currently maintain very large server farms in many locations across the globe. Server farms are very expensive, as they require lots of hardware, electricity, and laboratories to run efficiently.
Now, if DNA storage gets evolved into an affordable and quick data storage and retrieval system, Google could store all of the world’s data in a server the size of a granola bar, and wouldn’t even need a power supply. This will have a major impact on reducing costs and will have a very positive environmental effect with almost no CO2 emissions and no demand for electricity. Another thing to remember is that the stored data could be kept for thousands of years.
“You can drop [the DNA strand] wherever you want, in the desert or your backyard, and it will be there 400,000 years later,” said George Church, another scientist from Harvard University who is involved in the experimental research of DNA storage.