DNA for Data Storage: Storing all the World’s Information

When you want to store away physical objects for the future, the attic tends to be the go-to place for this. However, at some point, it becomes cramped, crowded and unorganized. What can be even messier than your attic is your computer and all other electrical devices with a storage component. A one page paper is small and easy to manage, but imagine all those files, documents, pictures, videos, and applications that pile up until your computer’s memory is full. Then, the computer starts running slow and you need a USB, external memory disk, or an online storage unit like cloud to back-up and save everything. But with approximately 7.7 million people currently on earth, and almost 2 centuries since the first built computer, how are we supposed to manage ALL of it?

Scientists have proclaimed DNA as being a potential source of storage for long periods of time. Its dense, easy to replicate and stable property makes it a highly desired candidate for an easier method of storage and retrieval. Current storage uses magnetic tape to store zettabytes of data but with the extensive amount of data production made every day, the current infrastructure is expected to consume all the world’s microchip-grade silicon by the year 2040 and therefore does not seem like an efficient method. Researchers can use the DNA base pairs Adenine (A), Thymine (T), Cytosine (C) and Guanine (G) to make a script to encode information. Attaching non-binary numbers, 0s and 1s, to transcribe the nucleotides into a coding sequence can allow the ability to simplify the computational language in the writing and reading process.

Encoding data in DNA initially started as a joke in 2011, but it was soon seen as a potential idea for long time archiving. Shakespeare's Sonnet, snippets of Martin Luther King’s speech “I have a Dream” and even parts of Beethoven have been successfully encoded into a strand of DNA. Unfortunately, the biggest worry is that DNA tends to make 1 mistake in the nucleotide sequence for every 100 bases. While reading the transcript, the desired file may not be able to be retrieved properly without being damaged. DNA uses one of the five types of DNA polymerases for proofreading and repairing for DNA sequences, so a mathematical computation needs to be made that performs the same function. The economics of writing DNA still remains problematic since DNA-synthesis companies charge 0.07-0.09 dollars per base. This means that a minute of stereo can be stored for $100,000. For such high expenses, an alternative source of processing needs to be found that is more cost effective.

Based on bacterial genetics, digital DNA can maybe one day rival or exceed storage technology. The read-write speed of a hard disk is between 3,000 to 5,000 microseconds per bit with a retention span of just over 10 years, using 0.04 watts per gigabytes. In contrast, Bacterial DNA’s reading-writing speed is less than 100 microseconds per bit, with over 100 years retention period and uses less than 1*10^11 watts per gigabyte. To conclude, this means that even though the translation of memory is slower in DNA, DNA storage still stores for 10 times longer than a regular hard disk, and uses an exponential amount of less energy with 1*10^6 times more data storage density. This means that we need only 1 kg of DNA storage to store the world’s information. Once the design is successful and the economics of its production resolved, we will be able to put all the internet’s information, books, and more with terabytes of information in something as small as a strand of hair.

 

References:

https://www.wired.com/story/the-rise-of-dna-data-storage/

https://newclasses.nyu.edu/access/content/attachment/2f2b4893-90fc-42a5-8306-d4a486689b93/Assignments/e3667c95-30ca-4973-addc-c2180a81b26e/DNA-Datastorage.pdf

Walida Ali