The prospect of storing vast amounts of data in DNA is getting closer to reality thanks to a new technique data recovery.
Microsoft seems interested in synthetic DNA. The company is considering using it in the future as a storage medium that could address the world's need for ever-increasing data storage.
Previous research has shown that only a few grams of DNA can store an exabyte of data and keep it as it is for 2.000 years.
The disadvantage is that the method is quite expensive and extremely slow. Writing data into DNA involves converting 0's and 1's into DNA molecules (adenine, thymine, cytosine and guanine), while retrieving data from DNA should includecodification of files to 0 and 1.
Finding and retrieving specific files stored in DNA is also a very big challenge.
As scientists from Microsoft Research and the University of Washington explain, without a random access or some ability to selectively retrieve files from the stored DNA, it would be necessary to decode the entire data set it contains to find the files we want. The creation of a random access it would reduce the amount of processes that need to be done for each search-find.
So to get some random access to DNA, they created a "primer" library that is linked to each DNA sequence. The primers, together with a polymerase chain reaction (PCR), are used as targets to select the desired DNA fragments by random access.
“Before synthesizing the data from a file in DNA, the researchers added to both ends of each DNA sequence PCR primer targets from the primer library,” he says the University of Washington.
"They then used these starters to select the desired point via random access and used a new algorithm designed to more efficiently decode and restore data to their original digital state."
Researchers have also developed an algorithm for more efficient decoding and data recovery. Microsoft researcher Sergey Yekhanin said the new algorithms are more tolerant of writing and reading DNA errors, which reduces the processes and processing required to retrieve information.
Although it is not the first time that random access to DNA has been achieved, it is the first time it has been done on such a scale, according to researchers.
The researchers encoded a 200MB data file containing 35 of the 29kB by 44MB into synthetic DNA. The files contained video, sound, images and high-definition text.
After the release of the study describing the technique, they encoded and regained 400MB data in DNA.
Researchers believe that the approach they have used for random access will escalate into large DNA tanks containing several terabytes each.