Microsoft And The University Of Washington Researchers Introduce A Proof-Of-Concept Molecular Controller In The Form Of A Tiny DNA Storage Writing Mechanism On A Chip

According to current forecasts, data storage consumption is expected to expand by 20.4 percent year over year to around nine zettabytes by 2024. To put that figure in context, Windows 11, which initially takes up roughly 64 gigabytes of storage space, would need to be installed on almost 15 billion machines to consume just one zettabyte of space. In comparison, since 2011, it is projected that little over 3 billion personal computers have been supplied worldwide.

In the long run, available storage solutions are having trouble keeping up with the rising demand. Synthetic DNA, which is essentially a microscopic data storage device, can drastically reduce the amount of space and material required for future archive storage demands. Using the above growth prediction, it would take millions of tape cartridges—the current densest commercial storage media—to store nine zettabytes of data, whereas DNA would only take up the space of a tiny refrigerator.

With a density of nearly 1 exabyte per cubic inch, DNA not only outperforms existing storage media but also presents a potential answer to today’s data archive storage difficulties. Unlike tape, which needs to be redone every 30 years at best, DNA is extremely resilient and can survive thousands of years. Because the techniques for reading DNA molecules are many and vital to life science applications, DNA data storage will not become outdated. 

Furthermore, evidence suggests that DNA storage could result in decreased greenhouse gas emissions, water use, and energy use. Despite these benefits, one major stumbling block to large-scale DNA data storage deployment has been the low synthesis throughput, resulting in low writing throughput and a relatively expensive cost.

In a recent paper published in Science Advances, Microsoft researchers and their University of Washington collaborators at the Molecular Information Systems Laboratory (MISL) addressed this problem by introducing a proof-of-concept molecular controller in the form of a microscopic DNA storage writing mechanism on a chip. The chip shows that DNA-synthesis sites can be packed three orders of magnitude more closely than before. This demonstrates that substantially higher DNA writing throughput is possible.

The study outlines the progress they’ve made in demonstrating that writing throughput can be increased for more general storage demands, as well as the technology they’ve developed to do it, which includes a nanoscale electrochemical array. The researchers encode a message onto four strands of synthetic DNA using the technique, demonstrating that nanoscale DNA writing is viable at the size required for practical DNA data storage.

Recently, a lot of work has been done to improve the possible scale of DNA storage, such as establishing automation systems to minimize the time-consuming procedure of manually pipetting DNA and other chemicals or finding strategies to safeguard DNA for long-term storage of thousands of years. Two techniques are required to store information in DNA at the scale required for commercial usage. The first requires using encoding software and a DNA synthesizer to convert digital bits (ones and zeros) into strands of synthetic DNA that represent these bits. The second option is to use a DNA sequencer and decoding software to read and decode the information back into bits to be converted back into digital form.

Digital bits are encoded in the DNA bases (A, T, C, and G) of a DNA sequence to store data on synthetic DNA. When data is stored in DNA, a DNA chain with a specified base sequence must be created. Traditionally, DNA chains have been made using a multi-step process known as phosphoramidite chemistry. A DNA chain is formed by the consecutive addition of DNA bases in this process. Each DNA base has a blocking group that prevents the base from being added to the developing DNA chain numerous times. Acid is given to break the blocking group and prime the DNA chain for the addition of the next base after a blocked base is connected to the DNA chain.

Individually synthesizing DNA chains or in parallel on an array, which comprises several locations where unique DNA sequences can be synthesized simultaneously, is possible. The key to boosting writing throughput and minimizing costs is to increase synthesis density or the frequency of synthesis sites on a fixed surface. The lower the synthesis cost of each DNA chain, the closer these locations are to an array because the ingredients required for the procedure can be used with more sequences.

Maintaining control of individual spots without interfering with nearby spots is the critical problem in boosting DNA writing throughput. Photochemistry, fluid deposition, and electrochemistry are the three main array synthesis technologies used to generate a small number of high-quality DNA sequences with millions of exact duplicates in today’s DNA synthesis arrays.

A photomask or micromirror forms patterns of light on an array in photochemical DNA synthesis, removing the blocking group from the DNA strand. The acid deblock is delivered to specific areas via liquid deposition, such as acoustic or inkjet printing methods. Due to micromirror size, light scattering, or droplet stability, both approaches are limited in the synthesis densities they can achieve.

Each area in the array has an electrode in electrochemical DNA synthesis. When a voltage is supplied, acid is formed at the anode (working electrode) to unblock the developing DNA chains. An equivalent base is generated at the cathode to deblock the growing DNA chains (counter electrode). When reducing the spacing between anodes, the essential worry is acid diffusion; the smaller the pitch, the easier it is for acid to travel to neighboring electrodes and cause accidental deblocking.

A 650-nm electrode was implanted into a glass well surrounded by cathodes, according to the research team. The glass well would act as an attachment surface for the DNA chains to grow on, as well as a physical barrier to prevent acid from diffusing to other spots. Any acid that escaped the well would be neutralized if it came into contact with the base created at the cathodes. The model revealed that acid could be contained at these and much lower scales, motivating us to design and produce chips with microscopic feature spots.

Sets of four independently addressable electrodes were used in these electrochemical arrays. The scientists used them to demonstrate the capacity to control DNA synthesis at specific sites by using two fluorescently tagged bases in experiments (green and red). If acid diffused unexpectedly, it would reach undesired locations, causing one color to bleed over into another.

Acid was created at one set of electrodes on an electrochemical array to deblock the DNA chain, and then a green-fluorescent base was added. To create the image, acid was produced at the second set of electrodes in the same array and then combined with a red-fluorescent base. No bleed-over was observed, indicating that there was no unintentional acid diffusion.

The array’s capacity to write data was shown on a second array by synthesizing four unique 100-base-long DNA strands that encoded the phrase “Empowering each person to store more!” Despite having higher error rates than commercial DNA synthesizers, the message was deciphered without mistakes.

Conclusion

This research shows that the electronic-to-molecular interaction may be controlled, opening new possibilities. Electrochemical control approaches, for example, allow for spatial control of enzymes at the nanoscale. Beyond DNA, this could be a tool for drug discovery, as it allows for quick combinatorial chemical synthesis to be used as a platform for determining drug-protein binding kinetics. Other examples are a platform for monitoring environmental toxins or a tool for assays that detect illness biomarkers. This will bring the performance and cost of DNA data storage much closer to those of tape.

Paper: https://www.science.org/doi/10.1126/sciadv.abi6714

Reference: https://www.microsoft.com/en-us/research/blog/