Skip to main content
Log in

A high storage density strategy for digital information based on synthetic DNA

  • Original Article
  • Published:
3 Biotech Aims and scope Submit manuscript

Abstract

DNA has been recognized as a promising natural medium for information storage. The expensive DNA synthesis process makes it an important challenge to utilize DNA nucleotides optimally and increase the storage density. Thus, a novel scheme is proposed for the storage of digital information in synthetic DNA with high storage density and perfect error correction capability. The proposed strategy introduces quaternary Huffman coding to compress the binary stream of an original file before it is converted into a DNA sequence. The proposed quaternary Huffman coding is based on the statistical properties of the source and can gain a very high compression ratio for files with a non-uniform probability distribution of the source. Consequently, the amount of information that each base can store increases, and the storage density is also improved. In addition, quaternary Hamming code with low redundancy is proposed to correct errors occurring in the synthesis and sequencing. We have successfully converted a total of 5.2 KB of files into 3934 bits in DNA bases. The results of biological experiment indicate that the storage density of the proposed scheme is higher than that of state-of-the-art schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ailenberg M, Rotstein O (2009) An improved Huffman coding method for archiving text, images, and music characters in DNA. Biotechniques 47:747–754

    Article  CAS  Google Scholar 

  • Akram F, Haq IU, Ali H, Laghari AT (2018) Trends to store digital data in DNA: an overview. Mol Biol Rep 45:1479–1490

    Article  CAS  Google Scholar 

  • Babu HMH, Mia MS, Biswas AK (2017) Efficient techniques for fault detection and correction of reversible circuits. J Electron Test 33:591

    Article  Google Scholar 

  • Bancroft C, Bowler T, Bloom B, Clelland CT (2001) Long-term storage of information in DNA. Science 5536:1763–1765

    Article  Google Scholar 

  • Blawat M, Gaedkea K, Hütter I, Chen XM, Turczyk B, Inverso S, Pruitt BW, Church GM (2016) Forward error correction for DNA data storage. Procedia Comput Sci 80:1011–1022

    Article  Google Scholar 

  • Borchert C, Schirmeier H, Spinczyk O (2013) Generative software-based memory error detection and correction for operating system data structures. In: Proceedings of the 2013 43rd annual IEEE/IFIP international conference on dependable systems and networks (DSN), pp 1–12

  • Bornholt J, Lopez R, Carmean DM (2016) A DNA-based archival storage system. IEEE Micro 99:637–649

    Google Scholar 

  • Church GM, Kosuri S (2012) Next-generation digital information storage in DNA. Science 6102:1628

    Article  Google Scholar 

  • Davis J (1996) Microvenus. Art J 55:70–74

    Article  Google Scholar 

  • Dimopoulou M, Antonini M, Barbry P, Appuswamy R (2019) A biologically constrained encoding solution for long-term storage of images onto synthetic DNA. arXiv:1904:03024

  • Erlich Y, Zielinski D (2017) DNA fountain enables a robust and efficient storage architecture. Science 6328:950–954

    Article  Google Scholar 

  • Goda K, Kitsuregawa M (2012) The history of storage systems. Proc IEEE 2012:1433–1440

    Article  Google Scholar 

  • Goldman N, Bertone P, Chen SY, Dessimoz C, LeProust ME, Sipos B, Birney E (2013) Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 7435:77–80

    Article  Google Scholar 

  • Grass RN, Heckel R, Puddu M, Paunescu D, Stark WJ (2015) Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angew Chem Int Ed 8:2552–2555

    Article  Google Scholar 

  • Hughes A, Ellington D (2017) Synthetic DNA synthesis and assembly: putting the synthetic in synthetic biology. Cold Spring Harbor Perspect Biol 1:a023812

    Article  Google Scholar 

  • Mardis R (2017) DNA sequencing technologies: 2006–2016. Nat Protoc 2:213–218

    Article  Google Scholar 

  • Panda D, Molla KA, Baig MJ, Swain A, Behera D, Dash M (2018) DNA as a digital information storage device: hope or hype? 3 Biotech 8:239

    Article  Google Scholar 

  • Rajaei N, Rajaei R, Tabandeh M (2017) A soft error tolerant register file for highly reliable microprocessor design. Int J High Perform Syst Archit 7:113–119

    Article  Google Scholar 

  • Shipman SL, Nivala J, Macklis JD, Church GM (2017) CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 7663:345–349

    Article  Google Scholar 

  • Yazdi SMHT, Kiah HM, Ruiz EG, Ma J, Zhao H, Milenkovic O (2015) DNA-based storage: trends and methods. IEEE Trans Mol Biol Multi-Scale Commun 1:230–248

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shufang Zhang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, S., Huang, B., Song, X. et al. A high storage density strategy for digital information based on synthetic DNA. 3 Biotech 9, 342 (2019). https://doi.org/10.1007/s13205-019-1868-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13205-019-1868-4

Keywords

Navigation