Перевести на Переведено сервисом «Яндекс.Перевод»

Computer Security and Privacy in DNA Sequencing

There has been rapid improvement in the cost and time necessary to sequence and analyze DNA. In the past decade, the cost to sequence a human genome has decreased 100,000 fold or more. This rapid improvement was made possible by faster, massively parallel processing. Modern sequencing techniques can sequence hundreds of millions of DNA strands simultaneously, resulting in a proliferation of new applications in domains ranging from personalized medicine, ancestry, and even the study of the microorganisms that live in your gut.

Computers are needed to process, analyze, and store the billions of DNA bases that can be sequenced from a single DNA sample. Even the sequencing machines themselves run on computers. New and unexpected interactions may be possible at this boundary between electronic and biological systems. As a multi-disciplinary group of researchers who study both computer security and DNA manipulation, we wanted to understand what new computer security risks are possible in the interaction between biomolecular information and the computer systems that analyze it.

Here we highlight two key examples of our research below: (1) the failure of DNA sequencers to follow best practices in computer security and (2) the possibility to encode malware in DNA sequences. See our paper for more detailed information on our findings. This paper will appear at the peer-reviewed USENIX Security Symposium in August 2017.

Computer Security Analysis of DNA Sequencing Programs

After DNA is sequenced, it is usually processed and analyzed by a number of computer programs through what is called the DNA data processing pipeline. We analyzed the computer security practices of commonly used, open-source programs in this pipeline and found that they did not follow computer security best practices. Many were written in programming languages known to routinely contain security problems, and we found early indicators of security problems and vulnerable code. This basic security analysis implies that the security of the sequencing data processing pipeline is not sufficient if or when attackers target the pipeline.

DNA Encoded Malware

DNA stores standard nucleotides—the basic structural units of DNA—as letters such as A, C, G, and T. After sequencing, this DNA data is processed and analyzed using many computer programs. It is well known in computer security that any data used as input into a program may contain code designed to compromise a computer. This lead us to question whether it is possible to produce DNA strands containing malicious computer code that, if sequenced and analyzed, could compromise a computer.

To assess whether this is theoretically possible, we included a known security vulnerability in a DNA processing program that is similar to what we found in our earlier security analysis. We then designed and created a synthetic DNA strand that contained malicious computer code encoded in the bases of the DNA strand. When this physical strand was sequenced and processed by the vulnerable program it gave remote control of the computer doing the processing. That is, we were able to remotely exploit and gain full control over a computer using adversarial synthetic DNA.

No Reason for Concern

Note that there is not present cause for alarm about present-day threats. We have no evidence to believe that the security of DNA sequencing or DNA data in general is currently under attack. Instead, we view these results as a first step toward thinking about computer security in the DNA sequencing ecosystem. One theme from computer security research is that it is better to consider security threats early in emerging technologies, before the technology matures, since security issues are much easier to fix before real attacks manifest.

We again stress that there is no cause for people to be alarmed today, but we also encourage the DNA sequencing community to proactively address computer security risks before any adversaries manifest. That said, it is time to improve the state of DNA security.

We encourage the DNA sequencing community to follow secure software best practices when coding bioinformatics software, especially if it is used for commercial or sensitive purposes. Also, it is important to consider threats from all sources, including the DNA strands being sequenced, as a vector for computer attacks. See our research paper for a more detailed discussion of threats to the DNA sequencing pipeline and potential defenses.

Paul G. Allen School of Computer Science & Engineering, University of Washington

http://dnasec.cs.washington.edu/

DNA
Log in or sign up on  to add a comment to scientific problem you are interested in!
Comments (0)