The rapid growth of modern computing power may soon reach its limits. Because traditional circuitry requires at least few molecules to exist, the miniaturization required for speed, performance and computing power will reach its limitations as the components of computers continue to shrink. The possibility of a successor to the silicon chip may be found within the same lines of code that program life itself. Biochemists are currently working on ways to use DNA to process and store large amounts of data.
Due the high amount of parallelism involved the interactions between DNA and enzymes may soon substantially outperform the fastest supercomputers. Parallelism is the ability for multiple calculations to occur in a given instance in a computer. While standard computers are limited to the number of processors contained in the hardware, a test tube filled with DNA fragments and enzymes has all of the molecules exposed to each other at once. This property enables DNA computing the possibility of rapidly searching, sorting and manipulating massive amounts of data in parallel.
A typical representation of the computing power of highly parallel computing is represented by the Hamiltonian Path, commonly associated with the traveling salesman problem. A salesman is planning on traveling to several cities and must travel along certain routes (just pretend he has airfare discounts for flights between certain cities). The salesman needs to find a path taking him to all of the cities, but he doesnt want to end up going to the same city twice. In order to solve the problem a computer must generate every possible combination of paths and reject those that do not fit the needs of the salesman.
Current methods of processing become extremely slow with such problems. As the number of cities involved in the problem increases, the number of possible answers increases exponentially. If only 30 cities are used in the problem, more than 1 billion possible answers must be checked. A processor can only check one of these possible answers at a time. Even supercomputers with 100 processors working in parallel would take a long time to solve the problem. Given modern computing strategies a problem with hundreds or thousands of cities could take lifetimes to answer.
A Novel Application of DNA Technlogy
In 1994 Leonard Adleman of the University of South Carolina shocked the biochemistry and computing community alike, as he was able to solve the traveling salesman problem for seven cities using a DNA computer. Even though the process took a great deal more time to complete than modern computing, he showed it was possible to compute through a biochemical medium in which all of the processing occurred simultaneously. If the experiment had been designed to handle much more than 7 cities, the speed of DNA computing would have rapidly overtaken that of conventional computing.
Adlemans DNA computer followed the same basic algorithm as conventional computers. The traveling salesman problem was broken into 5 steps:
Step 1) All possible random paths must be created.
Step 2) Discard all paths starting or ending with the wrong city.
Step 3) Discard all paths that travel to more than a total of 7 cities.
Step 4) Discard any path that does not travel through every city.
Step 5) Decode the remaining solutions, if any are present, to find viable answers.
These steps translate into biochemical processes surprisingly well. To complete step 1 biochemically, Adleman used random sequences of 20 base pairs (oligonucleotides) to represent each city, much like electronic processors use combinations of 1s and 0s. These "cities" were then combined into a liquid medium and attached randomly using T4 DNA ligase. Ligases are enzymes that use ATP to induce the formation of bonds between their substrates. Through the random associations between these DNA fragments every possible connection between all of the cities is created. Step 2 was completed using a polymerase chain reaction (PCR). PCR amplifies DNA using primers and a DNA polymerase. Since polymerase can only begin to copy the nucleotide sequences at the site of a primers attachment, the primers were made complementary to the DNA fragments corresponding to the first and last cities. After only a few cycles this process mass-produced fragments with the first city on one end and the last city on the other. Step 3 selected the paths containing the proper number of cities by using an agarose-gel. Since each "city" is 20 base pairs long and there are seven cities, only those fragments containing 140 base pairs were extracted. The fragments remaining contained 7 oligonucleotides and began and ended with the first and last cities respectively. To ensure every city was represented in every path (step 4), Adleman used 5 different types of specific filters using magnetic beads. Each filter had a different complementary fragment of the 5 oligonucleotide fragments representing the five cities between the first and the last. Each filter is used once and only those molecules trapped by the first filter may pass on to the second and then the third and so on. It would be like a long set of rooms. Only those with keys to the first room would be allowed to continue and see if they have a key to get into the second etc. Those that made it to the last room had to have keys for all the rooms before it. Therefore those molecules held by the last filter had the oligonucleotides for all of the five cities between the first and last cities. Since there are only 7 oligonucleotide sequences represented, each city can only be present once. Step 5 revealed the answers by again amplifying them using PCR. The paths could then be determined by sequencing the answers and comparing the results with thesequences used to represent each city.
These DNA computers can operate in many other ways. Some utilize enzymes that can be added to the assortment of randomly generated nucleotide combinations. In a sorting computation a restriction endonuclease may be designed to recognize the oligonucleotide representing b next to d as a binding site. Since a restriction endonuclease is an enzyme that will break apart the bonds between specific sequences of base pairs, they will break a combination of in the wrong order. Once isolated by using an agarose-gel, the largest DNA fragment can be amplified and sequenced to find the arrangements of the smaller DNA fragments within it, thus revealing the answer to the problem. Researchers at the NEC Research Institute have even used a virus to insert the answer into a bacterium, Escherichia coli. This bacterium mass-produced the answer for easier detection. Mircoasssays and DNA chips are all possible methods of detecting the symbolic nucleotide-representation of the answer.
There are several forms of these computers under experimentation. Some fix the DNA to a polymer membrane and look much like a computer chip. Others, like Dr. Adlemans, occur in an actual test tube and take advantage of the amount of data that can be stored in a three dimensional space.
Try the following logic problem (similar to what you might see on the LSAT or GRE) and see how DNA computing could be used to solve it:
Problems to Overcome
Problems with the current models of the DNA are still abundant. DNA is relatively unstable as compared to a hardwired processor. Attempts are being made to give the DNA a peptide support to resist degradation. The reliability of the enzymatic interactions is also a bit irregular. The results often need to be checked several to insure the correct answers are provided. The amount of time currently involved in preparing a DNA processor is extremely large. Fortunately, the enzymes and DNA segments from previous problems can be reused by reassigning them to represent other data sets. This reuseability, combined with future and recent advances in automation, may make the time involved with production less of a concern. It may even be possible for mechanical processors to design and manipulate these chips automatically in the future.
To create a DNA chip to process a single event appears to be extremely inefficient. Supercomputers can process up to one million singular instructions in a second. DNA computing requires as many as a thousand seconds to complete one instruction. Because the main benefits of DNA computing appears to be in its compact size, low energy requirements, and its proficiency in parallel computing, the future of this type of processing may be simply to augment the shortcomings of todays computer chips. These traits indicate that DNA computing may only be relevant to areas of data storage, sorting, and solving multivariable equations.
Looking to the Future
The future of biochemical computing is far from certain. Current uses of the new technology are limited to solving basic math calculations, sorting data, and other simple problems. Based mainly on the potential for the DNA computer, a large amount of funding to is being directed to improve the technology. Grants from the Defense Advanced Research Projects Agency of the Pentagon and the National Science Foundation are supporting a large amount of the research. The governments interest and concerns with the new technology is its profinity for decoding data encryptions. Researchers at Stanford and Princeton have already outlined a feasible DNA processor capable of decoding the USs Data Encryption Standard. In action, this plan could decode classified information over phone lines and even intercept information being transferred to and from the Federal Reserve.
Other applications of DNA computers could included the cracking of the genetic code itself by its ability to rapidly sort and correlate massive amounts of data. The data provided by the ongoing research into the human genome is quickly providing too much data to assimilate. In an era where technology is drowning in a flood of information, it is a strange irony that the lifejacket may be tailored with the primitive threads of life itself.