Racing Ahead With Genome Sequencing

Introduction

In the last decade, whole-genome sequencing efforts of microbes, plants and animals have become a reality, making available an ever increasing pool of genetic information about life on this planet. The most complex of these efforts, the Human Genome Project (HGP), is now nearing completion. A "draft quality" sequence of the human genome was published simultaneously by the government-funded HGP and the privately funded Celera Genomics Corp. in February of 2001, with the final, high quality sequence expected to be finished by 2004. For more information read the Human Genome Hot Topic article.

Sequences in Progress

While work on the human genome has received the bulk of publicity, sequencing has been proceeding apace with the genomes of a wide variety of organisms. These include non-human vertebrates such as mice, zebrafish and pufferfish, as well as frogs and chickens. Invertebrates such as the highly studied worm Caenorhabditis elegans, its relative C. briggsae and Brugia malayi (the causal agent for filariasis) are also the subject of sequencing efforts. Amongst the microorganisms, in addition to E. coli and its relatives, the organisms for whooping cough (Bortadella pertussis and B. parapertusis), food borne diseases (Campylobacter botulinum jejuni, Clostridium botulinum and Shigella dysenteriae) along with the agents for leprosy and tuberculosis (Mycobacterium leprae and M. tuberculosis respectively) are under intensive study along with a variety of Streptococcus species (rheumatic fever, scarlet fever).

Examples of Genome Sequencing in Progress

Vertebrates

Invertebrates

Microorganisms

Humans
Mice
Zebrafish
Pufferfish
Frogs
Chickens

Caenorhabditis elegans
Caenorhabditis briggsae
Brugia malayi

Escheria coli
Bortadella pertussis
Bortadella parapertusis
Campylobacter botulinum jejuni
Clostridium botulinum
Shigella dysenteriae
Mycobacterium leprae
Mycobacterium tuberculosis
Streptococcus
(many species)
Yersinia pestis
Bacillus anthracis
Clostridium botulinum

Government-sponsored scientists also are working to combat bioterrorism. This research is made even more urgent by the anthrax letters found in the mail after the September 11, 2002 terrorist attacks on the U.S. Several people died of anthrax, many others had to take massive doses of antibiotics, and thousands more were tested for exposure to the bacterium. As these very real events demonstrate, fast and accurate detection of bioterrorism agents, as well as effective methods of disease prevention and treatment, are needed to protect the American public from bioterrorist attacks. Research currently underway to better understand the genomes of human pathogens such as Bacillus anthracis (anthrax, multiple strains), Yersinia pestis (plague), Variola virus (smallpox), and Clostridium botulinum (botulism) will help scientists accomplish these goals.

Very often these genome sequencing efforts are carried out by consortia rather than by research teams at a single institution. An example of such a collaborative effort is the sequencing of the genome of Plasmodium falciparum, the parasitic organism causing human malaria. Here the studies are being carried out jointly by the Wellcome Trust Sanger Institute, (U.K.), the Institute for Genomic Research [TIGR], (U.S.) and at Stanford University (U.S.).

The Post-Genomic Era

The full sequencing of so many genomes represents a new beginning for research in what has been dubbed the “Post-Genomic Era.” With the completion of the human genome now close at hand, scientists are starting to turn their attention to how such sequence information can be utilized for the good of society.

Dissecting the genome

Certainly one overriding goal is to understand what is actually encoded by the four-letter “alphabet” of G, A, T and C nucleotides. This is the domain of functional genomics, a field in which scientists study the structure of the genome, and work to decipher how its constituent parts contribute to a healthy human being. This work has only just begun. So far, only half of the approximately 30,000 to 40,000 genes contained on the 23 pairs of human chromosomes have a known function, and identifying the rest of the genes presents a formidable challenge. However, genes themselves are only a small part of the genomic puzzle, making up only about 2% of the genome sequence. Some of the other 98% is involved in regulating gene expression, but much of this extra sequence has been dubbed “junk DNA” because it is not clear if it is needed for anything at all. With so little currently known, making sense of the 3 billion-letter human “book of life” will be no easy task, and will keep geneticists around the globe busy for years to come.

Preliminary examinations of the genome have already generated a great deal of interest in single nucleotide polymorphisms, or SNPs. These single nucleotide differences between individuals are believed to hold the key to many frustrating medical questions. Why does a drug that does wonders for one patient cause such miserable side effects in another? Why do some people develop diseases such as cancer at a young age, while others manage to live to a ripe old age? The answers are believed to lie at least partially in those slight differences in genetic code that make each one of us unique, about one base in every 2,000 to 3,000. Over one million of these SNPs have already been identified, although not all of them are expected to be associated with disease states. Still, scientists are hopeful that SNP analysis will someday lead to more sophisticated medications, and aid in the diagnosis and treatment of a wide variety of ailments.

Pharmacogenomics

The realization that the genetic makeup can change the way individuals respond to medication has let do a new type of drug research. In pharmacogenomics, scientists are using genetic information to design better pharmaceuticals. It is hoped that someday, drug treatments will be tailor-made to each individual based on genetic makeup, taking much of the guesswork out of prescribing drugs. Allergic reactions to drugs will be prevented, and the correct drug and proper dosage can be determined from the individual's genetic code, rather than through the trial and error procedure that doctors currently use. Understanding the genetic causes of disease will allow researchers to find better drug targets and custom design drugs to impact a particular RNA or proteins involved in the disease. An awareness of an individual's susceptibility to specific diseases should also help the doctor and patient decide on appropriate lifestyle changes or medical treatments to prevent disease or lessen its severity. Of course it is this type of highly personal information that also has the greatest potential for abuse (denying employment or health insurance to a person because they have or may develop a disease, for instance), so finding ways to keep such information private will be paramount.

Gene therapy

Another highly anticipated yet controversial avenue of genomics research is gene therapy. In gene therapy, genes are inserted into a person’s cells to correct for missing or mutated genes that cause disease. Such therapy has the promise of someday curing conditions that are currently incurable. There are studies underway to treat such diseases as cancer, AIDS, cystic fibrosis, hemophilia, and diabetes. However, gene therapy is still very much in the experimental stages, and there are many hurdles that need to be solved before it can become a standard medical treatment.

Perhaps most pressing problem is the need for a good gene delivery system. In other words, how can the necessary gene be safely and effectively brought to the cells in the body that need it? Most studies currently underway utilize modified viruses to carry the desired gene into cells. In this method, some of the viral genome is replaced by the desired human gene, and when the virus infects the person, the gene is delivered to the correct cell. Unfortunately, such viral carriers can cause unwanted side effects, including strong, or even fatal, immune reactions.

To avoid such undesired responses, scientists are looking for ways to deliver genes without the use of viruses. One intriguing idea is to use a “47th chromosome” – a long piece of DNA that would supplement the 46 chromosomes in each human cell. It would be large enough to carry large amounts of genetic information, but if made properly, would evade immune system detection because it would have all the features of human chromosomes. Much more needs to be understood about how the human genome works, however, before scientists can confidently make such a sophisticated gene carrier.

Genetically-modified foods and cloning

While gene therapy is still years away, our ability to manipulate DNA has already led to the production of genetically modified organisms, or GMOs. Although farmers and ranchers have manipulated the genes of domesticated plants and animals for thousands of years through selective breeding programs, this new ability to insert individual desired genes into genomes of a totally unrelated organisms has been greeted with both enthusiasm and unease. Proponents of such technology note that GMOs can make crops resistant to insects, pesticides, even spoilage, and will make foods more nutritious and delicious. Opponents point out the possibility that such foods will be harmful to human health, and that the introduction of inappropriate genes into plants will result in “superweeds” capable of destroying entire ecosystems. Both sides make opposing arguments regarding the impact of these technologies on the human and economic health of countries around the globe. But despite being dubbed “frankenfoods” by objectors, food made from GMOs are already a part of our daily lives. Many farmers grow Monsanto’s “Roundup Ready” pesticide-resistant soybeans, and most of the milk consumed in this country is produced by cows injected with genetically engineered bovine growth hormone (BGH). While the debate rages on, no clear picture has emerged regarding the overall effect of these products on the lives of people around the world. The impact of GMO products on global health, as well as global economies, still needs to be more thoroughly studied.

Another aspect of genetic research is mammalian cloning, with the possibility of clining human beings. For more informatin on this topic visit the Mammalian Cloning Hot Topic article.

Other potential benefits of genetic research

As noted above, one feature of the post-genomic era is the increased interest in sequencing the DNA of all types of organisms. Dozens of microbial genomes are currently under investigation as part of the US government’s “Genomes to Life” (GTL) program, and they harbor some surprising potential. Bioenergy research is focused on finding microbes that can produce energy as an alternative to foreign oil. Scientists have already identified bacteria that produce hydrogen gas or other fuels as a by-product of their metabolism as they break down biomass. Microbes may also be useful for toxic waste cleanup. There are strains of bacteria that actually thrive under conditions that are highly toxic to humans, such as in radioactive waste. Efficient decontamination of toxic materials by microbes could save the US taxpayers millions of dollars, as well as eliminate health hazards at toxic dump sites.

Proteomics

Having the genomic sequence of an organism represents an enormous achievement, but the important thing is to be able to understand just what the sequence can provide in terms of biological function. In fact, having the genome sequence is generally the very first step (though a crucial one) in understanding gene function and regulation. Many scientists feel that the next step is to study the proteins encoded by all those genes. In proteomics, researchers study how organisms use their genes, by examining the proteins that cells actually make. Each organism expresses genetic information in a very specific way, which varies over time, and from tissue to tissue. Some protein-encoding genes are heavily utilized, while others are used only for short bursts of time. Thus, even though the genome of an organism does not change, the proteome, or set of proteins found within a cell at any given time, can vary greatly. Thus, the proteins expressed by a liver cell are very different from the proteins expressed by a brain cell, which are different again from proteins expressed in the cells of a developing embryo.

With proteomics, scientists hope to gain a better understanding of how genetic information is translated into the growth and function of active, living cells. To provide just one example, while the sequence of the genome of the worm Caenorhabditis elegans (representing about 20,000 genes) was determined in 1998, less than 10% of those genes have been the subject of functional analyses using classical biochemical and genetic techniques. Only very recently have high throughput DNA microarray (Gene Chip) methods been employed to establish gene expression and patterns of coregulation (see Kim 2001). It is the immensity of the data sets deriving from genomic studies that places such a heavy emphasis on high throughput, multidimensional analysis (see Gifford, 2001).

Conclusion

Genomic research, of the human genome and the genomes of many other organisms, is sure to bring advances in medicine and human health, energy production, toxic waste cleanup, and more. Certainly, many of these goals will not reach fruition for years, and some of the research being done is highly controversial. Still, the genie is out of the bottle, and how we deal with our newfound knowledge will be as much a test of our humanity as anything we have done in the past.

Genome Sequencing References and Web Links

D.K. Gifford Science 293, 2049 (2001).

S.K. Kim et al. Science 293, 2087 (2001)

Sanger Institute – Genome research institute

The Institute for Genomic Research – Analysis of genomes from multiple organisms

Project Ensembl – Genome browser software

YourGenome – Information, news, and commentary on the field of genome science

'return to HotTopics' index