Happy 20th to the Human Genome Project

To celebrate 20 years of modern science, let’s take a look at the Human Genome Project, past and present.

Linda Bartlett (Photographer), Public domain, via Wikimedia Commons

Computers and genomics were made for each other. The precision and speed at which a computer could run through a gene sequence was unparalleled. Most importantly, though, was the fact that this made the gene sequence accessible to the entire scientific community.

April 14th, 2023 marked twenty years since the Human Genome Project was completed. Kicking off in October of 1990, this was arguably the most large-scale scientific project of all time. Involving thousands of collaborators, the goal of this project was to identify, map, and sequence all of the genes of the human genome. In simpler terms, the project was a voyage to discover the chemical makeup and function of every single gene that can be found in human DNA. The initial proposal suggested a 15 year project that would cost approximately three billion dollars. Since its completion, two years ahead of schedule, the cost of the project has been offset by the significant economic benefits that genomics has provided. The passage of two decades offers an opportunity to explore how the project has helped scientists understand and cure diseases, discover life-changing drugs, identify cancer-causing markers, and much more. 

“With the introduction of computers into the lab, the process was enormously improved; not to mention the newfound readiness to share the discoveries coming in.” (National Cancer Institute / Unsplash)

When it was first proposed, this project seemed outlandish and impossible, as DNA sequencing was done by hand, and computers hadn’t been used in conjunction with genetics. DNA sequencing had only been invented 13 years prior to the launch of the project, and utilized electrophoresis to extract DNA fragments to analyze them. The extraction alone took 24 hours, and reading the analysis took hours after that. However, with the introduction of computers into the lab, the process was enormously improved, not to mention the newfound readiness to share the discoveries coming in. 

The Human Genome Project (HGP) unlocked more than a wealth of data. It marked the beginning of a new sort of science: collaborative research. “The sooner we get it and everybody has access to it, the faster we can move in terms of understanding how life works and how disease occurs,” said Eric Green, the current director of the National Human Genome Research Institute (NHGRI) at the National Institutes of Health (NIH). In 1996, leaders of the project met in Bermuda at the first International Strategy Meeting on Human Genome Sequencing. The outcome was the Bermuda Principles, a groundbreaking article that decided that all data would be made public within 24 hours of being generated. This was a huge step in global collaboration, and is the basis for how most scientific studies are conducted today — with an open approach to data sharing and open-source software. 

This project was really the introduction to the use of technology in science and medicine. Besides the newfound ease when it came to analysis and collaborative benefits, this had an incredible effect on healthcare. Suddenly, as long as you can operate a computer program, you can work in pharmacies, doctors can easily prescribe medications, and treatments are widely available to the public. Economically speaking, this accessibility means affordability. It also means that pharmaceutical companies were emerging as powerful corporations. This may have opened a new can of worms, but one result is that it has meant incredible things for the economy. For instance, although struggles with regulation and government involvement have emerged, companies like Regeneron Pharmaceuticals have fostered the discovery and distribution of thousands of treatments. The time saved by the collaborative effect of the HGP has saved resources that would’ve been spent by initiatives around the world on similar projects.

In addition, the completion of the project completely revolutionized our understanding of evolution, and where we fit into the grand scheme of things. In addition to humans, the goal of the HGP was to sequence the genomes of E. Coli bacteria, a rat, and a fruit fly. However, they achieved more than this, finishing the genome sequences of some multicellular organisms, another fruit fly, mice, and more. Since the project ended, over 4,000 organisms have been sequenced and placed on the genetic tree of life, ranging from microbes to primates. The reveal is incredible: all living organisms descended from a single common ancestor. Knowing this, big questions regarding implications for biology and medicine have become approachable. A fascinating insight having to do with our own evolutionary history, for instance, is that Neanderthal DNA was found to be present in our own Homo sapien DNA. In other words, there was interbreeding between the two species while they were diverging.

Francis Collins, the director of the National Institute of Health and a leader for the Human Genome Project, discusses the findings and their importance after the completion of the project. (National Human Genome Research Institute (NHGRI) from Bethesda, MD, USA, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via Wikimedia Commons)

Before the Human Genome Project, the vast majority of drugs and treatments were found serendipitously. It was accepted as the norm that drug’s targets, or affected molecules and proteins, were largely unknown. However, since 2001, nearly 100% of approved drugs licensed in any given year have all of its targeted molecules and proteins identified. We’ve also been able to analyze genes like Tumor Necrosis Factor (TNF), which has been associated with at least 160 different known diseases. Since completion of the HGP, less than three percent of genes have been left undiscussed in the scientific community. Thanks to the project’s discoveries, we are now more informed than ever on ways we can utilize drugs and treatments to treat specific diseases and disorders, as well as the effects drugs may have on the rest of the body. This allows drugs to be safer and more effective than ever before. The NIH, for instance, talks about “the concept that drugs can be tailored to an individual’s genetic make-up to increase their efficacy and safety; for example by identifying how [individuals] metabolize drugs differently.”

In addition to the data from the Human Genome Project being used directly, the completion of the project has opened the door to other proposals and discoveries. Proteomics, for instance, is a field of study centered around the identifying and quantifying proteomes, or the entire set of proteins produced by an organism. Proteomics, as we know it today, was made possible by the genetic information uncovered by the HGP, as it has provided the genetic information used to look at the interactions, function, composition, and structures of proteins and their cellular activities. Thanks to the sequencing of protein-encoding genes, particulars such as a protein’s mass is revealed, which is elemental mass spectrometry, a tool noted as the the most powerful method for analysis of proteomes.

Another example is the ENCODE (Encyclopedia Of DNA Elements) Project, launched in 2003 with the goal of understanding the functional parts of the genome. Much of the human genome is “nonfunctional,” meaning it does not get transcribed. One debate in the planning of the Human Genome Project had to do with whether or not it was worth mapping out the non-coding segments of the genome. Thankfully, this so-called “junk DNA” was analyzed, and found to be not so full-of-junk, after all. Functional sequences were once defined as the protein coding parts of DNA. In other words, they were the segments that mattered. However, after analyzing the “junk-data,” we now know the majority of functional sequences do not encode proteins. The ENCODE project has raised challenging questions and answers regarding genome functionality. By identifying the purposes of different genes, pinpointing diseases’ and disorders’ origins could become more simple and applicable. 

Here is one of my personal favorite by-products/real-world applications: The NHGRI (National Human Genome Research Institute) Dog Genome Project. Aside from adding to our understanding of evolution and relatedness of the animal world, this has spun off into canine versions of DNA Ancestry-based biotech companies. (Left: National Human Genome Research Institute, Public domain, via Wikimedia Commons; Right: The Broad Institute of MIT and Harvard, Public domain, via Wikimedia Commons)

There are plenty of resources that everyday people use that were made possible by the human genome project. Ever heard of Ancestry.com? How about 23andMe? These biotech companies specialize in genetic ancestry tracing. Advertising slogans like “being able to find lost relatives” and “uncover family secrets and heroics” urge people to send in small samples of their saliva to be genetically traced. Samples are entered into a database and compared to the millions already inputted, to uncover links and similarities. These can tell regions of origin, relatives, etc.. Over 30 million people around the world have taken DNA tests, and over half of those were ancestry kits. In addition to familial interest, these ancestry projects can reveal valuable information about your medical plausibilities. The 23andMe company places a spotlight on this aspect of DNA sequencing; “It’s time for DNA-powered health” are the words centered and bolded on their website. By getting your own genome sequenced, you can find out specifics from likelihoods of developing diseases and disorders — such as diabetes, lactose intolerance, and cancers, to dietary and exercise details particular to your biological needs. Learning your carrier status is strongly advised, not only for yourself, but for your children’s sake, who can be detrimentally affected by unknown genetic conditions. 23andMe boasts 76% of their clients having reported making healthier life choices.

Even here, at Bronx Science, we learn about the importance of the project and teachers use its findings in lessons. Dr. Khaled Mahmoud, who teaches the Post-AP Genetics class, reported that his students have many lessons that go in-depth with the history and uses of the Human Genome Project. “They explore the different molecular techniques used in the project, and the modifications made in the DNA sequencing technology. They also learn about the real world applications related to genomics and the human genome project,” said Dr. Mahmoud. He went on to elaborate on these real-world applications, including 23andMe and DNA ancestry. Even though the genetics class was formed  after the project was completed, the class continues to keep up on new discoveries and discuss them in town hall-type classes. 

Eli Stahl, a research geneticist working at Regeneron, has been in the genetics field for his entire scientific career. Even though he worked with genetics all throughout graduate school at the University of Chicago, Stahl didn’t enter the human genetics world until 2008. When asked about his connections to the NHGRI, he responded, “I’m the primary user of Human Genome Project data,” and added that his bosses were those directly involved with harvesting the details. The work he does every day is dependent on the data fielded by those pioneering scientists. Stahl’s past projects had to do with linking disorders such as Schizophrenia and Bi-Polar to specific proteins and loci on the human genome. Now working with Parkinson’s Disease, he agrees that the HGP was a game-changer and its completion has been, and will continue to be, life changing.

“The sooner we get it and everybody has access to it, the faster we can move in terms of understanding how life works and how disease occurs,” said Eric Green, the current director of the National Human Genome Research Institute (NHGRI) at the National Institutes of Health (NIH).