This past April marked the twentieth year since the completion of the Human Genome Project (HGP), and what a project it was! Beginning in 1990, scientists from universities and research centers across six countries banded together in an international effort to map the first complete human blueprint. This global research group became known as the International Human Genome Sequencing Consortium (IHGSC), and when the project ended in 2003, they had achieved things some thought impossible before its start – including successfully mapping 92 percent of the genetic roadmap that makes us “us.” Their combined efforts within the global biomedical community led to new growth in ethics and open scientific data sharing and charted a path for the future of DNA sequencing technology.
Recent Breakthroughs in Genomics Research
In January 2022, researchers at Stanford Medical partnered with Oxford Nanopore Technologies (ONT) to complete the first series of single-day genome mappings, and today’s genomics laboratories can routinely map a dozen human genomes in less than two weeks. While a feat this author calls miraculous, achieving the means to swiftly and thoroughly map the human genome didn’t come about without extraordinary effort, vision, and exploration. It took over 13 years of work to build a majority map of the human genome at the turn of the century. It wasn’t until March of 2022, nearly 20 years later, when Telomere-to-Telomere Consortium (T2T) announced itself as the first to fill in the missing gaps left over from the HGP, thus creating the first fully complete human genome sequence. This achievement came about from decades’ worth of innovation in biomedical research, and in rapid new technology development.
Next Generation Sequencing
The technology that makes today’s rapid genome sequencing possible is Next Generation Sequencing (NGS). NGS is a series of fast and cost-effective DNA/RNA sequencing technologies that play a major role across a swath of greater life-sciences fields including ecology, microbiology, and genetics. Nick McCooke, founding CEO of Solexa, led his team to invent the first NGS technology in 2005. In an interview with Labiotech, he explains that the idea was born from minds at Cambridge University amid the excitement of the HGP and sought to focus on massively parallelizing the sequencing process. Prior to this method of thought, sequencing was done one DNA fragment at a time with methods like Maxam-Gilbert and Sanger. The latter was the primary sequencing method used for the HGP. NGS technology has grown exponentially in sequencing power since Solexa’s achievements. To put it into perspective, global life-sciences and genomics leader Illumina — who acquired Solexa in 2006 says, “Since 2007, NGS data output has increased at a rate that outpaces Moore’s law, more than doubling each year.”
How Does NGS Work?
NGS methodology varies by platform, but in general, there are a few broad steps that each method entails. The variance is primarily in how steps are performed, and it’s important to note that while they can overlap in application, each method has its own set of strengths and weaknesses. Developers of the various NGS technologies have shaped their visions to meet the needs of their research goals. For example, Ion Torrent — a subsidiary of ThermoFisher, is well suited for research that delves into inherited and complex disease, as well as reproductive health, whereas ONT’s MinION shines in cancer research, infectious disease study, and building new reference genomes
One thing the range of sequencing platforms have in common is library creation. A library is a specimen group that has been prepared for sequencing. In essence, the building of libraries works to convert DNA or RNA into a format that can be read by a sequencing platform. Today’s NGS technologies require library preparation as part of Whole Genome Sequencing (WGS), as well as Whole Exome Sequencing (WES) and other targeted sequencing processes.
An early step in library creation is fragmentation. Fragmentation is necessary for both short-read and long-read — also called Third Generation Sequencing (TGS), as it breaks genetic data into sizes that current platforms can manage. Methods include physical, chemical, and enzymatic fragmentation.
Following fragmentation, genetic material undergoes differing processes dependent upon the platform used. These methodologies can be largely divided between platforms that require clonal amplification for sequencing and platforms that don’t. Short-read technologies rely on replicating short DNA fragments, called oligonucleotides, in the sequencing process. Two common methods for this are Sequencing by Ligation (SBL) and Sequencing by Synthesis (SBS). Both methods involve replicating DNA and recording the order of nucleotides as new phosphodiester bonds form.
Conversely, long-reads work by sequencing much longer DNA strands, or polynucleotides, directly from native samples. These technologies don’t require amplification so long as usable genetic material exceeds a certain threshold. While long-reads overcomes this major limitation of short-read technologies, it doesn’t do so without its own list of setbacks. Long-read platforms are much more expensive than their short-read counterparts. They’re also historically less accurate than short-read sequencing platforms, and slow to process data for large genomes.
That said, long-read platforms are maturing quickly. PacBio’s HiFi sequencing technology, released in April 2021, boasts an impressive 99.9 percent accuracy rate, putting it on par with other NGS platforms. Meanwhile, MinION platforms deployed to West Africa during the 2014-2016 Ebola outbreak sequenced the virus’ entire genome in an hour, and again in 2018 when the Zaire strain landed in the PRC. The platform’s portable size and real time sequencing ability was critical in efforts to identify and study the virus for further action.
Looking to the Future
There’s a place for both short-reads and long-reads in today’s genomics industry, and technologies on each side of the sequencing fence are often best used in tandem. However, advancements in TGS technology are beginning to correct long-reads’ previous shortcomings while also filling in the gaps of short-read platforms, and big names like Illumina, Element Biosciences, and MGI Technology have been working on their own approaches to long-read sequencing. Will there come a time when both sequencing families are part and parcel of a greater one-box approach? That’s one goal industry leaders and researchers alike look forward to. Might we see the day short-read sequencing is entirely a thing of the past? Are we nearing an era in which cost-effective bioinformatic software can quickly and efficiently isolate and target genome segments from telomere to telomere? Whatever the future brings, the possibilities are endless.