Applying sequencing power to the smallest organisms
But what about other organisms? Plants have proven to be particularly thorny sequencing subjects, with many containing massive, highly repetitive genomes. The largest yet identified is from Paris japonica, a rare Japanese flower that has 40 chromosomes and a whopping 149 billion base pairs. On the other side of the coin are microorganisms such as bacteria, which can efficiently pack the information needed for life into as few as 130,000 base pairs.
While bacterial genomes are highly variable in size, even the largest, which can reach nearly 15 million base pairs, pose little challenge for today’s sequencers. What is challenging is figuring out what is contained in any given sample, which can contain hundreds or even thousands of bacterial species. Such work is essential for microbial research, including work investigating the microbes in, on and around our own bodies, collectively called our microbiomes. And while the genomes might be vastly different from ours, recent discoveries indicate that the microbiome also plays a vital role in our health and wellbeing.
Microbiome research: From culture to shotgun to long read
While the genomes might not be large, sequencing bacterial samples to correctly identify the species and abundances within takes some ingenuity. Ideally, bacteria and other microbes can be isolated, cultured and the different cultures then sequenced. But many microbes are either very difficult or, thus far, impossible to culture, and the method is laborious and time consuming, making it unfeasible for many applications.
Another technique, known as 16S rRNA sequencing, focuses on a single highly variable gene called 16S rRNA found across bacterial species. It is a fast and inexpensive way to characterize bacterial samples and identify species, but it has limitations, such as an inability to identify the presence of different strains of a single bacterial species. Therefore, over the past two decades researchers have developed and refined what is known as shotgun metagenomic sequencing.
Shotgun metagenomic sequencing is so-called because, unlike more targeted sequencing methods, it looks to “hit” targets across entire mixed samples. It begins similarly to short-read sequencing of large genomes. DNA is extracted from the cells in the sample, the genomes are broken into short segments, and millions of the segments are sequenced simultaneously in parallel. The segments are then “assembled” using reference data. Instead of a single reference genome such as a human’s, however, the assembly software matches the sequences to the many bacterial ones that have been generated. There are difficulties associated with accurately re-assembling all of the reads from complex samples, but the data produced generally contains a comprehensive overview of the microbes present in a given sample.
Finally, new long-read sequencing technologies are providing important new capabilities for microbiome research. Long-read sequencing methods can be used to sequence entire microbial genomes in one pass and eliminate the need for assembly algorithms and their associated drawbacks. While still evolving, the use of long-read sequencing techniques in research is providing even more power to investigate bacterial strain properties, identify new species, construct accurate reference genomes, track gene transfer between bacteria (such as antibiotic resistance genes, an important public health threat) and more.
How JAX is learning more about what’s in us and on us
Microbiome research studies at The Jackson Laboratory (JAX) have contributed a great deal to revealing the what of our microbiome—that is, what species are in and on us? And what might they tell us about our health and susceptibility to disease? Professor , Ph.D., a pioneer in the field of human genomic sequencing, has turned his attention to microbial samples to learn how the combinations and relative abundances of bacterial species in and on us affect our health. Also, how might our own genetics and behavior affect them in turn? It’s a highly complex dynamic, and the huge effort needed to obtain, sequence and analyze enough samples to identify patterns and establish cause-and-effect relationships has only just begun. Weinstock is working on many fronts to characterize what microbes are found where, as well as what that might mean for disease. One collaborative initiative is the National Institutes of Health’s Human Microbiome Project, which has cataloged the microbial communities from multiple sites on the human body and is now in a second phase—the Integrative Human Microbiome Project—to explore the biological properties of both microbiome and host across three different microbiome-associated conditions.
Assistant Professor , Ph.D., is also exploring the microbiome, but her research mostly addresses what’s on us—the skin microbiome. Oh combines experimental lab work with developing advanced computational methods to learn more from the microbiome sequencing data her laboratory generates. For example, she seeks to understand not only how microbiome samples from skin differ, but what are the fundamental characteristics that can determine a healthy versus diseased state? How do the different microbial communities grow and compete, and how does that affect skin health? And why do some bacterial species that are beneficial to humans on the skin also have strains that can become pathogenic if they are introduced elsewhere in the body? Oh is also looking to apply her research for potential therapeutic applications, such as engineering probiotic strains that are able to integrate into microbial communities and provide effective skin treatments.
An unknown world of microbes
Even with the powerful technology, we are just beginning to scratch the surface of knowledge about our microbial neighbors. Over the past 25 years, researchers have generated hundreds of thousands of sequences for bacteria and archaea and made them available through databases such as the Genomic Encyclopedia of Bacteria and Archaea (GEBA). But until very recently, nearly half of the sequenced genomes came from strains of just 10 species that are, not coincidentally, human pathogens. Researchers are working to expand the range of species sequenced, but a recent study estimates that we’re still only scratching the surface, with only about two percent of the global variety represented in the data so far. And while the other 98 percent includes very rare species and those that live in extreme environments, many of them are species with which we cohabit.
Microbiomics is an exciting field, and one that has profound implications for health and medicine. And as we start linking the what—which species are where in what combination?—with causal relationships and clinical meaning, we will gain far better insight into both our wellbeing as well as our own ecosystems, even the parts of it we cannot see.