After several years of exciting new genomics methods and capabilities emerging one after the other (better NGS and CRISPR being the most obvious examples), the past year hadn’t produced any game changers, or at least not to my knowledge. Instead, it seemed like the existing tools had matured somewhat and were being applied in some interesting and unexpected ways. Also, even as substantive progress is being made on the genomic medicine front, efficiently applying genomic data in the clinic is still an exceedingly difficult task. What I heard recently at The American Society of Human Genetics’ annual meeting (ASHG) bolstered those impressions.
Genome data siloes are still prevalent, but large amounts of human genome data are being aggregated and annotated. Indeed, at ASHG a group from the Broad Institute announced the doubling of the number of exomes in the ExAC database to more than 125,000, plus the launch of a whole genome sequence database called gnomAD. The progress and potential is hugely exciting, but the early returns are cautionary. As happens so often in biology, especially human biology, the search for clarity is yielding layer upon layer of uncertainty and complexity. I have my own concrete example: the data from my own genome. It yielded more than 700 times more variants of uncertain significance (VUS) than variants with robust annotation regarding pathogenicity and penetrance (2,776 versus 41).
My first session at ASHG was therefore a hands-on workshop on teaching genomic medicine, using methods originally developed to teach pathologists how to work with a sequencing report. We were provided with clinical sequencing data (mostly cancer related for obvious reasons) and employed the on-line tools in the variant analysis arsenal, including COSMIC, cbioportal, ClinVar and others. We had to call variants in the sequence, and, when we did, decide whether or not they should be flagged as pathogenic or possibly so. The exercise was designed to showcase the thought that goes into every call, the continued existence of conflicting annotations, and the changes that occur in the databases every day. It definitely did, underscoring the reality that interpreting clinical genome data remains labor intensive and inexact.
I later attended a session of talks about interpreting VUS, including data analysis from parent-child trios in a neurodevelopmental disorder cohort, variant interpretation in healthy populations, pathogenicity lessons from analyzing the first 60,000 exomes in ExAC, variant calls for Mendelian disease, copy number variants, and more. I was excited when I saw the lineup, but 2/3 of the way through, I confess my head was spinning. All these glorious data—sequences by the thousands and tens of thousands—added up to … what? Even a simple sequence difference in a protein coding region is difficult to interpret, and many variants previously flagged as pathogenic are showing up in relatively large numbers in healthy people in ExAC. That means they almost certainly aren’t pathogenic, or are only in combination with other factors. What about copy number variants, in which there may be more or fewer than two copies of a gene with a “normal” sequence? And how do differences in post-transcriptional mRNA processing affect protein function? More sequences, perhaps millions more, must be obtained, pooled together and analyzed before we can start to answer these questions for patients with less time and effort and higher success rates.
Much of the buzz around CRISPR has focused on its potential use in humans and the enormous ethical implications arising from such work. Technically we can already use it to fix devastating genetic diseases. Theoretically, we could also manipulate traits such as intelligence, appearance and athletic talent. (If and when we figure out the complex genetic underpinnings of those traits, that is.) The research community is struggling to catch up to develop guidelines that address the potential harmful power of the technology while not limiting the very real opportunities it presents for accelerating research. A series of excellent talks, some with real-time gathering of attendee opinions, covered the state of the discussion in both lay public and research community contexts. They frankly pushed the envelope farther than I expected given the brouhaha over the 3PN embryo study just last year, boldly addressing the potential to alter human genetics, even to the point of affecting the germline and future generations. ASHG is issuing a formal policy statement in the coming weeks, and I anticipate a lively ongoing discussion.
Technical challenges also remain for even the more ethically straightforward clinical applications (e.g., delivering it to enough liver cells to fix hemophelia), but in research using cell lines and model organisms, CRISPR-based techniques are being refined and applied with breathtaking speed. On the human side, a series of talks on genome-scale screens in specific cell lines showed how quickly researchers are now able to build genotype-phenotype associations. That is, which genes are associated with a certain cellular trait? The technique has provided valuable insight into the genes essential for survival, has revealed the genetic basis of cancer therapy resistance, and has helped map disease-relevant regulatory elements in the genome. Future applications include screens assessing gain-of-function—that is, what happens when a gene is turned on, not knocked out—using a disabled form of Cas9, simultaneously screening multiple cell types, and more. Efforts along these lines will benefit from the development of new tools like Casilio, a CRISPR technology developed at JAX that allows the simultaneous manipulation of multiple genes.
ASHG was about human genetics, of course, but CRISPR work in model organisms also bears mention. Recent overview articles in Science and Nature showcase the speed and power of these new methods, which allow the precise engineering of human patient variants and mutations into mouse models. Advances such as this are starting to overcome some of the difficulties encountered in translating mouse data to clinical progress, as the models become more accurate in recreating disease pathology. The back-and-forth between human patients and models is relatively new, and it is already leading to a wave of discovery and potential therapeutic targets.
For much of the decade after the Human Genome Project wrapped up, gathering genomic data was news in itself. A few years ago, the question largely shifted to what to do with the mountains of data piling up. Now, it seems, we have tools to better analyze the data and to shape genomes themselves. The ability to realize the potential of genomics is in our grasp, but first we must overcome serious issues around data hoarding, security and privacy, ethics, and more. Until then, the significance of the vast majority of our variation will remain unknown, and power of CRISPR may frighten as well as enlighten.
Mark Wanner followed graduate work in microbiology with more than 25 years of experience in book publishing and scientific writing. His work at The Jackson Laboratory focuses on making complex genetic, genomic and technical information accessible to a variety of audiences. Follow Mark on Twitter at @markgenome.