Tracking sub-genomic SARS-CoV-2 variation

Chia-lin wei, human genome sequencing
JAX Professor Chia-Lin Wei, Ph.D.

The COVID-19 pandemic has put a bright spotlight on a viral family called coronaviruses, members of which can also cause cases of the common cold. Coronaviruses, including SARS-CoV-2, the specific virus that causes COVID-19, are RNA viruses, which carry their genetic code as RNA, not DNA. The processes through which they produce proteins and replicate within cells are highly error-prone, and coronaviruses can mutate rapidly within a host population.

A lot of effort has therefore gone into sequencing SARS-CoV-2 genomes and tracking the many variants that have emerged. The work is vital for surveillance, as some variants, such as delta, gained traits that have had important effects on the spread of infection and disease. But most sequencing focuses on a particular kind of mutation known as a single nucleotide variant, where a single base changes in the genome. Research led by Jackson Laboratory (JAX) Professor and Associate Director, Genome Technology Data Science Chee Hong Wong, M.Sc., looked instead into structural variations in the viral RNA as well as separate, sub-genomic RNA (sgRNA) transcripts. They found that differences in sgRNA expression provides vital information not previously recognized regarding SARS-CoV-2 virulence and the host response to infection.

In “Reduced subgenomic RNA expression is a molecular indicator of asymptomatic SARS-CoV-2 infection,” published in Communications Medicine, the research team present findings from 81 clinical specimens, 51 of which were collected from hospital patients with symptomatic COVID-19 infections and 30 from asymptomatic people who nonetheless tested positive for SARS-CoV-2. When SARS-CoV-2 infects a cell, it produces both full-length genomic RNAs and a distinct set of sgRNAs. The sgRNAs serve as viral RNAs for the translation of multiple viral structural and accessory proteins, including the spike surface glycoprotein (S protein) and nucleocapsid protein (N protein) widely used as vaccine and drug targets. Using advanced sequencing methods, the researchers identified and characterized sgRNAs in addition to full SARS-CoV-2 genomic RNAs. What they found was that asymptomatic people had drastically lower sgRNA amounts and expression compared with symptomatic patients, indicating a lack of active viral transcription in asymptomatic infections.

Investigating further, Wei and her team examined structural deletions in SARS-CoV-2 RNAs from the samples and found subsets of deletions specific to symptomatic versus asymptomatic samples. In symptomatic cases, the deletions were both more abundant and significantly larger in size than those found in asymptomatic hosts. The additional deletions may result from more active viral replication—and more errors during the process—in the symptomatic hosts. The presence of different types of deletions and specific deletions exclusive to different host responses strongly implicates the functional significance of structural variants in SARS-CoV-2 pathogenicity. The variants also increase the complexity of the viral protein repertoire, with important implications for SARS-CoV-2 evolution under host response pressures.

The distinct structural variant signatures found in symptomatic versus asymptomatic hosts may represent viral quasispecies, with significant ramifications in viral fitness and virulence. Moving forward, the functional significance of the existence of the quasispecies as they relate to the host immune system, viral evolution, and disease transmission needs further study. In the meantime, recognizing sgRNA expression and structural diversity provides vital insight in understanding host-viral interactions, viral evolution and transmission. It also provides a path forward for developing better risk mitigation and testing strategies, as well as informing ongoing vaccine development.