Defining long COVID with data

JAX researcher Peter Robinson at The Jackson Laboratory for Genomic Medicine in Farmington, CT. Photo credit: Charles Camarda JAX researcher Peter Robinson at The Jackson Laboratory for Genomic Medicine in Farmington, CT. Photo credit: Charles Camarda

JAX Professor Peter Robinson, working with the National COVID Cohort Collaborative, is working with hospital systems across the U.S. to pool COVID patient data and better define the scope of the long COVID problem and provide better clinical insight for patients.

Defining long COVID with data

By any measure, the COVID-19 pandemic has been devastating, a global march of illness and death now well into its third year.

And trouble may still lie ahead, even for those who recover from an infection seemingly unscathed. Because while the medical community has become better at preventing and treating acute COVID-19 through vaccines and anti-viral therapies, many people face a mysterious add-on to their initial infection: Post Acute Sequelae of SARS-CoV-2 Infection (PASC), which may also be referred to as long COVID.

What is long COVID? Who is most at risk? Does prior vaccination provide protection? What are the most common symptoms? How long does it actually last? These questions, and many more, are still difficult to answer, and the situation continues to change as new variants emerge and more people are infected and re-infected. To learn more, patient data is needed. Lots of patient data. And that’s where Jackson Laboratory (JAX) Professor Peter Robinson, M.D., and the National COVID Cohort Collaborative (N3C) come in.

What is long COVID?

Anecdotal evidence began to emerge in mid-2020 that recovering from a COVID-19 infection was not necessarily the end of the story. People reported various health issues, including lingering fatigue, from minor to acute and debilitating, heart problems, cognitive disruption, new-onset diabetes, and many others. In fact, a recent article in Nature reported that, in all, more than 200 symptoms have been linked with what is now known as long COVID. At the same time, however, acquiring a clear, empirical grasp of long COVID has been very difficult. The challenges are many, with one of the most obvious being the highly variable manifestations of disease. And while some cases are easily tracked from initial infection through the onset of long COVID shortly thereafter, others are not so clear cut.

It’s no surprise, then, that studies tallying long COVID incidence following a COVID-19 infection have come up with wildly different estimates, ranging from five to 50 percent. According to the U.S. CDC, more than 40 percent of American adults self-reported having had COVID-19 in the past, and of those, 19% were still having long COVID symptoms. The findings, based on data collected from a federal survey in June 2022, carry the usual caveats associated with surveys, but they provide a good overview of long COVID in the U.S. And they underscore just what a pervasive health problem it is. 

Integrating COVID-19 patient data

When the first COVID-19 cases emerged in the United States, it was quickly apparent that clinical data needed to be aggregated and shared quickly to determine best treatment practices. The United States’ highly siloed healthcare systems and non-interoperable electronic healthcare record (EHR) databases presented significant challenges, however. To address the problem, a group of leading researchers quickly formed the National COVID Cohort Collaborative (N3C) to extract and combine the data across organizations and data models. Peter Robinson joined the group and was a co-author on a paper announcing its design and deployment in August 2020. Of course at that point the focus was on acute initial infections, but as time went by and long COVID emerged as a serious health issue, Robinson and his group shifted to focus their attention on it.

By October 2021, N3C had data from more than 1.8 million COVID patients who presented with acute cases that needed medical intervention, and five million positive controls. Early studies from other groups had compared post-COVID patients with control patients who had influenza or other respiratory tract infections (RTIs) and found alarmingly high rates of new-onset psychiatric conditions in the post-COVID patients, who were about 50% more likely to develop them than control patients. From the N3C cohort, Robinson and his team, including UConn MD-PhD student Ben Coleman, were able to create 46,610 matched patient pairs with RTI control patients to enable apples-to-apples comparisons. The researchers compared the incidence of all new psychiatric diseases and specifically mood disorders and anxiety disorders for the periods from 21-120 days and 121-365 days after the detection of the initial infection.

In a paper published in World Psychiatry, the team presented findings that the long COVID patients were indeed more likely to be diagnosed with psychiatric diseases than post-RTI patients. The difference, however—about 25%—was much less than previously reported, and occurred relatively soon after infection, within the 21-120-day time period. The variability in the study results might reflect the evolving situation—do SARS-CoV-2 variants or vaccines play a role?—but also underscores the importance of continuing to study what long COVID is and what it is not as we move forward in the pandemic.

Whenever I look at a patient’s medical record, I ask what story it tells about my patient,” says Coleman. “In this study, we looked at medical records across the U.S. and we told the world the story of thousands of patients suffering from psychiatric disease months after recovering from acute infection. Their story represents a significant public health concern.”

The evolution of the pandemic

Robinson’s group has continued to analyze the N3C data to further clarify long COVID and psychiatric symptoms, as well as assessing other disorders. By March 2022 the N3C platform contained data for nearly three million COVID-19 patients, increasing the power to investigate potential patterns and predictors for long COVID within the data set. By investigating the retrospective data, Robinson hopes to shed light on how long COVID manifests, what symptoms and disorders co-occur in patients, and how clinicians can best intervene on an individual level, based on data-informed predictions of how cases are most likely to progress in patients.

“The NC3 platform represents the largest ever collection of harmonized data from electronic health records,” says Robinson. “Our group is developing semantic standards, statistical analysis pipelines, and machine-learning algorithms to leverage this data to better understand the natural history of COVID-19 and long COVID.”

While the specifics around long COVID remain cloudy, one thing is crystal clear: it will be a significant public health issue for years to come. While many cases are relatively mild and resolve within weeks or months, others can be highly debilitating, and some of the first long COVID patients have now been dealing with their symptoms for more than two years. It all adds up to a situation demanding better knowledge of the disease than we have now, and a better medical toolkit for clinicians to address it in their patients. Robinson’s efforts are vital for connecting the long COVID dots in patients and bringing structure and insight to what is, at the moment, a cloudy picture.