Computing the etiology of cancer
By Emily Powers
What if, much like screening through genetic testing, we could learn how to better treat individual patients through their unique epigenetic markers? This is the question that Assistant Professor Sheng Li, Ph.D. tackles in her lab every day.
Acute myeloid leukemia (AML) is fast-moving cancer that develops in blood stem cells located in the bone marrow. AML is a notably hard to treat form of cancer and studies have determined that at five years, only about one-quarter of patients survive.
The low survival rate for AML correlates to a high rate of relapse in patients who undergo chemotherapy. This means that a small number of AML tumor cells evade chemotherapy and then multiply, reestablishing a population of chemotherapy-resistant tumor cells. As a result, treating relapsed patients is challenging.
Unlike many other forms of cancer, rates of DNA mutation are low in AML patients. By turning her attention to epigenetic abnormalities in tumor cells, Li can explore how AML cells beat chemotherapy and develop new ways to fight back.
What is epigenetics?
The study of genetics concerns itself with DNA—the genetic blueprint for life that codes for genes and proteins through specific ATCG sequences called nucleic acids. In contrast, epigenetics is an emerging field of biology that focuses on how our cells read this genetic blueprint, explicitly focusing on external environmental factors that change gene expression. These external factors can include environmental toxins, behaviors, aging and others, which trigger biochemical changes in our bodies that affect the way a strand of DNA is interpreted. Epigenetic processes come in many shapes and sizes, but DNA methylation is the most well-studied. During DNA methylation, a methyl group (a small molecule composed of one carbon and three hydrogen atoms) attaches to the DNA molecule. A hallmark of cancer, abnormal DNA methylation—a high number of DNA molecules with methyl groups attached—contributes to the initial formation and growth of tumors.
DNA methylation is easily quantifiable. Current sequencing technologies can determine DNA methylation levels for an entire genome or, rather, all of the genetic material in a single human. "Instead of setting very limited [parameters], like genes or a biological event, you can scan the whole transcriptome," says Li. "That's the part I find the most fascinating—converting the big data to knowledge."
A pioneer in the field
Li earned a bachelor's degree in biotechnology from Sun Yat-Sen University in Guangzhou, China before pursuing a doctorate in computational biology at Weill Cornell Medicine in New York. Her Ph.D. research consisted of a three-pronged approach to better understanding of leukemia and lymphoma epigenetics.
Initially, Li helped the FDA establish the clinical standard for RNA sequencing (RNA-seq) data analysis. RNA-seq gives researchers a complete picture of an individual's total RNA, thereby allowing them to analyze factors such as changes in gene expression. At the same time, Li developed open-source software for the analysis of DNA methylation sequencing data. Li then applied these two innovative tools to her research of cancer systems and clinical treatments.
"Li was an extraordinary graduate student, publishing 12 papers during her accelerated Ph.D. and pioneering entirely new methods in epigenetics, sequencing, and cancer biology," says Li's Ph.D. mentor, Christopher Mason. "Most importantly, she is now a trusted colleague for ongoing work between our laboratories, and a friend as well."
After completing her Ph.D. program, Li stayed on at Weill Cornell Medicine as an instructor of bioinformatics. During her time at Cornell, Li amassed publications in journals such as Nature Medicine, Nature Biotechnology, Genome Biology.
Currently, Li is focused on several machine learning projects that seek to clarify the physical and chemical interactions between epigenetic markers—such as DNA methylation—and cancerous cells. Li compares training machine learning models to when a baby first starts to learn words or learns to speak.
"As the parent, you supply them the best training materials," Li says. "If you point them to an apple and tell them 'this is an apple,' then you give them the image and, at the same time, you give them the label."
In this way, Li and her team are working to build models (babies) that can predict the behaviors of cancer cells via their large multi-omics data sets (apples).
The genome in three dimensions
Li and her team are developing 3D genome computational technology, which would allow researchers to explore genetic and epigenetic variation in terms of spatial orientation and multi-dimensional structure. The first step in the project has been to create a training data set, in this case utilizing chromatin interaction data.
Chromatin is key to our understanding of epigenetic processes and their relationship to malignant tumors. An entire set of DNA fits into the tight packaging of a nucleus because of the way long strands of DNA coil around proteins. These protein-wrapped units of DNA are called chromatin. Li is currently utilizing an algorithm (used frequently in social network platforms) to study all of the collective chromatin interactions within a genome.
"Think about this: the customers become the chromatin on the human genome," Li explains. "Different chromatin in the genome have the chance to communicate with others."
Just as companies like Facebook use algorithms to understand the users on their social network better, Li uses a formula to determine gene regulation activities at specific chromatin sites.
Over time, Li plans to build a machine learning model that "uses this 1D epigenome data [of interactions] to predict 3D genome structure."
Epigenetic changes in the chromatin landscape—such as DNA methylation—can inhibit regular communication and subsequently alter gene expression and cell function. Through 3D genome mapping, it is possible to pinpoint each site of epigenetic change and how these changes impact the site-site interactions throughout an individual's entire DNA.
Computational tools like machine learning and 3D genome mapping allow Li to address the epigenetic nuances of diseases like AML better and, as a result, make advancements in tailoring medical treatment to individual patients.
Targeted epigenetic therapy
While much of Li's research revolves around the development of computational methods and tools, she does so with the end goal of "collaborating with oncologists, cell biologists, and physician scientists together to address the questions in the cancer system."
Just as genetic testing and genetic-specific treatment have proved useful for combating diseases like breast cancer, similar epigenetic approaches show promise for improving patient outcomes. In particular, epigenetic targeted therapy drugs may be a breakthrough treatment for diseases with high occurrences of epigenetic abnormalities, such as AML.
By spearheading advancements in computational research on epigenetics and disease treatment, Li has re-written the adage from bench to bedside.
"Here, it's more like from the computer to the bedside," Li says.