Assessing the performance of next-generation sequencing techniques in accurate genome/epigenome/transcriptome profiling and understanding the clinical and functional role of epigenome heterogeneity in the cancer evolution.
Our research focus
The primary focus of the Li Lab is on assessing the performance of next-generation sequencing techniques in accurate genome/epigenome/transcriptome profiling and understanding the clinical and functional role of epigenome heterogeneity in the cancer evolution. The Li laboratory utilizes computational and sequencing methodologies to identify and characterize the essential epigenetic lesions that guide cancer cells to evolve and escape from therapy. Our research interest is to understand the inner workings of cancer cells – the genetic and epigenetic heterogeneity that drive cancer initiation and progression. Specifically, the research involves (1) determining the drivers of epigenetic heterogeneity; (2) evaluating the functional impact of the cross-talk among epigenetic modifications on transcriptome; (3) assessing epigenetic heterogeneity/subpopulations in treatment resistance.
Dr. Li has developed a series of computational methods and software for the epigenome sequencing data analysis, to comprehensively detect the significant DNA methylation aberration and epigenetic heterogeneity during disease progression. Dr. Li's work has helped to establish the first principles and metrics for examining changes in RNA splicing and expression profiling and set standards at the FDA for clinical-grade RNA-sequencing. Dr. Li further applied these approaches to study the epigenetic heterogeneity and dynamics using acute myeloid leukemia (AML) as a model. Dr. Li found that epigenetic allele burden was linked to inferior clinical outcome, and epigenetic dynamics was related to hypervariable transcriptional regulation and was divergent from the genetic burden.
We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.