My research interests focus on developing and implementing computational and statistical methods to identify molecular signatures associated with cancer initiation and progression based on high-throughput transcriptomics and genomics data.
I have broad interests in developing and implementing computational methodologies that can improve our understanding in molecular mechanisms involved in cancer initiation and progression. During my postdoctoral research at Yale School of Medicine, USA, I studied how cancer-associated mutations in splicing factors result in aberrant RNA splicing and gene expression in hematopoietic disorders such as, myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). Besides, I studied the role of musashi 2 and its RNA-binding targets in myeloid leukemia, employing integrated transcriptome-wide approaches to determine RNA-protein interactome and translatome profile.
I received my graduate training (PhD) at the University of Tuebingen, Germany. Here, I was trained in both lab-techniques and informatics of next-generation sequencing data. My graduate work focused on investigating the prevalence of alternative splicing and nonsense-mediated decay, and the underlying regulatory mechanisms in plants. During my early training in bioinformatics, I developed online tools based on GeSTer (Genome Scanner for Terminators) algorithm, that generates a comprehensive landscape of intrinsic transcription terminators (RNA secondary structures) across the whole prokaryotic genomes.
Currently at the Jackson Laboratory for Genomic Medicine, my research focuses on identifying splicing signatures associated with the dysregulated expression of splicing factors in breast and ovarian cancer. Here, my computational interests are in algorithm and pipeline development for processing the transcriptomics data derived from the cancer models (eg. cell lines, patient-derived xenograft mouse) and patients. Recently, I constructed a pipeline to determine three-dimensional chromatin structures from ChIA-PET (Chromatin Interaction Analysis with Paired-End Tag) sequencing data generated in the laboratory, and strive to further develop scalable methods on local clusters and google cloud machines.
Recurrent mutations in core splicing factors have been reported in several clonal disorders, including cancers. Mutations in SF3B1, a component of the U2 splicing complex, are the most common. SF3B1 mutations are associated with aberrant pre-mRNA splicing using cryptic 3' splice sites (3'SSs), but the mechanism of their selection is not clear. To understand how cryptic 3'SSs are selected, we performed comprehensive analysis of transcriptome-wide changes to splicing and gene expression associated with SF3B1 mutations in patient samples as well as an experimental model of inducible expression. Hundreds of cryptic 3'SS were detectable across the genome in cells expressing mutant SF3B1. These 3'SS are typically sequestered within RNA secondary structures and poorly accessible compared with their corresponding canonical 3'SS. We hypothesized that these cryptic 3'SS are inaccessible during normal splicing catalysis and that this constraint is overcome in spliceosomes containing mutant SF3B1. This model of secondary structure-dependent selection of cryptic 3'SS was found across multiple clonal processes associated with SF3B1 mutations (myelodysplastic syndrome and chronic lymphocytic leukemia). We validated our model predictions in mini-gene splicing assays. Additionally, we found deregulated expression of proteins with relevant functions in splicing factor-related diseases both in association with aberrant splicing and without corresponding splicing changes. Our results show that SF3B1 mutations are associated with a distinct splicing program shared across multiple clonal processes and define a biochemical mechanism for altered 3'SS choice.