The spider web of your genome

Finding a suitable collaborator isn’t always easy, but researchers at The Jackson Laboratory and UConn Health in Farmington, Conn., are taking full advantage of their nascent partnership in the quest for better treatments and diagnostics in human disease. In this series, we delve more deeply into specific JAX-UConn Health collaborations to investigate what it takes to make collaborative research work and how it contributes to progress on the leading edge of science. 

This installment of innovative collaborations between JAX and UConn is spearheaded by Duygu Ucar, Ph.D., assistant professor at JAX. She is leading team of researchers that are developing software to analyze interactions in the genome in 3-D.  

What is 3-D genome structure and why is it important? 

The packing of a human genome is an incredible feat, with 2 meters of DNA fitting into one nucleus.This is equivalent to fitting a piece of string as long as the Empire State Building is tall under your fingernail.

The packing of a human genome is an incredible feat. Each nucleus is approximately 5 micrometers in diameter and contains about 2 meters of DNA. And according to the ENCODE Project, this is equivalent to fitting a piece of string as long as the Empire State Building is tall under your fingernail! As you can imagine, the unpacking, reading and re-packing of the DNA in the genome must be very tightly controlled. New sequencing tools are uncovering how the unpacked sections of DNA can interact with each other in three-dimensional space, leading to new insights in the relationships between (linearly) distant sections of DNA.

One such tool is ChIA-PET (e.g. Chromatin Interaction Analysis by Paired-End Tag) sequencing, which uses protein (transcription factor) binding to DNA to understand the 3-D structure of the genome and how that structure affects gene expression.

Importantly, these data can be used to uncover novel genomic insights into human predisposition to disease. For instance, the laboratory of Yijun Ruan, Ph.D., at JAX determined that a non-coding single nucleotide polymorphism (SNP) associated with asthma and autoimmune disease disrupted the normal 3-D structure of the genome and protein expression, implicating genomic 3-D structure as a potential driving factor in human health (Cell, 2015). This is important because most disease-associated SNPs found via genome-wide association studies are in non-coding regions, and many are of unknown function in the genome.

Many biologists are interested in using ChIA-PET to ask questions about their system of interest, but the computational learning curve to parse these data can be daunting. Thus, there is a need for software tools that can help biologists learn from their ChIA-PET data.

The software is called QuIN, which is short for Querying and visualizing chromatin Interaction Networks. QuIN has had input from a cohort of faculty at both institutions, including Paola Vera-Licona, Ph.D., assistant professor at UConn Health Center, and Michael Stitzel, Ph.D., assistant professor at JAX. The software engineer on the project is UConn computer science graduate student Asa Thibodeau, who joined the Ucar lab as a co-op associate. Thibodeau’s Ph.D., work is mentored by Dong-Guk Shin, Ph.D., at UConn, a computer science professor who recently finished a yearlong sabbatical at JAX.

The strength of QuIN, compared to other genome browsers that represent data linearly, is that it converts the data to a network. The nodes in the network correspond to anchors in the dataset (where the proteins are bound to the genome) or user defined sites, and nodes are connected based on the strength of the interaction and 3-D distance.  

Importantly, QuIN allows for the visualization of both direct and indirect interactions, so you can get more information from your dataset. According to Ucar, “Networks give us a more systematic view of what is impacting what.” For instance, you cannot visualize interchromatin interactions using linear methods because you can only look at one chromosome at a time. A network approach allows you to look at very distant interactions much more easily.

Many parameters of the network can be tweaked to address different questions of interest. At the same time, the researchers on the project wanted to make it as user friendly as possible.

“When I developed the software, the goal was to make it accessible for biologists with little to no computational background,” Thibodeau said. “But I didn’t want to make it a black box where you just upload the data, and the program spits out some output. I wanted to give the user flexibility to decide on what factors were important to them in their dataset.” 

The manuscript describing this tool is currently in revision. The software is already available on the web (https://quin.jax.org )to any scientist who wants to use it. It also links to publicly available ChIA-PET datasets, such as those available through the ENCODE Project. Hopefully, QuIN will enable scientists to learn more from newly generated and existing datasets and to uncover the structural genomic mechanisms that govern human health and disease.