Assembling the 3D genome with ChIA-PIPE


Scientists recently celebrated the 20th anniversary of the completion of the human genome project, which produced the first full human genome sequence. It was a landmark achievement, but the two decades since have revealed genomic complexities that go far beyond the linear sequence. In fact, the three-dimensional (3D) configuration of DNA within the nucleus is now known to play an important role in genome function.

It makes sense: nucleotides that are far apart in the linear sequence may become physically adjacent when there’s a loop in the DNA. Researchers now know that particular regulatory regions that help control gene expression act in concert, even though they may be far apart in the genome sequence. It is therefore necessary to identify and study the 3D properties of chromatin (the complex of DNA and proteins in chromosomes) to understand normal regulation and dysfunction that may lead to disease.

To help investigate these interactions, researchers previously developed a technology known as chromatin interaction analysis with paired-end tags (ChIA-PET), which isolates and sequences DNA segments that interact with each other and with DNA binding proteins. Interpretation and analysis of ChIA-PET data has been understandably challenging, however, particularly with sequencing capabilities that generate hundreds of millions of reads for each ChIA-PET data set. To address this challenge, JAX Assistant Professor Sheng Li, Ph.D., Professor Chia-Lin Wei, Ph.D., and their colleagues have now developed ChIA-PIPE, a robust, fully automated pipeline for ChIA-PET data to process and analyze the massive amounts of data required.

Published in Scientific Advances, ChIA-PIPE expands the capabilities of previous analysis pipelines that identified chromatin loops, the DNA sequences of linked DNA, and protein binding peaks within the DNA sequences. ChIA-PIPE also provides data statistics and quality assessment metrics and expanded structural interpretation capabilities, including annotating enhancer-promoter (E-P) loops, a key regulatory structure. In addition, ChIP-PIPE incorporates the latest visualization capabilities using recently developed web-based visualization tools to provide high-resolution depictions of chromatin interaction, as well as information on genomic location and genomic distance. It is also adaptable enough to process data from related chromatin-mapping protocols, including HiChIP and PLAC-seq.

Because of its automation, robustness and broad capabilities, ChIA-PIPE is now the production pipeline for ChIA-PET data for two important research consortia: the Encyclopedia of DNA Elements (ENCODE) and 4D Nucleome (4DN). The authors anticipate that it will become a valuable resource for the broader research community as well.

ChIA-PIPE: A fully automated pipeline for comprehensive ChIA-PET data analysis and visualization. Science Advances.10 Jul 2020: Vol. 6, no. 28, eaay2078. DOI: 10.1126/sciadv.aay2078