Tools & Data

What tools and resources can I use to analyze and visualize my data?

JAX Web Applications

  • Diversity Outbred Database (DODB), citation: https://dodb.jax.org - DODB provides a location for investigators to submit their DO experiments including associated genotype array data (MUGA platforms), clinical phenotype data, associated publication(s) and other types of supporting data such as: RNA expression, eQTL, proteomics, metabolomics, etc...  The download interface acts as the primary entry point for researchers who want to interrogate existing DO studies that have been submitted.  There is also an application programming interface (API) to allow data analysts to pull study information from languages like R or Python.  DODB allows searching genotype and phenotype data for DO experiments by Publication, Investigator, Project, associated data types submitted with projects, and provides the ability to download full studies or subsets of studies in zipped CSV or R/QTL2 formats.  While a user can download subsets of a study's samples, genotypes or phenotypes; other supporting data such as expression, proteomic and metabalomic data are provided as whole project file downloads.  As the DODB query interface evolves to be the Diversity platform's entry point for interrogating data, we will begin to allow a user to launch some analysis pipelines, and have visualization endpoints allowing a user to explore the analyzed results (QTLViewer being an example endpoint).
    • Contacts: Gary Churchill, Dave Walton, Anna Lamoureux
  • Mouse Phenome Databasehttp://phenome.jax.org/ - A highly curated phenotype data repository and analysis platform for experimental mouse data.  This will be the primary repository for data from BXD, CC and DO phenotyping experiments.  Part of our current initiative is to fully integrate MPD and DODB, so that data submissions for the two web applications are collected through one intake and curation platform, ensuring that data is stored once, is consistent and well curated between the two specialized applications.  We are also planning integration between the two interfaces so that a user is taken to the tool that best meets their investigative needs. The data intake and curation platform that is currently under development will allow an investigator to submit their data online, and do some initial self curation before submitting the data to our team of professional curators.
    • Contacts: Elissa Chesler, Molly Bogue, Dave Walton
  • QTL Viewer, source code: https://github.com/churchill-lab/qtlweb, example instance for DO Islet: https://churchilllab.jax.org/qtlviewer/attie/islets (associated with paper: Keller, et al. Genetic Drivers of Pancreatic Islet Function, PMID 29567659) -  QTL Viewer is an interactive web-based analysis tool that will allow users to replicate the analyses reported for a study (in the example link, the afore mentioned paper). It includes the ability to search various subsets of data from a study such as phenotypes or  expression data and then visualize data with profile, correlation, LOD, effect, mediation and SNP association plots.
    • Contacts: Gary Churchill & Matt Vincent
  • HaploQA - https://haploqa.jax.org/ - A web application for performing haplotype analysis of genotype calls from the “MUGA” platform genotyping arrays. The application was developed at the Jackson Laboratory to facilitate genetic quality assurance of mice using genotype data derived from these platforms.  The tool allows the community to examine data sets which have been publicly released by viewing Karoytype plots generated by haplotype reconstructions.  An individual can also contact the team to register and receive an account, which will allow them to upload their own data (MegaMUGA or GigaMUGA genotypes), have haplotype reconstructions run, and examine their data using a private account, and if they choose, share that data publicly.  An individual can also set up their own private instance, with source code available here: https://github.com/TheJacksonLaboratory/haploqa
    • Contacts: Laura Reinholdt, Keith Sheppard, Anna Lamoureux, Dave Walton
  • GeneWeaver - http://www.geneweaver.org/ - A system for the integration and analysis of heterogeneous functional genomics data. A powerful tool for mapping mouse data to other species and for discovery of gene → trait interactions.
    • Contacts: Elissa Chesler, Erich Baker (Baylor University) 
  • Synteny Browser - http://syntenybrowser.jax.org/ - Conserved synteny describes a condition in which common ancestry is reflected as similar genome feature content and order along a chromosome in different species. The JAX Synteny Browser allows users to search for and selectively display genome features within syntenic blocks according to the biological and functional annotations associated with the features. The most common use cases for this tool involve uploading genomes of two or more species and explore the relationships of conserved synteny between the two at different levels of depth and detail. Based on a selected reference (aka source) and comparison (aka target or destination) species, users have the ability to search for conserved features by name, function, or phenotype in the reference genome and can investigate the corresponding matched features within the comparison genome.
    • Contacts: Carol Bult, Anna Lamoureux

JAX Command-Line Tools

  • g2gtools - http://churchill-lab.github.io/g2gtools/ Genome Editing tools. Creates custom genomes by incorporating (phased) SNPs and indels into reference genome, extracts regions of interest, e.g., exons or transcripts, from custom genomes, and converts coordinates of files (bam, gtf, bed) between two genomes.
    • Contacts: Gary Churchill, Kwangbom Choi
  • alntools - https://churchill-lab.github.io/alntools/ Processes NGS alignments into a sparse compressed incidence matrix. Stores pre-defined binary format for efficient downstream analyses and storage.
    • Contacts: Gary Churchill, Kwangbom Choi
  • EMASEhttp://churchill-lab.github.io/emase-zero/ - An expectation maximization algorithm for allele specific expression. Primary author K. Choi of the Churchill Lab. Published in Bioinformatics by Raghupathy/Choi, et al. Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression, 2018 PMID: 29444201.
    • Contacts: Gary Churchill, Kwangbom Choi
  • gbrshttp://churchill-lab.github.io/gbrs/ - Genotype-free genome reconstruction and ASE quantification.  Example use case deducing potential sample mixups by comparison of GigaMUGA haplotype reconstructions to haplotypes deduced from islet RNA-seq-based genotype-by-sequencing method (mentioned in Chick/Munger et al. Defining the consequences of genetic variation on a proteome–wide scale, 2016 PMID: 5292866).
    • Contacts: Gary Churchill, Kwangbom Choi 
  • Intermediatehttps://github.com/churchill-lab/intermediate - An R package for eQTL/pQTL mediation analysis (mentioned in Chick/Munger et al. Defining the consequences of genetic variation on a proteome–wide scale, 2016 PMID: 5292866)
    • Contacts: Gary Churchill, Petr Simecek

Other Resources

  • R/qtl2http://kbroman.org/qtl2/ - An R package for QTL analysis for high-dimensional data and complex crosses.  This is a reimplementation of the QTL analysis software R/qtl, to better handle these types of data.  We consider this to be the de facto package for command-line analysis of DO data.
    • Contact: Karl Broman - University of Wisconsin - Madison
  • GeneNetworkhttp://www.genenetwork.org/webqtl/main.py - GeneNetwork is a web based genetics platform, formally known as WebQTL. This is considered to be the primary web platform for searching and interrogating data from BXD lines. 
  • Systems Genetics toolhttp://www.systems-genetics.org/ - This web tool makes use of the multilayered datasets from the BXD mouse population to expedite in silico gene function prediction through a series of integrative and complimentary systems analytical approaches.  The data are pulled from GeneNetwork, and is a collaborative project involving Rob Williams.  Associated paper PMID: 29199021 - Li H et al, An Integrated Systems Genetics and Omics Toolkit to Probe Gene Function, 2017
    • Contacts: H Li, Laboratory for Integrative and Systems Physiology, Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland.admin.auwerx@epfl.ch.
  • CC8Works/CC8Scheme/CCDBhttps://sourceforge.net/p/cc8works/wiki/Home/ - CC8Works is a Web-based software system for managing and documenting a cross of eight existing inbred mouse lines to yield hundreds to thousands of recombinant inbred lines that will become a permanent genetic resource for biomedical research. CC8Scheme systematically tests all available funnel pairs to optimize over the given parameters. CC8scheme builds a design scheme by stepwise addition of the funnel pair that would most improve the balance of the existing set of funnel pairs. The resulting designs avoid using strain combinations known to be infertile or unproductive while still achieving the best possible balance. Collaborative Cross Database (CCDB) supports the task of maintaining a randomized mating scheme in the breeding colony based on a specified balanced mating design.
    • Contact: Ken Manly, Department of Biostatistics, University at Buffalo manly@buffalo.edu
  • QTLRel, CRAN page https://cran.r-project.org/web/packages/QTLRel/index.html, paper https://www.ncbi.nlm.nih.gov/pubmed/21794153 -  Tool for quantitative trait mapping in populations such as advanced intercross lines where relatedness among individuals should not be ignored. It has been adapted for DO mapping and includes functions to estimate additive and dominance components of kinship.
    • Contact: Riyan Cheng

For general questions associated with these tools and data, send email to diversity-mice-support@jax.org.