CKB: building a leading digital cancer resource for the world

JAX lead programmer at the Clinical Knowledgebase Dan Durkin sitting at a table with his laptop.

Dan Durkin serves as a self-professed man in the middle. Durkin is the technical team lead on the Clinical Knowledgebase (CKB), a dynamic digital resource for cancer clinicians and researchers that helps them interpret complex cancer genomic profiles.

It's his daunting task to meet with curators (scientists who go through mountains of data about new cancer treatments and targets and use their input to develop features and tools that allow the newest cancer data (on genes, tumor types, treatments, and more) to be seamlessly searchable by the greater medical community.

Despite (or, more accurately, because of) these challenges, though, Daniel Durkin, B.A.Strong interests in automation, testing, and rapid application development and evolution. Dedicated to building systems to meet current and future needs. Durkin loves his position on CKB. "This is a big draw for me," Durkin says. "Having both a strong interest in science and in software, it's a great place to be. I get to answer, 'How do we build out the software that can enable us to accomplish what we need to do? What kind of tools do we need? How do we gradually scale up over time?' These questions and challenges make the work rewarding every day."

The roundabout journey to CKB

Durkin began his journey to CKB in Connecticut, where he was born and raised. Always interested in science, Durkin studied chemistry as an undergrad, then professionally worked in medicinal chemistry. Through this time, Durkin's fascination with software grew, "Working in the biomedical field combined everything I liked – I was building out applications to solve biochemical problems, it was a dream job for me!"

Durkin went on to work at the Broad Institute, where he was proud to be part of a team that worked to deliver the BioAssay Research Database (BARD), a chemical biology resource. Eventually, Durkin wanted to move back to Connecticut to be closer to family. As luck would have it, this was also when JAX began building out the JAX-Genomic Medicine facility in Farmington. Blown away by the beautiful campus and the powerful work going on within its walls, Durkin officially joined the lab in 2014 and immediately began working on the fledging CKB.

From internal tool to widely-used cancer research asset

According to Durkin, CKB started as an internal tool to help with reporting from the clinical lab. Over time it grew in terms of content and complexity. 2016 saw the release of the free version of CKB, CKB CORE, to an excellent reception. "Over 160 countries have accessed CKB Core since 2016," says Durkin, "And over 150,000 users since that time. The resource was put out there, and we just hoped it would be useful to people."

CKB CORE proved to be extremely helpful for its target audience. It provides a structured, searchable framework and clear answers drawn from very complex data. As Durkin explains, you can ask about specific genes or variants and review therapies for specific tumor types as well as related clinical trials. 

It wasn't long before clinicians and researchers were requesting access to the full datasets within CKB. CKB CORE features data for a limited set of genes, that is rotated on a regular basis. In 2018, CKB BOOST and CKB FLEX were introduced as subscription-based models that offer users full access to data on 1800+ genes and over 40,000 variants. With CKB BOOST expanded search options and additional visualizations, allow people to continue "asking" CKB questions and travel down a voyage of discovery: answering questions that lead to more questions.

These subscription models have been well received, with a healthy number of users accessing CKB daily. "We delivered CKB CORE as a sample, and people wanted more!" says Durkin. "With CKB BOOST and CKB FLEX our user base has continued to expand with people coming in from all over the world to use CKB every day." 

Producing consistent, quality data

Though there are several similar databases out there, Durkin explains that the difference with CKB is consistent, quality data. A big part of this aspect of CKB are the curators, the scientists who pour over the newest data every day and report on genes, mutations, tumor types and more. It's then up to Durkin and the CKB development team to "translate" this mountain of data into actionable applications.

According to him, the challenge isn't obtaining vast amounts of data, "The big challenge surrounds the complexity of cancer data," says Durkin. "We've tried to use controlled vocabularies as much as we can to limit the amount of free text. This keeps things consistent. It's a delicate balancing act as the data is extremely complex; papers are coming out all the time, and new data is constantly coming in."

The curators are invaluable in this. They are all very involved with molecular biology, and they work to make sure that the data making its way into CKB is immediately accessible and valuable to customers. This editing process provokes (sometimes strong) customer feedback, but Durkin says that this dialogue is one of the biggest strengths of CKB. "Conflicting and changed data emerges very often," he says, "but we establish a policy of how we're representing the data. This is the quality aspect of CKB. Customers may agree or disagree with our curators, but our policies are very clear, as well as the identity of the app. This transparency sets us apart."

The sometimes-conflicting data means that CKB will never be "complete," but Durkin embraces this flux state. "We are never going to fully future proof a tool," he says. "CKB embraces the complexity and has many tools to approach that complexity. Much of our concepts and data have changed in recent years, and that's all-in response to the newest data."

The future of cancer treatment today

Ultimately Durkin and the CKB team want to increase its reach and its impact on patient care. They are constantly asking questions like: Should we expand and go beyond human data? Do we look more at inherited disease and other gene types? While the future of CKB seems wide open, almost limitless, Durkin approaches this with the philosophy of "smart and sustainable."

Though CKB can be many things, Durkin doesn't want to lose sight of the main purpose: informing cancer treatment and patient care. "It's not the best solution to just throw more people and computational power behind the project," he says. "CKB represents lots of pieces coming together. We want people to connect the dots to the right primary references and papers. This supports assertions being made – we're trying to be as transparent as possible and give people room to explore."

Letting people "play" in CKB is a major touchstone for the team. To get at that, they do a lot of software testing, both automated and unit testing. They want to see how far CKB will stretch without breaking because they want the platform to be manipulated, used, and searched by customers. This consistent application of technology and clarity of data is all purpose-driven, "It's not easy to be consistent in tech – things move and change," Durkin says. "Whether it's the application, things the app is talking to, or deployment, things don't happen in a vacuum."

In the meantime, Durkin and the CKB team will continue to provide customers with high-quality, consistent data to be used in treating a wide variety of cancers. "The team behind CKB," says Durkin, "is great with fantastic teamwork. We want to improve, and we all have a great innovative spirit. This is indicative of JAX: a research institute that knows how to spin up resources for the community at large."