Unlocking cancer image data

animated visualization of a machine learning cell 

Computational image technologies based on machine learning have the potential to greatly improve and accelerate tumor whole slide image analysis. A team led by JAX Associate Professor Jeffrey Chuang, Ph.D., has developed image processing and convolutional neural network software that can combine images of cancers from different organs to reveal how they are related to one another. This is a powerful approach for discovering the shared processes that underlie diverse cancers.

The standard method for characterizing cancer tumors has been for a pathologist to look at a thin section of the tumor on a slide under a microscope. These results have contributed to tumor classification (e.g., malignant or benign), informed diagnostic and treatment decisions, and provided research data. Nowadays molecular assays, such as tumor sequencing, also play a big role, but viewing the actual tumor tissues on slides are still an important source of information.

The traditional process has significant limitations, however. It’s obviously labor intensive, so identifying cancer patterns across thousands of slides or more for research is not feasible. Also, in any group of expert pathologists, some will classify the same tumor tissue section on the slide differently than others. As a result, cancer data in the form of digital images is growing quickly, but their value is limited by the analysis bottleneck. An accurate, automated system provides advantages not only in speed, but also in scope, and scientists have worked to develop and employ computational image analysis systems for these purposes.

Computational science based on biology

Now, advances in computational image analysis and classification have opened the potential for using machine learning-based neural networks for tumor analysis. In “Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images,” a multi-institutional team led by Jackson Laboratory (JAX) Jeffrey Chuang, Ph.D.Computational studies of cancer image and sequence data to improve treatment outcomesAssociate Professor Jeffrey Chuang, Ph.D. , demonstrates that, in addition to accurately determining tumor versus normal tissues, a particular approach for analyzing images known as convolutional neural networks (CNN) can indeed be applied across cancer types to reveal shared traits. They were also able to apply their image processing and CNN methods to detect patterns in cancers with common molecular drivers, such as TP53 mutations.

The use of CNNs has seen continual progress since the approach was first implemented in the 1980s. CNNs are actually based on our own biology, in which each cortical neuron only responds to stimuli in a specific region of the visual field. Inputs from different neurons are assembled in the neural networks of the brain, yielding our perception of the entire visual field. Thus, when “trained” on reference images, computational CNNs are able to recognize and assemble complex patterns from smaller, simpler input data. Employing CNNs offers a possible leap forward for cancer image analysis, but applying them across cancer types and tissues has remained a formidable challenge.

A boon to cancer research?

In this study, the research team used 27,815 scanned images over 19 cancer types from The Cancer Genome Atlas (TCGA), which contains a large amount of annotated cancer data, for their training and analysis. As a first step, they developed classifiers based on each cancer type, and these were able to classify tumor versus normal images for each of the cancer types with high accuracy.

CNNs can be particularly valuable for comparing image sets to detect previously unrecognized patterns. Therefore, as a second step, Chuang tested the notion that different tumor types share detectable features distinct from normal tissues. Remarkably, the CNN trained on any single tissue also successfully classified cancer versus normal tissue in other tissues, indicating that there are indeed shared features across cancer types that are not present in normal tissue. Bladder, uterine and breast cancers, in particular, were found to display features universal across cancer types.

The team also investigated whether the CNN could classify driver mutation status in different cancer types. Choosing five cancers with high TP53 mutation frequency (breast, lung, stomach, colon, bladder), the researchers found that images from one cancer type could be used to train classifiers for other types, though they needed to adjust to a more computationally intensive approach to train the neural networks to achieve higher predictive performance. Even then, some cancers, such as lung adenocarcinoma, were easier to predict than others, particularly stomach adenocarcinoma.   

“Our work shows that images of cancer, even from different parts of the body, can be rapidly computationally compared to reveal when they are similar to one another,” says Chuang. “This will be an important way to select treatments for patients in the future.”

The results underscore the potential that CNNs have for analyses across cancer types, and their importance for identifying previously unrecognized patterns in the image data. Moving forward, it will be important to mechanistically examine the patterns within images that determine their similarity, and then develop these approaches to better predict how patients will respond to treatment.