Technical Articles

Developing a Deep Learning Model for Assisted Pathology Diagnosis

By Ong Kok Haur, Laurent Gole, Huo Xinmi, Li Longjie, Lu Haoda, Yuen Cheng Xiang, Aisha Peng, and Yu Weimiao, Bioinformatics Institute


As the second most common form of disease among males, prostate cancer is typically diagnosed via the inspection of tissue samples. This process, traditionally conducted by expert pathologists using a microscope, is labor-intensive and time-consuming. Furthermore, the number of medical professionals, such as pathologists, capable of doing such inspections is limited in many countries, especially when clinical workloads are high. This can lead to backlogs of samples that must be analyzed and delays in starting treatment.

Emerging partly due to the limitations of manually analyzing samples, research into the use of AI and deep learning to assist with pathology diagnosis for prostate and other forms of cancer has rapidly expanded. Several technical hurdles need to be cleared, however, before deep learning models can be developed, optimized, validated, and deployed for clinical applications. For instance, it is estimated that about 15% of digital pathology images have quality issues related to focus, saturation, and artifacts, among other issues. Moreover, image quality cannot be quantitatively assessed with the naked eye, and the whole-slide image (WSI) scanners used today produce extremely large data sets, which can complicate image processing with high-resolution images of 85,000 × 40,000 pixels or more. Additionally, similar to manual pathology diagnosis, the process of annotating images requires a significant amount of time from experienced pathologists. This process makes it difficult to assemble the high-quality database of labeled images to train an accurate diagnostic model.

The Computational Digital Pathology Lab (CDPL) at A*STAR’s Bioinformatics Institute (BII) has developed a cloud-based automation platform that addresses many of the challenges associated with deep learning–assisted pathology diagnosis, while also reducing the burden on pathologists for image labeling and clinical diagnoses (Figure 1). This platform includes A!MagQC, a fully automated image quality assessment tool developed in MATLAB® with Deep Learning Toolbox™ and Image Processing Toolbox™. The platform also includes a deep learning classification model trained to identify Gleason Patterns. In experiments with pathologists locally and overseas, the platform reduced image labeling time by 60% when compared to manual annotation and traditional microscopic examination, helping pathologists analyze images 43% faster while maintaining the same accuracy as conventional microscopic examination.

A workflow diagram of the digital pathology image platform, including A!MagQC, a fully automated image quality assessment tool.

Figure 1. The digital pathology image analysis platform, including A!MagQC and A!HistoClouds. A) shows the existing digital pathology assessment pipeline. B) illustrates the proposed pipeline in this study, which integrates A!MagQC, A!HistoClouds, and an AI model that can detect and grade prostate cancer for images scanned by multiple scanners into the existing pipeline.

Image Quality Assessment

In digital pathology, image quality problems can be broadly divided into two categories: tissue sample preparation problems and scanning problems (Figure 2). Tissue tears, folds, air bubbles, overstaining, and understaining fall into the first category; when these issues are detected and impact the diagnosis, a new sample will need to be prepared. On the other hand, when image contrast, saturation, and focus problems are detected, the existing sample can simply be rescanned, and recutting is not necessary.

A screenshot of A!MagQC showing several types of image quality problems in a sample as seen through a microscope.

Figure 2. Texture uniformity, contrast, artifacts, saturation, and focus problems detected with A!MagQC. A) shows the simple and user-friendly user interface of A!MagQC. B) shows examples of low-quality patches from the whole-slide images. C) shows the output from A!MagQC where it is able to depict the low-quality areas of a whole-slide image in the form of heatmaps.

Whether the analysis is conducted by pathologists or via deep learning models, any of these common problems can have an adverse effect. As such, A*STAR’s BII CDPL team developed image processing algorithms in A!MagQC to automatically detect the principal factors affecting image quality. The team chose MATLAB because of the specialized toolboxes it offered. For example, when images were too large to load into memory, the blockproc function from Image Processing Toolbox could divide each image into blocks of a specified size, process them one block at a time, and then assemble the results into an output image.

The team also used MATLAB tools to build the A!MagQC user interface and to compile the MATLAB code into a standalone A!MagQC executable for distribution.

Using the developed QC solution, the team quantified image quality to identify variances in color, brightness, and contrast for WSIs. This exercise ensured that the deep learning model subsequently trained would produce accurate diagnostic results for the broad range of scanners in use today.

Model Training and Testing

When analyzing a sample, pathologists apply the Gleason grading system—which is specific to prostate cancer—to assign a score based upon its appearance. Aside from normal or benign tissue, areas of the sample may include stroma (connective tissue) or tissue that is assigned a Gleason score from 1 to 5, with 5 being the most malignant (Figure 3). Before the team could begin training an AI diagnostic model to classify tissue samples, they needed to assemble a data set of image patches that was labeled with these categories. This task was completed with the help of pathologists using A!HistoClouds, which worked with images that had been checked for quality using A!MagQC. Once the team had a base set of labeled image patches, they performed data augmentation to expand the training set by reflecting individual images vertically or horizontally and rotating them by a randomly or targeted number of degrees.

Slides of various types of tissue samples scored on the Gleason scale.

Figure 3. Tissue samples showing stroma, benign tissue, and tissue scored as Gleason 3, Gleason 4, and Gleason 5. The annotated regions (each labeled with their respective categories) by pathologists in A!HistoClouds will be extracted as patches. These patches will be used for model training.

Working in MATLAB with Deep Learning Toolbox, the team created deep learning model structures using ResNet-50, VGG-16, and NasNet-Mobile pretrained networks, replacing their regular classification layers with a weighted classification layer (Figure 4). The team also used the multi-gpu option to scale from a single GPU to multiple GPUs for the deep learning model training.

A diagram of the deep learning model training structure, where regular classification layers for the ResNet-50, VGG-16, and NasNet-Mobile pretrained networks are replaced with a weighted classification layer.

Figure 4. Training structure using a weighted classification layer as a class rebalancing strategy. The weights are inversely proportional to the number of image patches to mitigate imbalance in the data set.

The model is trained and applied via an iterative process. Following the first stage of initial training on manually labeled images is a second, semiautomated stage in which pathologists review and modify predictions generated by the trained model (Figure 5). This second stage is repeated until the model is ready to be deployed by medical professionals to assist with clinical diagnoses. Step (a) requires an initial manual annotation by both junior- and senior-level pathologists. The annotations are done using A!HistoClouds, where they are extracted as patches used to train the deep learning model. This model will then output the predicted region of interests (ROI) to assist the pathologists, hence known as semiauto annotation. In step (b), the model will undergo incremental learning, where the AI-predicted ROIs are reviewed and corrected by pathologists, ROIs are extracted as patches, and the model learns from this new data. Step (b) is repeated until the model performance has achieved convergence, where in step (c), the model will be deployed to achieve fully auto annotation/fully auto diagnosis that will boost pathologists’ decision-making.

The deep learning model training structure, where regular classification layers for the ResNet-50, VGG-16, and NasNet-Mobile pretrained networks are replaced with a weighted classification layer.

Figure 5. Iterative process for training.

Next Steps

CDPL has since deployed its deep learning–assisted pathology diagnosis platform onto global cloud platforms, providing easy access to the team of pathologists working in different countries. A*STAR’s BII is currently working on validating and optimizing its deep learning model for additional clinical scenarios, including different tissue thicknesses, staining mechanisms, and image scanners. Lastly, BII is exploring opportunities to extend the same image quality assessment and deep learning workflow beyond prostate cancer to other types of cancers.

CDPL at BII also organized the Automated Gleason Grading Challenge 2022 (AGGC 2022), which was accepted by the 2022 International Conference on Medical Image Computing and Computer Assisted Intervention. AGGC 2022 focuses on addressing challenges in Gleason grading for prostate cancer, digital pathology leverage, and deep learning approaches. The challenge aims to develop automated algorithms with high accuracy for H&E-stained prostate histopathological images of real-world variations. Notably, this is the inaugural challenge in the field of digital pathology that investigates image variations and builds generalizable AI diagnostic models.

Although the challenge has concluded, the complete data set is now available for continued research.

Acknowledgments

A*STAR’s BII would like to thank colleagues at the National University Hospital (NUH), notably Professor Tan Soo Yong and Dr. Susan Hue Swee Shan, Dr. Lau Kah Weng, and Dr. Tan Char Loo, etc. for their partnership and collaboration. The NUH is duly recognized as the origin of the data and samples that have contributed to the research delineated within this work. The team is grateful for the support received from other clinical and industrial partners.

Published 2024

View Articles for Related Capabilities

View Articles for Related Industries