GANs in Digital Pathology
Traditionally, skilled pathologists observe tissue slides under a microscope. If slides are converted into a digital format, they can be shared through telepathology and analyzed by Machine Learning algorithms. Such methods can be used to automate the manual counting of objects or for classifying the tissue condition, like tumor classification. This has the prospective to decrease human error and increase the accuracy of diagnoses. Furthermore, state-of-the-art methods show a possibility for identifying patterns that cannot be easily discovered, even by the human eye.
It is known that deep learning methods are very powerful and common tools for a large variety of tasks in different fields. However, deep learning approaches usually require large amounts of labeled data. Manual annotation of histological images can be extremely time-consuming due to the large image size. Since such tasks need to be done by medical experts, it is not easy to advance deep learning methods in practice.
Additionally, unintended variations in the pathology image domain are produced by differences in the cutting and staining process. Intended variations can be due to other staining methods, applied to extract other or additional features from the image data. Varying staining methods showing similar morphologies, but the different texture and color features, also require separately trained image analysis models.
Generative Adversarial Networks (GANs) demonstrate the potential reduction of large amounts of manual labeling in digital pathology. Recent investigations not only enhanced measures but also introduced new applications.
In this article, I will summarize different tasks and latest developments using GAN-based methods in the field of digital pathology.
Several important architectures like GAN, cGAN, cycleGAN, InfoGAN, BigGAN, and GAN based siamese Networks are showing stable training in digital pathology for image analysis.
GAN architecture enables the image generation by mapping an unstructured latent space into an image Z X, using a generator (G). This generator (G) is trained with the goal of fooling a discriminator (D) to produce images with desired features. The discriminator (D) is trained to differentiate between real and generated samples.
In other words, GAN is a neural network model where two networks are trained concurrently, with one concentrated on image generation and the other focused on discrimination.
There are several different architectures of GANs:
For simplicity, let’s define an application from digital pathology as a translation from one input domain to another. There are three such translations: Image-to-Image (I2I), Latent to-Image (Z2I), and Image-to-Label (I2L).
All tasks in digital pathology can be assigned to these three translation settings that are mappings from one input domain to another output domain. Specific GAN-based architectures are used for solving these tasks.
1. Image-to-Image (I2I)
Mapping from one image domain X1 to another image domain X2. For this purpose, Pix2Pix, CycleGAN as well as further GANs like the encoder-decoder GAN and the adversarially trained siamese networks can be applied. Training of I2I approaches can be categorized into two major classes, namely paired (Pix2Pix and InfoGAN) and unpaired (cycleGAN, encoder-decoder GAN, and the adversarial siamese network). While paired training requires corresponding samples from the two domains for training, unpaired training only needs two individual data sets from both domains. While paired approaches usually show better performance, unpaired methods enable additional areas of application since paired data is not always available.
Several tasks in image pathology assigned to this translation:
- Stain normalization. This task reduces the color and intensity variations present in stained images from different laboratories (mapping from an original image domain Xo to a normalized domain Xn. Xn). Apart from the stain, the mapping is proposed to keep all other image characteristics. It is shown in general, that cycleGAN is highly flexible and powerful and shows a general-purpose architecture that is capable of stain normalization.
- Stain/domain adaptation. This task where not (only) the variability within one staining protocol (e.g. H&E), but between different protocols, needs to be compensated. So, the domain shift is supposed to be larger than in the case of stain normalization. The domain adaptation is usually not performed on an image but a feature level. Stain adaptation can be interpreted as a special type of domain adaptation. CycleGAN, Pix2Pix were adapted to translate between different stains. Siamese GAN was introduced for domain adaptation.
- Supervised Segmentation. The Pix2Pix network is an established powerful segmentation approach. The GAN’s advantage over other (non-adversarial) approaches here is the ability to maintain fine details through the adversarial loss. GANs were also applied to “enrich” the image domain in a segmentation setting. Instead of increasing the number of samples, the information per image was enlarged. This generated data was used to train and test a segmentation network that outperformed the baseline setting of a network trained and tested on the source domain only. However, the positive effect of this application was only shown in one very specific application in the field of kidney pathology.
- Synthesis enabling Unsupervised Learning. Cycle-GAN, encoder-decoder GAN, and Pix2Pix were introduced to generate virtual images out of label-masks for obtaining labeled training samples. Other researchers performed translation directly from the image to the label-mask domain. Translation between the image and the segmentation label-mask domain (and vice versa) shows a highly interesting field, with the potential for unsupervised segmentation. The unpaired and thereby unsupervised approach can be combined with manually labeled data to improve performance even further (dependent on the number of labeling resources). Based on the publications so far, it is hard to make a general statement, whether a translation from the label-mask to the image domain or vice versa is more effective with regards to unsupervised segmentation. Performing a translation from the image to the label-mask domain is probably a task that is easier to learn.
- Data Generation & Augmentation. CycleGAN was to translate from one tissue category to another (here from normal to abnormal). Based on existing samples it creates additional samples of the other class. The difficulty of this task is, that not only low-level image details, such as color, need to be changed. In contrast, a translation from one class to another typically requires a major change of the image morphology. It was showed that this is effective in the considered application scenario as the achieved classification performance could be increased. However, the cycleGAN architecture in general is not optimized for performing morphological changes.
2. Latent to-Image (Z2I)
An original GAN’s idea of generating images out of the noise. This task in digital pathology is to increase the data set size. Such a mapping (Z2I: Z X) is typically performed by conventional GANs, cGANs, and their various modifications such as the progressive-growing GAN and Wasserstein GAN. Furthermore, latent samples can be translated into a structured space before image generation to support interpretable changes.
3. Image-to-Label (I2L)
Mainly used for final classification in diverse applications. Theoretically, the discriminator of a cGAN and cGAN-like networks can typically be repurposed for I2L but only a few applications use the cGAN discriminator directly. Furthermore, the discriminator of most GAN variations can be used for representation learning, which can in turn be used as input of subsequent evaluation through e.g. SVMs.
Overall, methods relying on Generative Adversarial Networks (GANs) demonstrate the potential to decrease the necessity of large amounts of manual annotations in digital pathology tasks.
There is an optimism that this technology will play an important role in real-world applications of image analysis in digital pathology.
References
[1] M. E. Tschuchnig, G. J. Oostingh, and M. Gadermayr, “Generative Adversarial Networks in Digital Pathology: A Survey on Trends and Future Potential,” arXiv.org, 07-May-2020. [Online]. Available: https://arxiv.org/abs/2004.14936.