SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task.


Introduction
Generative modeling of histology images has been an active area of research in the field of computational privacy issues, and for overcoming ethical and legal barriers related to sharing image data (Price and Cohen, 2019;Krause et al., 2021).
Most of the aforementioned use cases of synthetic histology image generation require image synthesis from custom cellular layouts or user-defined tissue parameters such as cellular composition or disease grade.Another key requirement for such models is that they be able to generate nuclear or tissue level annotations along with a generated image in case these generated images are to be used for training deep learning models for cellular composition prediction or nuclear segmentation where expert annotations can be difficult and time consuming to obtain.
Despite a recent increase in synthetic histology image generation methods, there are no existing methods that can generate images and their associated annotations conditioned on input cellular layouts or user-defined histology parameters.The term cellular layout refers to a two-dimensional plane where a user can arrange arbitrary types of cells at different spatial locations on the plane.Such a bespoke layout provides flexible control over nuclei in the generated image which can then be used for various applications.For instance, they can potentially be used to overcome the data imbalance problem for CPath tasks such as cellular composition prediction in which the number of certain types of nuclei such as neutrophils can be quite small in comparison to other types such as epithelial or connective cells.
In this paper, we propose a novel generative framework called SynCLay (Synthesis from Cellular Layouts) for generating synthetic histology images from bespoke cellular layouts.To the best of our knowledge, this is the first method that can generate annotated histology images from bespoke cellular layouts.The proposed framework can also be used to generate tissue images from a set of user-defined parameters such as grade of cancer differentiation and proportions of different types of cells in an image.For this purpose, we use SynCLay in conjunction with a parametric model to first generate a cellular layout from user-defined parameters (Kovacheva et al., 2016).This integration also allows construction of visually realistic complex multi-cellular structures like glands.In order to improve the visual quality of generated nuclei as well as generating nuclear masks alongside tissue images, the proposed framework proposes a novel integration of a nuclear segmentation and classification model called HoVerNet (Graham et al., 2019).
We demonstrate the significance of bespoke synthetic images generated using the proposed approach in balancing the training data for training cellular composition prediction and nuclei presence detection algorithms for which training data can be highly biased due to data imbalance across different types of nuclei.We show that by artificially increasing the counts of minority class cells, synthetic images can be used to minimize data imbalance in training data and thereby improve the performance of cellular composition prediction algorithms for rarely occurring cell types and also detecting their occurrence in images.We also present a detailed quantitative comparison with other state-of-the-art methods for histology image synthesis based on the Frechet Inception Distance (Heusel et al., 2017) metric.Major contributions of this paper are listed below: 1. We propose an interactive framework SynCLay that can generate tissue images from bespoke cellular layouts.The proposed framework also allows the user to generate custom tissue images by adding and moving cells by changing the cellular layout.2. The proposed approach also allows generation of realistic synthetic histological images and their associated cellular counts from a set of user-defined parameters such as grade of differentiation of cancer and cellularities (cell densities) of different kind of cells.3. We incorporate a nuclear morphology loss function based on a nuclear segmentation model (called Hov-erNet (Graham et al., 2019)) into the framework to improve the quality of generated nuclei.This integration also enables the simultaneous generation of nuclei segmentation masks alongside images.4. We assess the realism of synthetic images with the help of 4 trained pathologists and show that the quality of generated images is comparable with their real counterparts.5. Finally, we highlight the benefit of using synthetically generated colorectal histology image data using the proposed framework for downstream cellular composition prediction and nuclei presence detection tasks.
The remainder of this paper is organized as follows.In the subsequent section, we go through the related work Each cell is represented by a cellular vector.Each cellular vector is passed through the mask generator neural network generating binary cellular masks (b).The bounding box for each cell is computed by its spatial location and its size.The size can be defined as a pair of lengths of horizontal and vertical diameters of a cell.These size coordinates are either taken directly from the datasets or constructed using the statistics of cellular sizes from the available data.The cellular vectors are then element-wise multiplied with the constructed binary cellular masks generating masked embeddings which are then wrapped into the positions of bounding box coordinates using the bilinear interpolation algorithm giving an intermediate tensor (c).The intermediate tensor is passed through the residual encoder decoder neural network generating the colon histology image (d).We integrate the HoVer-Net into the framework which enables the generation of the nuclei segmentation mask (e) by passing the generated tissue image to the HoVer-Net.The HoVer-Net integrated loss component also assists in curating the cellular structures inside the generated tissue image.We combine the proposed framework with the existing TheCoT framework while inference.The TheCoT model takes user-defined parameters like grade of differentiation of cancer and cellularities cells and construct the cellular layout which mimic the actual location structure of nuclei.The cellularities are real numbers between 0 and 1 are shown in different colors for respective nuclei types in the leftmost of inference part.The cellular layout is then passed to the proposed framework that generates the pair of a histology image and its nuclei segmentation mask.
on relevant methods of image generation in CPath.In section 3, we provide details of various components in the proposed framework.In Section 4, we present results demonstrating the efficacy of the proposed framework.We evaluate the importance of architecture design of SynCLay with various ablation experiments, followed by discussion in 5. Finally, we conclude with future directions.

Related Work
In the last decade, wide adoption of Generative Adversarial Networks or GANs (Goodfellow et al., 2014) has led to the generation of realistic and high-quality synthetic tissue images in CPath.For instance, Quiros et al. (2019) presented Pathology-GAN to generate high-fidelity cancer tissue images.The model also learned pathologically meaningful representations within cancer tissue images that allowed it to perform linear arithmetic operations to change high-level tissue image characteristics.Levine et al. (2020) presented an adversarial learning approach based on ProGANs to generate high-quality histology images of size 1024 × 1024 pixels.Though these approaches were able to generate high-fidelity histology images, they didn't produce matching annotations which are required for development of machine learning algorithms for various tasks in computational pathology.
Some researchers have investigated generating synthetic pathology images conditioned on tissue component masks (Mirza and Osindero, 2014;Senaras et al., 2018b).Hou et al. (2019) proposed an unsupervised pipeline to construct both histopathology tissue images and their corresponding nuclei masks to train a supervised nuclear segmentation algorithm.Their image generation model used the output from its discriminator network to assign instance weights for training the nuclear segmentation algorithm.The success of conditional GANs (Mirza and Osindero, 2014) (cGANs) in generating highfidelity images conditioned on known ground truth inspired researchers to adapt them for tissue image synthesis.Senaras et al. (2018a) proposed a cGAN based model to construct breast cancer tissue images conditioned on input nuclear masks.As the generative models are sensitive to artifacts in synthetic images, Shrivastava et al. (2017) proposed an unsupervised approach to add realism in generated images and stabilize GAN training.These methods either assume input component masks are already present, or require explicit construction of component masks by generating random shapes of respective tissue components like nuclei, which can be erroneous and may not be realistic.Moreover, this process of crafting component masks can be tricky for various nuclear structures.Generating synthetic images along with component masks simultaneously is therefore desirable as it potentially reduces the cost of annotations and also constructs realistic annotated pairs.Furthermore, none of these methods work on placing nuclei in a proper structure i.e., placing epithelial cells around the glands.It may result in unrealistic nature of nuclei locations.

The Proposed Method
Our aim is to develop an interactive framework that enables generation of colon tissue images from user-defined cellular layouts.The cellular layout can be described as a plane where users can arrange cells of distinct types on its 2-d spatial locations as shown in Figure 1.We feed this layout to the proposed framework which models spatial dependencies between cells and constructs histology image of size 256 × 256 pixels through a series of different neural networks.Tissue image generation from the cellular layout allows a user to control the locations and types of cells in the colon histology image.We integrate the framework with pretrained HoVer-Net (Graham et al., 2019) which allows generation of nuclei segmentation masks simultaneously with images.The proposed framework also allows generation of histological images by using cellular composition (counts of different types of cells) as input.This is achieved by generating a cellular layout from the input cellular composition vector through the TheCOT framework Kovacheva et al. (2016).Thus, the entire framework has two major components: First, we construct a cellular layout using from a set of userdefined parameters such as cellularities of different cells and grade of differentiation of cancer.Second, the proposed framework takes the cellular layout as an input and generates tissue images along with their nuclei masks.An overview of the framework is given in Figure 1.The implementation can be found here 1 .Below we describe the main components of the proposed framework:

Cellular Layout Representation
The input cellular layout can be assumed as a set of n cells of different types and the corresponding locations on two dimensional plane.Each cell, indexed by k = 1 . . .n, is characterized by a triplet c k = (t k , l k , z k ) comprising a one hot encoding vector t k to specify nucleus or cell type, a two dimensional location vector l k and random noise sampled from the Gaussian noise z k ∼ N(0, 1), |z k |= 4. The components of the location vector l k are normalized to the range [0, 1].The Gaussian noise is added into the cell vectors in order to ensure variable appearance of generated cellular objects in the final image.

Generation of Binary Cellular Masks
The cellular vectors {c k | k = 1, 2, ..., n} are input to a mask generator network M to generate the corresponding individual cellular binary masks {m k | k = 1, 2, ..., n}, each of the size 64 × 64 pixels i.e., m k = M(c k ; θ M ), where θ M denotes the trainable parameters of the model.The mask generator network is comprised of series of blocks having transpose convolution layer followed by the ReLU activation.The detailed architecture is given in the Appendix.
In order to generate the histology image, we need to move to from the input cellular layout to the image domain.For this purpose, we utilize the generated binary cellular masks, cellular vectors and bounding box coordinates, and construct an intermediate tensor which holds the information needed to generate the histology image using the following procedure.The bounding box coordinates {b k | k = 1, 2, ..., n} are computed from the input location parameters and the horizontal and vertical sizes of the cells.These size coordinates are either taken directly from the training datasets or can be realized from the procedure described in Section 4. Each cellular vector v k is multiplied element-wise with the individual cellular mask m k to give masked embedding of size 8 × 64 × 64 which is then wrapped to the position of bounding box using a fixed bilinear interpolation function F (Jaderberg et al. (2015)).This gives an intermediate tensor of dimensionality 8 that holds information about all cells in the given input layout.

Histology Image Generation
After generating the intermediate tensor T , we feed it to the encoder-decoder residual network (Ashual and Wolf, 2019).The image-to-image translation encoder-decoder network is used as an image generator to construct the final tissue image I = E(T, θ E ) of size 256 × 256 pixels.The network consists of a series of residual blocks which transform the input tensor into an output image.The exact architecture of the encoder-decoder residual network is provided in the Appendix.

HoVer-Net Integration
In order to support generation of nuclear masks and to refine the quality of generated nuclei, we incorporate a nuclear segmentation and classification algorithm called HoVer-Net (Graham et al., 2019) into our framework.We pass a generated image through a pretrained HoVer-Net model denoted by H and compute the nuclei mask Y as shown in Figure 1 i.e., Y = H(I).The pretrained HoVer-Net is kept frozen after incorporating into our framework.Therefore, it does not have any trainable parameters.This setting allows HoVer-Net integrated loss to get incorporated into the loss function for training the framework.

Discriminators
We employ two discriminator neural networks to make generated tissue image and its cellular components appear realistic: image discriminator D I (I; θ I ) for the generated tissue image I, and the cellular discriminator D C (I c i ; θ C ) for cellular components {I c i | i = 1, 2..c} inside the tissue image, where c is the number of cells within it; θ I and θ C denote the respective trainable parameters of those discriminators.The first discriminators employ the same architecture as the PatchGAN (Isola et al. (2017)) discriminator which predicts the realism of the different portions from the generated component mask and the tissue image, respectively.The adversarial losses based on these discriminators ensure tissue component masks and tissue images appear realistic.
The architecture of the cellular discriminator is comprised of a series of convolution operations and predicts a single score of realism for the generated glandular portions cropped out from the final tissue image based on input bounding boxes, and resized to a fixed size using bilinear interpolation (Jaderberg et al. (2015)).It ensures the generated cells, one by one, appear real with their microcomponents like nuclei and cytoplasm.

Loss Function Terms
Here we give details about all loss terms used in our framework.The training loss of the proposed framework is composed of several terms as it involves multiple networks.as described below: Cellular masks reconstruction loss: This component penalize the difference between ground truth { mk | k = 1, 2...n} and generated individual binary glandular masks {m k | k = 1, 2...n} using the mean square error (MSE) as follows, where mk is the ground truth, m k is the generated binary cellular mask generated in 3.2, and n is the number of cells in the tissue image.As we saw in section 3, m k is dependent on the trainable parameter θ M .
Image Reconstruction Loss: This term captures the reconstruction error between ground truth Î and generated tissue image I using the L1 difference, where Î is ground truth and I is the generated tissue image.
HoVer-Net Integration Loss: This term captures the label prediction error between ground truth nuclei segmentation mask Ŷ and HoVer-Net predicted nuclei segmentation mask Y using the cross entropy loss function.For this purpose, we use pretrained HoVer-Net and freeze its model parameters in order to make generated tissue image aligned with nuclei segmentation mask as a function of HoVer-Net model.
Adversarial Loss Terms: We employ an adversarial loss function (Goodfellow et al. (2014)) for both discriminators used in the SynCLay framework.A discriminator D t (X; θ t ) attempts to maximize the loss by classifying the input image X generated by the generator function G(X; θ G ) which tries to minimize it, where t denotes the type of the discriminator among the image discriminator (t = I) and cellular discriminator (t = C), θ t and θ G denotes the set of trainable parameters of the respective networks.The adversarial min-max loss function is given by, min (4) Therefore, the two adversarial loss terms: L I GAN and L C GAN for tissue image and the cellular components cropped out from the tissue image, respectively, their expressions can be realized by putting t = I and t = C in Equation ( 4), The overall learning problem can be cast as a the adversarial optimization problem based on the linear combination of adversarial and reconstruction losses.The framework is trained by minimizing the following objective L: where λ 1 , λ 2 , λ 3 , λ 4 and λ 5 denote the weights of corresponding loss components.

Inference using User-defined Parameters
The trained framework is able to generate histology images from input cellular layouts.The cells and their locations can be altered in order to change the appearance of generated images.The sample can be seen in Figure 6.To give flexibility of generating annotated images, we utilize a parametric model to construct the cellular layout from user-defined parameters.
To enable this, we integrate the proposed framework with the existing TheCoT model (Kovacheva et al., 2016) in order to enable the image generation from user-defined parameters such as grade of differentiation of cancer and cellularities (cell densities) of different types of cells.The rest of the parameters consumed by the TheCoT model such as image size = 256 × 256, magnification = 40×, cell overlap = 0 are kept fixed in this experiment.
The framework first estimates the number of cells and glands based on the image size and magnification.It then computes number of each of the distinct cells based on their input cellularities between 0 and 1.Based on the grade of differentiation parameter, it draws glands and place epithelial cells along its surface.The rest of the cells are placed according to uniform distribution.The sample cellular layout generated using TheCoT model can be seen in Figure 1.

Experiments and Results
In this section, we present visual and quantitative results of the quality of generated images and comparison of SynCLay with other state-of-the-art models employed for high-quality image generation.We also assess the quality of synthetic images with the help of expert pathologists.We demonstrate the utility of bespoke synthetic images in reducing the class imbalance in training datasets and for improving the performance of the cellular composition prediction algorithm.Finally, we demonstrate the importance of the SynCLay framework design and validate its loss components using Ablation study.As a side experiment, we also investigate the results of employing graph convolution neural network to model spatial dependencies between cells on the cellular layout for generation of synthetic images.

Datasets
For training and performance evaluation of the proposed SynCLay framework, we require the image data annotated with cellular layouts.In this work, we consider two datasets for our experiments: CONIC 2 (Graham et al., 2021a,b) and PanNuke 3 (Gamper et al., 2019(Gamper et al., , 2020)).CoNiC Dataset: This dataset is collected from the CoNiC challenge (Graham et al., 2021a,b).Overall, it contains 4,981 Haematoxylin and Eosin stained colon histology image of size 256×256 coupled with corresponding nuclei segmentation mask of the following nuclei types: epithelial, lymphocyte, plasma, eosinophil, neutrophil and connective tissues.From these images, we use 3918 images Both of these datasets contain histology images and their corresponding nuclei masks.In order to obtain cellular layouts, we adopt the following procedure: First, we extract nuclei objects from the nuclei segmentation masks using OpenCV (Open Source Computer Vision Library) python library (Bradski (2000)), and locate their centroids.Then we collect bounding boxes for each of the cells using boundingRect() function of the same library.

Model Training
We perform the training of SynCLay framework in two phases.First, we train the framework without HoVer-Net for 100 epochs.We train HoVer-Net separately on the same training set of respective dataset.Later, we integrate this pre-trained HoVer-Net into the SynCLay and perform iterations of next 20 epochs.

Visual Assessment
Figures 2 shows generated colon images from the CoNiC test set.We observe that shapes, morphological characteristics, glands, glandular lumen, and cellular appearances are preserved in the generated images, which resemble the corresponding real images quite closely.In addition, although cells can be clearly distinguished, some moderate deformities in epithelial cells are visible.We can also notice the actual nuclei segmentation mask closely resemble the output segmentation masks.Visual results on a representative image from the PanNuke test dataset are shown in Fig. 3.As with synthetic images   from CoNiC dataset, distinct nuclei structures are apparent in generated images from the PanNuke dataset.For instance, lymphocytes that are usually in darker shades, are clearly visible in generated images and computed segmentation masks (in yellow color).
In order to do visual comparison we consider stateof-the-art conditional generative models such as Pix2Pix (Isola et al., 2017) and CycleGAN (Zhu et al., 2017) that are utilized to generate images conditioned on the input nuclei masks.We train both of these models on both CoNiC and PanNuke training sets defined in section 4.1.The Pix2Pix network includes the encoder-decoder generator pipeline with convolution operations for downsampling and up-convolution operations for upsampling along Figure 6: Change in the appearance of synthetic colon tissue image and its nuclei mask after adding lymphocyte on its cellular layout.The last column shows image alterations after moving the added lymphocyte.1: Assessment of generated tissue images and associated glands by 4 pathologists (P 1 ,P 2 , P 3 and P 4 ).The images were scored between 1 (least realistic) to 10 (most realistic).The average scores show that the synthetic images are not distinguishable from the real images.The synthetic images used in this experiment are generated using SynCLay variant (with graph convolution network) discussed in the ablation study.

Images
with skip-connections among they layers to pass low level details for the generation.CycleGAN is an image-toimage translation framework that gets trained on unpaired examples.We also consider the residual encoder-decoder network (Ashual and Wolf, 2019) as a baseline and adapt it for the task of generating images conditioned on input masks.It is interesting to note that these existing models need input nuclei masks for inference while our model needs cellular layouts, whose construction is relatively easy and can also be done from user-defined parameters using the TheCoT model (Kovacheva et al., 2016).
Figure 4 shows generated images using baseline models and our models.It can be visibly noticed that the nuclei generated using our framework exhibit finer details compared to baseline models.From the figure we can notice that glandular lumen generated by our framework matches closest to that of the original.The residual encoder decoder network produces visually better tissue im- ages compared to Pix2Pix.This also validates our choice to use residual encoder-decoder network as a backbone generator against the pix2pix generator.Figure 5 shows the image generated with the help of TheCoT framework from the set of cellularities of different types of cells.It can be observed that the generated samples appear realistic.
We also observe changes in appearance of histology images after addition of nuclei or changing their positions.Figure 6 shows visual results after adding a lymphocyte gets added into the fixed cellular layout and after altering its position.We can also see the appearance of added lymphocyte (in yellow color) in the generated cellular mask.This shows the flexibility of our interactive framework that can be used for customized colon histology image generation.

Assessment by Pathologists
To assess the realism of synthetic histology images generated using SynCLay framework, we requested 4 pathologists to rate each generated image from 1 (least realistic) to 10 (most realistic).The pathologists were presented with a set of total 30 images from which 15 were real and 15 synthetic.The set contained images generated from different cellular compositions and grades of differentiation.We also included some unnatural images in the set by manipulating nuclei locations such as placing lymphocytes inside glands.
We requested the pathologists to score each image as well as the associated appearance of nuclei.The scores regarding the quality of the generated tissue images are provided in Table 1.We can clearly see that the synthetic images obtain similar average realism score (8.31) compared to that for the real images (7.92).As can be observed from the Table, average score of synthetic nuclei appearance (8.25) is similar to that of the real nuclei (8.68).Therefore, it can be argued that images generated by the proposed framework appear realistic.However, some pathologists argued that delineation between cells was not clear.Few pathologists found that cytoplasm seemed artificial.Besides, one or two pathologists were able to identify unnatural images generated for the purpose of experiment.

Quantitative Analysis
The Frechet Inception Distance (FID) (Heusel et al. (2017)) is a widely used metric to evaluate the quality of generated images (Quiros et al., 2019;Levine et al., 2020;Deshpande et al., 2022) as it quantifies the network's ability to reproduce original data distribution.We assessed the quantitative performance of visual similarity between real tissue images and the corresponding synthetic images generated using the SynCLay framework by computing FID between the two sets of images.
Table 2 shows FID scores between a set of real and generated tiles computed by both our framework and other state-of-the-art (SOTA) models on the CoNiC and Pan-Nuke datasets respectively.The lowest FID score between real and synthetic images for the CoNiC dataset suggest that the convolutional feature maps computed from the synthetic images are close to the ones obtained for real images.Though FID score for the SynCLay framework is slightly lower than for the residual encoder-decoder model, it needs to be considered that the SynCLay images are generated from the cellular layouts instead of nuclei masks.Crafting masks can be more cumbersome than drawing up cellular layouts as latter do not require drawing realistic shapes of respective nuclei.Overall, our results suggest that synthetic images are close to realis- tic images and can potentially be used in computational pathology applications.

Synthetic Images for Performance Improvement of Cellular Composition Prediction
The cellular composition prediction refers to the task of evaluating the presence and counts of different types of cells in the tumour microenvironment of Haematoxylin and Eosin (H&E) stained histology images.The cellular composition analysis can be useful for various downstream prediction tasks in CPath such as survival prediction (Shaban et al., 2019;Ko et al., 2021), gene expression and biological process (Zhan et al., 2019) and recurrence prediction (Ji et al., 2019).
The CoNiC dataset is highly unbalanced with respect to counts of different types of cells, as shown in Table 3. From the table, it can be noticed that neutrophils, and eosinophils are present in a small number of tissue images, additionally their overall incidence in the dataset is less than 1%.This may affect the performance of cellular composition prediction and nuclei presence detection tasks.In this section, we evaluate the applicability of the proposed SynCLay model to boost the performance of these tasks using a modified version of the ALBRT (Dawood et al., 2021) model, which uses five branches (All, Left, Bottom, Right and Top) for predicting the counts of different types of cells in an input image and in their corresponding left, right, top, and bottom halves.We trained ALL branch of ALBRT model for predicting the counts of Neutrophil, Epithelial, Lymphocytes, Plasma, Eosinophil and Connective cells present in a patch.Originally, ALBRT was trained using pairwise ranking loss.However, for this experiment we used Huber loss.For baseline results, we trained ALBRT model on  CoNiC training dataset and assessed model performance on CoNiC validation set.We then balanced the distribution of cellular counts by generating synthetic colon images given the cellularites of different cell types using the proposed SynCLay model.Table 3 shows the distribution of counts of different types of cells in the training dataset used for training ALBRT before and after data augmentation.We assessed the performance of ALBRT model using both real and synthetic data using the same experimental protocol used for getting baseline results.
We evaluated the predictive performance of both the models in detecting the presence of different types of cells in a patch using AUC-ROC as performance metric, while for cellular counts prediction we reported Pearson's correlation coefficient, Spearman's correlation coefficient and R2 score between ground truth and predicted cellular counts.The performance comparison of both the models for cellular composition prediction task, and cell presence detection task can be seen in Figure 7.
From Figure 7, it can be seen that cellular composition and cell presence detection performance of the AL-BRT model improved significantly for neutrophils and eosinophils after addition of cell-type specific synthetic images in the training data generated using the proposed framework.For example, the AUC-ROC improved by 4% and 6% for neutrophils and eosinophils, respectively, when using synthetic images in training.Spearman's correlation coefficient registered an increase of 11% for neutrophils, and 8% for eosinophils.Similarly, Pearson's correlation coefficient demonstrated the gain of 7% for neutrophils and 5% for eosinophils.The improvement in performance validates the utility of customized synthetic images for the task of cellular composition prediction.The performance metrics for rest of the cells including epithelial cells, lymphocytes, plasma cells and connective tissues look nearly equal.This might be due to the fact that the cellular counts and number of images containing those cells were relatively high and sufficient already for training the ALBRT model.

Ablation Studies
In this section, we perform an ablation study to examine the importance of each of the loss terms used to train the SynCLay framework.We also assess and compare the performance of SynCLay after incorporation of the graph neural network for processing cellular vectors to generate histology images.

Assessments of Loss Terms
We train the SynCLay network by using different combinations of the loss terms discussed in section 3.6.The visual and quantitative results of this experiment are presented in Table 8.
It can be observed that the generated images exhibit major and minor patch artifacts after removal of L1 & L2 loss and by detaching the cellular discriminator respectively, specifically in first four rows in Table 8.We can also notice that enabling HoverNet integrated loss moderately increased the fidelity of generated images in terms of FID, for instance, FID reduction from 113.33 to 85.80 in first two rows and from 84.64 to 82.32 in next two rows in Table 8.It is interesting to notice that the model trained with all of the loss terms shows almost equal FID with or without HoVer-Net but enabling HoVer-Net improves the visual quality of synthetic images.Specifically, the nuclei structures and lumen components inside the tissue images.

Graph Convolution Operation
Methods by (Johnson et al., 2018;Ashual and Wolf, 2019) generate natural images from the input scene graphs.They employ a graph convolution operation to process objects in the scene graphs along with their interdependencies.It is obvious that the graph convolution operation can be employed to process cellular vectors of cells inside the cellular layout.We incorporated the graph convolution operation in the SynCLay architecture shown in Figure 1.The modified architecture can be seen in Figure 9. Specifically, we construct the cellular graph by computing Delaunay Triangulation (Ito, 2015) over 2dimensional cellular locations.The edges between cells are evaluated by computing Euclidean distances between spatial locations.Cellular graph is processed by a graph convolution network (GCN) (Scarselli et al., 2009;Johnson et al., 2018) that generates the per-cell latent embedding.These latent embeddings now act as the cellular vectors while rest of the architecture remains the same as described in the section 3.
We train the SynCLay framework with GCN on the CoNiC train set with same settings used while training SynCLay without GCN with varied loss terms as described in the previous section, and plot the results in Table 10 akin to in Table 8.From both the tables, it is apparent that adding GCN adds little improvement as both cases are showing similar visual and quantitative measures.A potential reason can be that constructing the nuclei shapes is relatively less complicated compared to that of objects in natural images (Johnson et al., 2018;Ashual and Wolf, 2019).Therefore, using SynCLay without graph convolution network can be thought as an advantage in the CPath domain as the architecture becomes relatively computationally and memory efficient compared to the one with graph convolution network.

Discussion
A limitation of our method is that it requires cellular layouts to generate high-quality colon histology images.However, we have shown how layouts can be created using the THeCoT method (Kovacheva et al. (2016)).We have also presented utility of these masks and generated synthetic images in augmenting limited data for the cellular composition prediction algorithm.After combination of the proposed framework with TheCoT model enables the generation of histology images from a set of user-defined parameters such as grade of differentiation, cellularities of different cells.
The proposed SynCLay framework can be thought of as a crucial step towards construction of complete whole-slide images from a set of user-defined parameters.Whole-slide images are generally multi-gigapixel in nature and have great mixture of various tissue components with large number of inter-dependencies among them.The object layouts where tissue components can be placed on the Cartesian plane can be generated or acquired using segmentation techniques depending on the prediction task or using frameworks similar to TheCoT.The proposed framework can be adapted to consume object layouts to construct the whole-slide images.The framework can potentially be useful in generating an exhaustive annotated data for training and evaluation of tasks like cellular composition prediction and nuclei segmentation in the domain of computational histopathology.Moreover, the discriminator output can be utilized to compute importance weights as given in (Hou et al. (2019)).The weighted training can potentially increase the performance of nuclei segmentation algorithms.

Conclusions & Future Directions
We presented a novel interactive framework SynCLay for generation of synthetic colon histology image tiles from cellular layouts.We showed that the synthetic tissue images generated by the framework appear realistic and preserve morphological characteristics in the tissue regions.We also assessed quality of the generated tissue images using the FID metric.Synthetic images generated using our framework showed consistently low FID and outperformed those generated by other SOTA models by a significant margin.We assessed the quality of generated images with the help of pathologists and found that synthetic images appear highly realistic from the pathologist's point of view.We demonstrated that the synthetic image tiles constructed from our framework, accompanied with a definitive ground-truth generated by a parametric model, can be used for development of deep learning algorithms for computational histopathology tasks like cellular composition prediction, especially when we have highly limited and unbalanced data.
The proposed framework may be used to extend already existing segmentation datasets for histology image analysis, enabling researchers to improve the performance of automated segmentation approaches for computational pathology.This framework can be generalized for producing a large number of customized images for different types of carcinomas and tissue types.An open future direction for research is to extend the proposed framework to generate complete whole-slide images from known parameters.

Figure 1 :
Figure 1: An overview of the proposed SynCLay framework for generating colon histology images from the bespoke cellular layouts.The cellular layout (a) define the spatial location of different types of cells (shown by different colors).Each cell is represented by a cellular vector.Each cellular vector is passed through the mask generator neural network generating binary cellular masks (b).The bounding box for each cell is computed by its spatial location and its size.The size can be defined as a pair of lengths of horizontal and vertical diameters of a cell.These size coordinates are either taken directly from the datasets or constructed using the statistics of cellular sizes from the available data.The cellular vectors are then element-wise multiplied with the constructed binary cellular masks generating masked embeddings which are then wrapped into the positions of bounding box coordinates using the bilinear interpolation algorithm giving an intermediate tensor (c).The intermediate tensor is passed through the residual encoder decoder neural network generating the colon histology image (d).We integrate the HoVer-Net into the framework which enables the generation of the nuclei segmentation mask (e) by passing the generated tissue image to the HoVer-Net.The HoVer-Net integrated loss component also assists in curating the cellular structures inside the generated tissue image.We combine the proposed framework with the existing TheCoT framework while inference.The TheCoT model takes user-defined parameters like grade of differentiation of cancer and cellularities cells and construct the cellular layout which mimic the actual location structure of nuclei.The cellularities are real numbers between 0 and 1 are shown in different colors for respective nuclei types in the leftmost of inference part.The cellular layout is then passed to the proposed framework that generates the pair of a histology image and its nuclei segmentation mask.
2 https://conic-challenge.grand-challenge.org/ 3 https://jgamper.github.io/PanNukeDataset/for training (CoNiC train set) and rest for testing (CoNiC test set).PanNuke Dataset: This dataset includes semiautomatically generated nuclei instance segmentation masks with exhaustive nuclei labels across 19 different tissue types.The dataset consists of 481 visual fields, of which 312 are randomly sampled from more than 20K whole slide images at different magnifications, from multiple data sources.From these regions, we keep 2656 images for training (PanNuke train set) and 2522 for testing (PanNuke test set).

Figure 2 :
Figure 2: Samples of generated images and nuclei segmentation masks from the CoNiC dataset.The locations of distinct cells are shown in different colors in the topmost row of cellular layouts.

Figure 3 :
Figure 3: Samples of generated images and nuclei segmentation masks from the PanNuke dataset.The locations of distinct cells are shown in different colors in the topmost row of cellular layouts.

Figure 7 :
Figure 7: Improvement in various performance measures, (a) Spearman's Correlation (b) R2 Score (c) Pearson's Correlation (d) AUC-ROC Score for the cellular composition prediction task after augmenting limited data with synthetic images.

Figure 8 :
Figure 8: Ablation study showing importance of various loss components used in training the proposed framework.Bottom most row shows the generated tissue images with using all of the available components.Top rows show synthetic images generation by keeping some of the total lost components.

Figure 9 :
Figure 9: SynCLay with Graph Convolution Network.The architecture is very similar to the one shown in 1.The only change is a construction of cellular graph (b) and processing it with the graph convolution network (c).

Figure 10 :
Figure 10: Ablation study showing importance of various loss components used in training the proposed SynCLay framework with the graph convolution network.Bottom most row shows the generated tissue images with using all of the available components.Top rows show synthetic images generation by keeping some of the total lost components.

Table 3 :
Distribution of CoNiC train set.It shows the total cell counts of distinct cell types and also number of images having occurrence of those cells.