ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Method Article

A deep learning segmentation strategy that minimizes the amount of manually annotated images

[version 1; peer review: 2 approved with reservations]
PUBLISHED 30 Mar 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

This article is included in the NEUBIAS - the Bioimage Analysts Network gateway.

This article is included in the Bioinformatics gateway.

Abstract

Deep learning has revolutionized the automatic processing of images. While deep convolutional neural networks have demonstrated astonishing segmentation results for many biological objects acquired with microscopy, this technology's good performance relies on large training datasets. In this paper, we present a strategy to minimize the amount of time spent in manually annotating images for segmentation. It involves using an efficient and open source annotation tool, the artificial increase of the training data set with data augmentation, the creation of an artificial data set with a conditional generative adversarial network and the combination of semantic and instance segmentations. We evaluate the impact of each of these approaches for the segmentation of nuclei in 2D widefield images of human precancerous polyp biopsies in order to define an optimal strategy.

Keywords

Deep learning, image annotation, semantic and instance segmentations, conditional GANs, nuclei segmentation

Introduction

Over the last decade, deep learning approaches have outperformed all existing methods for image segmentation14. Semantic segmentation, the estimation of a label at each pixel, and instance segmentation, the identification of individual objects, were successfully applied to spatially characterize biological entities in microscopic images58. However, these powerful approaches rely on large annotated datasets. While more and more datasets become publicly available9,10, annotated data for every combination of modalities, tissues and biological objects is far from completion. Therefore, procedures to efficiently build training datasets are re quired to use the full potential of deep learning-based segmentation at a single biological lab scale.

In this paper, we propose a strategy to minimize the amount of time dedicated to manually annotate images and investigate several approaches to maximize accuracy when only using one annotated image. We apply this strategy to segment nuclei stained with DAPI in widefield images of human colorectal adenomas (i.e. precancerous polyps) as follows. First, we take advantage of existing training datasets11,12 and massive data augmentation to obtain a preliminary segmentation. We then use an open source annotation software12 to manually correct this segmentation and consequently define the training dataset. Next, we simulate synthetic images using a conditional generative adversarial network (GAN)13 to increase the size of the training dataset. Finally, we combine U-Net14,15, a semantic segmentation approach, and Mask R-CNN16, an instance segmentation approach, to improve the nuclear segmentation accuracy.

Methods

Sample preparation

In this study, we used the Medical University of South Carolina (MUSC) pathology laboratory information system CoPath (Cerner Corporation, Kansas City, MO), to identify a convenience sample of colorectal adenomas excised from patients who underwent a sigmoidoscopy or colonoscopy with polypectomy between October 2012 and May 2016. For each patient, we obtained a formalin-fixed, paraffin-embedded (FFPE) tissue block and prepared one H&E and 5, 5-micron sections for immunofluorescence (IF) on FFPE tissue. Prior to the start of the IF procedures, all antibodies were optimized and reviewed by the study immunologist, the pathologist, the epidemiologist, and laboratory personnel to ensure agreement and proper staining. The MUSC Institutional Review Board has approved the research study (IRB # PRO-00007139).

Image acquisition

DAPI was used for nuclear counterstaining. Stained slides were mounted with ProLong™ Gold Antifade Reagent (Cat. # P36934, ThermoFisher) and imaged using the Akoya Vectra® Polaris™ Automated Imaging system (Akoya Biosciences, Marlborough, MA). Whole slide scans were done at 20X magnification and regions of interest where chosen randomly.

Deep learning code

U-Net, Mask R-CNN and pix2pix were coded in Python and used the Python libraries numpy17, tensorflow18, keras19, scipy20 and scikit-image21.

Training dataset

The training dataset consisted of three 1868 x 1400 images manually annotated with Annotater12. Only one image was used to train U-Net and Mask R-CNN as well as pix2pix (conditional GAN) for most of the study. The two other images were added to the training dataset in the last section to be compared with the combination of results obtained with U-Net and Mask R-CNN (see Figure 3).

U-Net training

The annotated 1868 x 1400 image was divided into six 622 x 700 images for training: five of these images were included in the training dataset while the last one defined the validation dataset. As U-Net is a semantic segmentation approach, three classes were defined to allow separating nuclei as proposed in 22: inner nuclei, nuclei contours and background. To facilitate nuclei separation, the nuclei contours in the training dataset were dilated22. To limit over-fitting, the imaging field for images in the training dataset was set to 256 x 256 by randomly cropping the 622 x 700 input images. These cropped images were then normalized to obtain intensity values between 0 and 1. A root mean square prop was used to estimate the parameters of the deep convolutional neural network by minimizing a weighted cross entropy loss to handle class imbalance for 100 epochs without data augmentation and 25 epochs with data augmentation. The weights associated with each class were defined from the training dataset as their inverse proportion. A data augmentation to increase the training dataset by a factor of 100 was processed after normalization with the imgaug python library23 and included flipping, rotation, pixel dropout, blurring, noise addition and contrast modifications. In Figure 2 and Figure 3, augmented simulated images were obtained by applying the same modifications with the imgaug python library to simulated images with pix2pix. When combining the annotated image for this study with simulated images and/or existing datasets, the number of augmented images was defined to be balanced between the different data.

U-Net post-processing

An ImageJ macro24,25 was used to convert the three classes obtained with U-Net to individual nuclei. More specifically, individual nuclei were identified by thresholding the subtraction of the nuclei contours component from the inner nuclei component with a threshold equal to 0.35. A 3D Voronoi tessellation26 was then applied to assign each pixel to a nucleus. The object component was defined as all pixels whose background component was inferior to 0.95. This object component was then multiplied by the Voronoi tessellation to obtain individual nuclei. The Voronoi tessellation implies that a 1-pixel width area between nuclei is not assigned to any nucleus. To address this problem, the location of these pixels is obtained by subtracting the binary thresholding of the individual nuclei from the object component. The individual nuclei are then dilated27 and multiplied to this subtraction to be added to the individual nuclei. Finally, nuclei with less than 35 pixels were removed.

Mask R-CNN

The annotated 1868 x 1400 image was divided into thirty-five 266 x 280 images for training: thirty of these images were included in the training dataset while the last five images defined the validation dataset. Version 2.1 of Mask R-CNN16 was used in this study. The backbone network was defined as the Resnet-101 deep convolutional neural network28. We used the code in 5 to define the only class in this study, i.e. the nuclei. A data augmentation to increase the training dataset by a factor of 100 was processed before normalization with the imgaug python library23 and included resizing, cropping, flipping, rotation, shearing, pixel dropout, blurring, sharpness and brightness modifications, noise addition and contrast modifications. Transfer learning with fine-tuning from a network trained on the coco dataset29 was also applied. In the first epoch, only the region proposal network, the classifier and mask heads were trained. The whole network was then trained for the next three epochs. In Figure 2 and Figure 3, augmented simulated images were obtained by applying the same modifications with the imgaug python library to simulated images with pix2pix. When combining the annotated image for this study with simulated images and/or existing datasets, the number of augmented images was defined to be balanced between the different data. The maximum image size used for processing Mask R-CNN was larger than 256 as resizing and cropping were applied for data augmentation and set to 512. This parameter was defined as 1024 when other existing datasets were included for training as magnification in these images is higher.

Evaluation

One 1868 × 1400 and one 934 × 1400 manually annotated images were used for evaluation. As proposed in 11, we used the F1 score with respect to the Intersection over Union (IoU) to evaluate the different nuclei segmentation approaches. More formally, let OGT = {OGT(e)}e=1,...,n be the set of n ground truth nuclei and OE = {OE(e)}e=1,...,m be the set of m estimated nuclei. The IoU defined between the truth nucleus OGT(e1) and the estimated nucleus OE(e2) was defined as:

IoU(OGT(e1),OE(e2))=OGT(e1)OE(e2)OGT(e1)OE(e2).

An IoU (OGT(e1), OE(e2)) equal to 0 implies that OGT(e1) and OE(e2) do not share any pixel while an IoU (OGT(e1), OE(e2)) equal to 1 means that OGT(e1) and OE(e2) are identical. To ensure that one ground truth nucleus is not associated to multiple estimated nuclei and conversely, we use the following definition for the IoU:

IoU*(OGT(e1),OE(e2))={OGT(e1)OE(e2)OGT(e1)OE(e2)ifOGT(e1)OE(e2)OGT(e1)OE(e2)>OGT(e1)OE(ei)OGT(e1)OE(ei)i1,,m,OGT(e1)OE(e2)OGT(e1)OE(e2)>OGT(ej)OE(e2)OGT(ej)OE(e2)j1,,n,0otherwise.

F1 score for a given IoU* threshold t > 0 can be defined as:

F1(t)=2×TP(t)2×TP(t)+FN(t)+FP(t),

where

TP(t)=e1{1,,n}e2{1,,m}𝟙(IoU*(OGT(e1),OE(e2))>t),FN(t)=e1{1,,n}𝟙(IoU*(OGT(e1),OE(e2))<t),FP(t)=e2{1,,m}e2{1,,m,},𝟙(IoU*(OGT(e1),OE(e2))<t),e1{1,,n,},

and

𝟙(𝒞)={1if𝒞istrue,0otherwise.

With a threshold t = 0.05, this metric gives the accuracy of a method to identify the correct number of nuclei, while with thresholds in the range 0.05 0.9, it evaluates the localization accuracy of the identified nuclear contours.

Conditional GAN

The annotated 1868 × 1400 image was divided into thirty-five 256 × 256 images for training. As defined in 13, U-Net14 was used for the generator and a convolutional PatchGAN classifier was used for the discriminator. Once trained, nuclei masks had to be generated to simulate images. Distributions for the number of nuclei per image and the size of nuclei were defined from the training dataset. The number of nuclei per image was then modeled as a Gaussian distribution while the size of nuclei was modeled by a Gumbel distribution to reflect the heavy tail distribution observed in the training dataset. Nuclei masks were then defined as ellipses randomly generated with these distributions with random orientation and a ratio between the two axes defined according to a Gaussian distribution of average s/π and standard deviation of 0.2s/π, where s is the area of the ellipse. 1000 256 × 256 nuclei images were simulated by considering the generated ellipses as nuclei masks.

Combination of instance and semantic segmentations

The combination of results obtained with instance and semantic segmentations was initialized as the nuclei segmented with Mask R-CNN. To prevent from hallucinations, nuclei identified with Mask R-CNN for which the area overlapping with nuclei obtained with U-Net was inferior to 20% were discarded. Then, nuclei identified with U-Net whose area overlapping with nuclei obtained with Mask R-CNN was inferior to 33% were added as new nuclei to the final segmentation. Finally, nuclei with an area inferior to 35 pixels were discarded.

Results

Deep learning-based instance segmentation with existing datasets and massive data augmentation is used to initialize the training dataset

A training dataset is required to train a deep learning method for object segmentation. Consequently, users most often start with manually annotating objects of interest with existing annotation tools30,31. As shown in Figure 1a, this task is particularly challenging in our case due to the wide range of morphologies and high density of nuclei in polyps. We use the ImageJ plugin Annotater12 to efficiently annotate nuclei, a task that takes approximately 30 hours. To avoid a fully manual annotation and save time, it is possible to use the same plugin to correct a nuclei segmentation obtained with an existing method. The watershed method32, probably the most used method for nuclei segmentation in fluorescence microscopy images, correctly identifies a high number of nuclei (high F1 score for a low IoU threshold in Figure 1 b-c). Unfortunately, under- and over-segmentations, a well-known limitation of this approach, lead to a poor segmentation localization (rapidly decreasing F1 score with increasing IoU thresholds in Figure 1 b-c). Alternatively, deep learning approaches can be trained with existing training datasets. We propose to use a high throughput chemical screen on U2OS cells dataset (CC) (image set BBBC039v1 available from the Broad Bioimage Benchmark Collection9) and a widefield mouse intestinal epithelium dataset (MIE)12. While U-Net demonstrates a poor performance with these datasets (Figure 1b), Mask R-CNN identifies more nuclei and mostly leads to much higher localization precision than the watershed approach (slowly decreasing F1 score with increasing IoU thresholds in Figure 1c). Correcting this segmentation with Annotater takes about 15–20 hours, which is clearly faster than an annotation from scratch. For both U-Net and Mask R-CNN, a massive data augmentation (100 times) clearly improves the performance.

43b2f8ab-14bf-4945-a53e-05abae6482ee_figure1.gif

Figure 1. Manual annotation and evaluation of deep learning-based segmentation with existing training datasets.

a Widefield acquisition of a human polyp biopsy stained with DAPI. Manually annotated nuclei are overlaid as red circles. Zoomed-in regions are displayed on the right side with corresponding squared colors. Scale bar = 100µm. b–c F1 score for range of IoU thresholds obtained with the watershed method, with U-Net b and Mask R-CNN c approaches trained with a high-throughput chemical screen on U2OS cells dataset (CC) or/and a widefield mouse intestinal epithelium dataset (MIE), with and without data augmentation (DA). Lines correspond to average F1 score over the two tested images while the shaded areas represent the standard deviation.

43b2f8ab-14bf-4945-a53e-05abae6482ee_figure2.gif

Figure 2. Evaluation of deep learning-based segmentation when using a conditional Generative Adversarial Network to increase the size of the training dataset.

a First row: masks generated as ellipses (see Methods) and represented with unique colors. Second row: images simulated from masks shown in first row with a conditional Generative Adversarial Network (GAN). b–c F1 score for range of IoU thresholds obtained with U-Net b and Mask R-CNN c trained with 1 annotated image with data augmentation (DA), 1000 simulated images, 1000 augmented simulated images, 1 annotated image with DA combined with 1000 augmented simulated images and 1 annotated image with DA combined with 1000 augmented simulated images as well as a high-throughput chemical screen on U2OS cells dataset (CC) and a widefield mouse intestinal epithelium dataset (MIE). Lines correspond to average F1 score over the two tested images while the shaded areas represent the standard deviation.

43b2f8ab-14bf-4945-a53e-05abae6482ee_figure3.gif

Figure 3. Evaluation of nuclear segmentation when combining U-Net and Mask R-CNN.

F1 score for range of IoU thresholds obtained with U-Net trained with 1 and 3 annotated images with data augmentation (DA), Mask R-CNN trained with 1 and 3 annotated images with DA, and the combination of results obtained with U-Net trained with 1 annotated image with DA and augmented simulated images, and the results obtained with Mask R-CNN trained with 1 annotated image with DA, augmented simulated images and existing datasets with DA. Lines correspond to average F1 score over the two tested images while the shaded areas represent the standard deviation.

43b2f8ab-14bf-4945-a53e-05abae6482ee_figure4.gif

Figure 4. Nuclear segmentation example when combining U-Net and Mask R-CNN.

Segmented nuclei obtained by combining U-Net and Mask R-CNN overlaid as red circles over the processed image. Zoomed-in regions are displayed on the right side with corresponding squared colors. Scale bar = 100µm.

Increasing the training dataset by using a conditional GAN improves nuclear segmentation accuracy

When only considering the annotated image in Figure 1a in the training dataset, U-Net leads to higher segmentation accuracy than Mask R-CNN (Figure 2a-b). To increase the training dataset, we use the same annotated image to train a conditional Generative Adversarial Network (GAN)13 and simulate images showing nuclei from masks defined as random ellipses generated with the distributions of nuclei size and nuclei number observed in the training dataset (see Figure 2c and Methods). Only using simulated images lead to a lower accuracy for both deep learning approaches, even though applying mathe matical operations to these synthetic images (augmented simulated training dataset, see Methods) improves the segmentation accuracy. However, pooling together augmented simulated images and the annotated image from Figure 1a slightly improves U-Net performance and distinctly increases the number of accurately identified nuclei with Mask R-CNN while decreasing the segmentation localization precision. Finally, adding existing datasets clearly leads to the optimal results for Mask R-CNN while degrading the accuracy for U-Net, which is consistent with the inability for this approach to generalize nuclear segmentation for different data as shown in Figure 1b.

Combining semantic and instance segmentations improves nuclear segmentation accuracy

Nuclei segmented with Mask R-CNN show a higher localization precision than those obtained with U-Net as shown in Figure 1a-b. However, nuclei that are harder to delineate are missed with Mask R-CNN while U-Net accurately identifies pixels that belong to nuclei, even though the separation between individual nuclei might not be precise. In order to get the best of both worlds, we propose to combine the results obtained with U-Net trained with one annotated image with data augmentation and augmented simulated images, and the results obtained with Mask R-CNN trained with one annotated image with data augmentation, augmented simulated images and existing datasets with data augmentation (see Methods). As shown in Figure 3, these results demonstrate a higher F1 score for any IoU threshold than obtained with U-Net or Mask R-CNN trained with 3 times more annotated images. The corresponding segmented nuclei are shown in Figure 4.

Discussion

This study demonstrates how to take advantage of existing training datasets, efficient annotation tools, massive data augmentation, conditional GANs and the combination of results obtained with both semantic and instance segmentations to minimize the amount of manually annotated data. When facing a new object segmentation problem, it is beneficial to find existing training datasets, even though modalities and/or tissues differ, to train an instance segmentation-based deep learning method. The segmentation obtained with this approach is then used to initialize a training dataset. Training a conditional GAN to increase the size of the training dataset improves the performance for both semantic and instance segmentations. Additionally, adding existing training datasets increases even more the segmentation accuracy for instance segmentation. Finally, combining semantic and instance segmentation results leads to the optimal result for the initial training dataset. If the final accuracy is not satisfactory, images should be processed by manually correcting the combination of semantic and instance segmentations to increase the size of the training dataset and repeat this operation until an accuracy threshold is met.

Data availability

The five annotated images are available at https://github.com/tpecot/DeepLearningBasedSegmentationForBiologists/tree/main/Data/AnnotatedNuclei. This project contains the following data:

  • Polyp12_[10837,39273]_component_data.tiff: image used for training U-Net and Mask R-CNN in all figures and for training pix2pix in Figure 2

  • Polyp40_[13694,34105] _component_data.tiff and Polyp42_[12011,37598] _component_data.tiff: two images used for training U-Net and Mask R-CNN in Figure 3

  • Polyp12_[12699,39273] _component_data.tiff and Polyp42_[12942,36900] _component_data.tiff: two images used for evaluation in all figures

The images generated with pix2pix and used for training U-Net and Mask R-CNN in Figure 2Figure 3 are available at https://github.com/tpecot/NucleiSimulationWithConditionalGAN/tree/main/datasets/Nuclei_polyps_1image.

Software availability

The code with the parameters used to train and process all experiments presented in this manuscript with U-Net and Mask R-CNN is available at https://github.com/tpecot/DeepLearningBasedSegmentationForBiologists/tree/main/Codes.

Archived code as at time of publication: https://doi.org/10.5281/zenodo.460879533

License: GPL3

The code with the parameters used to train and generate images with pix2pix is available at https://github.com/tpecot/NucleiSimulationWithConditionalGAN.

Archived code as at time of publication: https://doi.org/10.5281/zenodo.460879334

License: GPL3

The ImageJ macro used to convert the output classes obtained with U-Net to individual nuclei is available at https://github.com/tpecot/DeepLearningBasedSegmentationForBiologists/tree/main/Codes/ImageJMacros.

Archived macro as at time of publication: https://doi.org/10.5281/zenodo.460879533

License: GPL3

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 30 Mar 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Pécot T, Alekseyenko A and Wallace K. A deep learning segmentation strategy that minimizes the amount of manually annotated images [version 1; peer review: 2 approved with reservations] F1000Research 2021, 10:256 (https://doi.org/10.12688/f1000research.52026.1)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 30 Mar 2021
Views
23
Cite
Reviewer Report 14 Dec 2021
Romain F Laine, MRC Laboratory for Molecular Cell Biology, University College London, London, WC1E 6BT, UK 
Approved with Reservations
VIEWS 23
Pecot et al present a nice set of ideas about how to improve the pipeline of nuclei segmentation. The premise of this work is that it is time consuming to generate good quality annotation for DL training. The authors are ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Laine RF. Reviewer Report For: A deep learning segmentation strategy that minimizes the amount of manually annotated images [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:256 (https://doi.org/10.5256/f1000research.55252.r101244)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 17 Jan 2022
    Thierry Pecot, Department of Biochemistry and Molecular Biology, Hollings Cancer Center, Medical University of South Carolina, Charleston, 29407, USA
    17 Jan 2022
    Author Response
    We thank Romain Laine for his enlightening observations.
    To answer Dr Laine remarks, we changed the manuscript accordingly:
    • We added a sentence about the cost/benefit analysis of manual
    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 17 Jan 2022
    Thierry Pecot, Department of Biochemistry and Molecular Biology, Hollings Cancer Center, Medical University of South Carolina, Charleston, 29407, USA
    17 Jan 2022
    Author Response
    We thank Romain Laine for his enlightening observations.
    To answer Dr Laine remarks, we changed the manuscript accordingly:
    • We added a sentence about the cost/benefit analysis of manual
    ... Continue reading
Views
25
Cite
Reviewer Report 03 Aug 2021
Alice Lucas, Broad Institute, Cambridge, MA, USA 
Approved with Reservations
VIEWS 25
The authors propose multiple strategies to improve segmentation results given a new dataset.

Instead of manually annotating a training image from scratch, the authors recommend to leverage knowledge learned by networks pre-trained on other larger datasets. Therefore, ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Lucas A. Reviewer Report For: A deep learning segmentation strategy that minimizes the amount of manually annotated images [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:256 (https://doi.org/10.5256/f1000research.55252.r89422)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 17 Jan 2022
    Thierry Pecot, Department of Biochemistry and Molecular Biology, Hollings Cancer Center, Medical University of South Carolina, Charleston, 29407, USA
    17 Jan 2022
    Author Response
    We thank Alice Lucas for her insightful remarks and apologize for the delay between her review and our response, we were waiting for a second reviewer.

    To answer Dr ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 17 Jan 2022
    Thierry Pecot, Department of Biochemistry and Molecular Biology, Hollings Cancer Center, Medical University of South Carolina, Charleston, 29407, USA
    17 Jan 2022
    Author Response
    We thank Alice Lucas for her insightful remarks and apologize for the delay between her review and our response, we were waiting for a second reviewer.

    To answer Dr ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 30 Mar 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.