Quantitative characterization of human breast tissue based on deep learning segmentation of 3D optical coherence tomography images

: In this study, we performed dual-modality optical coherence tomography (OCT) characterization (volumetric OCT imaging and quantitative optical coherence elastography) on human breast tissue specimens. We trained and validated a U-Net for automatic image segmentation. Our results demonstrated that U-Net segmentation can be used to assist clinical diagnosis for breast cancer, and is a powerful enabling tool to advance our understanding of the characteristics for breast tissue. Based on the results obtained from U-Net segmentation of 3D OCT images, we demonstrated significant morphological heterogeneity in small breast specimens acquired through diagnostic biopsy. We also found that breast specimens affected by different pathologies had different structural characteristics. By correlating U-Net analysis of structural OCT images with mechanical measurement provided by quantitative optical coherence elastography, we showed that the change of mechanical properties in breast tissue is not directly due to the change in the amount of dense or porous tissue.


Introduction
Other than non-melanoma skin cancers, breast cancer is the most common cancer diagnosed in women, and is one of the leading causes of cancer related death in women around the world and in the United States [1,2]. Various medical imaging technologies (mammography, ultrasound, and MRI) are utilized for breast cancer screening, diagnosis, image guided intervention, and surgical guidance. Optical coherence tomography (OCT), a 3D imaging modality based on low coherence interferometry, is a promising technology for breast tissue characterization, because OCT allows in situ imaging of breast tissue through a handheld probe [3]. The unique advantage of OCT in breast imaging derives from the fact that its imaging volume and spatial resolution are similar to those of biopsy examination. Previous studies have demonstrated that OCT could reveal morphological characteristics of breast tissues with different pathologies [4][5][6][7][8][9][10][11][12]. Despite technological development in the past decade, it remains challenging for a human reader to perform high level tissue characterization using OCT data in clinical tasks, such as differentiation of tumor from normal breast tissue. First, breast tissues exhibits a wide range of morphological features under OCT examination, while OCT data is affected by various noises including speckle noise. In addition, state-of-the-art high-speed OCT engines generate a massive amount of data that is prohibitively large for visual inspection on individual 2D images in many 3D data cubes. In clinical management of breast cancer, there is an unmet need for automatic analysis of volumetric OCT data and information extraction .
Segmentation, the image processing technique that divides pixels in an image into multiple categories, can be used to analyze breast cancer OCT images, and advance OCT's clinical applications in oncology. By dividing pixels in a volumetric OCT image into different categories, a segmentation algorithm quantifies the amount of tissue belonging to different subtypes, enables quantitative analysis of tissue morphology, and performs automatic cancer detection. Various segmentation algorithms have been developed, based on spatial domain thresholding, spatial frequency domain analysis, and texture analysis [13]. Segmentation based on amplitude thresholding has suboptimal performance in analyzing breast OCT images. This is because the amplitude of OCT signal is randomly modulated by multiplicative speckle noise and varies in a large range regardless of tissue type. Segmentation based on spatial frequency analysis is limited by the fact the breast tissue deforms easily. Segmentation through texture analysis is also challenging, because characteristic features for different types of breast tissue can be found at different spatial scales.
To address the unmet need for automatic segmentation of breast tissue in OCT imaging, here we describe a deep learning approach that utilizes a U-Net configuration for OCT image analysis. U-Net, a widely used convolutional neural network (CNN) architecture for biomedical image segmentation, has the capability to segment biomedical images with a limited number of training examples [14]. In OCT community, U-Net has been utilized to segment retina images and skin images [15][16][17][18]. In this study, we manually annotated OCT images obtained from human breast tissue specimens and trained a U-Net for automatic segmentation. We classified pixels of OCT images into the following categories: porous tissue, dense tissue, void area, air-tissue interface and background area. U-Net segmentation of breast OCT image has not been systematically investigated before. In addition, we demonstrated for the first time to the best of our knowledge, that U-Net segmentation enabled quantitative assessment of tissue composition and heterogeneity in volumetric OCT images. Furthermore, Optical coherence elastography (OCE) has demonstrated its clinical value in breast cancer management [19][20][21][22][23][24][25].
Here we combined U-Net analysis of OCT image with quantitative optical coherence elastography (qOCE), and showed that cancer pathology changed the stiffness of tissue rather than the amount of dense breast tissue. We believe that the clinical impacts of this study may include more accurate characterization of core biopsy specimens obtained during image guided biopsy. Current standards of care involve image guided acquisition of tissue via spring loaded or vacuum-assisted core biopsy systems. Although considered highly accurate for most diagnostic criterion, these methods may result in outcomes requiring additional tissue sampling such as insufficient sampling, discordance with imaging, or a diagnosis of a "high-risk lesion" which may imply undersampling of the lesion and a lack of sufficient tissue to render a cancer diagnosis [26]. With U-Net segmentation performing point-of-care tissue characterization, the accuracy of tissue sampling can be significantly improved.
The objective of this study was to demonstrate the potential of U-Net for clinical breast imaging, and show the unique capability of U-Net to perform structural and functional analysis on 3D OCT data. Therefore, we chose to use a typical network architecture. The optimization of network architecture and training strategy will be our future work. This manuscript is organized as the follows. We first describe our OCT imaging platform, protocol for patient data acquisition, and the segmentation method. Afterwards, we present results in the performance of U-Net in tissue segmentation, quantitative tissue characterization enabled by U-Net segmentation of 3D OCT images, and quantitative dual modality analysis (OCT and qOCE) of breast tissues. We finally summarize the manuscript with conclusion and discussion.

OCT imaging platform and data acquisition
We used a spectral domain OCT system for breast imaging. Details on the OCT system can be found in our previous publications [27]. Briefly, the spectral domain OCT system uses a super-luminescent diode (SLD) centered at 1310nm as its broadband source and uses a line scan CMOS camera (SUI1024LDH2, Goodrich, 92 kHz line scan rate, 1024 array size) for signal detection. The output from the light source is split into the reference arm and the sample arm. In the sample arm, a pair of galvanometer steers the light beam for two dimensional lateral scanning. A workstation computer (Dell Precision T7600) with a general purpose graphic processing units (GPU, NVidia GeForce 780) is used for real-time signal processing. Software developed in house (C++ with CUDA) is used to control and synchronize individual devices (camera, galvo, and GPU), manage data streaming, and perform image reconstruction. The sample arm of the OCT system uses a scanning lens (LSM04, Thorlabs) that provides a 36µm lateral resolution. The lateral field of view (FOV) is depends on the voltage applied to the galvo and is approximately 2mm in lateral dimension. For 3D imaging, the OCT engine provides a 7.5µm axial resolution, 2.5mm depth imaging range, and >95dB sensitivity. We also performed quantitative evaluation of the mechanical properties of the tissue using a qOCE instrument [28,29]. Before tissue compression, the qOCE probe is translated axially to have full contact with the tissue. The qOCE instrument uses a thin probe to compress the tissue, measures the reaction force (F) with a calibrated fiber-optic force sensor integrated into the probe shaft, collects OCT signals from the tissue under compression to track depth resolved displacement (D(z)), and quantitatively evaluates the mechanical properties of the tissue using the ratio between the stress (F/A where A represents the area of the probe tip and A≈ 2.5mm 2 ) and strain (dD(z)/dz). To control the amount of compression, the qOCE probe attaches to a linear motor. The motor is programed to move a certain distance (0.2mm in this study) for mechanical characterization. We then obtain the apparent Young's modulus (E) of the sample: E = F/A dD(z)/dz . We performed OCT imaging on breast tissue specimens acquired from human subjects at University Hospital, in Newark, NJ, the primary academic medical center of Rutgers, New Jersey Medical School. This prospective study was approved by the institutional review board at Rutgers, New Jersey Medical School. Informed written consent was obtained from each patient prior to image guided tissue sampling and enrollment in the study. The biopsy recommendation had been made based on the standard of care. All tissue samples were obtained via ultrasound guidance utilizing a 14 gauge x 11 cm spring loaded core biopsy device (Achieve breast biopsy, Merit Medical Systems Inc. South Jordan, UT) with 3-5 core samples obtained per lesion targeted for sampling. Biopsies were performed by a board certified radiologist specializing in breast imaging and intervention (Basil Hubbi, MD). Tissue samples were immediately characterized using OCT analysis. We performed volumetric OCT imaging on tissue specimens. Afterwards, we performed quantitative optical coherence elastography (qOCE) characterization. In qOCE measurement, we compressed the tissue specimen with our qOCE probe, and simultaneously measure tissue deformation and tissue/probe reaction force. Afterwards, the tissue was submitted in formalin for standard of care histopathologic analysis (Mark Galan, MD). This study involved 13 patients. Data obtained from each patient included volumetric OCT image (1024(x) × 500(y) × 512(z)), qOCE measurement (force versus depth resolved displacement), histology and clinical diagnosis based on histology.

U-Net for breast OCT image segmentation
We use convolutional neural network based on a U-Net architecture to generate the rules to assign a label to every pixel of an OCT image. Normal breast consists of adipose tissue (fat), stroma tissue and epithelial tissue. The microscopic morphology of breast tissue can also be altered due to pathologies such as cancer or fibroadenoma. Nevertheless, OCT images of breast tissue generally show either a porous texture or a dense speckle pattern [4][5][6][7][8][9][10][11][12]. Usually, normal adipose breast tissue corresponds to a porous appearance with low scattering areas enclosed by individual, well-circumscribed scattering borders. Other than adipose tissue, dense breast tissues include stroma tissue, epithelial tissue and tumor. Area occupied by dense tissues generally has a homogeneous, speckled appearance. Void area (air above the tissue, or empty cavities within the tissue) does not create light scattering and shows no signal. Air-tissue boundary shows as a bright interface because of abrupt change in refractive index. In addition, background area at a large imaging depth shows minimal signal amplitude because of light attenuation and loss of coherence. Hence we manually label the OCT images using the above described categories (porous tissue, dense tissue, void area, air-tissue interface and background area). The network is trained iteratively using ground truth (OCT image in Fig. 1(a) along with manual segmentation in Fig. 1(b)). The training process optimizes the automatic prediction of pixel type and the trained U-Net allows automatic image segmentation, as illustrated in Fig. 1(c). To segment an arbitrary input image, the probabilities for each pixel to belong to different categories are calculated according to the trained U-net. The pixel is given the category that corresponds to the highest probability. The architecture of our U-net is illustrated in Fig. 2. The input and output layer of the U-Net have a dimension of 128 (axial) × 128 (lateral). In other words, the image input into the U-Net has a dimension of 128 by 128, while the U-Net outputs a label for each pixel in the image. To segment a larger image such as a Bscan with a dimension of 512 (axial) × 1024 (lateral), one must divide image into multiple 128 × 128 patches and perform U-Net segmentation in each patch. The dimension of the image patch is chosen to mitigate the memory demand to train the network. If the U-Net is designed to directly segment a 512 (axial) × 1024 (lateral) Bscan, the memory required is extremely large and the training takes a much longer time. As shown in Fig. 2, the U-net consists of a contracting encoder branch and an expanding decoder branch. The encoder branch has four encoder stages and extracts multiscale features from the input image. On the other hand, the decoder branch has four decoder stages and generates a spatially resolved prediction of individual pixels for segmentation. As illustrated in Fig. 2, each encoder stage consists of five layers (3 × 3 convolution layer, ReLU activation layer, 3 × 3 convolution layer, ReLU activation layer, and max pooling layer). Each decoder stage consists of seven layers (up convolution layer, up ReLU layer, concatenation layer, 3 × 3 convolution layer, ReLU layer, 3 × 3 convolution layer and ReLU layer). Notably, the up convolution layer that performs transposed convolution generates more coefficients for the next layer based on numbers in the current layer and can be considered as an upsampling layer. The up ReLU layer is an ordinary ReLU layer. The 1st, 2nd, 3rd, and 4th encoder and decoder stage have 64, 128, 256, and 512 features, respectively.

U-Net training and validation
To train the U-Net, we generated ground truth by manually labeling OCT images. For each data volume consisting of 500 Bscans, we selected 25 Bscans uniformly distributed within the volume and labeled the pixels to be porous tissue, dense tissue, void area, air-tissue interface and background area. With 13 data volumes from recruited patients, we have 325 labeled images for training. These Bscans were further divided into smaller image patches with a dimension of 128 × 128, resulting in 4550 patches for training. Notably, We only used 256 pixels in each Ascans for the training, because pixels at a larger imaging depth are very noisy. We also cropped Ascans at the beginning and the end of each Bscan, to eliminate signals affected by galvo scanning artefacts. The accuracy of the U-Net was quantified by comparing the U-Net prediction of pixel label and the ground truth pixel label. We empirically optimized the architecture and training strategy for the U-Net. The U-Net with a satisfactory performance had 4 encoder/decoder stages (Fig. 2), input/output dimension of 128 × 128, and was trained at an initial learning rate of 10 −4 . The learning rate drop factor was 0.9 and the learning rate drop period was 1. We used cross entropy as the loss function and used Adam optimization method. Separate sets of Bscans were used for training, testing and validation. 90% of the data was used for training and 10% of the data was used for validation during the training process. Table 1 shows the segmentation accuracy for networks with different encoder/decoder stages (128 × 128 input/output dimension and trained with an initial learning rate of 10 −4 ). The network with 4 encoder/decoder stages achieved the highest accuracy. Table 2 shows the segmentation accuracy for networks with different input/output image dimensions (U-Net with 4 encoder/decoder stages and trained at an initial learning rate of 10 −4 ). The accuracy was optimized when image patches had a dimension of 128 × 128. Table 3 shows the segmentation accuracy achieved under different initial learning rate (U-Net with 4 encoder/decoder stages and trained with 128 × 128 patches). The accuracy was optimized when the initial learning rate was 10 −4 . The U-Net was trained in Matlab (Matlab 2019b), on graphic processing units (GPU-GTX1070). The training was accomplished in approximately 14 minutes (a minibatch size of 10, iterated for 10 epochs). After determining the optimal hyperparameters, we trained U-Net and tested its accuracy using an 80%(training)/10%(validation)/10%(test) split to prevent overestimation of the accuracy. The global test accuracy was approximately 89.10%, slightly worse than the best training accuracy achieved. We also evaluated the accuracy for individual categories in Table 4.

Breast OCT image analysis based on U-Net segmentation
We first demonstrate the need for U-Net to segment OCT images of breast tissue in Fig. 3. We selected a Bscan centered at one of the 3D OCT data cube. The pixels of the image were manually labeled to be porous tissue, dense tissue, void area, air-tissue interface and background area. We obtained histogram for all the pixels in the image (blue curve in Fig. 3). We also obtained histogram for pixels corresponding to only porous tissue (black curve in Fig. 3), and histogram for pixels corresponding to only dense tissue (red curve in Fig. 3). Histograms for porous tissue and dense tissue (black and red curves) overlap significantly and the histogram for the entire image does not show seperate peaks corresponding to specific tissue type. This is because OCT data is overwhelmingly affected by speckle noise that is approximately a Rayleigh distribution [30]. Hence it is impractical to segment OCT images based on amplitude thresholding strategies. We then demonstrate that U-Net segmentation of OCT image provides morphological characterization of the sample, and the results are consistent with histology. Data shown in Fig. 4(a) (Bscan image in x-z plane), (b) (enface image in x-y plane), (c) (U-Net segmentation result overlaid with gray scale Bscan image) and (d) (histology image obtained with 40X magnification) were obtained from a patient diagnosed as having invasive ductal carcinoma. We re-scaled the images for display, such that the spatial sampling intervals in axial and lateral dimension are identical and the scale bars are valid for both axial and lateral dimensions. We obtained OCT data from our CUDA based software platform after k-linearization, fast Fourier transform, and dynamic range compression through logarithm operation. We then turned the single precision float number for each pixel to an 8-bit integer (color bars by gray scale OCT images). The histology image (Fig. 4(d)) shows porous microarchitecture corresponding to adipose tissue, as well as dense structure because of cancer invasion through the basement membrane. OCT Fig. 3. Histogram corresponding to all the pixels within the OCT Bscan (blue), histogram corresponding to pixels labeled as porous tissue (black), and histogram corresponding to pixels labeled as dense tissue (red) images ( Fig. 4(a) and (b)) clearly show porous structure corresponding to adipose tissue and homogeneous speckle pattern corresponding to dense tissue (cancer). Data shown in Fig. 4(e) (Bscan image in x-z plane), (f) (enface image in x-y plane), (g) (U-Net segmentation result overlayed with gray scale Bscan image) and (h) (histology image) were obtained from a patient diagnosed as having pseudoangiomatous stromal hyperplasia. The histology image (Fig. 4(h)) shows interanastomosing spaces in dense collagenous, keloid-like stroma, characteristic for the specific pathology. OCT images (Fig. 4(e) and (f)) have a homogeneous speckled appearance that is consistent with histology. Notably, We did not take the effort to make sure that two images (OCT and histology) showed exactly the same location and achieve pixel wise correlation. According to our approved protocol, OCT images were obtained immediately after tissue acquisition. Afterwards, the tissue specimen was placed in formalin and later submitted for histology. Following such a protocol, it was impractical to label the OCT imaging plane at a tissue specimen acquired by 14 gauge core biopsy needle. However, the high-level morphological features characterized by U-Net segmentation are consistent with the results of histology.
We further demonstrate the feasibility to perform high-level morphological analysis on volumetric OCT data through U-Net segmentation. Results of 3D analysis shown in Fig. 5 and results in Fig. 4(a)-(d) were obtained from the same patient diagnosed as having invasive ductal carcinoma. Results of 3D analysis shown in Fig. 6 and results in Fig. 4(e)-(g) were obtained from the same patient diagnosed as having pseudoangiomatous stromal hyperplasia. 3D OCT data can be examined as a sequence of 2D Bscan images or a sequence of 2D enface images, which does not allow direct characterization of 3D morphological features. Alternatively, an entire OCT data cube can be viewed through volume rendering by projecting the three-dimensional data to a two-dimensional plane, as illustrated in Fig. 5(a). However, due to the limited contrast of OCT image, it is challenging to identify different types of tissue (porous tissue and dense tissue) by examining the 3D rendering of OCT data. For example, 2D OCT image in Fig. 4(a) clearly shows area with a porous appearance and area with a homogeneous speckled appearance, while these morphological characteristics are not discernible in Fig. 5(a). U-Net enables the extraction of high-level morphological information from volumetric OCT data, which is critical for the application of OCT in clinical diagnosis. Using the volume data shown in Fig. 5(a), we segmented all the Bscans with the trained U-Net and the segmentation results are 3D rendered in Fig. 5(b). The volume rendered with tissue category labeling show dense tissue (cancer in red) infiltrating into porous tissue (fat in green), which is consistent with the result show as 2D images in Fig. 4(a). This is further illustrated in Fig. 5(c) and (d), which are volume renderings of pixels identified as porous tissue and dense tissue. In addition, as illustrated in Fig. 5(e), quantitative understanding of 3D characteristics of breast tissues can be obtained by further reducing the results of U-Net segmentation. Using the results of U-Net segmentation, we calculated the number of pixels corresponding to dense and porous breast tissues, for individual x-z planes (Bscans), y-z planes, and x-y planes (enface), and show the ratio between pixel numbers for dense and porous breast tissues in Fig. 5(e) (upper: x-z plane; middle: y-z plane; bottom: x-y plane). Results in Fig. 5(e) suggest that the composition of breast tissue is heterogeneous even within a small tissue specimen. Same observation can be made for U-Net enabled analysis of 3D OCT image in Fig. 6.
In addition, we performed U-Net segmentation across specimens confirmed to have different pathologies. For each breast specimen, we calculated the percentage of pixels belonging to porous tissue and pixels belonging to dense tissue. For each patient, we estimated the percentage of pixels corresponding to porous tissue and dense tissue in each Bscan (p porous (i) and p dense (i) for the ith Bscan), and the average percentage for porous and dense tissue within the entire volume   6. Results of 3D analysis of OCT data obtained from patient with pseudoangiomatous stromal hyperplasia: (a) rendered 3D OCT data; (b) rendered 3D OCT data that is labeled based on U-Net; (c) 3D rendering of pixels classified as porous tissue; (d) 3D rendering of pixels classified as dense tissue; (e) the ratio between pixel numbers for dense and porous breast tissues(upper: x-z plane; middle: y-z plane; bottom: x-y plane).
. P porous and P dense obtained from 13 patients are shown as bars in Fig. 7. Notably, each tissue specimen was from an individual patient. Specimens with the same pathology were from different patients. Figure 7 shows results according to the chronological order of data acquisition. The malignant cases are highlighted using the color red as the background. For different tissue specimens, the combined percentage of porous and dense tissue was different, because of different amount of pixels corresponding to void area or background. We also quantitatively evaluated how the number of pixels corresponding to porous tissue or dense tissue varied across different Bscan frames using standard deviation: shown as error bars in Fig. 7. Results in Fig. 7 demonstrate that the U-Net segmentation enables the extraction of concise morphological information from massive 3D OCT image data, for different types of tissue. U-Net segmentation allows quantitative assessment sample morphology and allows quantitative assessment of functional properties of breast tissue. To demonstrate this, we performed dual modality characterization of breast tissue. The same tissue specimen underwent volumetric imaging and quantitative optical coherence elastography (qOCE) characterization. Our qOCE instrument has been described in previous publications [28,29,31]. Here we analyzed data of dual modality characterization (3D OCT and qOCE) obtained from three patients (marked by red asterisk in Fig. 7). These tissue specimens were respectively diagnosed as benign fatty tissue, fibroadenoma, and ductal carcinoma in situ (DCIS) in histology. For volumetric OCT data, we used the U-Net to segment individual Bscans in the 3D data cube, and calculated the total percentage of pixels corresponding to porous tissue and dense tissue in the entire volume. For qOCE data, the instrument is calibrated in its force and displacement tracking capability [28]. We evaluated the apparent Young's moduli (E) of the sample: E = F/A dD(z)/dz . Figure 8(a) shows the percentage of pixels classified as porous and dense tissue for each specimen, according to the results of U-Net segmentation. Relatively, the specimen diagnosed as benign fatty tissue has the least porous tissue because it contained a significant amount of benign dense tissue, while the specimen diagnosed as DCIS has the most porous tissue. Figure 8(b) shows the apparent Young's moduli (E fat , E fibroadenoma and E DCIS ) obtained from qOCE measurements for the same tissue specimens. Our results suggest E fat <E fibroadenoma < E DCIS . Notably, these elasticity values only represent an estimation of apparent Young's moduli, because the measurement depends on the boundary condition of the measurement and the accuracy of force sensing calibration. Nevertheless, the results obtained from the same calibrated instrument showed the relative stiffness of the specimens. It is known that values of tissue stiffness reported in literature vary significantly. It depends on the sample geometry, measurement methodology, and technique utilized for stiffness assessment. Nevertheless, in terms of orders of magnitude, our results are similar to values reported previous study [32]. According to Fig. 8(a), the DCIS specimen had the largest percentage of porous tissue (or adipose tissue). It is generally believed that porous breast tissue is less stiff compared to diseased breast tissue. Therefore, the specimen (DCIS) containing the most porous tissue is anticipated to the smallest Young's modulus. However, qOCE measurement showed DCIS to have the largest stiffness ( Fig. 8(b)). On the other hand, the benign fatty specimen had the least percentage of porous tissue and the smallest stiffness. According to our results, it can be derived that the altered stiffness of breast tissue is not directly related to a larger volume occupied by morphologically dense tissue. Instead, the altered stiffness may be due to the change in molecular composition of the tissue. Notably, the current qOCE study was limited by the small number of specimens tested. Furthermore, the assessment of tissue composition (percentage of porous and dense tissue) was affected by signal attenuation.

Conclusion and discussion
In this study, we acquired tissue specimens from patients who underwent standard of care breast biopsy procedures. We performed volumetric OCT imaging on these tissue specimens. We manually labeled a subset of images (Bscans) in 3D data cubes. We used the results of manual labeling as the ground truth to train a U-Net that was subsequently used in automatic tissue segmentation. Our results showed that OCT imaging of breast tissue provided morphological characteristics consistent with histology, and U-Net allowed accurate and robust segmentation of breast OCT images (Fig. 4). With U-Net segmentation, high-level understanding of tissue morphology, such as the ratio between porous and dense breast tissue, can be extracted quantitatively from volumetric image data and used to assist clinical diagnosis. In addition to clinical impact, U-Net segmentation is a powerful enabling tool to advance our understanding of the characteristics of breast tissue. U-Net analysis of 3D OCT data showed that the tissue specimen was highly heterogeneous even within a small volume acquired by a core biopsy needle ( Fig. 5 and Fig. 6). Therefore, when OCT is used to perform in situ tissue characterization, a sufficiently large volume must be interrogated to overcome the inherent heterogeneity of the sample. In addition, specimens diagnosed as having different pathologies showed different tissue composition (Fig. 7). Results in Fig. 7 represent new knowledge discovered through U-Net segmentation. If the amount of porous tissue and dense tissue in a specimen can be characterized in situ during biopsy, the radiologist can quickly rule out non-diagnostic sample that contains predominantly porous tissue. Moreover, results in Fig. 7 suggest the composition of breast tissue varies significantly and the anticipation that normal breast tissue generally has a porous appearance in OCT image is inaccurate. The diagnostic capability of OCT can be largely strengthen by combining morphological characterization with functional characterization, such as qOCE measurement. We also correlated the results of U-Net analysis with the results of qOCE measurement (Fig. 8 ). Our results showed that the change in tissue mechanical properties under is not directly due to the difference in the amount of dense or porous tissue.
Similar to other AI approaches, U-Net analysis of 3D OCT data is currently limited in clinical application by the diversity of breast pathology and the small number of participating patients. Nevertheless, U-Net enables automatic, fast, robust, and quantitative morphological tissue characterization through segmentation. In this study, we trained a U-Net that classifies pixels in individual Bscans. An alternative deep learning approach for breast tissue characterization is to train a convolutional neural network (CNN) to analyze an entire image and determine the tissue type for diagnosis. If a sufficiently large image data base (images and ground truth diagnosis for each image) is used for training, CNN can achieve very high diagnostic accuracy, as demonstrated in dermatology [33]. However, we are limited by the small number of tissue specimens belonging to individual pathology categories, while the training of a U-net is much easier with data we have. In addition, U-Net can be potentially trained to segment malignant tissue from benign tissue. However, this is a challenging task, because of limited training data available, large variety of breast pathologies, diverse morphological features among normal breast tissues, and diverse morphological features among pathological breast tissues.
We trained the U-Net using 2D images and used the U-Net to segment individual Bscans images in a 3D volume. It is possible to train a U-Net that segments 3D images. However, we established the ground truth for the training through manual annotation, and manual labeling of 3D data remains challenging. Moreover, U-Net for 3D segmentation has much more parameters compared U-Net for 2D image segmentation. It requires extremely large space for storage and takes a much longer time to train. To train the U-Net, we essentially taught the AI algorithm the way humans perceive OCT image data. We chose to classify the pixels in an OCT image into the following categories: porous tissue, dense tissue, void area, air-tissue interface and background area, because these categories could be reliably and accurately identified through human observation. Our experimental data suggested that normal fatty tissue (confirmed in histology by pathologies) appeared to have a combination of a porous and dense appearance, while other tissue could have a predominantly porous appearance in OCT image. Hence, each of these categories defined in U-Net may correspond to histologically different tissues, which may complicates clinical interpretation of U-Net data. Nevertheless, U-Net extracts valuable high level information that is consistent with human observation. When only a small data set is available for training, it is a common practice to perform transfer learning that builds CNNs using architecture pretrained on large data sets [34]. In this study, we did not use transfer learning, because OCT images are significantly different from natural images such as ones in ImageNet, in spatial and frequency domain characteristics. We trained our network from scratch and our training strategy achieved satisfactory segmentation performance. We will investigate the potential usage of transfer learning in OCT image analysis.
Disclosures. The authors declare no conflicts of interest.