Polarimetric second-harmonic generation microscopy of the hierarchical structure of collagen in stage I-III non-small cell lung carcinoma

: Polarimetric second-harmonic generation (P-SHG) microscopy is used to quantify the structural alteration of collagen in stage-I,-II and -III non-small cell lung carcinoma (NSCLC) ex vivo tissue. The achiral and chiral molecular second-order susceptibility tensor components ratios ( R and C , respectively), the degree of linear polarization ( DLP ) and the in-plane collagen ﬁber orientation ( δ ) were extracted. Further, texture analysis was performed on the SHG intensity, R , C , DLP and δ . The distributions of R , C , DLP and δ as well as the textural features of entropy, correlation and contrast show signiﬁcant diﬀerences between normal and tumor tissues.


Introduction
Lung cancer is the second most commonly diagnosed cancer worldwide with 85% being classified as non-small cell lung carcinoma (NSCLC) [1]. The most important prognostic factor for NSCLC, is staging, which is defined by the anatomic extent of the disease. Staging is usually done by the TNM method based on tumour (T), lymph nodes (N) and metastases (M) factors, the combination of which determine the stages I-IV; the last mostly involves metastatic spread. Staging determines the treatment strategy. While the focus of therapy for patients with stage IIIB-IV NSCLC is palliation, the treatment intent for stages I-IIIA is curative. The five-year survival rate for patients with stage-I NSCLC is 70%, dropping to around 1% for stage IV. Hence, early diagnosis and accurate staging are critical to impact survival rate and quality of life [2].
Tumor progression comprises not only tumor cell proliferation but also changes in the tumor microenvironment [3][4][5] that affects tumor growth and metastatic potential [6]. Previous studies have highlighted the impact of tumor growth on the structure and composition of the extracellular matrix (ECM) [7][8][9][10]. Since collagen is the major structure protein in the ECM, studying its structural alterations during tumor development has been the focus of many studies. These

Tissue sample preparation
Tissues were collected according to an institutionally-approved protocol (University Health Network, Toronto, Canada) from 9 patients with NSCLC who underwent complete surgical resection of their tumors: 3 patients in each of stages-I, -II and -III. The normal-appearing lung parenchyma was taken at least 5 cm away from the tumor from the same patients. All samples were handled as per standard clinical histology protocols, generating formalin-fixed specimens cut into 5 µm thick sections and mounted on glass slides. The sections were stained with hematoxylin and eosin (H&E) and imaged with a bright-field microscope scanner (Aperio Whole Slide Scanner, Leica Biosystems) for reference. The stained samples were used for determining the regions of interest (ROIs) and they also have been used for SHG imaging without further preparation. For each patient, slides of one tumor and one normal lung section were investigated and at least 7 ROIs were scanned in each slide, as identified by a lung pathologist (T. W., S. S. or M.-S. T.), yielding a total of 23, 35 and 27 areas in stage-I, II and III specimens, respectively. Also 65 areas from the 9 patients in normal-appearing lung tissues were scanned as the control samples. For the controls, collagen rich patches in non-neoplastic lung were selected. For tumoral slides, areas of abnormal stroma that is contiguous with the cancer mass was selected as cancer associated stroma. Each scan area was 110 µm × 110 µm.

PIPO SHG microscope setup
The PIPO SHG imaging was performed with a custom nonlinear laser-scanning microscope, as described elsewhere [23]. The microscope was coupled with an in-house built diode-pumped Yb-ion-doped potassium gadolinium tungstate (Yb:KGW) crystal-based oscillator providing ∼ 430 fs pulses at 1028 nm peak wavelength and 14.3 MHz pulse repetition rate [37]. The microscope contains a polarization state generator (PSG) to control the incident polarization state of the laser beam and a polarization state analyzer (PSA) to measure the polarization state of the outgoing SHG signal from the sample. The PSG has a linear polarizer (Laser Components Inc.) to ensure that the incident laser beam is linearly polarized and a half-wave plate (Comar Optics) to rotate the linear polarization of incoming laser beam. The PSG is placed before the excitation objective (20×, 0.75 NA air objective (Carl Zeiss)). The PSA contains only a rotating linear polarizer and it is placed after the custom 0.85 NA collection objective. The SHG signal was collected in the forward direction and detected with a single-photon counting photomultiplier tube (Hamamatsu H7422P-40). A BG 39 Schott glass filter and a 510-520 nm band-pass interference filter (Edmund Optics) were used in front of the detector to separate SHG from the laser light. The PSG half-wave plate was rotated to nine different angles with equal increments from 0 • to 180 • , while the PSA was rotated to nine different evenly spaced angles from 0 • to 180 • for each PSG state to perform the PIPO imaging. To monitor any possible sample degradation during imaging, the samples were periodically scanned at the initial PSG and PSA settings after measurements with each set of the PSA states. The 110 µm × 110 µm ROIs were scanned with a 2 µs dwell time, and 50 to 100 frames of 128 × 128 pixels were summed to obtain an image.

PIPO SHG image analysis
Ultrastructural properties of collagen fibers in tissue can be explained by the second-order susceptibility tensor components ( χ (2) ijk ), where ijk denotes the Cartesian molecular coordinate system, with the z-axis being the collagen fiber direction. The 3D organization of collagen fibers in the lab coordinate system (denoted by IJK/XYZ) can be defined by the out-of-plane angle of the collagen fibril (α) measured from the image plane located in the XZ-plane and the average in-plane orientation of the fiber (δ) with respect to laboratory Z-axis. The laser propagation is set along the laboratory Y-axis. The detected SHG intensity from collagen will depend on molecular susceptibility components as well as δ and α angles. It also depends on the orientation of the incoming laser polarization, θ, and orientation of the outgoing SHG polarization, φ, both measured with respect to the Z-axis of the laboratory frame of reference. The relation between the SHG intensity (I SHG ), polarizer angle (θ), analyzer angle (φ), the second-order susceptibility ratios (R and C), and in-plane orientation of the collagen fibrils (δ) can be formulated as follow [25] are the achiral and chiral ratio of in-image-plane molecular susceptibility components, respectively. In this equation, it is assumed that the sample birefringence is negligible, which is valid for the 5-µm thick tissue sections. χ xxz and χ zxx were shown to be similar in Achilles tendon collagen at the fundamental wavelength of 1028 nm [25], therefore χ xxz / χ zxx = 1 is assumed [38]. A trust region reflective (TRR) fitting [39] with custom written software in MATLAB (MathWorks) was performed on every pixel of the PIPO set of images using Eq. (1) as a function of θ and φ, and parameters R, C and δ were extracted. From PIPO SHG microscopy data, the degree of linear polarization (DLP) of the outgoing SHG signal can also be obtained. For each pixel of the image, the average fiber orientation angle, δ, extracted from the fit was used to select the closest PSG angle to the fiber axis, θ c , such that |θ c − δ| ≤ 11.25 • , where 11.25 • is half of an incremental step of PSG. Since the SHG intensity depends on the angle between the incoming polarization and the collagen fiber axis orientation, the closest PSG angle was selected to decrease the intensity noise. In order to increase the signal-to-noise ratio (SNR), two sets of PSA orientation angles were used to calculate DLP: DLP 1 was calculated by Eq. (3a) and DLP 2 was calculated by Eq. (3b) where the subscript of intensity, I, indicates the PSA orientation angles.
The DLP 1 and DLP 2 were averaged to obtain the DLP for each pixel of the image. The DLP, R-and C-ratios are presented as color-coded images and values of δ are presented as orientation maps, where each line-bar shows the average collagen fiber orientation in the image plane.

Texture analysis of various SHG parameters
I SHG , R-and C-ratios, δ, and the DLP images were subjected to image texture analysis where the relation between each pixel of the image and its nearest neighbors are considered. The texture analysis was performed using a gray-level co-occurrence matrix (GLCM). Fourteen textural parameters can be derived from the GLCM [30], amongst which some have previously been suggested as relevant to characterization of collagen SHG intensity images. The GLCM provides a second-order statistical representation of the distribution of gray levels within a specific ROI, which in turn provides the basis for textural analysis. GLCM is built by calculating the occurrence of a certain gray level pair i next to gray level j at the distance d along the direction γ. After GLCM is obtained, the probability density function, P d,γ (i, j), of finding certain pairs of intensity i and j are calculated. Therefore, GLCM textural analysis considers the variation of pixel gray levels within a certain distance. Thereby, the forms, distributions and variation of the imaged objects, such as collagen fibers, can be tracked. Prior to textural analysis of the 8-bit images, the GLCM is calculated for each γ = 0 • , 45 • , 90 • and 135 • angle directions, as well as each textural feature. The four obtained values are then averaged to account for possible variations regarding, for example, sample positioning on the imaging stage. These angles represent directions of neighbouring pixels and should not be mistaken for the PSG or PSA angles. A window size of 3 × 3 pixels used which then considers all 8 first neighboring pixels, and therefore accurately captures structures and variations associated to collagen fiber remodeling. Image processing and texture evaluations were then carried out using ImageJ [40] (post-image processing) and a custom-built toolkit incorporating MATLAB image processing toolbox scripts.
When studying collagen remodelling, it has been shown that some textural parameters are more informative than other with regards to the fibrous tissue morphology [35]. Performing a feature selection test based on Fisher score [41], only three GLCM features were found to be statistically significantly relevant in the context of defining structures in collagen fibers from PIPO SHG images obtained from lung biopsies. Here, three GLCM textural features-entropy, correlation and contrast-showed significant differences between tissue types, as judged by a Fischer score-based feature selection test. The three parameters are as follows.
The Entropy, S, is defined as: where N is the number of gray levels. Entropy measures the lack of spatial organization within the computational window and has been applied previously to collagen SHG intensity images [21,42,43]. High entropy means that the probability of finding certain paired gray levels are equal, i.e. it shows that the structures within the image are not organized, which corresponds to a rough texture. The correlation parameter, ρ, quantifies a linear dependence of gray levels between two pixels separated by a certain distance d and it is defined as: where µ is the mean and σ is the standard deviation of the gray levels. When an image shows low correlation, it means that the gray levels are generally independent from one another. When dealing with biological images, it can be associated with the fact that there is no regular structure within the image. However, if the correlation is high, then there is a high probability that one or several patterns repeat within the computational window. This parameter can be useful to quantify collagen fiber arrangement in normal and pathological tissues [35]. The third texture parameter, contrast or variability, ν, is defined as: and is a representation of pixels entirely similar to their neighbors, and it is very sensitive to large differences occurring inside the co-occurrence matrix. The PIPO SHG microscopy images were used for the texture analysis. These images contain 81 polarization slices deduced from different combinations of PSG and PSA states, so that all slices were added to create a single polarization independent SHG intensity image for texture analysis. The texture analysis of the R-and C-ratios, δ and DLP was performed similarly.

Statistical analysis
The statistical parameters that were extracted from the R-and C-ratios as well as the δ and DLP distributions to investigate possible differences between collagen in normal and different stages of NSCLC, were the median and median absolute deviation (MAD). The latter is a measure of statistical dispersion and it is defined as the median of the absolute deviations from the median [44]. It has the advantage of being less sensitive to outliers than the standard deviation. To compare the textural features between collagen in normal and different tumor stages, the average of each feature was considered. After all statistical parameters from PIPO SHG measurements and texture analysis were extracted, D'Agostino-Pearson and Shapiro-Wilk normality test was used to assess the normality of the data. Linear mixed effect models were performed using GraphPad Prism version 8.1. Differences with p-value < 0.05 were considered as significantly different. The different ranges are indicated in each figure legend. To measure possible linear correlation between data sets a Pearson product-moment correlation coefficient was calculated.

Tissue ultrastructure properties
The ultrastructure in each image pixel was examined with the linear PIPO SHG measurements. Examples of scans from normal lung parenchyma and the different tumor stages are shown in Fig. 1. It can be clearly seen that R-images in tumor samples are dominated by yellow and orange pixels compared to the control samples, which have mostly blue and green pixels (Fig. 1(c)). This can also be seen from the R-ratio distribution that shifts to higher values for tumor samples. The range of the color bar for the C-ratio (Fig. 1(d)) varies from blue to grey to red, depending on the out-of-image-plane collagen tilt angle, α, where blue indicates α<0, grey represents α = 0 and red shows fibres with α>0. The C-ratio images show that, although the tissue samples were cut at random orientations, the majority of pixels contain collagen fibers that are oriented around α = 0. The C-ratio image for stage-II and -III shows an alternating pattern of red, grey and blue, which is an indication of the collagen fiber waviness in 3D with the fiber orientation changes, respectively, from pointing into the image plane, being parallel to the image plane and pointing out of the image plane. None of the C-ratio distributions have a bimodal distribution, which was seen in a previous study in normal tendon where collagen fibers have an antiparallel arrangement [45].
Column (e) of Fig. 1 shows the fiber orientation (δ) map where each line (bar) represents the effective cylindrical axis of collagen fibres within each pixel. The histogram of collagen fibers orientation is also shown in the bottom left corner of Fig. 1(e) as a polar plot. A reduced waviness can be observed when comparing collagen fibers in tumor samples to the normal tissue. It can also be seen that the angle distributions in tumor samples are narrower than the normal samples, which is another indication of the reduction in the waviness.
The DLP, which is a measure of the average degree of linear polarization of the outgoing SHG signal from the sample, was also calculated for each pixel using Eqs.(3) and is presented as a color-coded image for each tissue in Fig. 1(f). The DLP distribution is closer to 1 for normal tissue compared to tumor samples. This can also be observed from the dominant red pixels in the normal tissue compared to the presence of some blue and green pixels in the tumor samples.
The statistical values extracted from the occurrence frequency histograms of R, C, δ and DLP for all scanned areas in normal and different tumor stages are presented in Fig. 2. No statistically significant differences observed between patients in control samples. Consequently all the control samples were pooled into one group. The R-ratio in tumor tissue of all stages is higher than in normal tissue, the differences being statistically significant in stage-II and -III ( Fig. 2(a)). A trend can be seen between different tumor stages, where the R-ratio increases as the tumor progresses, but there were no statistically significant differences between stages. An increase in the R-ratio has been previously reported for lung cancer, where only stage-I was studied [17], as well as human breast [27], pancreas [28] and thyroid [26]. This increase in the R-ratio of tumor indicates less organized collagen fiber structure, and may be due to molecular changes of the collagen triple helices, different arrangement of triple helices forming fibers, and/or alteration in the organization of the fibers in the focal volume [17,24]. The SHG intensity and R-values for individual pixels did not correlate linearly (Pearson correlation coefficient of -0.12 for normal lung and -0.07, -0.09 and -0.08 for tumor stages-I, -II, and -III, respectively). The absence of linear correlation between the R-ratio and the SHG intensity indicates that there are additional factors, such as collagen concentration and fragmentation of fibrils, that influence the SHG intensity but are not reflected in the R-ratio to the same extent. The use of the C-ratio distribution has been reported for differentiating between different collagen structures in heart tissue [46]. Here, since the C-ratio strongly depends on the out-ofplane orientation of the collagen fibers and the tissue samples were cut at a random orientation, the median absolute deviation (MAD) of C-ratio was used instead of the median. This shows that the C-ratio distribution is narrowest in normal lung tissue and that the difference with the tumor tissue is statistically significant at all 3 stages, indicating a larger spread of the out-of-plane orientation angle (α) that, in turn, indicates a more disorganized structure. It can also indicate alterations in the structure of collagen fibrils, which may influence the chiral χ xyz or achiral χ zxx component of the second-order susceptibility tensor of collagen. No significant differences were observed between the different tumor stages. As with the R-ratio, no correlation was found between the C-ratio and SHG intensity for each pixel (Pearson correlation coefficient= 0.01 for normal and -0.01, -0.02, and -0.05 for stage-I, -II, and -III, respectively).
Since the mean and the median of δ over the whole image depend on the sample orientation, they are not reported. Instead, the MAD of the δ values was extracted as a measure of the waviness of the collagen fibrils as well as the overall morphology of the sample. As Fig. 2(c) shows, the MAD values for tumor are smaller than for normal lung tissue, indicating that the collagen fibrils are straighter in tumor. This is also visible from Fig. 1(e), where the orientation lines become more aligned in the tumor samples. While the difference between normal and stage-I is not significant, there are statistically-significant differences between the normal tissue and both stage-II and stage-III. Both stage-I and -II follow a trend indicating that the collagen gets straighter as the stage of tumor increases, but this did not hold for stage-III NSCLC. This can be because some selected ROIs are around the tumor lump, another curvature is introduced to the whole collagen in macro-scale which can cause an increase in the MAD of δ. Figure 2(d) shows that the median DLP for all three stages are smaller than that for the normal tissue, the differences being statistically significant. However, no trend nor statistical difference was seen between the different tumor stages. Lower DLP means an increased depolarization of outgoing SHG due to higher ultrastructural disorder of collagen fibrils in the focal volume. It can also be due to fragmentation of collagen to smaller fiber segments within the focal volume. Hence, the SHG is emitted from uncorrelated domains and unpolarized scattering becomes significant compared to polarized SHG radiation. This feature was also observed in thyroid and breast tumors [26,27]. The Pearson correlation test showed very low correlation between the SHG intensity and the DLP (0.13 for normal and 0.26, 0.24, 0.27 for stage-I, -II and -III, respectively), although the correlation was still stronger than that between the R-ratio and the SHG intensity, suggesting that ultrastructural order leading to higher DLP has some influence on the SHG intensity.

Texture analysis of different stages of NSCLC
The texture analysis was conducted on the SHG intensity, I SHG , as well as on all the R-, C-ratios, δ and DLP parameters extracted from the PIPO images. The I SHG -image is obtained by summing all 81 images of the same area recorded at different combinations of incident laser and SHG signal polarizations, and so, it is polarization independent. Three textural features (entropy, correlation and contrast) in the SHG intensity images showed significant differences between tissue types. Figure 3(a) shows the average entropy for normal and all tumor stages. This is lowest for normal tissue and increases with stage, indicating decreasing uniformity and possibly increasing fragmentation. Such trend has been also observed in different diseases, such as asthma, and atherosclerosis [34,47], showing that collagen remodelling is present when there is incidence of fibrosis and/or scar formation.  Figure 3(b) shows the correlation textural feature, with higher values for normal tissue than all tumor stages, with stages-II and -III being statistically significant. The higher correlation value for normal tissue indicates a more regular structural pattern, i.e. tumor tissues (especially stages-II and -III) have more isolated and fragmented collagen fibers, so that there is less probability of patterns of grey levels repeating within the computational window. Stages -II and -III were not statistically different from one another.
The contrast texture feature shown in Fig. 3(c) represents regions with varying SHG intensity in the neighboring pixels, higher values being associated with larger variability of gray levels within the computational window. Normal collagen shows a statistically-significant higher contrast compared to all three NSCLC stages. The contrast decreases as a function of tumor stage, however there are no significant differences between stages. Higher contrast indicates that normal fibers have better order, i.e. are less fragmented, leading to brighter SHG than the surroundings.
The entropy, correlation and contrast textural features for images of R-and C-ratios, δ and DLP did not show statistically significant differences but some trends are apparent. Firstly, the R-ratio entropy decreases in stage-I and stage-II tumor compared to normal tissue, while the correlation and contrast exhibit no differences between normal and tumor samples. From the ultrastructure analysis we know that mean of the R-ratio shifts to the higher values and according to the texture analysis result this increase occur consistently throughout the image. Secondly, in the δ-value texture analysis the correlation parameter shows a decreasing trend for tumor compared to normal tissue. Since correlation quantifies the dependence of two pixels separated by distance d, lower correlation means that the pixels are generally independent from one another. This can be explained by high spatial-frequency waviness of the fibers in the tumor. A high spatial frequency waviness of collagen occurs on a micron-scale, while the whole fibre on a few tens of microns scale get straighter as explained by the MAD of δ. On the other hand, the fibers in normal tissue follow a low waviness on a micron-scale while having a wavy structure on a larger scale of tenths of microns. Lastly, the contrast parameter in the DLP texture analysis shows higher values for normal than all stages of tumor tissues, indicating a better order in the normal collagen fibers. This is similar to the contrast parameter of the SHG intensity texture analysis. The fact that there is higher correlation between DLP and SHG intensity than between the R-ratio and SHG intensity indicates that DLP is more sensitive to collagen disorder.

Discussion
We have used polarimetric SHG microscopy to study the impact of NSCLC tumor stage on the collagen structure. The ultrastructure analysis provides R, C, δ and DLP parameters for each image pixel, yielding information about the ultrastructural organization of collagen fibers within pixels (through R, C, and DLP) and about the effective collagen orientation (through δ). In addition, texture analysis of these parameters and of the SHG intensity were used to investigate the microstructural variation of collagen fibers between the nearest neighboring pixels in the computational window that is probed across the scanned area. Three textural parameters were extracted: entropy, correlation and contrast.
The R-ratio distribution showed lower values in normal tissue and increase with tumor stage. One possible mechanism is the re-organization of collagen fiber matrices. This is due to the overexpression of ECM proteinases [48] and increase in collagen fiber stiffness, demonstrating that the ECM organization and stiffness is a key regulator of tumor progression. A correlation between integrin α11 expression and collagen stiffness has been reported [15] and expression of collagen-binding integrins is known to remodel collagen type-I by mediating contraction of the collagen lattices [15]. Images obtained from atomic force microscopy have revealed that collagen fibers are less organized and the Young's modulus varies significantly as a function of integrin α11 expression [15]. These observations could be linked to ultrastructural properties of collagen which impacts the R-ratio values.
The δ-distributions showed that collagen in tumor tissue has increased fiber alignment compared to the normal tissue. The lower values of the contrast parameter of the SHG intensity can also be interpreted as a better aligned and oriented structure. This behavior has been shown through another study on NSCLC [15] as well as other types of carcinoma, including breast [27,[49][50][51] and pancreas [16]. Increase in the collagen fiber alignment and orientation is one of the proposed mechanisms of metastasic spread from solid tumors [52].
The MAD of the C-ratio distributions trended towards increase in tumor stage. Since the C-ratio is a function of sin α, this may indicate that the spread of collagen fibers out of the image plane is increasing, in contrast to what is observed for δ. The random-orientation sectioning of the tissue should eliminate the differences between the in-plane and out-of-plane angle distributions. This points to increase in the molecular χ xyz / χ zxx distribution in tumor, possibly due to collagen fragmentation.
The DLP values were smaller in malignant than normal lung tissue, indicating increased disorder in collagen fibers of tumor tissue. Collagen fragmentation is consistent with increase in the entropy in the SHG intensity texture analysis, which is lower for normal tissue and increase with tumor stage, indicating loss of structural organization. The correlation parameter from texture analysis of the SHG intensity also suggests that the collagen fibers in the computational window are more fragmented for malignant tissue. It has been previously shown that integrin α11 can influence the linearization of collagen fibers adjacent to the tumor cells [15] and that disorganized collagen fibers at the tumor cell/stroma interface can be created as the result of direct contact between tumor cells and cancer associated fibroblasts (CAF) [52].

Conclusions
The combination of SHG polarimetric analysis and texture analysis reveals significant differences in the collagen structure between NLSC and normal lung tissue in the R, C, δ, and DLP parameters from polarimetric measurements, and the entropy, correlation and contrast from the SHG intensity texture analysis. With the tumor progression, the collagen structure becomes more disordered and fragmented on the submicron level, while at the same time remodeled fibers become straighter and preferentially microstructurally aligned. Hence, polarimetric SHG microscopy in combination with texture analysis can provide a powerful tool to investigate the extracellular tumor microenvironment. The extension of this study to a large tissue area investigation and combination with other histopathology techniques may be of value for more accurate cancer diagnostics and staging.

Disclosures
The authors declare that there are no conflicts of interest related to this article.