Machine learning classification of human joint tissue from diffuse reflectance spectroscopy data

Objective: To assess if incorporation of DRS sensing into real-time robotic surgery systems has merit. DRS as a technology is relatively simple, cost-effective and provides a non-contact approach to tissue differentiation. Methods: Supervised machine learning analysis of diffuse reflectance spectra was performed to classify human joint tissue that was collected from surgical procedures. Results: We have used supervised machine learning in the classification of a DRS human joint tissue data set and achieved classification accuracy in excess of 99%. Sensitivity for the various classes were; cartilage 99.7%, subchondral 99.2%, meniscus 100% and cancellous 100%. Full wavelength range is required for maximum classification accuracy. The wavelength resolution must be larger than 8nm. A SNR better than 10:1 was required to achieve a classification accuracy greater than 50%. The 800-900nm wavelength range gave the greatest accuracy amongst those investigated Conclusion: DRS is a viable method for differentiating human joint tissue and has the potential to be incorporated into robotic orthopaedic surgery. © 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Laser surgery combined with robotic control in orthopaedics is still a developing field which provides the opportunity for more precise surgery, new surgical techniques, and the ability to work remotely with a high level of sterility [1][2][3]. However, the process of laser surgery does not provide tactile feedback, which is useful for surgeons to determine the type of tissue being ablated and for controlling the ablation depth [4]. Hence there is a risk of iatrogenic damage [5][6][7].
When light is applied to a tissue type, various wavelengths will be both absorbed and scattered depending on its optical properties [8]. This phenomenon can be measured through a process known as Diffuse Reflectance Spectroscopy (DRS). This study investigates the DRS of human joint tissue to determine the plausibility of automatically distinguishing tissue typing based on the generated spectral data. This was performed using an optical fibre coupled spectrometer with a minimum wavelength resolution of 2.0 nm and Si CCD. It was illuminated using a standard halogen lamp. The resulting data set comprised of spectra from cartilage, subchondral bone, meniscus, and cancellous bone specimens. We also investigated the effects of spectrometer resolution and signal-to-noise ratio (SNR) on classification accuracy.
The objective of this study was to prove with strong statistical significance that the incorporation of DRS sensing into real-time robotic surgery systems has merit. Current orthopaedics surgical technology available in clinical practice relies on specialists (surgeons) to visually differentiate tissue. DRS as a technology is relatively simple, cost-effective and provides a non-contact approach to tissue differentiation. Its usage as a diagnostic tool has already been proven by identifying various tissue types including bladder [9] where elasticscatter spectra were obtained using a fibreoptic probe incorporated in a urological cystoscope, brain tissue where near-infrared (NIR) optical-property characterization by measurement of spatially resolved diffuse reflectance [10,11], breast [12,13] tissue where elastic scattering spectroscopy mediated by fibreoptic probes were utilised, cervix [14] tissue which utilised reflectance spectroscopy, colon [15,16] tissue which utilised diffuse reflectance spectroscopy, oesophagus [17,18] tissue utilising fluorescence, reflectance, and light-scattering spectroscopy, ovarian [19] tissue utilising reflectance spectroscopy, pancreas [20,21] tissue utilising optical spectroscopy, and skin [22,23] utilising near-infrared spectroscopy.

Hardware
Human joint tissue in the form of bone and soft tissue specimens were collected from routine total knee replacement surgeries over a 3-month period after clinical ethics and patient consent had been received.
Optical spectra were generated through DRS from a set of 3043 human joint tissue samples which included cartilage, subchondral bone, cancellous bone and meniscus. These spectra had a wavelength range of 200-1000nm and were collected under consistent conditions. The spectrometer ( Fig. 1) used for collection was the portable Ocean optics USB-650 tide spectrometer [24] which detected the wavelength dispersed light with a linear silicon CCD array detector and was operated with a wavelength resolution of 2nm. The light source used was a 150W fibre-coupled Halogen lamp, connected to a light ring to standardise illumination spread across the tissue sample. In total, spectra from 1579 cartilage, 1269 subchondral bone, 156 cancellous bone and 39 meniscus samples were collected.

Sample collection and preparation
Human joint tissue was collected by two orthopaedic surgeons during the course of total knee replacement operations. The tissue was transported to the sensing laboratory, sensed and stored in a freezer within 72 hours of collection. Figure 2 illustrates the supervised machine learning work flow used for tissue identification. All computations were performed using the WEKA machine learning tool kit [25] based on the normalised spectra. The data set consisted of 3043 spectra whose measurements spanned across 2048 wavelength channels. Each of these wavelength channel were regarded as an attribute towards the identification of the associated tissue class. There were four tissue classes consisting of cartilage, subchondral bone, cancellous bone, and meniscus.

Software
As part of a supervised learning procedure, the first step involved the identification of the samples to create a ground truth for each sample. This identification was performed by clinical orthopaedic surgeons based on the shape, colour, presentation and, most prominently, the location in which they were removed from the patient.
The second and third steps involved the normalisation and dimensionality reduction of the spectra respectively. Normalisation began through division by the light source spectrum followed by the application of a standard normal variate (SNV) [26] transformation to centre and scale them. The average spectra and standard deviation for each tissue class was then calculated based on this normalised form. This enabled both the inter-class variation and the intra-class variation to be measured. Dimensionality reduction involved reducing the number of attributes or wavelengths associated with each spectral sample. This was achieved through Multiclass Fisher's Linear Discriminant Analysis (Multiclass FLDA) [27] and resulted in each sample having only 3 identifying attributes.
The final step was the comparison of present classifiers to determine which best correlated with the ground truth. This was achieved through Linear Discriminant Analysis (LDA) [28] with 10-fold cross-validation being used to determine the resulting classifier accuracy. This involved splitting the data into 10 sets with 9 sets being used for training and 1 set used for testing. This was repeated 10 times with a mean accuracy recorded across the iterations.
The quality and quantity of data required for efficient classification was determined by investigating how the classification accuracy changed depending on the spectral range, resolution and SNR. Changes resulting from the range were determined by running the classifier on 100nm segments of the spectra. This provided a comparative accuracy measure for each wavelength range that can be useful in identifying regions better suited to tissue identification. Changes resulting from the resolution and SNR were determined using a combination of linear-interpolation and noise that was added by artificially degrading the data. The optimal level of noise required was determined by incorporating it within 10 samples. The classifier was then run on this reduced set with the degree at which the results differed from their originals being used as an indicator of how much noise could be tolerated.

Results
A total of 30 average DRS or otherwise w their class ide (24,25). Man the greatest a classifier.   classification scopy data and mensionality r romising resul OC) curve pass curve is a plot e (sensitivity) a had success w 30], however Neighbours) cla n for the variati ue, such as he n the 400-450 m range. Lipids bed in ranges a e absorbed ab ngths but has nm range) reg hree wavelength n absorption be ue discriminatio performance n large quant s, total spectra n the light sour ficult to comp milar materials importance to f his problem, m us wavelength reso ge multiclass FLDA solution was alter of human joi d WEKA mac reduction follo ts with high c ses through the t of the false p along the y-axi with principa the greatest assifier after PC ion in spectra b emoglobin, lipi 0nm waveleng have an absor above 1000nm ove the 1000n increased abs gions (26)(27)(28). h segments for elow 600 nm a on.
can often a tities of spect al energy, nois rce, measureme pare and can c do not appear features that ar many spectral olution-the same A dimension redu red in software b int tissue, we chine learning owed by LD classification a e upper left co positive rate (s is [29]. Previo al component accuracy we CA. between differ ids, collagen a gth range but rption peak in m. Collagen pea nm range as sorption in bot In our investi r classification and dominant w arise due to tral data over se and backgro ent distance, o cause performa similar. It may re merely artef l classification machine learning uction followed by by averaging over e employed no tools that con DA classificati accuracy. The orner reflecting specificity) alo ous optical spec analysis (P achieved was rent tissue clas and water. Hem may also be the 900-1000n aks around the well. Water i th the UV (10 igation, the ra n accuracy. Thi water and fat ab inconsistent c r an extended ound of spectr r the spectrom ance deteriora y also cause a facts of the inc n solutions in g y r ormalised nsisted of ion. This e receiver g the high ong the xctroscopy PCA) for s 88.04% sses is the moglobin strongly nm range e 1500nm s weakly 00-400nm ange 600is may be bsorption collection d period. ra can be meter [31]. tion with classifier consistent nclude a normalisation pre-processing step to adjust for these variations. Common approaches for normalising spectra include constant shifts, smoothing, scaling, SNV, baseline correction and continuum removal [31][32][33][34][35][36]. It was recognized that a class imbalance was present within this work based on the significant variance existing between class samples sizes. This was investigated to determine if it led to the subsequent positive results. Of all the tissue classes used, Meniscus was the smallest with a data set of 39. The same number of data samples were taken from the remaining classes and the classifier was ran on the new data set. Regardless of this reduction, it still achieved a classification accuracy of above 99%. This demonstrates that suitable discrimination must exist between the classes, independent of any influence from their sample sizing.
Cross-validation is a technique used to evaluate predictive models by partitioning the original sample into a training set to train the model, and a test set to evaluate it [37]. The 10fold cross-validation is a common form of this were the data set is broken into ten different sets where nine of these are used to compose the training set and the remaining one is used for the test set. The classification accuracy was above 99% using this technique. This accuracy is similar to that achieved by Stelzle et al. [38] where Principle Component Analysis (PCA) and Quadratic Discriminant Analysis (QDA) were used to achieve 94.8% accuracy on 12,150 pig tissue measurements. However, our method learned a comparable level of accuracy from approximately 4 times fewer measurements. Figure 5 demonstrates that accuracy does not degrade below 90% until resolution drops below approximately 8nm. A colour camera has a spectral resolution of around 70-100nm [39]. This suggests we could achieve around 80% accuracy with a normal colour camera. At a 10:1 signal to noise ratio, the classifier was only able to correctly classify approximately 50% of the time, hence the classifier requires a signal to noise ratio better than 10:1.

Conclusion
We have used supervised machine learning in the classification of a DRS human joint tissue data set and achieved classification accuracy in excess of 99%. We employed a halogen light source and a spectrometer for DRS of human joint tissue collected from surgical procedures. We collected a data set comprising diffuse reflectance spectra of 4 types of human joint tissue in the wavelength range 200nm-1030nm. They included 1579 cartilage, 1269 subchondral bone, 156 cancellous bone and 39 meniscus samples. Tests on the data set revealed that full wavelength range is required for maximum classification accuracy, the wavelength resolution must be larger than 8nm, a SNR better than 10:1 was required to achieve a classification accuracy greater than 50% and the 800-900nm wavelength range gave the greatest accuracy amongst those investigated

Disclosures
The authors declare that there are no conflicts of interest related to this article