Towards real time assessment of intramuscular fat content in meat using optical fibre-based optical coherence tomography

We consider the use of optical coherence tomography (OCT) imaging to predict the quality of meat. We find that intramuscular fat (IMF) absorbs infrared light about nine times stronger than muscle, which enables us to estimate fat content in intact meat samples. The method is made very efficient by extracting relevant information from the three-dimensional high-resolution images generated by OCT using principal component analysis (PCA). The principal components are then used as regressors into a support vector regression (SVR) prediction model. The SVR model is found to predict IMF content stably and accurately, with an R^2 value of 0.94. Our study paves the way for automated, contact-less, non-destructive, real time classification of the quality of meat samples.


Introduction
Intramuscular fat (IMF) is an important quality attribute of meat that determines the eating experience of customers.In general, it positively influences juiciness, tenderness, flavour and the overall acceptability of meat in different breeds (Hocquette et al. (2010)).
The meat industry faces challenges in marketing meat products according to their qualities due to the high variability and heterogeneity of samples (Hocquette et al. (2010)).Standard quality assessment methods, chemical or mechanical, are often destructive and slow, have huge sampling errors, and require large expenditure on personnel and equipment.Moreover, they are often not adaptable to an "in-line" usage.In particular, internationally recognized methods for IMF analysis, such as Soxhlet solvent extraction, the Folch method, or gravimetric measurements of the extracted fat weight (Silva et al. (2015)), all have to be performed in a dedicated laboratory, outside the meat processing plant, and require minced meat for analysis.Overall, the evaluation, prediction, and control of quality attributes of individual pieces of meat in processing plants remains elusive.
To address these challenges, a number of techniques have been investigated over the past few years for application to meat quality attribute measurements, including visible or near-infrared (Vis-NIR) spectroscopy (Prieto et al. (2009)), hyper-spectral imaging (Feng et al. (2018); Lohumi et al. (2016)), and ultrasound imaging (Simal et al. (2003)).These techniques are non destructive, but still have severe drawbacks.For a start, they mostly only determine an average composition across an entire sample, as there is in general a trade off between spectral resolution and imaging speed.The low spatial resolution of Vis-NIR spectroscopy, e.g. is a disadvantage when analysing highly non-homogeneous meat samples such as whole steaks.Hyper-spectral cameras are also limited to surface examination.Ultrasound imaging can probe deep inside samples but requires direct contact.Finally, hyper-spectral cameras and Vis-NIR spectroscopy both require an external light source, making calibration and comparison across processing plants difficult in the long term.
Clearly, there is a need for a fast, reproducible, and non-destructive method to predict meat quality in real time.In this work, we consider optical coherence tomography (OCT) and show that this contact-less, three-dimensional, high-resolution imaging modality holds great promises for this task.OCT can be understood as a light-based version of ultrasound imaging.It generates images by measuring the intensity of light back-scattered from the depth of a sample.Unlike ultrasound where the delay between the original sound and its echoes can be directly measured electronically, OCT uses white-light (low-coherence) interferometry to measure the time of flight from reflections (Boudoux (2016)).OCT produces a tomographic map of a sample by moving the light beam in a raster pattern.The depth profile at each location is called an A-scan (Axial).A line of A-scans form a B-scan, which is a 2D slice through the sample.A series of B-scans, separated by a given distance, produces a volume or C-scan, which shows the structure of the measured sample volume.Typical OCT systems that provide a lateral resolution of 5-20 µm, and kHz to MHz A-scan acquisition rates, are nowadays commercially available for fast volume imaging, and commonly applied to semi-transparent tissues, such as the skin, eyes, and some internal organs (Bouma (2001)).
Continuous high resolution imaging of meat samples with OCT can in principle be implemented in a processing plant setting and will provide a lot of detailed information.However, the amount of data generated, combined with the large spatial variability of the samples (Chen et al. (2014)), significantly complicates classification.To address this problem, we combine OCT with two machine learning techniques.As a first step, we use principal component analysis (PCA) to reduce the dimensions of the complex imaging data set.PCA projects a set of observations to a new space spanned by a few number of dimensions, or components, orthogonal and uncorrelated to each other in such a way that each successive component accounts for a decreasing amount of variability in the data.This procedure filters out noise and reveals hidden structure in the data (Shlens (2014)).For each sample, the first few principal components are then used as input variables (regressors) in a regression model for classification.That model is determined by another machine learning technique, namely support-vector regression (SVR), which has emerged as an elegant tool for solving regression problems based on a training data set (Vapnik (1999); Dibike et al. (2001); Drucker et al. (1997); Basak et al. (2007)).It is a form of supervised learning, also known as support-vector machine.Reducing the number of variables through PCA prior to SVR helps to make the overall classification very efficient.
In this paper, we test this approach by analysing small sections of beef shortloin from Wagyu and Friesian bull animals to make and test the regression models, and we show that PCA-based feature extraction of OCT images is a viable alternative for fast prediction of meat quality, with the potential for real time implementation in the near future.

Meat samples
M. longissimus thoracis et lumborum (striploin) comprising of Wagyu X Friesian (n = 52) and Friesian bull (n = 130) were collected from a local abattoir during three processing days.The Wagyu samples were obtained by excising a single striploin from animals across a range of marble scores to ensure a wide range of IMF values (Scored 2-6) at approximately 16 hours post-mortem.
The striploins were packaged individually in unsealed vacuum bags and placed in cartons and held at −1.5 • C until approximately 24 hours post-mortem.The Friesian bull samples were obtained from a hot-boning process at a local abattoir, with both the left and right striploins being ex- At approximately 24 hours post-mortem, samples were cut into five steaks each.70 of these samples -Wagyu (n=24) and Friesian bull (n=46) -were allocated to our OCT study.The remaining samples were used in other studies and are not considered here (Dixit et al. (2020)).All sides (n=6) of each of the 70 samples were imaged on the same day at room temperature (20 • C).
After OCT imaging, the steaks were freeze dried and the fatty acid composition was measured using gas chromatography as outlined in Craigie et al. (2017) to provide comparison data.Figure 1 shows the distribution of IMF % of the Friesian bull (n=46), Wagyu (n=24), and total (n=70) dataset.

OCT system
Measurements are taken using a swept-source OCT system as shown in Fig. 2.This system is custom-built using a combination of single mode optical fibers and free space optics but the overall configuration otherwise follows standard OCT practice (Fujimoto et al. (2000)).The light source is a commercially available wavelength-swept laser (Axsun Technologies Inc, Billerica, Massachusetts) with a central wavelength of 1310 nm, bandwidth of 100 nm, 50 kHz sweep rate, and an average output power of 33 mW.Light back-scattered by the sample is mixed with a reference beam and the resulting interference intensity patterns are detected by balanced photo detectors (BPD).This is done separately for the two orthogonal polarization directions (respectively V and H), which are split with a polarization beam-splitter (PBS).The system is thus polarization-sensitive and in principle able to detect sample birefringence, but this feature was not used in the present study.
Rather, we only considered the total combined intensity of the two balanced photo detectors as our OCT signal.
As the wavelength is swept, standard processing of the OCT signal leads to A-scans, which are effectively a measure of the total backscattered light reflectivity R(z) at the beam position as a function of depth z inside the sample.B-scans and C-scans are then constructed by raster scanning the laser beam across the sample using two rotating mirrors, each controlled by a galvanometer (only one is shown in Fig. 2).There are 714 A-scans in a B-scan (along what we define as the x-axis), and 150 B-scans in a C-scan (respectively, y-axis).Each C-scan taken by our system represents a volumetric image of 1 cm × 1 cm (x-y cross-section) × 2.5 mm (depth, z).Several of such C-scans are measured in a grid-like pattern for each face of our 70 samples.

Attenuation calculations
The total backscattered light reflectivity R(z) is related to the reflectance ρ of a scatterer at depth z in the sample and to the attenuation coefficient µ t (z) according to the Beer-Lambert law (in the single scattering approximation) (Schmitt et al. (1993)), (1) The Beer's law relates intensity of light passing through a section of sample to the thickness of the sample as a simple exponential decay of the light incident on the sample.The attenuation coefficient µ t (mm −1 ) describes this decay.The factor of 2 in the exponential term accounts for the double pass of the light, in and out of the sample.
By taking the natural logarithm of R(z) in Equation ( 1) and differentiating the result, we obtain Equation ( 2) shows that the gradient of ln R(z) is proportional to the local value of the absorption coefficient µ t .This quantity can thus be obtained very simply from A-scans as a function of depth.
As µ t (z) is dependent on the sample optical properties, it can be used to differentiate intramuscular fat from muscle.
We illustrate this aspect in Fig. 3 As shown, the gradient of the line of best fit of the features shown in these plots enable us to estimate the absorption coefficient µ t of muscle and IMF.

Attenuation heat-map on OCT data
To automate the analysis of the attenuation, an attenuation "heat-map" is built for each C-scan by downsampling the data.Specifically, we define a grid of small voxels across the C-scans, with voxels large enough to span across multiple A-scans.We first find the gradient of the line of best fit for each section of A-scan data in each voxels following the procedure above to extract the attenuation coefficient.Gradient values are then averaged out across voxels to give an effective average attenuation coefficient, µ t , for each voxel.We use the word 'effective' here because in Fig. 3(c)].While the negative average gradients could be considered artefacts, they still carry some information about the sample which is beneficial to our machine learning approach.Therefore, we do not need to filter those values out, which simplifies signal processing.
As an example, we show in Fig. 4(a) the attenuation heat-map corresponding to the B-scan plotted in Fig. 3(a).Here, we only show positive values for better contrast.Red pixels are associated with high attenuation and therefore represent fat.We can observe that large fat deposits (such as on the left) absorb all the light, shadowing any structures underneath.To take that into account in our analysis, we automatically remove data with a signal-to-noise ratio (SNR) below a certain threshold, limiting the depth of the images.The sections above the surface of the samples, including the initial reflection artifact, are similarly removed.Note that the size of voxels was determined empirically: large enough to improve SNR and get enough information to predict IMF %, but not exceeding the size of marbling, which is typically in the mm range.

Histogram and distribution of IMF in the samples
To increase processing speed and further reduce the complexity and dimensions of the data, we Representing our data in terms of histograms offer a one-dimensional view of the entire fat content across the volume of a sample.We argue that such histograms are more effective in determining IMF % than counting, in the heat-maps, the individual voxels associated with IMF or meat muscle.This latter method lead to ambiguity for voxels containing a mixture of muscle and IMF and which present intermediate attenuation values.This occurs in particular at interfaces between muscle and IMF.Our histogram representation handles such cases graciously without discarding any voxel data.Overall, histograms offer a more faithful picture of the fat content of a sample.One-dimensional data-sets such as histograms are also more adapted to dimension reduction tools like PCA and spatial-spectral regression techniques like SVR, leading to better prediction models.Note that because IMF is distributed non-uniformly in each meat sample, the histograms we use in our analysis include data from all C-scans collected from a sample.

Machine learning techniques
Principal Component Analysis (PCA) is used to reduce the number of independent variables in the histograms (bins = 200) to build an effective and robust regression model (Balabin & Lomakina (2011)).The histograms of the samples are normalised to the area under the curve prior to PCA.
The principal components (PCs) obtained from the PCA output are used as regressors for the SVR analysis while the IMF % obtained using industry standards (Fig. 1) are used as the dependent variables.
The SVR model was developed using 55 samples for the training data set, and was used to All the analysis performed in our study was done in standard Python 3.6, with the Scikit-learn library for the PCA and SVR algorithms.

Heat-maps and attenuation histograms
To build the attenuation heat maps, voxel size was chosen as large as possible to optimize computational speed without degrading the PCA grouping.This led to voxels 600 µm (along x) × 285 µm (along y) × 40 µm (depth, z).From the heat maps we have also quantified the difference in attenuation between fat and muscle.Fat scatters light more than muscle resulting in stronger light attenuation, as shown by the steeper slope highlighted in Fig. 3(c) in comparison to that seen in Fig. 3(b).The average attenuation coefficients of muscle and fat were found as: Clearly, the difference is significant and underlies our ability to characterise IMF % with OCT.

PCA results
Normalized histograms show clear differences between low and high IMF % samples as was shown in Fig. 4(b).The PCA analysis of the histograms was able to successfully group the samples according to high IMF % and low IMF %.This is illustrated in Fig. 5(a) in which the points associated with high fat samples (yellow tint) are grouped together and clearly separated from the low fat ones (blue tint).We found that we could limit the PCA analysis to only five principal components (PCs), as these were sufficient to account for 98.4 % of the variance in the histograms of different samples as shown in Fig. 5(b).

Modelling IMF % using SVR
For the SVR modelling, the first five PCs of the PCA model are used as the regressors, or input variables, while the IMF % obtained from gas chromatography is the dependent variable.The model was tested with a number of PCs from 1 to 6, but five PCs was found to lead to the best outcome.
As shown in Table 1, the SVR prediction model shows good stability and prediction accuracy over several trials (using different training sets) with an average mean squared error (MSE) of 1.62 IMF % and average mean absolute error (MAE) of 1.09 IMF %.The regression model fit of trial 2 is shown in Fig. 6.The high R 2 value of 0.94 also indicates good prediction and generalisation performance of the developed SVR model.

Discussion
IMF absorbs 1300 nm wavelength light stronger than muscle or water.When imaging meat samples with OCT, this difference can be detected in the OCT signal, enabling tissue discrimination and measurement of IMF %.We have shown that an SVR prediction model of IMF % based on the OCT data has an R 2 value of 0.94 and an average mean absolute error of 1.09 IMF %, making it a promising tool to predict meat quality.The prediction accuracy of our OCT-based SVR model is better or as good as the models developed using various other techniques such as near infrared (NIR) OCT however has the potential to enable in-line non-destructive testing of individual samples at a comparatively low cost, which none of these other techniques can provide.
NIR spectroscopy is the most extensively used and studied technique for meat quality assessment due to its low cost and high fat sensitivity.Like other spectroscopic techniques, including Raman and bioelectrical impedance spectroscopy, it works best with minced meat.However, sampling uncertainties typically lead to large prediction errors when considering intact heterogeneous meat samples (Cheng et al. (2015)).Spectroscopic techniques also all have the drawback of providing little or no spatially-resolved information about the samples.Hyper-spectral imaging, which combines advantages of spectroscopic and imaging techniques, partially addresses that issue.However, long data processing times hinder real time applications.Additionally, the use of an external light source makes calibration across processing plants difficult in the long term.All these techniques are also essentially limited to surface examination, and cannot probe deep into meat samples.In comparison, OCT, which can probe beneath the surface of the meat samples with high resolution, leads to a high prediction accuracy even with intact samples.In addition, data acquisition and data analysis can be done in real time.
Freezing and thawing of the samples used in our OCT study results in water loss.This loss alters the myofibrillar protein structure and optical properties of meat muscle (Leygonie et al. (2012);Xia et al. (2009)), but does not change the fat content.Therefore this has no influence on the IMF % prediction accuracy of our model.
Our study had access only to a limited number of samples, with a reduced spread of IMF %.
We therefore anticipate that a test of our method with more samples along the entire IMF % range, especially towards high IMF content, is required before using it in an industry environment.To provide a wider variability of testing conditions, it would also be desirable to use samples from a broader range of breeds, muscle types, and diet.
We must finally note that imaging a single side of large samples would be more suitable for an industry environment than imaging all sides of small samples.Large meat samples could be imaged in one single scan using a long-range and wide field of view OCT system (Song et al. (2016); Shirazi et al. (2016)).Our current system takes ≈ 55 seconds to image and analyse one sample with a 225 cm 2 area, but this could be substantially reduced to 10 seconds or less, in particular by using recent multi-megahertz sweep rate sources (Xu et al. (2015); Lee et al. (2015)).Our OCT technique will also need to be tested, and the prediction model rebuilt, in an abattoir atmosphere which presents additional industrial noise and challenges.This is one of the future development direction of this study.

Conclusion
The work presented here insights on the ability of OCT and machine learning techniques to predict the quality of meat.To the best of our knowledge, this is the first time that OCT has been applied to this problem.We have shown that OCT has the potential for completely automated, contact-less, non-destructive, real time detection and classification of the quality of intact meat samples.

Figure 1 :
Figure1: IMF % distribution versus type of samples used in this study.The IMF % is measured after OCT imaging using gas chromatography as per industry standards.
. Panel (a) shows one B-scan of a sample containing some IMF along with meat muscle.In this image, we have selected two A-scans, respectively in areas with only muscle (highlighted by the red line), and with a mixture of fat and muscle (green line).The natural logarithm of the corresponding OCT signals are plotted in Figs 3(b) and (c) respectively.

Figure 3 :
Figure 3: (a) B-scan of a meat sample containing muscle fibre and IMF.(b), (c) Individual A-scans corresponding to the red and green vertical lines in (a), plotted as natural logarithm of the OCT signal.The slope of the line of best fit of the selected features represent the attenuation coefficient µt of muscle and fat, respectively.

Figure 4 :
Figure 4: (a) Heat-map of effective averaged attenuation coefficient µt , corresponding to the B-scan shown in Fig. 3(a).Voxels with high attenuation are considered to be 100 % IMF.The apparent high attenuation of the top surface is an artefact due to the large amount of light reflected at the air-meat interface.(b) Histograms of effective attenuation of two representative samples, respectively with high (Wagyu, blue) and low (Friesian bull, orange) IMF %.
convert the heat-maps into histograms.Two representative histograms are shown in Fig. 4(b) for, respectively, high fat (Wagyu, blue) and low fat (Friesian bull, orange) samples.As should be clear from the discussion above, the large positive values of µ t relate to voxels which mainly contain IMF, while the negative values represent the interfaces between IMF and meat muscle.A meat sample with a large IMF content is therefore expected to have more voxels towards both the high negative and positive sides of the histogram of effective attenuation coefficients, i.e. to be more spread-out.Fig. 4(b) confirms this interpretation.
predict the IMF % of the remaining 15 samples (test set).The samples were divided randomly into training and test sets.Test sets had at least one random sample from each IMF % range.The 15 samples in a test set consisted of 3 samples with 0-1 IMF %, 4 samples with 1-2 IMF %, 2 samples with 2-5 IMF %, 3 samples with 5-10 IMF %, 2 samples with 10-15 IMF %, and 1 sample with 16-20 IMF %.The PCA model was built on the training set and the associated loadings were used to make the SVR model.PCs were then calculated for each sample in the test set, based on the PCA loadings obtained from training data.The test data was completely unknown to the algorithm making the PCA and regression model, and therefore can be reliably used to assess the accuracy of the model.Cross validation was done five times by randomly picking up different test and training sets from the overall data set.

Figure 5 :
Figure 5: (a) PCA analysis of the attenuation histograms of different samples.The samples with high IMF % (yellow tinted dots) clearly appear grouped together and separated from IMF % samples (blue tinted dots).The IMF % associated with the colour scale was determined from subsequent gas chromatography analysis of the samples.(b) The %-explained variance curve shows how much information can be attributed to each PCs.

Figure 6 :
Figure 6: IMF % predicted by our SVR model versus actual values for the training (green points) and test (red points) samples of Trial 2. The green line is a fit across the training points, along with the 95 % confidence interval (translucent green).The table shows the slope and intercept (which ideally should be 1 and 0, respectively) of the regression model fit for each of the five trials.

Table 1 :
Accuracy of the SVR prediction model for different trials.MSE and MAE stands for, respectively, mean squared and mean absolute errors, expressed in IMF %.