Authenticating Raw from Reconstituted Milk Using Fourier Transform Infrared Spectroscopy and Chemometrics

Beijing Advanced Innovation Center for Food Nutrition and Human Health, Beijing Technology & Business University (BTBU), Beijing 100048, China Institute of Food and Nutraceutical Science, Department of Food Science and Technology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA


Introduction
Milk is one of the most consumed food items, which has signi cant nutritional and economical importance.e reconstitution of milk is an act that adulterates skimmed or whole milk powder in part into raw milk or completely substitutes raw milk [1,2].Such fraud can achieve marginal economic gain since the shelf life of milk powder is longer than their raw counterparts.Adulteration of powdered milk in their raw counterparts may alter the original nutritional and functional value of raw milk, and thus, it may provoke a crisis of con dence to consumers for milk industry.erefore, a rapid, simple, and automated method for milk adulteration detection is required.
Adulteration of commercial milk powder is even more challenging to detect than many other common milk adulterants such as melamine or plant protein, due to the extremely similar chemical composition.erefore, measurements with both high sensitivity and resolution are preferred.For instance, two-dimensional gel electrophoresis combined with matrixassisted laser desorption/ionization-mass spectrometry was reported to detect powdered milk in raw cow's milk based on the modi ed peptide including oxidation, lactosylation, and deamination protein products [1].e detection of furosine [3] and lysinoalanine [4] by liquid and gas chromatography was introduced.Rather than seeking speci c marker components, other kinds of methods applied empirical models or ngerprints to detect adulteration.Di erentiation of raw from reconstituted milk by the stable isotope ratios of oxygen and hydrogen was also reported [5].However, the above methods often require either time and cost-consuming mass spectrometric detection or labor-intensive sample pretreatment or analysis procedures, which render these methods inapplicable to large-scale assay.
Rapid analytical techniques such as spectroscopy or electronic noses, with the combination of empirical modelling, provide a convenient approach to characterize complex food matrices.For example, the adulteration of whole milk with milk powder was detected by spectrophotometry.
e ultraviolet and visible spectroscopy has been applied to the detection and quantification of raw milk with reconstituted full-fat milk powder [2].e transmittance of raw milk adulterated with full-fat dry milk powder reconstituted milk was observed and possibly explained the phenomenon by turbidity variation induced from low degree of homogenization [6].In addition, the fluorescence of advanced Maillard products and soluble tryptophan (FAST) index had been devised for distinguishing milk heat treatments [7].However, these researches were based on empirical observations without clear metrics or limits, and thus limited information is extracted from the spectra.Fingerprints combined with chemometrics methods were suitable for processing complex analytical data in an automated and objective decision-making manner.For instance, the adulteration of reconstituted milk or water with electronic noses constructed with ten different metal oxide sensors was monitored with chemometrics modelling [8].
Fourier transform infrared spectroscopy (FTIR) has been widely used for food quality monitoring including authenticity and traceability, due to its fast speed and nondestructive capabilities [9,10].FTIR spectroscopy has been successfully demonstrated in milk authentication such as to detect soymilk adulterated in cow or buffalo milk [11].It is therefore interesting to test whether FTIR spectroscopy could further identify any reconstitution in raw milk.
Adulteration in food ingredients such as milk or olive oil suggested that chemometrics modelling is becoming an essential part in the fingerprinting analyses [12][13][14].Specifically, infrared and Raman spectroscopy studies on detection of food adulterations had resulted in a wide range of successful applications.Raman spectroscopy could detect melamine adulterant in milk powder at the detection limit of 0.13% (w/w) by two vibration modes at 673 and 982 cm −1 [12].Additionally, machine-learning methods provide possibilities to a wide range of application of infrared spectroscopy in food authentication and quality control.For instance, near-infrared reflectance spectral were used to examine the authentication of skim and nonfat dry milk powder using analysis of variance-(ANOVA-) principal component analysis (PCA), pooled-ANOVA, and partial least-squares-regression (PLSR) [13].e potential of nearinfrared (NIR) spectroscopy combined with chemometrics for nontargeted detection of adulterants in skim and nonfat dry milk powder was also studied [14].erefore, it is interesting to test whether infrared spectroscopy combined with chemometric modelling techniques can be applied in detecting milk powder in raw milk.
In this study, FTIR combined with chemometrics was developed for the detection of milk adulteration.Specifically, infrared spectral fingerprints combined with chemometrics were tested in detecting reconstituted milk powder in raw milk.e workflow is demonstrated in Figure 1. is study aimed at detecting milk powder adulterated in raw milk using FTIR spectroscopy combined with chemometrics.is work may serve as a reference for quality assurance of raw milk and its related dairy products.

Sample Collection.
Twenty raw milk samples were provided by local milk farms located in Qingdao, Shandong province, China.ese farms were certified suppliers of the Nestle Corporation (Vevey, Switzerland).Each raw milk sample was stored in a separate 100 mL polythene bottle.All samples were immediately frozen after collection.e bottles were placed in a portable Styrofoam box with ice packs and dry ice to maintain optimum low temperature and stored at −20 °C once transferred to the laboratory.Four anonymously branded commercial milk powders with unrevealed processing techniques were purchased from local groceries in Shanghai, China.

Sample Pretreatment.
Raw milk samples were directly lyophilized using a Labconco freeze dryer (Kansas City, MO, USA).
e freeze-drying process removes any moisture that may interfere the FTIR measurement.It was served as a pretreatment step that maintains the original chemical compositions of raw milk as much as possible and made storage and testing of a large batch of samples possible.
For the preparation of adulterated milk with reconstituted milk powder, first, raw milk was randomly selected as the standard sample.en, each commercial milk powder was added to the authentic liquid milk in 0.5, 1, 3, 5, and 10% (w/v), resulting in five partially reconstituted samples, respectively.After that, the mixtures were sonicated for 20 min.Finally, the mixtures were lyophilized.e lyophilizates were subjected to FTIR analysis.

FTIR Analysis.
All fingerprints were collected using a Nicolet 6700 FTIR spectrometer ( ermo Fisher Scientific, Waltham, MA, USA) equipped with a Smart iTR single bounce germanium crystal attenuated total reflectance (ATR) sampling accessory ( ermo Fisher Scientific).e spectra were collected in the transmittance mode by an average of 60 scans ranging between 650 and 4000 cm −1 with a 0.48 cm −1 interval.Before each measurement, an independent background scan was performed and subtracted immediately to minimize atmospheric interference and instrument fluctuation.

Chemometrics Modelling.
All raw data were imported to MATLAB (version R2018a, e MathWorks, Natick, MA, USA).Different preprocessing strategies such as wavenumber region selection, autoscaling, standard normal variate (SNV), and derivative were applied.All chemometric analyses including preprocessing, PCA, and partial leastsquares-discriminant analysis (PLS-DA) were performed using in-house MATLAB routines running on a personal 2 Journal of Food Quality computer under Windows 7 operating system (Microsoft Corporation, Redmond, WA, USA).For internal validation, statistically relevant comparisons were achieved by the model population analysis (MPA) framework [15].
e MPA is essentially based on cross validation of a series of submodels obtained from the original data set through random sampling.In this work, MPA extract statistical information from models to achieve a statistically unbiased estimation of performance.
e internal validation process was evaluated repeatedly for 100 bootstraps.For external validation, the Latin partition approach was employed to split the whole data set into training and test sets prior to classification.To evaluate the result, prediction accuracy of the data set is used.Prediction accuracy is an estimated percentage of correct identifications when the model is applied for unknown samples, which is widely applied to assess the overall performance of a specific classification model.

FTIR Spectral Characteristics.
e FTIR spectral fingerprints contained representative information for different components in milks.
e mean spectra of raw and reconstituted milks are shown in Figure 2. e absorption bands observed at 1630 to 1680 cm −1 and 1510 to 1570 cm −1 may be induced by C�O stretching vibrations of absorption of amide I and N-H and C-H bending vibration absorption of amide II from milk protein, respectively [16,17].e bands around 2920, 2850, and 1743 cm −1 may be antisymmetric and symmetric CH 2 stretching and carbonyl group C�O double bond stretching from milk fat, respectively [18].
ese peaks also resemble the largest differed variable ranges in fingerprints.However, noting that, the mean spectra occurred in high overlap, suggesting a strong compositional similarity.Additionally, no evident peaks were determined as marker peaks since any single component is unlikely to be a critical differentiation factor.Consequently, it is hard to detect milk adulteration with mere visual inspection.
erefore, applying multivariate methods to address the overall spatial distribution of the data is necessary.

PCA Explanatory Study.
PCA was performed to preliminary visualize the multivariate distribution of all fingerprints.Figure 3 shows the PCA scores plot, with autoscaling preprocessing applied.
e PCA scores plot suggested that there were no obvious discriminations between raw and reconstituted samples using raw fingerprints.Specifically, no separation tendencies between raw and reconstituted samples were observed along the axes of both principal components PC 1 and PC 2, the two largest principal components.e combined variances explained by PC 1 and PC 2 were 88% of the total variances, indicating that the most dominant variances of the fingerprints do not closely relate to the reconstituted milk.
e PCA result was also consistent with the result from visual inspection.Different preprocessing methods, including SNV alone and SNV combined with first-and second-order derivatives, were also studied by observing the PCA scores plot (data not shown).Regardless of preprocessing methods or the combinations used, there were no obvious discriminations between raw and reconstituted samples.By selecting the wavenumber region of 800-1800 cm −1 , the degree of separation between raw and reconstituted milk cannot be improved either (data not shown).Consequently, it is not confirmed by PCA that there can be characteristic bands in the fingerprint region, nor there exhibit characteristics between two kinds of spectra.However, supervised multivariate classification may be capable of extracting information from complex data because the class memberships of samples were also included as the model input.erefore, PLS-DA was applied to analyze the fingerprints further.Journal of Food Quality 3.3.PLS-DA Model Evaluation.PLS-DA is perhaps one of the most well-known supervised classification methods in chemometrics.is method is based on partial least-squaresregression of continuous predictor variables, which seek for optimal latent variables with maximum covariance.Similar to PCA, PLS-DA was firstly applied as an explanatory approach to study the overall distribution.Different preprocessing methods were applied to the data, including wavenumber selection, autoscaling, first derivatives, and different combinations.It was indicated that PLS-DA achieved generally good separation of classes by PLS-DA scores.e best separation is shown in Figure 4, which is the X-scores (scores of the spectral data block) plot of the PLS-DA model by first selecting the wavenumber at 800-1800 cm −1 , where the spectral differences were larger than other regions, with autoscaling and first derivative preprocessing.e two largest latent variables are displayed in Figure 4.Although with small portion of overlap, the distribution of tested samples clearly showed two clusters, indicating an intrinsically different fingerprint patterns among two classes of samples.A trend related to the adulteration level was also observed.Specifically, samples adulterated at 0.5%, the lowest adulteration level in this study, is located at the partial overlap with the raw milk sample cluster.Contrarily, samples adulterated at 10% are more significantly apart from raw milk, compared with those at lower adulteration levels.
In explanatory studies, both PCA and PLS-DA scores plots limit their indicative abilities in only two dimensions, namely, the first and second principal components or latent variables.Such analysis approach relies heavily on the final judgement of the researcher for the analysis of visual patterns instead of objective performance metrics.In comparison, the PLS-DA model is able to overcome this shortcoming by the automatic model-building process with a reasonable number of variables.By selecting 90% of original data as the training set, with 11 latent variables though internal validation procedure described in the next section, a final PLS-DA model was built and validated.Figure 5 shows the regression coefficients of the PLS-DA model.Positive and negative coefficients represent the relationships of the peaks to pure and reconstituted samples, respectively.
e absolute magnitude of coefficients indicated the relative importance of peaks.Some interesting peaks arise in the PLS-DA coefficients.Peaks at 904 to 1288 cm −1 were generally associated with C-H bending, C-O-H in-plane bending, and C-O stretching vibrations of lipids, organic acids, and carbohydrate derivatives.Compared with the raw spectra shown in Figure 2, some peaks (904-1288 cm −1 ) may be associated with carbohydrates. is might be attributed to a series of the Maillard reaction occurred in milk powder, which result in the reduction of lysine-rich proteins and lactose [21].Peaks at 1583 cm −1 corresponded to unspecified compounds.e result is relevant with the PCA study that characteristic peaks may arise in the fingerprint region when authenticating raw milk.However, it did not agree with our previous findings that PCA and PLS-DA performed consistently in classifying pure milk and their counterparts adulterated with other powdered proteins [22], probably due to the complexity of spectra.It was indicated that, for the complex FTIR spectral fingerprints, the application of supervised classification methods is important because exploratory methods such as PCA did not yield a complete clear characterization.

PLS-DA Model Validation.
Although PLS-DA model finds the possible characteristics between raw and reconstituted milk samples, evaluating the validity of the PLS-DA model is necessary, since PLS-DA may be prone to overfitting.Specifically, the quantitative metrics of PLS-DA prediction power were tested by both internal and external approaches to indicate the suitability and generalizability of the model.Firstly, the complete data set was split into training and external test sets.Secondly, the internal validation was performed solely on the training set by splitting the training set into internal training and calibration set.In internal validation, statistically relevant validation of PLS-DA modelling was achieved by MPA.To achieve a statistically unbiased estimation of performance, a series of PLS-DA models were built and evaluated repeatedly for 100 bootstraps.e average classification accuracy was 98% when 11 latent variables were applied, suggesting a reliable performance.
In external validation, the Latin partition approach was employed to split the whole data set into training (90%) and test (10%) sets prior to the PLS-DA classification.Unlike the previous PCA scores plot that used only two principal components to find possible separation between pure and reconstituted samples, 11 latent variables were applied for the final building of the PLS-DA model after bootstrapped Latin partition evaluation, indicating that there were many independent components presented in the sample to establish an effective model.Figure 6 shows the final prediction output of the PLS-DA model for external validation.All samples in the test set were correctly classified by PLS-DA.
It is also interesting to study the differences between different adulteration levels since Figure 4 presented differences as previously discussed.
erefore, PLSR was Other than a 9 : 1 (training set/test set) split ratio, further evaluations by different split ratios of 2 : 1 and 1 : 1 were performed to prevent model overfitting.Except that, all other calculations remain unchanged.e result was consistent with that from the previous condition.Specifically, only one test sample was misclassified when the split ratio was 1 : 1, and all other predictions were correct.It can be concluded that the MPA modelling approach is robust and still reliable even when half of the data were removed.

Conclusion
FTIR spectroscopy combined with chemometrics has been successfully demonstrated to detect possible presence of reconstituted milk in raw milk. is work indicates FTIR spectroscopy has great potentials in quality control of milk and their related products because the PLS-DA model yielded satisfactory separation of the two spectral fingerprints.Noting that, due to the limited sample size and variability, careful selection of liquid and powdered milk in a larger data set may be necessary before practice to assure the universality of the final model.Additionally, simpler methods such as sampling without lyophilization and quantitating the level of adulteration need to be investigated in the future.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Figure 1 :
Figure 1: Schematic workflow of this study.

Figure 2 :
Figure 2: Average FTIR spectral fingerprints of raw and reconstituted milks.

Figure 6 :
Figure 6: External prediction output of the partial least-squaresmodel.e line at 0.5 was the criteria of the PLS-DA model to determine the sample type.