Scrap Metal Classification Using Magnetic Induction Spectroscopy and Machine Vision

The need to recover and recycle material toward building a circular economy is increasingly a global imperative. Nonferrous metals in particular are highly recyclable and can be extracted using processes such as eddy current separation. However, their further separation into recyclable groups based on metal or alloy continues to pose a challenge. Recently, we proposed a new technique to discriminate between nonferrous metals: magnetic induction spectroscopy (MIS) measures how a metal fragment scatters an excitation magnetic field over different frequencies. MIS is related to conductivity, which can be used to classify the fragment according to this property. In this article, we demonstrate for the first time the use of MIS with machine learning to classify nonferrous scrap metals drawn from commercial waste streams. Two approaches are explored: 1) MIS over a bandwidth from 3 to 90 kHz and 2) the combination of MIS with the physical color of the metal samples. We show that MIS alone can obtain purity and recovery rates >80% for most metal groups and waste streams, rising to >93% for stainless steel. The exception was the Zorba waste stream where the mix of aluminum alloys within the sample set led to poor conductivity contrasts. The introduction of color substantially improved results in this case, increasing purity and recovery rates by 20%–35% points. Of the machine-learning models tested, we found that random forest (RF), extra trees, and support vector machine (SVM) algorithms consistently achieved the highest performance.

Abstract-The need to recover and recycle material toward building a circular economy is increasingly a global imperative. Nonferrous metals in particular are highly recyclable and can be extracted using processes such as eddy current separation. However, their further separation into recyclable groups based on metal or alloy continues to pose a challenge. Recently, we proposed a new technique to discriminate between nonferrous metals: magnetic induction spectroscopy (MIS) measures how a metal fragment scatters an excitation magnetic field over different frequencies. MIS is related to conductivity, which can be used to classify the fragment according to this property. In this article, we demonstrate for the first time the use of MIS with machine learning to classify nonferrous scrap metals drawn from commercial waste streams. Two approaches are explored: 1) MIS over a bandwidth from 3 to 90 kHz and 2) the combination of MIS with the physical color of the metal samples. We show that MIS alone can obtain purity and recovery rates >80% for most metal groups and waste streams, rising to >93% for stainless steel. The exception was the Zorba waste stream where the mix of aluminum alloys within the sample set led to poor conductivity contrasts. The introduction of color substantially improved results in this case, increasing purity and recovery rates by 20%-35% points. Of the machine-learning models tested, we found that random forest (RF), extra trees, and support vector machine (SVM) algorithms consistently achieved the highest performance.
Index Terms-Classification algorithms, electromagnetic induction, machine vision, recycling, waste recovery.

I. INTRODUCTION
A N ACCURATE and economic separation technique is essential to allow nonferrous metals to be recovered, recycled, and reused. The advantages of returning nonferrous metal to the supply chain are substantial. For instance, materials such as aluminum and copper are highly recyclable; aluminum produced from mined Bauxite ore requires 1 86 262 MJ of energy to acquire 1000 kg of primary aluminum, whereas secondary (recycled) aluminum requires only 11 690 MJ [1]. This substantial energy saving means reduced CO 2 emissions and impact on the climate. There is international pressure to improve the rate of metal recycling, recognizing the need to move to a more sustainable "Circular economy" [2]. In Europe, for example, EU directives (2000/53/EC) and (2012/19/EU) address the need for materials recovery in end-of-life vehicles and waste electrical and electronic equipment (WEEE); both prominent sources of nonferrous metal.
Nonferrous metals are primarily separated from source waste streams by eddy current separation. This process uses a high-speed rotating drum embedded with permanent magnets to induce eddy currents in the fragment, which in turn, develop Lorentz forces. In highly conductive fragments, i.e., nonferrous metals, the Lorentz force is sufficient to eject the sample from the conveyor, whereas poorly conducting fragments are allowed to free-fall from the conveyor. This means the eddy-current separator (ECS) is limited by the geometry of the waste fragments. It can be difficult to generate large repulsive forces with smaller-sized metal pieces, or longthin elements such as wires where it is more difficult for eddy currents to circulate [3]. High-density and low-conductivity materials are also difficult to eject. The resultant product of the ECS is a mix of nonferrous metals which must be further separated to be recyclable and yield full value.
Many challenges remain in sorting this nonferrous metal mix reliably, efficiently, and at scale. These metals need to be sorted into their base elements (aluminum, copper, and zinc), and in some cases, further sorted by alloy family. Tramp elements within an alloy make them difficult to recycle. Small amounts within the recyclate are allowed at a specific rate, but high rates can make the produced metal brittle [4]. Tramp elements become more of an issue in aluminum, where they are difficult to remove [3]; this makes it essential to recycle some aluminum sources into clean alloy families, increasing the complexity of recycling processes and the risk of cross-contamination that undermines alloy sustainability.
There are several methods for sorting nonferrous metals, each with advantages and disadvantages as summarized in Table I. A common approach is sorting by hand, using the worker's judgment to sort by color and physical characteristics [3]. Manual sorting is only economical in regions where labor costs are low. Regions such as Europe and the USA have tended to export their waste, the volume of which has seen substantial growth over the last 20 years [3]. It is claimed that manual sorting can achieve classification accuracies up to 99% [5]. The sustainability of manual sorting in other countries has been challenged, not least on the environmental costs of waste transport. Sink-float systems offer a less laborintensive, conceptually simple solution that uses the different densities of metals for separation. Slurries of water, sand, and air are used to create different gravitational drums that separate the metal [3]. The sink-float method struggles to separate hollow and boat-shaped materials. There is a high cost of maintaining a constant gravitational density [3], and the process creates environmental waste, such as contaminated water, which requires treatment.
The gold standard methods to classify nonferrous scrap metal are laser-induced breakdown spectroscopy (LIBS) and X-ray methods, such as X-ray transmission (XRT) and X-ray fluorescence (XRF). LIBS and XRF are commonly used to establish ground-truth metal composition using handheld analyzers or laboratory instruments [6], [7], hence their discrimination capability is high. In operation, they work downstream of ECS, separating the mixed nonferrous metal product that results. As dry sensor-based methods, these techniques can only passively interrogate the sample as they pass across the conveyor and must be paired to ejector mechanisms, typically air jets, to provide the physical separation of the metal pieces.
LIBS is a technique that classifies metals by laser ablating the surface of the metal to generate a plasma to analyze composition [8]. Fast conveyor speeds, however, can make it difficult to target the laser where precise multiple firings are required to obliterate surface contaminants before measuring the plasma [9], [10]. It has been proposed to use an additional camera system to identify flat and uncontaminated points on the sample [8], however, this builds complexity and implementing a near-3-D camera under high-speed operation is still a significant technical challenge. XRT uses a high-intensity X-ray beam to measure absorption across the metal piece [10]. This allows XRT to separate light metal (aluminum) from heavy metals (copper and brass). This technique is not affected by surface contaminants, however, it is unable to sort metals of similar density [11], [12]. XRF emits low-energy radiation onto the surface of a metal, causing excited low-energy electrons to eject [13]. The space left is filled by high-energy electrons, which release an elemental-specific fluorescence [13]. Like LIBS, XRF is susceptible to surface contamination and can be difficult to use for elements with very low characteristic radiation, such as aluminum, silicon, and magnesium [3]. Spectral ratios for aluminum alloys, for example, tend to be determined by their major alloying elements [3].
LIBS and X-ray techniques, while offering good performance and capability, are generally very expensive and face limitations when translated to high-throughput metals classification, where typical commercial conveyor speeds can operate between 2 and 3 m/s. There is still much interest within the state-of-the-art for low-cost, and industrially practical solutions, either as alternatives to LIBS and X-ray or to complement them by providing a presorting stage. For instance, the "Electrodynamic sorting technology" developed recently at the University of Utah [14], [15], uses a tuneable or variable frequency ECS system to be able to sort different metals and smaller fragments. The authors highlight some success extracting aluminum from Zorba [16], brass, copper, and other aluminum alloys [14], although the latter results were drawn from spherical test samples rather than genuine scrap.
Optical methods, like manual sorting, use color characteristics to sort metals, although across more wavelengths compared to the human eye. Li et al. [17] explored deep learning and superpixel optimization with a red, green, and blue (RGB) image, where their proposed algorithm achieved an average precision of 98%, which used 15 samples of aluminum and copper pieces. Hyperspectral imaging (HSI) measures a wider spectrum beyond the RGB wavelengths provided by a standard camera, returning a rich feature set for classification. HSI methods have achieved classification accuracies of 96.87% [18] and 98.36% [19] with WEEE scrap metal and 80 to 97% for brass, iron, copper, aluminum, and nickel classification [20]. Although HSI provides good classification, the high-dimensional vector associated with each pixel combined across the whole image creates a heavy computational load [18]; this limits the speed of the conveyor to allow time for processing. HSI classification has been reported on conveyor speeds running at up to 2.28 m/s; an improvement on previous methods of <1 m/s [18].
This study explores the use of magnetic induction or electromagnetic sensors for nonferrous metal classification. These sensors are generally lower cost than other methods and are well-suited to fast-moving conveyors and the constraints of high-throughput operation. They operate on similar principles to ECSs, in that an oscillating magnetic field is used to induce eddy currents in the sample. However, these eddy currents are too small to induce appreciable Lorentz forces, rather it is the resultant secondary magnetic field generated by these eddy currents that is used to interrogate the characteristics of the metal (i.e., the mutual inductance). This decoupling of the ejection method from the magnetic response means that we can potentially classify finer conductivity contrasts between metals and larger fragment shape variability than ECS, reliant as it is on developing sufficient ejection forces and predictable piece trajectories. Common to the LIBS, X-ray and optical methods, the magnetic induction sensors must be paired with a physical ejection mechanism to separate the pieces when classified. In contrast, magnetic induction is not influenced by surface contaminants as eddy currents can penetrate the conductive surface of the metal piece.
Magnetic induction sensors have shown some efficacy in separating the different metals of the nonferrous metal mix produced by eddy-current separation. Messina et al. [21] explored the use of narrowband low-frequency excitations (700 Hz to 5 kHz) and pulsed magnetic fields [12]. This system showed good results for separating metals with high conductivity contrasts, such as low conductivity stainless steel, from other nonferrous with high conductivity, including bronze, brass, zinc, magnesium, aluminum, and copper. Recovery and purity rates between 90% and 100% were reported for stainless steel, whereas recovery and purity were generally below 80% for the other metals. Kutila et al. [22] extended this approach by combining a magnetic sensor with an optical system, which could be operated combined or separately. The results showed a similar range of 80%-95% purity and recovery rates for stainless steels and the separation Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
of reddish metals (copper and brass). Performance was found to be impacted by industrial conditions, including machine vibration, ambient light, and reflections.
We propose to use magnetic induction spectroscopy (MIS) for metal classification. This approach, in contrast to the magnetic induction methods described, uses broadband excitations to obtain mutual inductance measurements across several frequencies from 3 to 90 kHz. This broadband spectrum captures the interesting region where skin depth takes effect, returning measurements that are to differing degrees a function of conductivity and sample geometry (lower frequencies) or sample geometry only (higher frequencies). MIS has been used extensively in nondestructive evaluation research, including surface crack detection [23], lift-off distance and characterization of metals [24], and estimating thickness of metallic films [25]. MIS has also been used for classification research into metal detection and landmine identification [26] and landmine detection combined with ground penetrating radar [27]. For nonferrous scrap metal classification, O'Toole et al. [28] developed a dual-frequency MIS system using mixed pairs of excitation frequencies from 3 to 64 kHz. This system was able to classify manufactured scrap pieces in brass, copper, and aluminum with average purity and recovery rates of 92%. However, performance fell to recovery rates of 80% and purity rates of 55%-80% when tested on scrap metal drawn from commercial waste, where sample morphology and lift-offs from sensing coils were more variable [9]. MIS has also been reported for the classification of batteries within waste streams, where notable differences in the spectra were observed across D, AA, AAA, and 9V/Eblock cells [29].
In this article, we examine for the first time the use of broadband multifrequency MIS to classify nonferrous scrap metals. As described, magnetic sensors and MIS systems pose a potentially low-cost, industrially robust, general solution to sorting nonferrous metal mixes output from ECSs, compared to techniques such as LIBS and X-rays. This method can in principle sort any metal, grade, or alloy provided that conductivity contrasts are sufficient, and unlike most other sensor-based approaches, is unaffected by surface contaminants. We advance on previous work by using multiple frequency components across a broadband spectrum from 3 to 90 kHz and combining these frequency components as features in a machine-learning algorithm. In contrast, our previous work only considered two components [28]. By sweeping across the full frequency range, we obtain the full shape characteristics of the spectra as eddy-current penetration diminishes according to the skin-depth effect.
The specific contributions of this article are threefold: we first demonstrate the capability of using multifrequency MIS measurements for metal classification by using a static swept frequency test rig, combined with machine learning algorithms to interpret the spectral response and deliver a metal classification. Second, we establish the efficacy of different machine learning models for MIS classification, using an expansive sample set of 445 genuine nonferrous scrap metal pieces drawn from six distinct presorted commercial waste streams. We report the optimal results that can be achieved. Third, we show the impact of combining MIS with some simple metal color parameters as additional features to support classification. This is based on our findings that some metals with poor conductivity contrasts, such as brass and cast aluminum, conveniently have high color contrasts. This invites the exciting possibility of using multisensor systems, enhancing and mitigating the strengths and weaknesses of individual sensors through their combination.

II. THEORY A. Magnetic Induction Spectroscopy
To understand how MIS can be used to determine the conductivity of different metals, the analytical formulation of a conductive sphere in free space can be used [30]. Define a as the radius of the sphere, σ as its conductivity and its permeability µ = µ 0 = 4π × 10 −7 . The conductive sphere is centered at the origin and is within a uniform magnetic field acting along an axis Z , oscillating at frequency f . This field induces eddy currents within the object that flow in the azimuthal direction [31]; these eddy currents induce a secondary magnetic field. If we take a point z along the Z -axis outside the sphere (z > a), we can calculate H rx and H ex , which are the complex components of the secondary magnetic field and excitation, respectively, using the following:  the same asymptote and the only visible difference is between the frequency of convergence. As the frequency increases, the eddy currents flow closer to the surface of the object; this is the skin depth effect [32]. The negligible skin depth at high frequency causes the asymptote of the real component to not change with conductivity [28]. Fig. 2 shows the real and imaginary component of H rx /H ex at z, which is again 3 mm away from the surface of the sphere with a fixed conductivity of 16.24 MS/s (28% ICAS). The radius of the sphere ranges between 3 and 90 mm. It can be seen from Fig. 2 that the size of the sphere affects the frequency and height of the imaginary peak and the height of the asymptote of the real component.

B. Classifiers
In Section II-B, we describe the different classification methods used. These methods are chosen as they work well with a small number of inputs, and the structure of the algorithms is easy to explain and visualize.
Support vector machines (SVMs) are a commonly used machine learning algorithm that uses hyperplanes to partition the feature space. SVMs can perform linear and nonlinear classification. Nonlinear classification is achieved by a polynomial or radial basis function (RBF). The nonlinear SVM polynomial and RBF perform their unique transformation of the data, which is then separated with a linear hyperplane. SVM data must be scaled as the algorithms are sensitive to the magnitude range of the data [33].
K -nearest neighbors (KNNs) is a clustering algorithm that classifies data based on the closest neighbors of the training Fig. 3. Schematic of analog electronics for the MetalID system. set. A K predefined number of neighbors is selected. Each new feature is compared to the same feature from the training data, and the closest K labels are recorded. The label with the largest number of nearest neighbors from a particular class is assigned to that data point.
Ensemble classifiers use a combination of multiple decision trees. Random forest (RF) and Extra randomized trees (ETs) algorithms contain decision trees, whose individual results are combined and used to determine the label. The decision is determined by the majority of the average of the predictions of all trees. Adaboost is similar to RF and ET but instead uses multiple small decision trees. Once a base tree is created, it is tested on the training data, and the weight of misclassified training data is increased for the next predictor [34]. The result is decided by a majority vote.

A. MetalID System
The MetalID sensor was first described in O'Toole and Peyton [9] with the solenoid design reported in [28]. A schematic of the system is shown in Fig. 3. The system comprises the following.
1) A sensing element or solenoid array.
2) Front-end analog drive and receive electronics.
3) A Red Pitaya STEM 125-14 for data acquisition. The sensing element consists of an inner excite coil with 32 turns wrapped around a 10 mm diameter acetyl former containing two 6 mm diameter ferrite rods (Fair-Rite 4077276011). The excite coil generates the excitation magnetic field H ex , to induce eddy currents in the test object. This structure is enclosed by a pair of outer receive coils, with 600 turns each wrapped around a 16 mm acetyl former and wound in opposition to form a gradiometer that cancels the effect of the excitation. This receives coil is used to measure the secondary magnetic field H rx induced by the test object. We denote the complex frequency component of the voltage emf induced in the receive coil by the field H rx as V rx ( f ).
The complete sensing element is screened by an aluminum cylinder with one end left open to form the sensing interface with the test pieces. The front-end analog electronics consists of a LT1210 (Linear Technology), power amplifier which drives the excite coil with an oscillating current, and a AD8429 low-noise instrumentation amplifier (Analog Devices) which provides 40 dB gain on the measured emf.
The Red Pitaya data acquisition system synchronously outputs a transmit waveform to the power amplifier to drive the excitation coil, and measures the receive coil voltage output from the low noise amplifier. The transmit waveform is generated by a 12-bit DAC sampling at 12.5 MSPS. The received waveform is measured using a 14-bit ADC sampling at 125 MSPS. The result is processed on an FPGA. The signal is first downsampled by a factor of 50 using an FIR filter and decimation process, then input to an FFT with a 4096-element buffer to obtain individual frequency components.
The MetalID sensor was used to obtain MIS measurements at frequencies from 3 to 90 kHz in intervals of 3 kHz. The results are referenced to a calibration target, a 10 × 20 mm ferrite cylinder (material 4B1, Ferroxcube), in-line with previous research [9], [28]. The process for a frequency sweep of a single test sample was as follows.
1) Fifteen background frequency sweeps (scans) were taken and averaged where no test sample was present on the sensor. We denote the background scan V rx,bkgnd . 2) The ferrite calibration target was placed on the sensor and scanned. This result is denoted V rx,calib .
3) The test sample was placed on the sensor and scanned.
This result is denoted V rx,sample . The ferrite piece is used as a reference as it has a constant permeability and negligible conductivity across the frequency range of interest. Therefore, it can be shown from (1) that the induction spectra H rx /H ex for the ferrite becomes purely real (zero imaginary) and uniform across the frequency range.
Denote the relative magnetic or mutual inductance, i.e., referenced to the ferrite, as M( f ) ∝ H rx /H ex . This result is obtained from the measurements described for a frequency f using the following: where M ′ and M ′′ are the real and imaginary components, respectively. The MIS sensor will be used independently and together with the results from an imaging system.

B. Imaging System
We propose that the visual characteristics of the scrap metal fragments (test samples) can complement induction measurements as features to classify the material. This work focuses on extracting color, specifically the RGB, and hue, saturation, and value (HSV) color components for each test sample. Sample color has the potential to distinguish between metals with high color contrasts, for example, red metals (brass and copper) from white metals (aluminum).
Static images of each sample are taken with the induction measurements using a bespoke imaging rig, as shown in Fig. 4. The rig consists of a camera, image processing system, and lighting dome. The MetalID sensor is located underneath the rig, housed in an acrylic box.
Images were taken using a Raspberry Pi 4 Model B 4 GB with a Raspberry Pi High-Quality Camera Module and a 3MP C-Mount 8-50 mm Zoom Lens. The lens captures images at 1920 × 1088 quality. The Raspberry Pi was programed in Python 3.7, with the OpenCV2 V4.1.2 [35] and PiCamera V1.3 libraries to control the camera and process images. The quality of light is a critical component for any vision system; it allows easier feature extraction and higher quality images. A diffused light source is used to provide consistent illumination for each test sample. This was achieved using a 3-D printed gray lighting dome and LED strips. The design of the lighting dome allows the camera to be mounted above the induction sensor at a sufficient distance to prevent interference.
The camera images are processed to remove the background and extract a single color, representative of the whole test sample in both RGB and HSV color spaces. Background removal is performed by taking a series of images over 15 s while the camera rig is empty. These images are used to parameterize a "Mean of Gaussian" background subtractor in OpenCV. The base of the lighting dome is kept black to facilitate this. Once this process is complete, a scan is run by placing the test piece onto the center of the box directly below the camera. The lighting dome is then secured, and an image and induction measurement are taken sequentially.
To extract a color feature set for the test sample, the image with the background removed is first reduced to 50 × 50 pixels to increase the computational speed. A two-means clustering algorithm is then applied to separate the pixels into two groups. One cluster group is the residual black background pixels and is ignored. The second group constitutes the foreground pixels of the sample. A mean average is taken of the foreground pixels across each color component (RGB) to obtain a single RGB set representative of the sample. For HSV, the RGB image is converted to HSV first and the discussed process is followed.

C. Test Samples
Six datasets were used, which consisted of mixed nonferrous metal and stainless steel from different waste streams sourced from commercial material recovery facilities. The waste streams include "Zorba" (3-8 mm) and "Zurik" (8-25 mm) [36], biomass incinerator metals (BIMs), fridge metals, and window frames (WFs) in two different size ranges (8-25 mm and 25-75 mm). Zorba consists of shredded nonferrous metals and is predominantly aluminum, whereas Zurik is predominantly stainless steel [36]. BIM consists of metals that have been through an incinerator, leading to surface contamination on all pieces. The fridge metal and WF streams consist of a shredded refrigerator and WFs. The differentiation of the input waste streams and size filtering is consistent with industry standards and are a realistic presentation for a material separator.
The metal samples in the datasets were measured with an XRF handheld device (Hitachi X-MET8000 Optimum). The analyzer provided a metal composition and an industrial grade. The grade was used to label the pieces according to 11 output classes defined by material, such as copper, brass, and stainless steel. These output classes are consistent with expected returns for a commercial materials separation process. The XRF analyzer was not able to assign all metal pieces an industrial grade. In those cases, the pieces were labeled with an output class determined by the dominant element in the metal composition, e.g., samples with over 90% zinc were labeled as "Zinc." Samples, where the class was unclear from the composition, were removed from the dataset.
The input waste streams, output classes, and the number of samples for each are shown in Table II. For the results that follow, we will not derive a classifier for any class label within a dataset where the number of samples is less than 20% of the total number of samples within that dataset; this is to ensure that enough pieces are present for training and testing. For example, we do not determine a classifier for aluminum in the BIM dataset as there is only one piece available. On the other hand, we can determine a classifier for brass as this makes up a significant proportion of the BIM dataset (Brass ∼65%/Not Brass ∼35%). If more than one metal class is >20%, such as aluminum and brass in 3-8 mm Zorba, we design two separate binary classifiers (one for each metal) using all samples in the dataset. In practice, industrial separators can only sort by binary classification (Class/Not Class). To remove multiple materials, one metal would be removed from the waste stream first, then the next metal by retesting the filtered material.

D. Machine Learning
The Python library Scikit-learn V0.22.2 [34] was used for the training and implementation of the machine learning models. The models had different inputs depending on whether the color was used as a feature. When the model used induction only, there was a total of 60 inputs which consisted of the real and imaginary components of each frequency measured. When the color components were used, there was a total of 66 inputs, which consists of the 60 induction measurements, R, G, B, H, S, and V.
All features are scaled between 0 and 1 prior to use; this is essential for SVM and KNN algorithms [33]. It is important to constrain a machine learning model's hyperparameters to reduce overfitting. The models were constrained by selecting a predefined range for the hyperparameters. The GridSearchCV function was used to find the combination of hyperparameters that achieved the highest accuracy.
SVM [37] has two hyperparameters: the first was the regularization, set to a range of 50 values between 0.01 and 100 increasing logarithmically. The second was the kernel, which refers to either linear, polynomial, or RBF. All three kernels were evaluated. KNN has one hyperparameter: the number of neighbors. The number of neighbors ranged from 1 to half the number of samples within the dataset.
RF and ET have two hyperparameters: the number of estimators, which refers to the number of decision trees used, and the maximum number of features to consider when looking for the best split. The number of estimators and the maximum number of features ranged from 1 to 20. Adaboost has only one hyperparameter: the number of estimators which ranges from 10 to 200 in step intervals of 10.
The algorithms used in this study are considered more traditional machine learning algorithms. A disadvantage of the traditional algorithms is that they are known to have high variance [38]. In addition, small datasets lead machine learning algorithms to overfit the training data [39]. Future training of models would benefit from a larger dataset consisting of more metal samples to reduce the risk of overfitting by the model. Different techniques, such as dropout, can be applied to algorithms such as deep artificial neural networks (ANNs) to reduce overfitting, which would help with small datasets [39]. Future design and research may benefit from the use of more complex algorithms such as ANNs.

E. Analysis and Comparison
The F1 score is used to compare performance between algorithms and feature sets. The F1 score is the harmonic mean of the precision and recall [33]; this means that the F1 score will only be high if both precision and recall are high [33]. The precision and the recall are also calculated. In what follows, we will refer to precision as the purity rate and recall as the recovery rate. In-line with the terminology more familiar to the material recovery industry. The purity rate describes the proportion of correct material within the sorted product after separation. The recovery rate describes the proportion of material correctly recovered from the total available within the input waste stream. These terms are formally defined as follows, noting their interchangeability with precision and recall:  III   RECOVERY AND PURITY RATES FOR HIGHEST F 1 SCORED ALGORITHMS  THAT USE MAGNETIC INDUCTION ONLY where TP, TN, FP, and FN are true positives, true negatives, false positives, and false negatives, respectively. Stratified K -fold cross-validation is used to evaluate the machine learning models. K -fold cross-validation splits the input dataset into K predefined groups or folds, then trains with all the data except for onefold which is reserved for testing. This process repeats until all folds have been evaluated as a test set. Stratified K -fold cross-validation also preserves the ratio of each class within the folds. We choose K = 10 for the work that follows.
The relatively small size of each dataset means that performance can be sensitive to the order of the samples. Therefore, the process explained in this section was repeated ten times with the datasets shuffled at each iteration to randomize the order. Each algorithm was trained and tested with the same combination of shuffled data to allow a fair comparison. The mean average F1 score, purity, and recovery rate from across the ten shuffles are used to present the results within this article.

IV. RESULT AND DISCUSSION
In the first part of this section, we explore the efficacy of using the magnetic induction spectra alone as a feature set, using the apparatus and method described in Section III-A. In the second part, we explore the improvement from combining the magnetic induction spectra with sample color to create a wider feature set, with color measured using the imaging system described in Section III-B. Fig. 5 shows the F1 scores when classifying stainless steel, brass, zinc, and aluminum across the six datasets described in Section III-C. The results are obtained using five machine learning models reported in Section II-B. The models include the SVM and KNN algorithms, and the three ensemble classifiers: 1) RFs; 2) ETs; and 3) Adaboost. Table III summarizes the highest performing models according to the F1 score across each dataset, with their associated purity and recovery rates, for the four different output classes (metal types) using the magnetic induction spectra as the sole feature set.
From Table III, stainless steel was classified with a >99% recovery and >93% purity rate; this was achieved using RF and ETs.
Brass within the BIM dataset obtained 97.53% recovery and 83.34% purity rate using SVM. It is clear from Fig. 5 that this performance was consistent across all classifiers (F1 score from 0.866 to 0.898). The total number of brass samples  This result for brass was not repeated with the 3-8 mm Zorba dataset, where classification fell to 54.67% recovery and 53.82% purity rate. Zinc achieved 81.5% recovery and 77.28% purity rate using RF. The result was relatively consistent across the machine learning models except for KNN, which had an F1-score ∼0.1 lower than the median.
Aluminum achieved F1 scores between 0.87 and 0.90 across three datasets, with recovery and purity rates between 89.45 and 96.15% and between 81.26 and 90.64%, respectively. However, the 3-8 mm Zorba dataset obtained a lower recovery and purity rate of 70.33% and 52.73%. The results for aluminum with the 3-8 mm Zorba were consistent with the brass classification results, which also showed a marked performance drop across the same dataset. This reduction is due to the presence of aluminum alloys in the Zorba with similar conductivities to brass. From (1), the induction spectra are a function of the sample conductivity and morphology. When the induction spectra are used as a feature set, the machine learning models effectively classify the material according to conductivity, while minimizing sensitivity to sample size and shape. Aluminum as an element has a conductivity of 65% ICAS [40]. However, within the 3-8 mm Zorba dataset, most aluminum pieces present are cast AL-383 and AL-384, with conductivities around 23% ICAS [40]. Cast aluminum conductivity is similar to brass, with around 26% ICAS [41]. By contrast, other datasets are mostly wrought aluminum alloys (AA-1100, AA-4343, AA-6070, and AA-6151) with 42%-59% ICAS [41]. For example, Zinc with 28% ICAS [42], is well separated in the WF dataset because it is mostly compared to wrought aluminum. The variation in conductivity poses a limitation on this approach when attempting to classify distinct elements with similar conductivities. This limitation should be acknowledged during training a model as it would be better to group metals of similar conductivity together, such as cast aluminum and brass when only magnetic induction is used. However, it also presents an opportunity to separate alloys with high conductivity contrasts, for example, wrought from cast aluminum.
In Table III, we found poor performance within the Zorba dataset between brass and aluminum with a similar conductivity. We hypothesize adding sample color to the magnetic induction spectra as a combined feature set will improve this result. Fig. 6 shows each machine learning model's F1 score when RGB and HSV parameters are included as features. RGB and HSV are extracted using the method outlined in Section III-B. Fig. 7 shows the difference between Fig. 6 and  the previous magnetic induction spectra results presented in Fig. 5. Table V summarizes the recovery and purity rates of the highest-performing results for each metal class.
Stainless steel, which already had a high F1 score across the Zurik and WF datasets, does not improve with color; this shows that MIS alone is sufficient to classify stainless steel. Similarly, color did not significantly improve the classification accuracy for brass within the BIM dataset; this is unsurprising given the extent of surface contamination on the BIM metal pieces caused by incineration, which leaves all the pieces with a similar color. On the other hand, zinc showed a notable improvement in classification when using color. The recovery increased by 13 and purity by 8.1% points. The most significant improvement was for brass and aluminum within the 3-8 mm Zorba dataset. For brass, the improvement was most evident for the SVM model, where F1 score increased from 0.4142 to 0.8501, yielding a recovery and purity rate of 89.09% and 87.2%, respectively. Aluminum F1 scores increased with color across all datasets apart from 25 to 75 mm WF. In the 3-8 mm Zorba dataset, recovery and purity rates improved to 91% and 75%. These results support our previous hypothesis that color contrast can improve the classification of metals when conductivities are similar and no significant surface contamination is present.
A problem that was not addressed in this study is that on an industrial conveyor, the pieces may overlap, but this is a potential challenge in all dry classification methods. However, we expect that the use of a vibrator feeder would reduce the chance of metals overlapping.
Rigorous performance measurements of calculation speed for the machine learning models are beyond the scope of the present work. However, our preliminary estimates indicate classification times of less than 1 ms for all algorithms, apart from Adaboost which was slightly longer classifying in <3 ms. A classification speed of <1 ms is practical for an industrial separator. For example, a typical conveyor speed of ∼2 m/s, and distances of up to 0.5 m between the sensing element and ejector manifold would yield 250 ms of available classification time.

V. CONCLUSION
MIS offers a new approach to the classification of nonferrous scrap metals in the recycling and waste recovery sector. The authors first posited an MIS approach using two-frequency component classification [28]. However, the effectiveness was found to be limited to classes of waste pulled from commercial production lines [9].
This article presents the first results of using multifrequency MIS. This progression from O'Toole and Peyton [9] uses more frequencies across a wider spectrum to derive features for classification; trading the simplicity of a two-frequencycomponent approach for the information provided by a fuller induction spectra. We demonstrate the use of multifrequency MIS to successfully classify valuable nonferrous scrap metals, including stainless steel, brass, aluminum, and zinc, from within different industry-standard waste streams (datasets), such as Zorba and Zurik. MIS could achieve a >99% recovery rate and >93.25% purity rate when classifying stainless steel. Good classification performance was found generally across the different metals and datasets, with recovery and purity rates greater than 80% in most cases. MIS achieved 98.63% recovery and 82.05% purity of brass from the BIMs dataset, indicating immunity to surface contamination in contrast to optical or some X-ray techniques. The exception in performance was the 3-8 mm Zorba dataset, where aluminum and brass classification fell to between 50% and 70% purity and recovery rates. This is attributed to the aluminum AL-384 alloy present in the dataset, which has a conductivity of 23% ICAS, close to the conductivity of brass and provoking misclassification.
Introducing color components (RGB and HSV) as additional features combined with MIS improved classification performance for brass, zinc, and aluminum metals across the majority of datasets. This improvement was the most marked across the previously poorly performing 3-8 mm Zorba, where the recovery and purity rates improved to a more acceptable 91% and 75%, respectively, for aluminum, and 89.09% and 87.2% for brass. Across the different machine learning models used, RF, ETs, and SVM yielded better results than other algorithms.
MIS independently or combined with color parameters is shown to be an effective and robust method for the classification and recovery of nonferrous scrap metals. MIS alone is insensitive to surface contamination on the sample, although limited where conductivity contrasts between metals are poor. The support of color components substantially improves performance in this case at the cost of being subject to surface contamination. There is some balance to be achieved in weighting MIS measurements against color between classification capability and sensitivity to surface contamination, dependent on the characteristics of the waste stream being sorted. Nevertheless, our findings suggest that for general Zorba as tested here, the combination of the two approaches will supercede either method individually.
The effectiveness of color is subject to good lighting conditions, which was achieved in this study with the lighting dome. However, industrial conditions would certainly present a more variable and challenging environment to deliver this consistent illumination, as noted by Kutila et al. [22]. There is scope for the use of enclosures or hoods over the conveyor to control light sources, such as the scheme in Tachwali et al. [43], using diffused and polarized light sources such as in Pramerdorfer and Kampel [44], or elliptical reflectors in Barnabé et al. [45]. This could further be complemented with air to blow away any dust particles-a standard approach in the industry. An MIS and color system would need to be partnered with a mechanical mechanism to eject the classified scrap metal into the required bins. Ejection could be achieved with air jets, which is industry standard [4], [10]. A limitation of the magnetic induction and color sensors is the inability to separate nonferrous metals which have similar conductivity and surface contamination. Additional limitations of this method are that induction measurements ideally require a metal piece close to the sensor, which can be difficult with a bouncing conveyor and rolling pieces. The feasibility of this technology has been demonstrated herein, and we continue to develop a mixed-metal separation solution for high-throughput and midcost recovery of some of the most common and valuable nonferrous materials.