Cyber-Physical Security with RF Fingerprint Classification through Cyber-Physical Security with RF Fingerprint Classification through Distance Measure Extensions of Generalized Relevance Learning Distance Measure Extensions of Generalized Relevance Learning Vector Quantization Vector Quantization

,


Introduction
Cyber-physical systems (CPSs) are increasingly found in critical infrastructure (CI) applications to enable the industrial Internet of ings (IoT), with ever-increasing security implications (e.g., [1]). CPS in industrial uses, e.g., energy systems, integrates computing, communications, and control and must be dependable, safe, and secure and enable real-time operations [2]. Due to the gravity of CI systems, accurate identification and authentication of communication devices is important. Radio frequency (RF) fingerprinting extracts RF signals and develops classifier models to provide discrimination between communication devices [3,4]. RF Distinct Native Attributes (RF-DNA) fingerprinting extends this general process by developing statistical features from regions of RF signals and has been shown to robustly enable biometric-like identification at the serial number level [4]. For this task, one needs classifier algorithms from which one can build models to discriminate between classes based on minute differences, as well as providing reliable authentication during masquerade attacks [4,5].
Recent advances in classifier applications for RF fingerprinting include (1) discriminant analysis, (2) Generalized Relevance Learning Vector Quantization Improved (GRLVQI), (3) learning from signals (LFS), and (4) random forests [6,7]. Of these methods, the GRLVQI is one of the most robust methods, but algorithmically, GRLVQI has limitations that need to be addressed when considering the minute variations inherent when discriminating devices as the serial number level.
GRLVQI and the LVQ family of algorithms, in general, are gradient-descent-based algorithms that compute a distance from each exemplar to the nodes, termed prototype vectors (PVs), and then find the nearest PV to the exemplar [8]. e standard distance measure in GRLVQI, and all LVQ algorithms, is the linear squared Euclidean distance measure [9]. However, Euclidean distances present limitation because they are adversely affected by high levels of dimensionality [10]. In addition to this, Euclidean distances are scale variant while being translational invariant [11]; alternatively, for example, a cosine distance is translational variant but scale invariant [11]. A measure that is scale invariant but translational variant provides a potential to enable better classification of groups which differ based on minute, fingerprint-like variations.
is work extends GRLVQI, and LVQ in general, to use a cosine distance measure. is is a nontrivial modification since the distance measure is an implicit part of the cost function in all LVQ algorithms. Updating LVQ algorithms for alternative distance measures thus requires computing the first derivative (gradient) of the new distance measure to appropriately incorporate it into a revised cost function. is is an important matter to consider and often neglecting, e.g., [12][13][14][15]. is paper addresses this limitation by presenting GRLVQI-D (distance) which is a straightforward framework for incorporating distance measure extensions of GRLVQI and LVQ in general, with an example using a cosine distance measure. Modifying GRLVQI is notably complex since it includes multiple embellishments, e.g., both a sigmoid-based cost function and a relevance learning approach. e new Cosine GRLVQI classifier is then applied to an example CPS application in the form of RF fingerprinting an experimentally collected Z-wave wireless personal area network (WPAN) dataset. Classification results show that Cosine GRLVQI offers a distinct performance advantage over the original GRLVQI, as well as over (1) the original GRLVQI with optimized settings and (2) MDA. is paper is organized as follows: Section 2 discusses the CPS environment and the need for reliable CPS identification methods. Section 3 discusses LVQ algorithms in general and GRLVQI in specific. Section 4 develops a straightforward approach to changing the distance measures in GRLVQI to any differentiable measure, with a specific example presented using cosine distance. Finally, results are presented in Section 5 showing a distinct classification performance advantage when using cosine GRLVQI over the baseline squared Euclidean distance method. Section 6 then concludes the paper.

Cyber-Physical System (CPS) Device Identification
CPS serve as a backbone for IoT connectivity ranging from critical infrastructure to home automation. Of interest is adopting a biometric-inspired approach, which involves three important steps: library creation, classifier model development, and classifier model verification [16]. Library creation involves selecting and measuring appropriate signatures, classifier model development involves selecting appropriate algorithms which can discriminate between signatures, and classifier model verification involves the robustness of the trained classifier to a claimed identity. Assembled effectively, a library and quality classifier model facilitate characterizing the system and understanding normal operations, while the verification approach enables monitoring for intrusion detection and thus fits into general autonomic visions of self-protecting IoT systems [17]. Security in CPS largely focuses on bit-centric Network (NKW) layer and Media Access Control (MAC) sublayer improvements [18]. One can view the various security measures as [18] (1) "Something you know" (NWK-encryption keys) (2) "Something you have" (MAC-MAC address) (3) "Something you are" (PHY-RF fingerprints) Such commonsense understandings relate how bit-level device identification credentials are easily spoofed and exploited by hackers. In a biometric understanding, one can consider a MAC address as the claimed identity to be verified using PHY-layer knowledge.
PHY-based security measures provide one remedy to these deficiencies and operate by either (1) incorporating physically traceable components to devices [19] or (2) RF fingerprinting which exploits inherent characteristics of the signal [20,21]. Herein, RF fingerprinting is primarily of interest since it does not require retrofitting CPS devices or changing well-established CPS manufacturing approaches to include physically traceable components.

Radio Frequency (RF)
Fingerprinting. Fingerprinting in communication systems considers physical layer (PHY) attributes, which are intrinsic to a specific device. RF fingerprinting extracts statistical features from RF signals and enables biometric-like identification of communication devices [3,4]. us, principles from biometrics (see [16,22]) and digital forensics (see [23]) are leveraged to develop, select, and identify RF features which have the general biometric qualities of universality, distinctiveness, permanence, and collectability [24]. A variety of RF fingerprinting approaches exist to accomplish this, which includes both transient and steady-state methods [24,25].
Steady-state methods are of interest herein and consider specific segments of RF emissions. Of interest in steady-state RF fingerprinting is comparing emissions from predefined signal characteristics, e.g., preambles, to discriminate devices via unique features, i.e. from production variations [4,26].
us, one can be divorced from attack modes, e.g., the types/ volumes of data being transmitted, and focus on identifying individual devices based on minute variations in the selected region [24].
In the RF-DNA fingerprinting process conceptualized in Figure 1, RF fingerprints are computed as statistical features from time-domain responses of instantaneous amplitude (a), phase (ϕ), and frequency (f ) [4]. Each response is then divided into N R contiguous, and equal length, intervals [4]. Within each interval statistics of variance (σ 2 ), skewness (c) and kurtosis (κ) are computed along with additional features for the entire response [4].

Classification Algorithms for RF Fingerprinting.
To discriminate between RF fingerprints and provide accurate identification of individual CPS devices, one needs to develop and train an effective classifier model. Herein, supervised classification is considered to develop a classifier model that takes labelled data, i.e. RF signatures and known identities, from the library of authenticated devices. From here, pattern recognition algorithms are employed to develop a mapping that separates the known identities (or groups) [27].
A variety of classifier algorithms have been considered for RF fingerprinting, see [6] for one example. Both GRLVQI and Multiple Discriminant Analysis (MDA) have seen consistent and successful use in RF fingerprinting discrimination [6]. MDA is considered, consistent with [6], to evaluate baseline performance. MDA operates via an eigenspace projection to find optimal linear separation between groups, where the underlying process extends Fisher's discriminant analysis to multiple classes [6]. MDA is computationally inexpensive, easy to interpret, and competitive with more complex algorithms [6]. Conversely, GRLVQI is more computationally intensive, but various applications can benefit from the nonlinear mappings inherent in GRLVQI and thus GRLVQI outperforms MDA depending on application [6].
Notably, GRLVQI and machine learning algorithms in general are highly sensitive to hyperparameter settings, such as learning rates and architecture size [28]. Although work has considered finding optimal settings for such algorithms, e.g., GRLVQI-SD, GRLVQI with optimized hyperparameters for Stochastic Optimization via Sequential Design of [28], such approaches are computationally costly with dozens of iterations needed to obtain improved algorithm settings. Additionally, such highly tuned hyperparameter values are often specific to the scope of the data and thus not useable on other datasets. Of interest herein is considering the well-known GRLVQI classifier in this domain and improving them to be a better fit to data that varies in only small, minute, dimensions, e.g., RF fingerprinting data.
is extended classifier algorithm will then be compared against MDA, the baseline GRLVQI algorithm, and the optimized GRLVQI-SD.

Classification and Verification Performance.
Assessing classifier performance involves using the appropriate performance measures. In RF fingerprinting and biometrics, in general, classification considers authorized fingerprints, which best discriminates them in a "1-vs-many" situation [24]. In contrast to this, verification takes the trained classifier model and a claimed identity, e.g., from a MAC address, and evaluates that claim as a device attempts to gain network access, e.g., a "1-vs-1" assessment [24]. Classification accuracy is considered as average percent correctly classified versus SNR (dB) operative points. Verification is considered as percentage correctly authorized in a one vs one claimed MAC address identity scenario [29]. Within both performance evaluation paradigms, a few measures are considered.

Classification Accuracy Measures.
To evaluate classification performance, a plot of average percent correct classification (%C) versus SNR is considered [29]. At each discrete SNR point on the plot, a classifier model was developed and trained for data at that SNR level. To provide for assessment and comparison of methods, both a gain measure and a Relative Accuracy Percentage (RAP) measure can be used [29]. Gain is defined, per [24,29] as the reduction in required SNR, in dB, for a method to achieve the same %C as a reference method. Generally, gain is evaluated at an arbitrary benchmark of %C � 90% accuracy [24,29]. As stated in the study by Bihl et al. [29], gain values, G SNR are interpreted as follows: (1) G SNR < 0.0 (negative gain), wherein a given method underperforms a baseline method by achieving the same %C as the baseline at a higher SNR Arbitrary feature sequence (2) G SNR � 0.0, wherein a given method is indistinguishable in performance to the baseline method by achieving the same %C at the same SNR (3) G SNR > 0.0 (positive), wherein a given method outperforms a baseline method by achieving the same % C at a lower SNR However, G SNR can be insufficient for relative performance comparisons because it only considers one part of the %C vs. SNR curve [29]. To alleviate this deficiency, the authors introduced the RAP measure in the study by Bihl et al. [29] to provide classifier assessment over the entire curve.
RAP measures are computed by first taking the %C vs. SNR curve and computing the area under this curve via trapezoidal approximation [29]. is is known as the Area Under Classification Curve (AUCC). Since the x-axis is not bounded on a simple 0 to 1 interval, it can be nonintuitive to interpret raw AUCC values. us, the RAP measure enables relative comparisons by considering where AUCC method is the AUCC of a given method and AUCC baseline is the AUCC of the baseline algorithm [29]. As developed in the study by Bihl et al. [29], RAP is interpreted as follows: (1) RAP < 1.0 indicates that a given method achieves overall lower %C than the baseline across all SNR (2) RAP � 1.0, a given method achieves an overall %C comparable to the baseline (3) RAP > 1.0 indicates that a given method achieves overall better %C than the baseline across all SNR (1) e percentage authorized (%Aut) at an arbitrary TVR ≥ 90% at FVR ≤ 10% threshold (2) e mean area of the ROC curves (AUC M ) e AUC M approach was developed in the study by Bihl et al. [29] to avoid dichotomous performance results since % Aut reflects coarse sampling.

Learning Vector Quantization (LVQ) Classifiers
LVQ is an artificial neural network (ANN) approach that classifies data via lower-dimensionality maps. LVQ provides a supervised learning extension unsupervised self-organizing maps (SOMs) or vector quantization (VQ). Epistemologically, SOM algorithms are self-organizing ANNs [31], and a general example is conceptualized in Figure 2, which compares a typical three layer (input, hidden, and output) ANN in Figure 2(a) with a typical LVQ network in Figure 2(b). Of note is that the LVQ network does not have an outer layer, which maps the response of a typical ANN to the output class; LVQ networks operate differently and work to move the nodes, prototype vectors (PVs), to represent the underlying data through an iterative training process [8,33]. LVQ considers each PV as associated with a specific class resulting in a "winner take all" approach where one and only one PV will win for each exemplar [8,[34][35][36][37][38]. e operation of LVQ is as follows: a distance measure is used to compute the distance of a given exemplar to all PVs. e PV that is closest to the exemplar is then selected for modification. If this PV has the same class label as the exemplar, it is moved closer to the exemplar; however, if the exemplar is misclassified, the PV is moved away [8]. e update process for PVs employs a general gradient descent: to train PVs with t being the training iteration number, ε(t) being the learning rate, w(t) being a given PV, C(w(t)) being the cost function, and ∇ implying the gradient [6,9,39]. e cost function in LVQ is the squared Euclidean distance used to find the distance between an exemplar and the PVs. Generally, LVQ algorithms train PVs by moving correctly classified PVs closer to a given class, and incorrectly classified PVs are moved away from a given class. us, LVQ is considered as a nearest neighbor approach to learning, and the nearest PV is iteratively moved to characterize the data [40].
Various extensions and embellishments of LVQ have been developed, differing in cost function, update logic, and the inclusion of additional computational steps (e.g., relevance computations) [41]. Kohonen [42] first extended LVQ by creating variants (e.g., LVQ2 and LVQ2.1) that improved the PV update strategy to updates involving both in-class and nearest out-of-class PVs. Further major LVQ variations are reflected through the addition of letters to the LVQ acronym, c.f. [41,43]. One such algorithms is GRLVQI, which is decoded as follows: G (generalized): a sigmoidal cost function [44,45], R (relevance): relevance learning [39,46], and I (improved): PV update logic and operation [9,47].
Relevance LVQ (RLVQ) extended LVQ by incorporating a relevance weight for each data feature, which is learned during the training process [46]. GLVQ extends LVQ by improving class boundary approximations through the incorporation of a sigmoid cost function [44]. GRLVQ of Hammer and Villmann [39] combined the innovations of both GLVQ and RLVQ to create a GLVQ algorithm that learned the input dimension weights to provide relevance information regarding each feature. GRLVQ was then further extended by Mendenhall [9] through improvements resulting in the GRLVQI algorithm. In GRLVQI, GRLVQ is extended with the conscience learning of DeSieno [48], improved PV update logic, and a frequency-based maximum input update strategy [9,47]. Despite any embellishment differences, all LVQ algorithms similarly employ the gradient-descent process seen in equation (1) to train PVs via nearest neighbor approaches.

Generalized Relevance Learning Vector Quantization
Improved (GRLVQI). GRLVQI has been applied to RF fingerprinting due to its inherent nonlinearities and potential to better learn nonlinear data manifolds over MDA and linear methods. For GRLVQ and GRLVQI, the underlying cost function for equation (2) is where f(μ(x m )) is a sigmoid and ψ are the relevance scores: and μ(x m ) is a relative distance difference: with τ being a GRLVQI rate (implicitly, in GRLVQ τ � 1), and d J and d K being distances between the exemplar x m and the in-class PV, w J and out-of-class PV, w K , respectively [9,39]. Nominally, d J and d K are computed via a squared Euclidean distance as To determine the PV update expressions for equation (3), the gradient descent for GRLVQ-type algorithms is then the gradient by chain and quotient rules multiplied by the learning rate, ε(t), and a differential shifting. e process yields where the superscript indicates if a positive (+) update, for in-class PVs, or a negative (-) update, for out-of-class PVs, is performed. In equation (7), the numerator includes the distance d K,J , indicating that d K is used for w J and d J is used for w K . Relevance learning in GRLVQI then involves a further gradient descent: where for a specific q-th feature, the derivative is computed with respect to the relevance ψ [9] as

Distance Extensions for GRLVQ and GRLVQI Classifiers
Despite the various extensions, GRLVQI, as well as many LVQ variations, relies on the linear squared Euclidean distance measure. As noted above, in the introduction,

K -th neuron
Output nodes x 1 x p K -th (PV) x 1 x p

GRLVQI-D: Distance Extension Framework for GRLVQI.
Because LVQ algorithms are trained using gradient descents, changing the distance measure necessarily requires computing the first derivative of the distance measure for appropriate inclusion into the cost function. Herein, distancebased extensions of LVQ, specifically GRLVQ and GRVLQI, are considered as a gradient-descent process. We can develop GRLVQI-D, a straightforward approach to changing GRLVQI distance measures by considering the various derivational pieces that are represented in equation (7). Using the developed derivative framework, GRVLQ and GRLVQI could be further extended with any differentiable distance measure. To accomplish this, we can represent equation (7) as and equation (9) as where c is a constant, c � 4 for nominal GRLVQI, and us, for example, if one changes the distance measure, only D m must be changed in equation (10), whereas the remainder of the expression is unchanged.
One final extension must be considered. Squared Euclidean distances are always positive and ensures that in equation (4), μ(x m ) ∈ [− 1, +1], which is desirable to avoid μ(x m ) from creating unstable results. Nonsquared distances do not necessarily ensure a positive distance. us, the authors extend equation (5) to is creates a squared measure to ensure that μ(x m ) ∈ [− 1, +1].

Cosine GRLVQI.
As an example of using the straightforward GRLVQI-D process presented in Section 4.1, the authors will use this process to derive a Cosine GRLVQI algorithm. As mentioned previously, a cosine distance could be useful for discriminating exemplars that are similar in operational characteristic but differ based on minute characteristics, e.g., biometrics. A cosine distance measure is a similarity measure that computes the cosine angle of two vectors [51], i.e. a measure of orientation and not magnitude of the distance (translation variance but scale invariance). e cosine distance measure can be formulated as Following the discussion in Sections 3.1 and 4.1, a relevance learning can be formulated as with its derivative via the quotient rule being Considering equation (14) with a derivative for relevance yields the following: Since cosine distance measures do not ensure a positive distance, the formulation of equation (12) will be used. Applying the quotient rule to equation (16) with where z(d J,K ) 2 / zw J,K is the product of 2d J,K and equation (11) for Cosine. Taking D m � z/zw J,K [(d cos ) 2 ] and inserting equation (17) into equations (10) and (11) produces the Cosine GRLVQI update expressions.

Application and Example Results
To enable the industrial IoT, CPS devices are finding increasing use in critical infrastructure, smart metering, and home automation [2,52]. CPS devices employ either open or proprietary protocols, with open protocols offering more aftermarket security options, but possibly more threats, while proprietary protocols offer "security through obscurity," but less additional aftermarket security options [53].
Of the proprietary wireless protocols, the most commonly used are Z-wave, which is based on the International Telecommunications Union-Telecommunications (ITU-T) G.9959 recommendation [52]. Since Z-wave is known to have security vulnerabilities [54], vetting the claimed identity of Z-wave devices through RF fingerprinting is important.

Z-Wave
Devices. Z-wave wireless communication devices are low-cost WPAN technologies used primarily for residential automation and similar in operations yet simpler to work with when compared with previously described devices [55][56][57][58]. However, Z-wave is generally considered as less secure than other WPAN technologies due to (1) an original lack of built in encryption [56] and (2) a proprietary standard that makes it difficult for third parties to provide enhancements [58]. Integration of Z-wave devices with IoT largely involves vendors. To produce a Z-wave device, and thus gain access to the proprietary Z-wave standard, a vendor must coordinate sign a Non-Disclosure Agreement (NDA) with Sigma Designs [59]. Vendors then gain access to hardware and software to develop Z-wave [59]. However, without a signed NDA, only general characteristics of the Z-wave protocol are known [59].
General Z-wave signal characteristics are known and presented in Figure 3 and Table 1. To facilitate incorporation of Z-wave with other communication devices, Z-wave follows the ITU-T G.9959 protocol at the physical layer (PHY) and medium access (MAC) layer [61]. However, the routing and application layer specifications are proprietary. us, third-party security at the network and routing levels is very difficult.
To identify Z-wave devices, Z-wave communications follow a predefined preamble and Start of Frame (SoF) [60], which is conceptualized in the PHY packet structure seen in [56,58,60]. Since the preamble and SoF should not vary from device to device, of interest is collecting such regions of interest for comparison and fingerprinting of devices at a serial number level.

Signal Collection and Dataset Generation.
Consistent with [54,62], N D � 3 Aeotec Z-Stick S2 WPAN devices were considered in this research. Experimentally, each device was placed 10 cm from a vertically oriented LP0410 log periodic antenna [54]. e antenna was connected via a Gigabit Ethernet cable to an NI USRP-2921 software defined radio with in-phase and quadrature (I/Q) samples collected as 16-bit integers, sampled at 2 Msps [62]. Amplitude-based leading edge detection with a − 6 dB threshold was used for transmission (burst) detection [54]. Using this setup, a total of N P � 230 preamble signals (the first segment of the signal per Figure 3(b), and the first 8.3 ms of the signal) were collected per device [54]. e collected data had Signal-to-Noise Ratio (SNR) at SNR C � 24.0 dB and was like filtered [54]. To provide multiple operating points to consider noisy environments, Additive White Gaussian Noise (AWGN) was added to collected signals to achieve SNR ∈ (0 24) dB operating points in 2 dB steps [54]. Since the data collected were for 3 devices, this research does not consider identity impersonation attacks by "rogue" devices, and all devices were considered as "authorized." Following the RF-DNA fingerprinting process in Section 2.1, Z-wave fingerprint generation parameters included the N TD � 3 (a, ϕ, f ) Figure 3: Z-wave characteristics: (a) protocol, and (b) signal. Extended from discussions in [56,58,60].
Following the process of Algorithm 1 for each classifier presented in Table 2, we first evaluate classification performance at each SNR point. Figure 4 presents the classification results for each classifier on the sequestered TST set. Classification performance was evaluated in Figure 4 as Average Percent Correct (%C) on the y-axis and SNR (dB) on the x-axis, consistent with [6,29]. Visible in Figure 4 is that both MDA and GRLVQI underperform Cosine GRLVQI and require higher SNR to achieve the same %C. GRLVQI-SD is seen to outperform Cosine GRLVQI at only SNR > 20 dB, where performance of all algorithms is largely over %C � 90%. Overall, at testing, Cosine GRLVQI is seen to provide over +6.00 dB gain at 90%C relative to MDA testing performance, whereas baseline GRLVQI offers only +3.32 dB gain at 90%C relative to MDA testing performance.
Verification performance is considered in Figure 5 as Receiver Operating Characteristic (ROC) curves. In Figure 5, models were evaluated at SNR � 20 dB where     [54]. Verification performance shows MDA (dashed grey lines) and Cosine GRLVQI (solid black lines) both outperforming the baseline GRLVQI (solid grey lines) and GRLVQI-SD (dashed black lines). Notably, baseline GRLVQI only correctly authorizes 2 Z-wave devices, missing the third device by a considerable margin, e.g., the solid grey line that intersects TVR � 0.80 and FVR � 0.20. When comparing MDA, Cosine GRLVQI, and GRLVQI-SD, the results are less clear. e insert in Figure 5 enlarges the 0-20% FVR and 80-100% TVR range and shows that the performance of these three classifiers for verification is very close, and that while only MDA provides 100% verification accuracy, both Cosine GRLVQI and GRLVQI-SD only barely miss the dichotomous %Aut threshold for this experiment. Since dichotomous results, e.g., for ND � 3 devices %Aut ∈ [0, 33, 66, 100], are not always reliable, as seen inspecting Figure 5, the authors also investigate AUC M to enable relative performance differences to be evaluated between competing classifiers. Table 3 presents performance for training (TNG) and testing (TST) data and shows an advantage of Cosine GRLVQI over GRLVQI-SD, GRLVQI, and MDA for classification. Verification performance in Table 3 was evaluated with binary grant/deny network access decisions based on a verification criteria, e.g., TVR ≥ 90% at FVR ≤ 10% with %Auth ∈ [0, 33, 66, 100] for N D � 3. When evaluating AUC M , the results show that these three methods (MDA, Cosine GRLVQI, and GRLVQI-SD) perform very similarly in verification performance with Cosine GRLVQI slightly outperforming all algorithms. As noted in discussing Figure 5, Cosine GRLVQI and GRLVQI-SD barely miss %Auth � 100%. Considering Table 3 overall, both MDA and Cosine GRLVQI outperform the baseline GRLVQI and GRLVQI-SD, while Cosine GRLVQI provides the best classification performance for this problem, thus illustrating the benefit of changing the cost function in GRVLQI.

Conclusions
Herein, the authors addressed problems in identifying cyber-physical systems (CPS) using radio frequency (RF) emissions. e authors considered the Generalized Relevance Learning Vector Quantization-Improved (GRLVQI) classifier algorithm, which is known to accurately classify RF fingerprint features but underperforms other methods  Figure 5: Verification performance at SNR � 20 dB for Cosine GRLVQI, GRLVQI, MDA, and GRLVQI with settings optimized per [28]. Inset shows performance in the [0%, 20%] FVR and [80%, 100%] TVR range. in the literature when protecting against masquerade attacks. Prior work optimized GRLVQI classifier settings to find improved operating points, but this is computationally costly and not generalizable. Since GRLVQI, and the LVQ family of algorithms, revolves around a distance measure to train their architecture, logically changing this distance measure can change results. However, LVQ algorithms are gradient descent based with the distance measure being related to the cost function of the gradient descent itself, thus changing the distance measure requires changing the derivatives. Since this can be complex, it has not been pursued previously. e authors thus developed GRLVQI-D, a straightforward modularization of the GRLVQI algorithm and developed an update methodology which facilitates changing the distance measure of GRLVQI and GRLVQ, as well as other LVQ algorithms. To illustrate the application of the update methodology, the authors further developed a Cosine GRVLQI algorithm. Example results were then then presented for experimentally collected Z-wave RF data. RF fingerprints were developed for these devices to create a biometric library. en, following a general biometric process, both classifier model development and identify verification were considered. Results show that the proposed Cosine GRLVQI algorithm outperforms both MDA and baseline squared Euclidean GRLVQI in classification and verification accuracy. Additionally, the results presented for Cosine GRLVQI were better than the iteratively obtained optimal results of GRLVQI-SD, which required 28 iterations to obtain improved algorithmic settings. Naturally, future extensions of this work would be to find optimal Cosine GRLVQI, per the approach of [28].

Data Availability
Publication is cleared for public release under case: 88ABW-2019-4252. Data results are publishable, but data and code are not cleared for public release.

Disclosure
is material is declared a work of the US Government and is not subject to copyright protection in the United States.

Conflicts of Interest
e authors declare that they have no conflicts of interest.