The Goertzel Algorithm for the Extraction of Texture Features

The detection of the properties of objects is essential to deal with the manipulation of objects with artificial hands and grippers. In particular, texture detection is a common challenge in robotics. In the quest for smooth and natural manipulation, response times in the order of milliseconds are needed. In a context where the number of sensors and actuators integrated in the system increases, it becomes necessary to pre-process in local electronics. This pre-processing reduces the number of interconnecting wires that hinder movement, and also the computational load and data traffic. This paper proposes the use of the Goertzel algorithm as an alternative to the common use of FFT to obtain the features used to identify a texture. The lower computational cost of the Goertzel algorithm translates into lower resource and power consumption. This lower cost is observed for a limited number of features to be extracted from the raw signal, which can be assumed for an application that seeks to obtain main features of the manipulated object, and not an exhaustive characterisation of the object. This paper shows that a set of 12 textures can be classified with 84.8% accuracy by extracting features that give the signal power for 16 selected frequencies in the spectrum.


I. INTRODUCTION
T ACTILE sensors serve as a valuable tool for acquiring in- formation about object properties [1].Their significance is particularly evident in applications involving artificial hands for robotics and prosthetics [2] where we encounter the following requirements: r Distributed array sensors (tactile sensors) in fingers and palm that acquire a large amount of data.
Manuscript received 27 February 2024; accepted 4 June 2024.Date of publication 19 June 2024; date of current version 25 June 2024.This letter was recommended for publication by Associate Editor P. Maiolino and Editor A. Banerjee upon evaluation of the reviewers' comments.This work was supported in part by Spanish Government with a FPU Grant given by the Ministerio de Ciencia, Innovacion y Universidades, and in part by the European ERDF program funds under Grant PID2021-125091OB-I00. (Corresponding author: Fernando Vidal-Verdú.) Raúl Lora-Rivera is with the Department of Electronics, University of Málaga, 29071 Málaga, Spain (e-mail: rlrqrur@uma.es).
Digital Object Identifier 10.1109/LRA.2024.3416790r A high number of actuators to achieve a high number of degrees of freedom.
r Limited space for communication buses and power supply cables without impeding manipulation.
r Real-time operation for precise manipulation.r Power consumption limitations, especially in the case of prosthetics, but also for autonomous robots with batteries.In this context, the proposed approach could involve implementing local pre-processing of data from array sensors to reduce data traffic in communication buses and alleviate the load on the processor responsible for system control.However, the key question is how intelligent the local processor needs to be.Generally speaking, the more complex the processor, the more capable it becomes, but it also consumes more power.To mitigate power consumption and address the problem using a simpler processor, it is necessary to seek more efficient algorithms in terms of power and hardware resources while achieving reasonably good performance in the intended task.Furthermore, even if a powerful local processor is available, it can handle a broader range of complex operations (such as slippage warning, compliance detection, texture recognition, etc.) if the associated algorithms are optimized.For artificial hands, a distributed architecture with a more potent processor at the palm level and simpler processors at the finger level could be an effective approach.The finger-level processors can extract features from the tactile sensor and transmit them via serial buses to the palm processor, which implements the classifier.
Among the various object properties, texture stands out as a key characteristic [1].An initial strategy for distinguishing textures involves processing tactile images captured by highresolution sensors [3].This approach leverages image processing feature extraction capabilities [4] and their efficient implementation on embedded systems, notably field-programmable gate arrays (FPGAs).However, challenges persist in terms of computational resources, power consumption, and data traffic, and the absence of dynamic interaction limits the capture of friction-related information.The majority of texture discrimination studies are based on active touch, where the sensor and the explored surface move relative to each other.Active touch enables the detection of micro-vibrations generated by normal and shear forces during surface exploration [5], mirroring how the human touch senses surface characteristics [6].In artificial systems, data obtained through active touch undergo preprocessing to yield descriptors or feature vectors.While some approaches use raw tactile datasets as classifier input [7], this can complicate learning, increase resource requirements, and risk overfitting.Thus, many opt for a reduced set of features, such as classical statistical features [8] or the friction coefficient combined with signal power, complexity, and frequency content [9].Some propose descriptors resilient to changes in exploratory parameters [10], incorporating information about signal power, mean frequency, bandwidth, and linear and nonlinear correlations across sensors.Nevertheless, most works based on active touch resort to fast Fourier transform (FFT) for feature extraction [1] [2] [11].For this it is necessary to know and keep constant the speed of displacement between the sensor and the surface.Principal component analysis (PCA) is often employed to reduce the dimensionality of the feature vector, especially when information from multiple taxels (force sensing units) is utilized [12].
While effective, these feature extraction methods are complex to implement in embedded systems like smart sensors in artificial hands, demanding high power consumption, computational resources, and data traffic.A different approach, as seen in [13], trades some performance for benefits in low-cost embedded implementations.This method divides the frequency spectrum into bands, using the vector of power in each band as a feature, simplifying computations and reducing complexity compared to FFT.This work explores a related approach based on the Goertzel algorithm [14].This algorithm provides a computationally efficient way to calculate the strength or magnitude of a particular frequency component within a signal.After inspection of the signal spectrum from the exploration of different textures, a set of frequencies are selected to obtain their corresponding magnitudes as input of a classifier that will give the associated texture as output.Preliminary results of this approach were presented in [15].
The remaining sections of this article are structured as follows: Section II provides a brief explanation of the Goertzel algorithm.Section III outlines the materials and methods used in the study.Section IV focuses on the implementation of the approach on a System on Chip (SoC).Moving on, Section V presents the results and associated discussions, while Section VI summarizes the key conclusions of the work.

II. GOERTZEL ALGORITHM
The Goertzel algorithm computes the first K components of the Discrete Fourier Transform (DFT) of a time-sampled signal {x[n]} of length N whose sampling frequency is F s .In this way, only those frequencies of interest can be computed, thus being more efficient than the FFT for certain situations.From the expression to calculate the DFT of the signal (1): one can derive the discrete transfer function given by (2), see [14]: ( The expression in ( 2) is equivalent to a second order system, whose equation in differences is (3): The state variables in (4) can be used to describe this structure, as follows: Then, the output (5) results as: where This state space description corresponds to that of an IIR (Infinite Impulse Response) filtering process.This is advantageous because the focus is solely on calculating the sample y k [N ], which represents the spectral coefficient X[k] for the frequency index k.
The traditional FFT algorithm, applied to a signal {x[n]} of length N , has a computational demand proportional to N • log 2 (N ).The actual number of operations is about 6N • log 2 (N ) (taking a complex multiplication as a combination of four multiplications and two additions).For the Goertzel algorithm, N real multiplications and 2 N real additions are necessary to compute a single frequency.Therefore, approximately 3 N operations are needed to calculate a frequency.If extrapolated to the computation of K frequencies, a number of approximately 3NK operations would be needed [14].
Considering the above, (6) gives for what number of K frequencies it is more advantageous to exploit the Goertzel algorithm versus the FFT, in number of operations.
On the other hand, if real signals are considered, the FFT requires a memory buffer of a total of 4 N memory locations.For the Goertzel algorithm, 7 K memory locations are needed [14].Therefore, the Goertzel algoritm is advantageous in terms of memory resources if (7) is satisfied.
For a given implementation of the Goertzel algorithm, it will be more advantageous than the FFT when it satisfies the conditions given by the expressions in ( 6) and (7).

A. Experimental Set up and Data Gathering
Fig. 1 depicts the experimental arrangement devised for gathering the data and findings outlined in this paper.An artificial finger equipped with a tactile sensor was affixed onto a Cartesian robot to facilitate the collection of data during texture explorations.The smart finger electronics are based on a Xilinx Spartan 6 FPGA (6SLX16-CSG225).This system establishes a direct connection between the FPGA and the resistive raw sensor to acquire data from a 55-taxel array.The raw tactile sensor consists of a layer of piezoresistive material on an array of electrodes.The circuit is implemented on a semi-rigid PCB with dimensions of 90 mm by 16 mm.This includes a rigid part measuring 35.5 mm by 16 mm and a flexible part measuring 54.5 mm by 16 mm.Each taxel within the array has a size of 3.7 mm by 3 mm, while the total array measures 40.7 mm by 15 mm.Details about the smart finger can be found in [16].The inset on the top right corner of Fig. 1 shows details of the dimensions and shape of the domes on the outer layer of the sensor, which is made of an elastic material [17].This smart finger interfaces via a serial peripheral interface (SPI) bus with an AVNET ZedBoard development board.The board is anchored by the Zynq-7000 SoC XC7Z020-CLG484-1 device, featuring both an FPGA (Field Programable Gate Array) and a dual-core Cortex A9 processor.
Table I shows information about the set of textures that have been used to test the performance of the Goertzel algorithm in the task of detecting them.Textures #TEX-7, #TEX-8 and #TEX-9 were generated using pulse-width modulation (PWM), with a sinusoidal modulator signal of wavelength λ m and amplitude A m and a triangular carrier signal of wavelength λ c and amplitude A c [18].The parameters m a and m f in Table I are defined as m a = A m A c and m f = λ c λ m .The setup in Fig. 1 is used to perform active explorations where the finger moves at a speed of v = 30 mm/s along the textures while keeping contact with them, as illustrated in Fig. 2. Tactile data were acquired at rate of 485 samples per second to obtain a vector with 2048 samples per exploration.The procedure is repeated 200 times per texture.The right column in Table I shows the spectra resulting from this data gathering for each texture and after antialiasing filtering.
Note that, as mentioned in Section I for other works based on the FFT, the exploration speed is assumed to be constant in the experiments of this paper.To obtain the representation of a given texture in the time domain, the simple relationship time = displacement/speed could be proposed.For a constant speed, this relationship is linear, and the Fourier transform of displacement can be easily derived from the Fourier transform in the time domain.Both the FFT and the Goertzel Algorithm are based on the Fourier transform, so the validity of this relationship applies to both algorithms.

B. Feature Extraction and Classifier
Taking into account the frequency nature of the textures considered in Table I, a possible approach is the extraction of those frequencies of the spectrum where there is a higher signal power.In this way, we would be considering the areas of the spectrum that contain most of the information.It should be noted that the spectra in Table I are the result of the interaction between the textures and the sensor, particularly influential are the material and the shape of the external layer of the sensor.It can be seen from Table I that most of the information for the 12 textures is at low frequencies.Therefore, a possible strategy is to concentrate the search of the algorithm indexes in that area of the spectrum.
The frequencies considered need not be uniformly distributed in the spectrum.In this paper, to decide which Goertzel indices are calculated, the nonlinear expression given in ( 8) is proposed: Therefore, for every index y within the range (0, N) in a uniform distribution across the spectrum, (8) provides the associated index in the non-uniform distribution.The parameter p in ( 8) is a positive integer that determines the extent to which the new distribution deviates from the uniform one.The output of the Goertzel algorithm for each sample of raw data from an explored texture is used to train an unsupervised Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I MAIN PROPERTIES OF THE TWELVE TEXTURES OF THIS WORK
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
k-means classifier.The initial selection of centroids (vectors identifying specific classes) can be either entirely random or based on the k-means++ method [19].The training procedure involved random initialization of centroids, followed by their iterative update during the learning phase with the presentation of the training set.This iterative process continued with shuffled data from the training set until centroids reached stability.Subsequently, the accuracy of the trained classifier was evaluated using the test set.This entire process was repeated 100 times, and the centroids yielding the highest accuracy were chosen for the final trained classifier [20].

IV. IMPLEMENTATION OF ALGORITHMS ON SOC
The implementation of the Goertzel algorithm considered is for online processing on embedded electronics.The pseudocode is provided in Algorithm 1 and is based on the state-space description given by ( 4) and ( 5).This pseudocode is implemented with the Vivado HLS tool.The optimization directive 'HLS ALLOCATION' shown in Algorithm 1 is used to keep the number of DSPs (Digital Signal Processor) in the logic synthesis to as low as possible.The vectors realCoef and imagCoef , of size K, contain both real and imaginary coefficients, necessary for the real and imaginary coefficients [14].For each index k, the real coefficients are given by ( 9): On the other hand, also for each index k, the imaginary coefficients are given by (10): In the implementation of the algorithm, these coefficients can be computed offline, so it is not necessary to compute them at each iteration.
Once the HLS code is synthesized, an IP Core is generated that can be used in the design of the Vivado architecture.The architecture and data path are shown in Fig. 3.The classification algorithm is implemented on the ARM core, though the training is performed offline.The resulting implementation is checked with the Vivado SDK development environment.

V. RESULTS
For the calculation of the Goertzel algorithm to be more advantageous than the FFT, ( 6) and ( 7) must be satisfied.To facilitate the calculations in an FPGA implementation, powers of 2 have been considered for the index k.Moreover, different values of p in (8) have been tried to obtain various distributions of the selected frequencies along the spectrum.The so obtained features are used to train the classifier as explained in Section III-B.Fig. 4 shows the results for the best cases in the ROC (Receiver Operating Characteristic) space for the classification of the twelve textures in Table I, where TPR (True Positive Rate) is sensitivity = true positives/total positives, and FPR (False Positive Rate) is 1 − specif icity, where specif icity = true negatives/total negatives.Figs. 5 and  6 show the results in the ROC space for these cases and the classification of the textures in Table I.
Regarding resource and power consumption, as well as latency, Table II presents the results for the scenarios depicted in Fig. 4. When comparing cases with similar input vector lengths and classification performance in Fig. 4, specifically the FFT with 1024 coefficients against the Goertzel algorithm configured with N = 1024, K = 16 and p = 10, one can observe that the Goertzel algorithm requires fewer logic, memory, and power resources compared to the FFT implementation.However, there is an increase in feature extraction time or latency for the Goertzel algorithm relative to the FFT.Nonetheless, the latency remains under 115 µs for feature extraction in the most demanding Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II PERFORMANCE DATA OF THE GOERTZEL AND FFT IMPLEMENTATIONS
scenario, which is within acceptable limits for the task of active touch texture detection.It is also worth noting that the output vector length resulting from the feature extraction processthe input vector for the classifier -is considerably shorter with the Goertzel algorithm, at only 16 components, in contrast to the FFT's 1024 components.This significant reduction in output Algorithm 1: Pseudocode of the Goertzel Algorithm.
vector length dramatically reduces traffic on the communication buses and eases the computational load on the classifier.

VI. CONCLUSION
In this work, the use of the Goertzel algorithm for texture detection is explored using the signal provided by a tactile sensor through active touch.The algorithm yields K coefficients corresponding to K selected frequencies from the signal spectrum.
Different distributions in the spectrum are tested, and it is observed that a non-uniform distribution, which takes into account the signal spectrum ranges where there is more information, is more successful in classifying textures.Specifically, an accuracy of 84.8% is achieved for K = 16 coefficients and N = 1024 samples, in the best case considered for the Goertzel algorithm.For the same set of textures, FFT achieves an accuracy of 91.4% with 1024 coefficients.The simplicity of the Goertzel algorithm translates into lower power consumption and resource usage.However, in addition to obtaining slightly lower accuracy than FFT, the latency of the Goertzel algorithm in providing features to the classifier is higher, although it is an acceptable value in the context of the application.

Fig. 2 .
Fig. 2. Illustration of the active exploration of textures.

Fig. 5 .
Fig. 5. ROC data for the textures in Table I and the 1024-component vector from the FFT as feature for the classification.

Fig. 6 .
Fig. 6.ROC data for the textures in Table I and the 16-component vector from the Goertzel algorithm (N = 1024, K = 16 and p = 10) as feature for the classification.