On Beamforming with the Single-Sideband Transform

.


Introduction
Filtering is an essential tool available in the modern world, being necessary in fields such as communications [1], biomedical applications [2][3][4], and system control [5,6], among other areas [7][8][9].At its core, filtering mitigates the effects of undesired sources and can use temporal and spatial information regarding the signals, sources, sensors, and environment to enhance the signals.Given time samples, these filtering processes can be implemented in the time-, frequency-, and time-frequency domains [10], each offering distinct advantages.In particular, time-frequency methods exploit frequency-related information while dynamically adapting to signal and environmental changes in time, providing a tradeoff between strictly time or frequency information.Generically, transforms are the primary tool for achieving desired time-frequency domain data, with the Short-Time Fourier Transform (STFT) [11,12] being prominent in the literature due to its widespread use.However, alternative transforms can also be employed [13][14][15], each offering a unique perspective and insight regarding the signal, leading to different performances depending on specific application requirements.
Among these alternatives, an approach based on single-sideband modulation has occasionally been proposed [16] for various applications in signal enhancement [17,18], from acoustic echo cancellation [19] to speech de-reverberation [20] and machine learning signal enhancement [21], showing varying levels of success.The Single-Sideband Transform (SSBT) holds promise in the field due to its real-valued representation.This characteristic simplifies mathematical developments by avoiding complex-valued coefficients or matrices, and it also leads to a more straightforward hardware implementation, resulting in more cost-efficient devices.Previous research shows that this transform works best with short time-frequency frames [20,21].Despite this, the research on applying this transform for beamforming is limited.In particular, its potential for spatio-temporal multisensor beamforming in reverberant scenarios [22] under a convolutive transfer function (CTF) model [23] is still open to be explored.However, it is necessary to understand that the transform's properties is crucial to ensure accurate and meaningful outputs.The SSBT lacks comprehensive examination in the literature, resulting in a limited understanding of its features and limitations compared to other methodologies.This knowledge gap complicates the appropriate application and utilization of this transform.This paper aims to explore the SSBT properties, how they interact with traditional beamforming assumptions, and how to properly implement a beamformer, specifically the Minimum-Power Distortionless Response, under this new transform.
We begin by examining a continuous-frequency version of the SSBT, comparing its properties to those of the Fourier Transform (FT) and the STFT.We study how these properties impact basic beamforming concepts such as the convolution theorem and relative frequency response estimation for the SSBT.While the SSBT is more error-prone and restrictive than the STFT regarding beamforming design, we also demonstrate a bijective interchangeability between them.This allows for their joint usage, potentially enhancing beamformer design within the SSBT domain by converting into the STFT and back without depending on inverse transforms, which are computationally intensive.We also test its direct implementation for beamforming in a real-life-like reverberant scenario, comparing it to the STFT in this regard.Our theoretical findings matched the experimental results: the beamformers based on SSBT were slightly less effective than the ones based on STFT in ideal conditions, and significantly less effective in non-ideal scenarios.This emphasizes the limitations of the SSBT compared to the STFT in practical beamforming.
The remainder of this paper is organized as follows: In Section 2, we introduce the proposed frequency and time-frequency transforms, explain their relationship, and elaborate on their relevant properties.These properties are fully developed in Appendix A where proofs are necessary.Section 3 presents the considered signal model and how to incorporate the desired constraints while considering the time-frequency transforms at hand, considering their features and peculiarities.In Section 4, we present and discuss the results, comparing the studied filtering approaches for various metrics and situations, each exposing different types of information regarding their performances.Finally, Section 5 concludes this paper.

Frequency and Time-Frequency Transforms
Hereafter, we assume that all time-domain signals are real-valued, which allows for shortcuts to some transforms and enables the use of others.
For continuous time and frequency domains, the Fourier Transform (FT) of a time signal () is defined as where •) * being the complex-conjugate operation.We define the Real Fourier Transform (RFT) similarly to the FT, constructed such that its frequency spectrum is real-valued without loss of information.The RFT is given by and the Inverse Real Fourier Transform (IRFT) as below (see Property A2 in Appendix A): The RFT can also be defined via the FT through a simple substitution of Equation (1) in Equation (2) Using the fact that () is a real signal, it is possible to achieve the following (Property A1, Appendix A): This means that the RFT can be defined in terms of the FT and vice-versa, forming a bijective relationship between the two transforms.This is only true if the original timedomain signal is real-valued, which is a requirement for the RFT to be invertible in the first place.A similar result was obtained previously (Equations ( 4) and ( 8) in [21]), where it was shown that the SSB representation depends on the complex-conjugate of the singlesideband modulated signal.Here, this complex-conjugate is represented by the negative frequency component, these being the same under the assumption of () being real-valued.

Convolution
Given an LTI system with an impulse response ℎ(), the convolution theorem for the Fourier transform states that where F ⇌ indicates a Fourier transform pair.This theorem is not strictly valid for the RFT (see Property A3 in Appendix A).However, it is possible to prove that there is an equivalent of the convolution theorem for the RFT (see Property A4 in Appendix A), with where R ⇌ indicates an RFT pair.For a given frequency  , the convolution's output on the RFT domain depends on both it and its dual frequency  .This makes intuitive sense, as the real-valued spectrum of the RFT impedes a correct phase representation using only the given frequency, making its conjugate frequency also necessary.

Relative Frequency Responses
Given two systems with an input (), each with an impulse response ℎ 1 () and ℎ 2 (), we can calculate their relative frequency responses (RFRs) respective to the output of one of the systems (assumed to be the first without compromise), these being denoted  1 (  ) and  2 (  ).Let  1 (  ) be the first system's output, given by and similarly for  2 (  ).Clearly, which satisfies  2 (  )  1 (  ) =  2 (  )  (  ).These RFRs can also be calculated as where {•} is the expectation operator.It is easy to see that Equations ( 9) and (10) are equivalent, at least in an ideal scenario.
Generalizing to a situation with  sensors, one can follow the same steps and achieve that for each -th sensor, we have Notably, Equation (19a) reduces to  ′ 1 (  ) = 1 and  ′′ 1 (  ) = 0, using the fact that  1 (  ) and  1 (−  ) are uncorrelated (see Property A6).
Observing strictly the mathematical structures of Equation ( 8) and Equation (11), the FT can be treated as a particular case of the RFT formulation, where  ′  (  ) =  ;F (  ), and  ′′  (  ) = 0. Knowing this, we use the SSBT formulation of Equation ( 26) for both the SSBT and the STFT, as it is a more general model.The necessary considerations are taken when particularizing the equations for the STFT.

Discrete Time-Frequency Transforms
The Short-Time Fourier Transform (STFT) [11,12] where  [] is an analysis window of length , and  is the hop size between successive windows of the transform, usually  = ⌊  /2⌋. is the frame index, and  is the frequency bin index.The STFT is generally seen as a discretization of the FT, being applied over sequential time "snippets".
The Single-Sideband Transform (SSBT) [16,24] is similarly defined, being the RFT's windowed-discrete-time equivalent.The SSBT of  [] is One advantage of using the STFT is the need for only ⌊ (+1) /2⌋ + 1 frequency bins, given its complex-conjugate behavior for real-time-domain signals.Meanwhile, the SSBT requires all  bins to correctly capture all the information of [], but its coefficients are real.
Assuming that all  bins of the STFT are available (even though they are not all necessary), similarly to Equations ( 4), (5a) and (5b) we now have For the abuse of notation, we let  S [, ] ≡  S [, 0], and equally for  F [, ].
As is the case for the RFT, the SSBT also does not hold the convolution theorem in the same way as the STFT.However, similarly to what is shown in Equation ( 7) (and in Property A4), the convolution on the SSBT domain through the MTF model [25] can be given by or, with the CTF model, in which this convolution is performed over , with Note that this is an approximation, as cross-band interference [26,27] is necessary to model the convolution perfectly; however, this effect is not considered here for either transform.The conjugate frequency's presence is a byproduct distinct to the cross-band interference, the latter happening due to aliasing and the windowing processes, and the former coming from the continuous-time convolution theorem for the SSBT (see Section 2.1 and Property A3).
Similarly to the relationship presented in Equation ( 5), we have Equations ( 25a) and (25b) give us a bijective relationship between time-frequency signals in the SSBT and STFT domains.That is, for the same transform parameters (window type and size, overlap, etc.), it is possible to convert from one to the other without using their inverses and going into the time-domain, which is resource-consuming.This interchangeability means they can be used together, transforming from one to the other for situations where each is more advantageous.

RFR Estimation for Time-Frequency Transforms
Under the same assumptions as in Section 2.2, we can write the output of our systems with the CTF model in the SSBT domain as where  ′ 1;S [, ],  ′′ 1;S [, ],  ′ ;S [, ] and  ′′ ;S [, ] are defined similarly to their counterparts from Section 2.2.In particular, given  ;F [, ] and  1;F [, ] under the STFT, we have This is equal to ] being uncorrelated.This contradicts the fact that  1;F [, ] is the output of a convolution, thus having correlated samples.The overlap between windows in the transform also contributes to the correlation of different windows.With this, the summation over  ≠  is an error term in the RFR estimation.
Applying this same process to the SSBT, we have that In this scenario, there is an error term for the same-frequency component but also one for the cross-frequency interference.Expanding  2  ′  ′′ ,−  [] leads to Three variables are of interest: the frame delay between different sensors , the RFR index , and the delay between same-and cross-frequency .In all cases where  ≠ 0 and  ≠ , the summation term of Equation ( 29) is not trivially null, and, therefore, the summation over  in Equation ( 28) is also not identically zero.Comparing Equations ( 27) and ( 28) reveals that the SSBT estimation introduces error terms due to the cross-frequency crossframe correlation, an issue absent in the STFT.This difference diminishes the robustness and performance of beamformers designed in the SSBT domain, potentially increasing output distortion due to inaccuracies in estimating relative frequency responses.

Signal Model and Beamforming
Let a device in a reverberant environment consist of  sensors and a loudspeaker.We assume the presence of a desired source and undesired uncorrelated noise at each sensor of the device.For simplicity, we assume that the environment and sources are spatially stationary, although this condition can be relaxed.Let   [] be the observed signal at the -th sensor, given by where   [] is a desired component,   [] is the undesired interfering signal (from the loudspeaker) captured by the sensors, and   [] is uncorrelated noise present in the sensors.The index  (1 ≤  ≤ ) refers to the different sensors.Regarding the environment as an LTI system, then with ℎ  [] being the frequency response between the desired source  [] and the -th sensor, and   [] is the relative impulse response between the reference (assumed to be  = 1) and the -th sensors.In the time-frequency domain, Equation (30) becomes Hereafter, the notation is different than it was previously to ease reading and emphasize the frame index .
From Equation (31), using Equation (26) and assuming the CTF model, we can write and   , [𝑙] being the RFRs between each sensor and the reference for same-and conjugate-frequencies.Note that   , [] is not strictly causal, depending on the direction of arrival and features of the reverberant environment, as well as relative delays between the sources at each sensor.It is trivial to see that, for Equation (33) to be respected, we can rewrite the convolution of Equation (34) as a vector multiplication, given by with (•) T being the transpose operator.Then, our observed signal  , [] becomes The in which D , is an   ×  Toeplitz matrix with  =   +   − 1, and x 1, [] is an  × 1 vector of our desired signal Concatenating the observed signals sensor-wise yields y  [], defined by and where y  [] is an (   ) × 1 vector, and D   is an (   ) ×  matrix.s  [] and r  [] are defined similarly to Equation (40a).

Filtering and the MPDR Beamformer
Our objective is to recover the desired signal at the reference sensor  1, [] without any distortion while minimizing the output signal's power.For this, a linear filter f  [] is employed, yielding an estimate   [] of our desired signal: with (•) H being the transposed-complex-conjugate operator.Using Equation (40b) in Equation (42) yields where   , [] is the filtered desired signal,   , [] is the filtered interference signal, and is the filtered noise signal.In particular, the filtered desired signal is where each component of the desired signal is exposed.The distortionless constraint on the desired signal is translated as which is equivalent to requiring that each component of  1, [] is perfectly recovered.This means maintaining the same-frequency component while nulling the cross-frequency parcel that appears in Equation (44).From Equation (39b), we have that the desired signal for the current index  is the (Δ + 1)-th element of x 1, []; thus, the constraints are where i Δ is an  × 1 vector of zeros except for the (Δ + 1)-th entry, which is 1, and 0 is an  × 1 vector of zeros.For the STFT, only the first constraint of Equation ( 46) is considered since D ′′  is identically zero by definition, and, therefore, the second condition is trivially satisfied.With this, we write our constraint matrix as where for the STFT, C  = D ′  and i = i Δ ; and for the SSBT, C  = D ′  , D ′′  , and i = i Δ 0 .
To minimize the output signal's power while satisfying the distortionless constraint, a Minimum-Power Distortionless Response (MPDR) beamformer is used, defined by and I is the identity matrix, both of size    ×    . is a regularization parameter for white noise gain control [28].The solution to the minimization problem in Equation ( 48) is given by All conjugate-transpose operations are replaced with simple transposes for the SSBT, as all signals and matrices are real-valued in this transform.

Conjugate-Frequency Filtering with the SSBT
It is useful to bring Property A7 to light.From there, we have that RFRs with the SSBT are even functions of the frequency for the same-frequency portion and odd for the conjugate frequency.That is,  ′ , [] =  ′ ,( − ) [] and  ′′ , [] = − ′′ ,( − ) [𝑙].Using this, from the constraint in Equation (46b), we have It is also easy to see that  given that f  [] fulfills the minimization problem from Equation (48) for the conjugate bin  −  as well.Although with the STFT, only half the spectrum is needed (given its complex-conjugate properties), with the SSBT, the filter only needs to be calculated for half the spectrum (even though it needs to be applied to the whole spectrum), putting them on a similar footing in this regard.

Theoretical Disadvantages
There are two apparent weaknesses with the SSBT: the first is a byproduct of working with a real-valued transform, where each frequency is influenced by its conjugate when dealing with convolution.This is inherent to any time-frequency transform that operates on a real-valued frequency domain, assuming a correct model that preserves the phase.The need to work with two frequencies simultaneously implies that, for each constraint in the problem, two constraints are needed in the mathematical model, even in an ideal scenario.This adds further load to the minimization problem, limiting how much noise it can minimize.
The second disadvantage is a direct consequence of the first.As explained in Section 2.4, the SSBT transform is less robust than the STFT in terms of RFR estimation errors, this being caused by correlation between conjugate frequencies on different frames (see Section 2.4), having more error terms when compared to the STFT.While it is impossible to bypass the first one since it is a modeling obstacle, the second can be worked around as it is an outcome of non-ideal considerations.For example, these effects are lessened by minimizing the impact different windows of the CTF model have on each other (or by using the MTF model).

Comparisons and Simulations
In the simulations (code available at https://github.com/VCurtarelli/py-ssb-ctf-bf(accessed on 21 August 2024)), we employed room impulse responses that were generated using the RIR generator [29] and signals that were selected from the SMARD database [30].In all cases, we used   =   and Δ = 0; with this, we disregarded any non-causal frames and considered as many frame samples as the RIR samples.The room's dimensions were 4 × 6 × 3 m (width × length × height), with a reverberation time of 0.3 s.The device composed of the loudspeaker plus sensors was centered at (3, 4, 1) m, comprising  = 8 sensors arranged in a circular array with a radius of 8 cm.All sensors were omnidirectional with a flat frequency response, with the reference being the sensor at (3, 1.92, 1) m.The positions and signals used for the sources are shown in Table 1.The room's layout is described in Figure 1, where in green we have the desired source (assumed to be omnidirectional), and in red the device, with the 8 sensors and the loudspeaker in the center.
We set the input Signal-to-Noise Ratio (iSNR) for the white Gaussian source to gSNR = 30 dB, and initially, the input Signal-to-Echo Ratio (iSER) for the interfering loudspeaker source to iSER = −15 dB.The loudspeaker interference was labeled as echo, given that it was a feedback path between the source and the sensors, which were all mounted on the same device.Hamming windows were used for the transform with an overlap of 50%, and all signals were resampled to the desired sampling frequency of 16kHz.Unless stated otherwise,  = 32 samples per window were used in the windowing processes.The regularization parameter was empirically set to  = 1 × 10 −4 , just enough to control the gain in SNR.Although the developments allowed for a time-variant beamformer, we designed a single filter for the whole signal, favoring a faster processing time and ease of comparing the results.
We compare filters obtained via the two transforms for varying conditions of the signals and variables considered.In all plots, the STFT results are presented in red lines with squares, and the SSBT results in green lines with triangles.The results for an accurate a priori RFR are in lighter continuous lines, and those for the estimated RFR via Equations ( 10) and (18) are in darker dotted lines.For simplicity, the STFT for an accurate RFR is labeled STFT-A, and for an estimated RFR, it is STFT-E.The same notation applies to SSBT-A and SSBT-E.

Metrics of Interest
The compared filters aim to reduce the loudspeaker's signal while preserving the desired signal.The minimal enhancement for white noise is also interesting, given the regularization parameter  added to the problem.These are measured by the desired signal reduction factor (DSRF, or   ), gain in SER (gSER), and gain in SNR (gSNR), respectively.We also observe the directivity index (DI, or D), which measures the beamformer's behavior when employed in a spherically anisotropic noise field.Their time-dependent broadband formulations are given by in which Γ [] is the spherical anisotropic noise field correlation matrix [31], and d  [] is the steering vector between the desired source and the sensor array, both assuming a far-end free field environment.We are also interested in a time-average broadband formulation for these metrics, these being given by We chose the gains in SER and SNR metrics rather than the more common ERLE and WNG [32], given an a priori knowledge of distortion on the desired signal.Therefore, the gains become more representative of the results.

Comparison for Different Observed Frames
In this simulation, we compare our filters for the accurate and estimated RFR for a range of considered observed frames   , in which   = 1 reduces to the MTF model.The results are in Figure 2.For all metrics except DI, the accurate results are consistently better than those achieved for an estimated RFR.Also, the SSBT-A results are, overall, worse than the accurate STFT-A ones, but not by too large a margin (around 3-4 dB for all metrics).This is also the case between the SSBT-E and STFT-E, but to a higher degree since the SSBT-E beamformer also led to a higher desired DSDI, which considerably adds to its performance loss of the former compared to the STFT-E for all metrics.

STFT-A STFT-E SSBT-A
SSBT-E The best STFT-E results are obtained for   = 1 in this scenario.With   = 2, there is a notable increase in DSRF and a sharp decrease in gSER, with the latter increasing again for larger values of   but never achieving the same performance.This is due to errors in the RFR estimation for frames other than  = 0, given that these windows carry less information, and these errors interfere more than adding new frames helps.Also, the estimate with   = 1 already takes into account some information about the different windows' correlations with regard to the desired signal (see Equation ( 27)), explaining the STFT-E results being better than the STFT-A results in this case.

Comparison for Different Numbers of Samples per Frame
We now compare the beamformers in a scenario with the MTF model and change the number of samples per window (Figure 3).This circumvents the problems addressed in Section 2.4 for both transforms since we do not consider a convolutive filter.By increasing the number of samples per window, we minimize the frequency aliasing effects caused by the windowing process of the time-frequency transform.This allows us to capture more of the desired signal on each window.

STFT-A STFT-E SSBT-A
SSBT-E The first effect is that the desired signal's distortion for the SSBT-E decreases as more samples are considered.Also, for all other metrics, the SSBT-E performance approaches that of the SSBT-A as we increase .The same happens for the STFT filters; however, the difference between the STFT-A and STFT-E is negligible with far fewer samples per frame.It is also observed that for a high , the SSBT results are only slightly worse than the STFT ones for both the accurate and estimated RFR cases.This supports the previous theoretical claims, where we state that increasing the number of samples per frame reduces the RFR estimation errors.

Comparison for Different iSERs
We now compare the beamformers for different values of the input SER (iSER); that is, we change the signal's power in the source within the device, right next to the sensors.This simulation's results are presented in Figure 4. Overall, the accurate results for both transforms are comparable, with the SSBT-A again underperforming marginally except for a high iSER and   = 1.The same results as obtained previously are observed for the estimated RFR outputs.For   = 1, the SSBT-E grossly underperforms compared to the STFT-E, and although their difference for   = 16 is not as substantial, it is still relevant, except for a higher iSER.We repeated these simulations, but instead of different values of   , we changed the number of samples per frame to  = 512 and set   = 1, with results in Figure 5.This was chosen using the knowledge from Section 2.4 and Figure 3, where we both theoretically and practically have that a higher  minimizes the effects of estimation error.Notably, the results for the STFT-A and STFT-E are identical in this scenario, as in Figure 3 for a higher .Meanwhile, the SSBT-A and SSBT-E results are similar only for very low iSERs (below −20 dB), and for higher input SERs, the estimated SSBT results deteriorate drastically.Although a high  value was used, which theoretically would lead to a better SSBT-E result (following Figure 3), we see that increasing the iSER led to a worse performance, similar to the results seen in Figure 4.

General Simulation Results
For all simulations, the ideal SSBT-A results are similar (although mostly slightly worse) to the STFT-A results.Given that the SSBT beamformer is a strictly real-valued filter, this could be a worthwhile tradeoff of a slight performance loss for a faster beamforming algorithm, leading to a cheaper implementation.However, for a useful implementation of the SSBT in beamforming, these results should be followed in the estimated RFR case.However, in this scenario, the SSBT results were drastically worse when compared to those through the STFT, except for the specific case of a multiplicative filter, with a large number of samples per frame and a low iSER.This suggests that employing the SSBT with an inherently more robust beamformer with the MTF and high  could lead to viable applications.

Conclusions
We conducted a comprehensive investigation into the Single-Sideband Transform within the context of beamforming, examining its mathematical properties and interaction with key processes such as convolution, relative frequency response estimation, and the distortionless constraint.Our theoretical study reveals that despite its interesting realvalued representation, the SSBT exhibits higher susceptibility to errors in RFR estimation, requiring stricter constraints for its proper application.We found that in scenarios with longer time windows and under the multiplicative transfer function model, the estimation errors with the SSBT are reduced.Furthermore, we established that the SSBT and the Short-Time Fourier Transform are interchangeable in the time-frequency domain without the need for their respective inverse transforms, enabling a seamless conversion between the two.
To validate our theoretical findings, we employed both transforms in the design of a convolutive Minimum-Power Distortionless Response beamformer within a reverberant environment across various scenarios.These practical results support our theoretical claims, showing that the SSBT-based filter slightly underperforms the STFT-based one in optimal conditions and significantly underperforms in non-ideal situations.While these findings highlight challenges in directly applying the SSBT for beamforming, the interchangeability of these transforms allows for filtering in the STFT domain-even when signals are initially in the SSBT domain.This allows for the combined use of the transformations, taking advantage of their respective strengths as applicable.Future research could explore integrating the SSBT into more robust beamformers, further comparing it to the STFT, and studying the cooperative integration of the two transforms in greater depth.
Substituting this in Equation (A4), we have The first term expands to the inverse Fourier transform of  F (  ), which is trivially (); the second term is the inverse Fourier transform of  F (−  ), which is from the time reversal property and is (−).Therefore, Note that the invertibility of the RFT is a direct consequence of () being a real signal.Assuming the contrary, then, obviously, R −1 {R{()}(  )}() ≠ (), given that the inverse RFT involves taking the real part of the inverse, and, therefore, it maps the original signal to a different one.□ Property A3.The FT convolution theorem does not apply for the RFT.
Proof.Let ℎ() be the impulse response of an LTI system, with input ().It is trivial that the system's output, (), is given by with * being the convolution operator.For the Fourier transform, through the convolution theorem, it is trivial that Expanding these in terms of real and imaginary parts (omitting the frequency index for clarity), Now in the RFT domain, with Equation (A6), we have that Assuming the convolution theorem is true for the RFT, Now, by applying Equation (A6) on Equation (A17), we have where it is explicit that  R (  ) ≠ ỸR (  ).Therefore, the RFT of the convolution (Equation (A20)) is not the product of the RFTs of the signals (Equation (A19)), and thus, the convolution theorem does not hold for the RFT.□ Property A4.There is an equivalent of the convolution theorem for the RFT.
Proof.From Equation (A20), we have our objective for the "convolution theorem"-equivalent for the RFT.From both Equations (A6) and (A7), we have We omit the frequency dependency in the FT values.Taking the possible combinations, we have Taking the difference between Equation (A22a) and Equation (A22d), and the sum of Equation (A22b) and Equation (A22c), we have Proof.We now assume that  F (  ) is the transform of a random process, such that its real and imaginary parts are independent and identically distributed with zero means.Taking the complex correlation of a given frequency in the FT domain, Proof.We take the same assumptions as those in Property A5.Taking the complex correlation between the two conjugate frequencies, Using the fact that the  ℝ F (  ) and   F (  ) are independent and zero-mean, the cross terms are zero, and with Equation (A28), then This result is known, but it is useful to show it since this same procedure is used for the RFT.
We now consider Equations (A6) and (A7).Taking the correlation between the two conjugate-frequencies yields { R (  )  R (−  )} =   ℝ F (  ) 2 −   F (  ) 2 .(A34) Under the same assumptions that the real and imaginary parts of  F (  ) are identically distributed, we obtain the same result as before, where { R (  )  R (−  )} = 0 . (A35) Note that, with the RFT, we did not use the complex correlation since it is real-valued.Lastly, we take the correlation between conjugate frequencies of the output of a system according to Equations (A25) and (A26): 48) where  ; [] = (1 − ) y  [] + I, with  y  [] being the pseudo-correlation matrix of y  [], y  [] =  y  − [] (for the SSBT), given all the properties from Appendix A. Therefore, we have that f  [] achieves the distortionless constraint for the bin  − , given that  ′ , [] =  ′ ,( − ) [].It achieves the null of the conjugate-frequency portion, given the results from Equation (50), and also minimizes the power of the output signal, given that  y  [] =  y  − [].Consequently, it is unnecessary to calculate f  − [],

Figure 2 .
Figure 2. Output metrics for the beamformers over time for varying values of   .

Figure 3 .
Figure 3. Output metrics for the beamformers over time, for varying .

Figure 4 .
Figure 4. Output metrics for the beamformers for varying iSERs, with  = 32 samples, and two cases of   .
signal vectors x ′ 1, [] and x ′′ 1, [] are still frame-dependent, while the RFR vectors d ′ , and d ′′ , are not due to the assumption of spatial stationarity.Taking the   most recent samples of  , [] implies

Table 1 .
Source information for the simulations.