On a Method for Improving Impulsive Sounds Localization in Hearing Defenders

This paper proposes a new algorithm for a directional aid with hearing defenders. Users of existing hearing defenders experience distorted information, or in worst cases, directional information may not be perceived at all. The users of these hearing defenders may therefore be exposed to serious safety risks. The proposed algorithm improves the directional information for the users of hearing defenders by enhancing impulsive sounds using interaural level difference (ILD). This ILD enhancement is achieved by incorporating a new gain function. Illustrative examples and performance measures are presented to highlight the promising results. By improving the directional information for active hearing defenders, the new method is found to serve as an advanced directional aid.


INTRODUCTION
In many-cases, individuals are forced to use hearing defenders for their protection against harmful levels of sound. Hearing defenders are used to enforce a passive attenuation of the external sounds which enter our ears. The use of existing hearing defenders affect natural sound perception. This, in turn, results in a reduction of direction-of-arrival (DOA) capabilities [1,2]. This impairment of DOA estimation accuracy has been reported as a potential safety risk associated with existing hearing defenders [3].
This paper presents a new method for enhancing the perceived directionality of impulsive sounds while such sounds may contain useful information for a user. The proposed scheme introduces a directional aid to provide enhanced impulsive types of external sounds to a user; improving the DOA estimation capability of the user for those sounds. Exaggerating this directional information for impulsive sounds will not generally produce a psychoacoustically valid cue. Instead, this method is expected to enhance the user's ability to approximate the direction of an impulsive sound source, and thereby speed up the localization of this source. With the exception of enhanced directionality of impulsive sounds, the proposed method should not alter other classes of sounds (e.g., human speech sounds). Safety is likely to be increased by using our new approach for impulsive sounds.
The spatial information is enhanced without increasing the sound levels (i.e., signals are only attenuated and not amplified). The risk of damaging the user's hearing by the increased sound levels is thereby avoided. However, the proposed directional aid passes the enhanced external sounds directly to the user without any restrictions. It is therefore recommended, in a real implementation, that a postprocessing stage is incorporated after the proposed directional aid for limiting the sound levels passed to the user. Active hearing defenders with such limiting features are commercially available today.
A suitable application of our directional aid is for the active hearing defenders used in hunting, police, or military applications, in which impulsive sounds such as gun or rifle shots are omnipresent. In these applications, the impulsive sounds are likely to accompany danger, and therefore fast localization of impulsive sound sources is vital. A similar idea for enhancing the directional information can be found in [4], wherein the hearing defender is physically redesigned using passive means in order to compensate for the loss in directional information.
A brief introduction to the theory of human directional hearing is provided hereafter followed by our proposed 2 EURASIP Journal on Audio, Speech, and Music Processing scheme for a directional aid. An initial performance evaluation of the proposed method is given with a summary and conclusions.

THEORY OF HUMAN DIRECTIONAL HEARING
The human estimation of direction of arrival can be modeled by two important binaural auditory cues [5]: interaural time difference (ITD) and interaural level difference (ILD). There are other cues which are also involved in the discrimination of direction of arrival in the elevation angle. For example, the reflections of the impinging signals by the torso and pinna are some important features for the estimation of elevation angle. These reflections are commonly modeled by head related transfer functions (HRTFs) [6,7]. The focus of this paper is on the use of the binaural cue ILD and estimation of direction of arrival on the horizontal plane.
The spatial characteristics of human hearing will be focused on when describing the underlying concept of these two cues, ITD and ILD. It is assumed that the sound is emitted from a monochromatic point source (i.e., a propagating sinusoidal specified by its frequency, amplitude, and phase). In direction-of-arrival estimation, the intersensor distance is very important to avoid spatial aliasing, which introduces direction-of-arrival estimation errors. The distance between the two ears of a human individual corresponds roughly to one period (the wavelength) of a sinusoidal with fundamental frequency F 0 . (For an adult person, this fundamental frequency is F 0 ≈ 1.5 kHz.) A signal whose frequency exceeds F 0 is represented by more than one period for this particular distance. Those signals with frequencies below this threshold, F 0 , are represented by a fraction of a period. Consequently, for a signal whose frequency falls below F 0 , the phase information is utilized for direction-of-arrival estimation and this corresponds to the ITD model. However, for a signal with frequencies above F 0 , the phase information is ambiguous, and the level information of the signal is more reliable for direction-ofarrival estimation; this corresponds to the ILD model. The use of this level information stems from the fact that a signal that travels a further distance has, in general, lower intensity, and this feature is more accentuated at higher frequencies.
Consequently, the ear closer to the source would have higher intensity sound than the opposite ear. Also, the human head itself obstructs signals passing from one ear to the other ear [8,9]. This discussion (above) gives only a general overview and is a simplification of many of the processes involved in human direction-of-arrival estimation. However, this background provides us with the basis for a simplified human direction-of-arrival estimation model, as considered in this paper.

PROPOSED SCHEME FOR A DIRECTIONAL AID
In our scheme, two external omnidirectional microphones are mounted in the forward direction on each of the two cups of the hearing defender; see Figure 1. Also, two loudspeakers Top view Figure 1: A hearing defender with directional aid where external microphone signals, M L and M R , are used to impose internal sounds through loudspeakers, L L and L R , in order to realize the directional aid.
x L (n) x R (n) x L,LF (n) x L,HF (n) x R,HF (n) x R,LF (n) Directional aid ILD enhancement are placed in the interior of each cup. These loudspeakers are employed for the realization of a directional aid.
An overview of the scheme proposed for a directional aid is shown in Figure 2. Note that in this scheme, the lowfrequency signal components are simply passed without any processing.

Signal Model
The microphones spatially sample the acoustical field, providing temporal signals x L (n) and x R (n), where L and R represent the left and right sides of the hearing defender, respectively. An orthogonal two-band filter bank is used for each microphone. The low-frequency (LF) band of this filter bank, denoted by H LF (ω), consists of a low pass filter having a cut-off frequency around the fundamental frequency, F 0 , corresponding to the ITD spectral band. Similarly, the highfrequency (HF) band of the filter bank is denoted by H HF (ω) and corresponds to the ILD spectral band. Since only the ILD localization cue has been employed in our approach, the LF signals (corresponding to the ITD cues) are simply passed through the proposed system, unaltered.
The left microphone signal, x L (n), is decomposed by the two-band filter bank into an LF signal, x L,LF (n), and an HF signal, x L,HF (n). Similarly the right microphone signal, x L,HF (n) x R,HF (n) y L,HF (n) y R,HF (n) ILD enhancement Directional gain calculation Figure 3: A block scheme for the enhancement of ILD cue for human direction-of-arrival estimation.
x R (n), is decomposed into LF and HF components, x R,LF (n) and x R,HF (n). The HF components are the inputs to the ILD enhancement block, see Figure 3, providing enhanced outputs of y L,HF (n) and y R,HF (n). The left-and rightside output signals, y L (n) and y R (n), are the sum of LF input signal components and enhanced HF output signal components according to y L (n) = x L,LF (n) + y L,HF (n) and y R (n) = x R,LF (n) + y R,HF (n), respectively.
These filters, H LF (ω) and H HF (ω), are for the sake of simplicity 128 tap long finite impulse response (FIR) filters, and they have been designed by the window method using Hamming window. It should be noted that, in a real implementation, it is of utmost importance to match the passive path to the active (digital) path with respect to signal delay in order to avoid a possibly destructive signal skew. The impulse response function of the passive path between the external microphone of a hearing defender to a reference microphone placed close to the ear canal of a user is presented in Figure 4. This estimated impulse response has a low pass characteristic and it has a dominant peak at 7 samples delay with sampling frequency 8 kHz. Thus, the active path should match this 7 sample delay of the passive path. This can be achieved in a real implementation by selecting a low delay (1 sample delay) analog-to-digital and digital-to-analog converters. In addition, the digital filter bank should be selected (or designed) with a pronounced focus on group delay in order to satisfy the matching of the passive and active paths (e.g., by using infinite impulse response (IIR) filter banks). The Haas effect (also denoted by the precedence effect) [10] pronounces the importance to minimize the temporal skew between the active and passive paths. An overly long delay in combination with a low passive path attenuation yields that our directional aid is unperceived. These aforementioned practical details are however considered out of the scope of this paper. However, these matters should be subject to further investigation in a later real-time implementation and evaluation of the proposed method.

The proposed ILD enhancement scheme
One fundamental consideration regarding our proposed method involves first distinguishing whether a signal onset occurs. (A tutorial on onset detection in music processing can be found in [11], and a method for onset detection for source localization can be found in [12].) Once a signal onset has occurred, any other new onsets are disregarded within a certain time interval, unless a very distinct onset appears. This time interval is used to avoid undesired false onsets which may occur due to high reverberant environment or acoustical noise. When an onset is detected, the method distinguishes which of the sides (i.e., left or right) has the current attention. For instance, for a signal that arrives to the left microphone before the right microphone, attention will be focused on the left side, and vice versa. Based on the information about the onset and the side which provides the attention, the "unattended" side will be attenuated accordingly. Hence, the directionality of the sound can be improved automatically.
A detailed description of the important stages of the proposed method, involving onset detection, formation of side attention, and gain function computation method for the desired directionality enhancement, is followed here.

Onset detection
The envelopes of each HF input signal are employed in the onset detection. The envelopes are denoted by e L (n) and e R (n). To avoid mismatch due to uneven amplification among the two microphone signals, a floor function is computed for each side. These floor functions, denoted by f L (n) and f R (n), are computed as Here, α ∈ [0, 1] represents a factor associated with the integration time of the floor functions. This integration time should be in the order of seconds such that the floor functions track slow changes in the envelopes. The function min(a, b) takes the minimum value of the two real parameters a and b. The normalized envelopes, e L (n) and e R (n), are now computed according to The envelope difference function is defined as 4 EURASIP Journal on Audio, Speech, and Music Processing A ceiling function, c(n), of the envelope difference function is computed according to Here, β ∈ [0, 1] is a real valued parameter that controls the release time of the ceiling function. This release time influences the resetting of some attention functions in (7), and this release time should correspond to the reverberation time of the environment. The function max(a, b) returns the maximum value of the real parameters a and b. Now, an onset is detected if the ceiling function exactly equals the envelope difference function, that is c(n) = d(n). This occurs only when the max(·) function in (4) selects the second parameter, d(n), which corresponds to an onset.

Side attention decision
In the case of a detected onset, the values of the normalized envelopes determine the current attention. If e L (n) > e R (n), the attention is to the left side and the corresponding attention function a L (n) is updated. If, on the other hand, e L (n) < e R (n), the attention will be on the right side, and the attention function for the right side is updated. This attention function mechanism is formulated as two cases: where the cases CASE 1 and CASE 2 are CASE 1 : e L (n) > e R (n), and γ ∈ [0, 1] represents a forgetting factor for the attention functions and its integration time should be close to the expected interarrival time between two impulses.

Directional gain function
To avoid any false decisions, due to high reverberation environment or acoustical noise, a long-term floor function, f C (n), is employed to the ceiling function according to where the parameter δ ∈ [0, 1] controls the integration time of this long-term average, and this integration time should be in the order of seconds in order to track slow changes in the ceiling function. In order to avoid drift in the attentionfunctions, they are set to a L (n) = a R (n) = 0 if the min(·) function of (7) selects the second parameter, c(n). This condition will trigger a time after a recent onset has occurred (this time is determined mainly by β and partly by δ). Thereafter, the recent impulse is considered absent. Depending upon the values of attention functions of a L (n) and a R (n) and the ceiling and floor functions of c(n) and f C (n), the two directional gain functions, g L (n) and g R (n), can be calculated. If a L (n) > a R (n), the attention will shift towards the left side and consequently the right side will be suppressed. If, on the other hand, the attention is shifted towards the right side, that is, a L (n) < a R (n), then the left side is suppressed. The directional gain functions are computed according to where the cases CASE 3 and CASE 4 are Here, ϕ(c(n), f C (n)) is a mapping function that controls the directional gain, and should be able to discriminate certain types of sounds. The mapping function used in this paper is inspired by the unipolar sigmoid function that is common in neural network literature [13]; it is defined here as where the parameter ϕ A controls the maximum directional gain imposed by the proposed algorithm. The parameter ϕ D corresponds to a center-point that lies between the pass-through region (ϕ(c(n), f C (n)) = 1) and attenuation region (ϕ(c(n), f C (n)) = 1/ϕ A ) of the mapping function. The parameter ϕ S corresponds to the transition rate of the mapping function from the pass-through region to the attenuation region. The reason for using the quotient of the two parameters, c(n) and f C (n) in (10), is to make the mapping function invariant to scales of the input signal. The various parameters in the present mapping function have been selected empirically such that impulsive sounds (which are identified as target sounds) are differentiated from speech (nontarget sounds). A set of parameters that appear to be suitable in the tested scenarios are ϕ A = 10, ϕ S = 2, and ϕ D = 32. The mapping function in (10) is presented in Figure 5. It is stressed that these parameters are found empirically through manual calibration of the algorithm.
Optimal parameter values can be found by using some form of neural training. Now, the output signals of the ILD enhancement block can be expressed as y L,HF (n) = g L (n)x L,HF (n) and y R,HF (n) = g R (n)x R,HF (n). Consequently, the total output of the directional aid can be obtained as y L (n) = x L,LF (n)+g L (n)x L,HF (n) and y R (n) = x R,LF (n) + g R (n)x R,HF (n).

Illustration of performance
This section illustrates important output signals with the proposed algorithm. An impulsive sound signal (gun shots) and a speech signal are used as input for the algorithm. To aid the illustration, all signals have the peak magnitude Benny Sällberg et al. 1. The sampling frequency and the algorithm's parameter values follow those outlined in Section 4. Four impulses are present; the first two impulses originate from the left side of the hearing defender, the second two impulses from the right side of the hearing defender. After 3.5 seconds, only speech is active. Figure 6 illustrates the input with its corresponding directional aid outputs and other relevant intermediary signals. This illustration highlights the operation of the algorithm, also demonstrates that the directional information for the two test signals is in fact enhanced (according to magnitude of the outputs for the two test impulses).

PERFORMANCE EVALUATION
In the following, the performance and characteristics of the proposed algorithm are demonstrated. Two cases are investigated. First is the directional aid's ability to enhance the directionality of impulsive sounds (gun shots) relative to speech sounds evaluated. Speech is a type of signal that should be transparent to the algorithm, that is, it should pass through the algorithm unaltered, since the focus of our algorithm is the enhancement of impulsive sounds. Second, the directional aid's sensitivity to interfering white noise is evaluated at various levels of impulsive sound peak energy to interfering noise ratio (ENR). The signals used in this evaluation are delivered through a loudspeaker in an office room (reverberation time RT 60 = 130 milliseconds) and recorded using the microphones on an active hearing defender; see Figure 1. The sampling frequency is F S = 8 kHz, and the parameter values used in the evaluation are selected as T α = T δ = 4 seconds, and T β = T γ = 0.15 second, where the actual value of every parameter p ∈ {α, β, γ, δ} is computed using p = 1 − (1/F S T p ), where T p is the time constant (in seconds) associated to every parameter p. This approximation is valid for T p 1/F S .

Performance measures
The maximal spectral deviation (MSD) is used as an evaluation measure. The MSD assesses the maximal deviation (in log-scale) of the processed output signal related to the unprocessed input signal, and is defined as where the spectral deviation is Δ Pm (k) = 10 log P ym (k) − 10 log P xm (k).
Here, P ym (k) and P xm (k) represent power spectral density estimates of the processed outputsignal y m (n) and the corresponding input signal x m (n), where m represents the channel index and k corresponds to the frequency bin index. In other words, MSD assesses the maximal spectral deviation of the output signal with respect to the input signal over all channels and all frequencies. In general, the MSD is high if the process alters the output signal with respect to the input signal, and MSD is low if the output signal is spectrally close to the input signal.
For the evaluation of the directional aid's sensitivity to interfering noise, a directional gain deviation (DGD) measure is used. This measure compares the directional gains of each channel in an ideal case when no noise is present (ENR = ∞), denoted by g L|∞ (n) and g R|∞ (n), with the case when interfering noise is present at a specific ENR level, while the directional gains are denoted as g L|ENR (n) and g R|ENR (n). The DGD measures for each channel are defined as Consequently, the desired behavior can be obtained if the directional gains at a specific ENR level exactly follow the directional gains in the ideal case, yielding the DGD measures to be zero. Any deviation from this behavior is considered as nonideal.

An impulsive test signal
In this first test, an impulsive type of test signal (gun shots) is used to show the objective performance. The MSD for this impulsive test signal is 4.3 dB, which implies that the algorithm spectrally alters this test signal. This is also the expectation of the algorithm.

A nonimpulsive test signal
In this second test, a nonimpulsive test signal (a speech signal) is used to demonstrate the performance. It is expected that such a signal should be transparent to the algorithm. The MSD for this speech test signal is ≈0 dB, which indicates that the algorithm is able to let such nonimpulsive signals remain spectrally undistorted.

Sensitivity to interfering noise
A mixture of white Gaussian noise and impulsive sounds acts as an input to the directional aid. The impulsive sounds are set to have a maximal amplitude of 1. The level of the interfering noise is then set according to a desired ENR level. The DGD measures for each channel are presented in Figure 7. This figure indicates that the directional aid fails to operate for ENR levels below 20 dB.

SUMMARY AND CONCLUSIONS
This paper presents a novel algorithm that serves as a directional aid for hearing defenders. Moreover, this algorithm intends to provide a protection scheme for the users of active hearing defenders. The users of the existing hearing defenders experience distorted directional information, or none at all. This is identified as a serious safety flaw. Therefore, this paper introduces a new algorithm and an initial analysis has been carried out. The algorithm passes nonimpulsive signals unaltered and the directional information of impulsive signals is enhanced as obtained by the use of a directional gain. According to some objective measures, the algorithm performs well and a more detailed analysis including a psychoacoustic study on real listeners will be conducted in future research. Furthermore, the psychoacoustic study should be carried out on a real-time system, where the impact of various design parameter values is evaluated with respect to the psychoacoustic performance with an intended live application.
The work presented herein is an initial work introducing a strategy for a directional aid in hearing defenders, with focus on impulsive sounds. Future research may include enhancing directional information (other than those related to impulsive sound classes) such as directionality of, for example, tonal alarm signals from a reversing truck.
Future research may also involve modifications of this proposed algorithm such as reduction of the sensitivity to interfering noise. The directional aid may be further enhanced with the addition of a control structure that restrains enhancement of the repetitive impulsive sounds, such as those from a pneumatic drill. This would extend the possible application areas of our directional aid.

Special Issue on Wireless Physical Layer Security Call for Papers
Security is a critical issue in multiuser wireless networks in which secure transmissions are becoming increasingly difficult to obtain in highly mobile and distributed environments. In his seminal works of the late 1940s, Shannon formalized the concepts of capacity (as a transmission efficiency measure) and equivocation (as a measure of secrecy). Together with Wyner's fundamental formulation of the wiretap channel in the 1970s, this work laid the groundwork for the area of wireless physical area security. Interest in this area has exploded in recent years, motivated by the rise of wireless networking in general and by the increasing interest in large mobile networks with light infrastructure, which are extremely difficult to secure by traditional methods.
The objective of this special issue (whose preparation is carried out under the auspices of the EC Network of Excellence in Wireless Communications NEWCOM++) is to gather recent advances in the area of wireless physical layer security from the theoretical, such as the analysis of the secrecy capacity of various channel models, to more practical interests such as the development of codes and other communication schemes that can provide security in real networks. Suitable topics for this special issue dedicated to physical layer security include but are not limited to:

Call for Papers
The performance of image and video analysis algorithms for content understanding has improved considerably over the last decade and their practical applications are already appearing in large-scale professional multimedia databases. However, the emergence and growing popularity of social networks and Web 2.0 applications, coupled with the ubiquity of affordable media capture, has recently stimulated huge growth in the amount of personal content available. This content brings very different challenges compared to professionally authored content: it is unstructured (i.e., it needs not conform to a generally accepted high-level syntax), typically complementary sources are available when it is captured or published, and it features theŞuser-in-the-loopŤ at all stages of the content life-cycle (capture, editing, publishing, and sharing). To date, user provided metadata, tagging, rating and so on are typically used to index content in such environments. Automated analysis has not been widely deployed yet, as research is needed to adapt existing approaches to address these new challenges.
Research directions such as multimodal fusion, collaborative computing, using location or acquisition metadata, personal and social context, tags, and other contextual information, are currently being explored in such environments. As the Web has become a massive source of multimedia content, the research community responded by developing automated methods that collect and organize ground truth collections of content, vocabularies, and so on, and similar initiatives are now required for social content. The challenge will be to demonstrate that such methods can provide a more powerful experience for the user, generate awareness, and pave the way for innovative future applications.
This issue calls for high quality, original contributions focusing on image and video analysis in large scale, distributed, social networking, and web environments. We particularly welcome papers that explore information fusion, collaborative techniques, or context analysis.
Topics of interest include, but are not limited to: • Image and video analysis using acquisition, location, and contextual metadata • Using collection contextual cues to constrain segmentation and classification • Fusion of textual, audio, and numeric data in visual content analysis This special issue focuses on the novel and practical ways, but solid contributions, to improve the wireless network security. Papers that do not focus on wireless network security will not be reviewed. Specific areas of interest in above networks include (but are not limit to):

Call for Papers
The capabilities of robots and autonomous systems have increased dramatically over the past years. This success story partly depends on advances in signal processing which provide appropriate and efficient analysis of sensor data and enable autonomy. A key element of the transition of signal processing output to its exploitation inside robots and autonomous systems is the way uncertainty is managed: uncertainty originating from insufficient sensor data, uncertainty about effects of future autonomous actions and, in the case of distributed sensors and actuators (like for a team of robots), uncertainty about communication lines. The aim of this special issue is to focus on recent developments that allow passing this transition path successfully, showing either where signal processing is used in robotics and autonomy or where robotics and autonomy had special demands that had not been fulfilled by signal processing before. Topics of interest include, but are not limited to: • Autonomous navigation: • Outdoor navigation using geo-information and dedicated indoor navigation solutions • Collision avoidance/sense and avoid • Dynamic feature maps, and simultaneous localization and mapping (SLAM) • Path planning: • Proactive, based on open-loop optimization • Reactive, based on adaptive control or model predictive control (MPC) • Probabilistic approaches for maximizing the expected future information • Exploration: • Networked teams of robots • Sensor networks which mix static sensors with autonomous moving ones • Distributed algorithms and communication aspects The special issue will focus on the one hand on the development and comparison of algorithmic approaches and on the other hand on their currently ever-widening range of applications in any platform: underwater, surface, ground, and airborne. Special interest lies in probabilistic approaches and setups of distributed sensors and actuators.
Authors should follow the EURASIP Journal on Advances in Signal Processing manuscript format described at http://www.hindawi.com/journals/asp/. Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http://mts.hindawi.com/, according to the following timetable: