Perception-based noise assessment of a future blended wing body aircraft concept using synthesized flyovers in an acoustic VR environment — The ARTEM study

New aircraft concepts are currently being developed with the goal of less emissions of CO 2 and noise. Remarkable noise reductions in long-range aircraft can only be expected from disruptive vehicle designs, new propulsion systems and specific low-noise technologies. In this paper, one such future vehicle design, a blended wing body (BWB) long-range aircraft, is described and studied with respect to sound levels on the ground, sound characteristics and noise annoyance. Virtual flyovers of different vehicle variants were synthesized and auralized in an acoustic VR environment, and investigated through psychoacoustic laboratory experiments. The applied methodology was successfully hierarchically validated by comparison with measurements of existing jet aircraft, assessing acoustical indices, time-frequency features, perceived plausibility


Background
Aircraft noise causes annoyance, sleep disturbance, cardiovascular disease and other adverse health effects for millions of people worldwide [69,70].In 2017, for example, it was estimated that 4 million people in Europe were exposed to aircraft noise with a day-evening-night level (L den ) of 55 dB or higher [17].With a growing population and the number of (post-COVID) aircraft movements likely to increase in the future, this problem can be expected to become more pronounced in the future.
To counter this trend, the International Civil Aviation Organization (ICAO) published the "Balanced Approach" [26], according to which aircraft noise management should be addressed through four main elements, namely (i) noise reduction at the source, (ii) land-use planning and management, (iii) noise abatement operational procedures, and (iv) operating restrictions.Noise reductions at the source in combination with tailored low-noise flight procedures is the most effective measure, as has been shown in various studies, e.g.[7,8].Indeed, large noise reductions were obtained in the past decades [25], mainly due to fleet modernization using modern aircraft with turbofan engines with higher bypass ratios.Strategies to optimize aircraft flight trajectories regarding noise were recently published for departures [74] and for arrivals [73].Future low-noise aircraft designs could therefore help to further reduce the growing problem of aircraft noise.
The noise situation around airports is usually assessed with calculated spatial maps of the noise exposure from the annual air operation using noise metrics such as the A-weighted equivalent continuous sound pressure level (L Aeq ) or the L den , e.g.[19].Single aircraft or flight procedures are assessed by event-based quantities, such as the A-weighted sound exposure level (L AE ) or the Effective Perceived Noise Level (EPNL).However, to get a more holistic picture, it is desirable to also assess the situation based on human sound perception or noise effects [53], especially already during the design phase of future aircraft technologies.Here, auralization (see, e.g., [68]) comes into play.By analogy with visualization, auralization allows creating virtual realities in which sound situations can be listened to that do not necessarily exist in reality.Through listening experiments, this allows a perception-based evaluation of design variants in addition to conventional noise metrics.A recent study has demonstrated the feasibility of this approach (``perception-based evaluation'') for future low-noise aircraft technologies [50].This approach was further developed and applied in the current work, with a focus on noise annoyance and familiarity.
Noise annoyance is one of the most important negative healthrelated effects of environmental noise (e.g., [69]) and thus of particular interest.In this paper, we hypothesize that a reduction in the short-term annoyance could lead to a reduction of the long term annoyance, consequently, reduce the negative health impact of the noise.Familiarity is investigated as participants might perceive certain (future) aircraft sounds as quite unfamiliar or even alien compared to their experience with environmental sounds in their everyday lives.This might affect annoyance (e.g., [9]).
The objective of this paper is to perform a perception-based assessment (noise annoyance and familiarity) of a novel Blended Wing Body (BWB) aircraft concept.The work of this paper was conducted within the European research project ARTEM (Aircraft noise Reduction Technologies and related Environmental iMpact), which was running from 2017 to 2022 with 26 partners.In ARTEM, various novel aircraft concepts, such as distributed electrical propulsion, integrated engines and a BWB configuration, and innovative aircraft noise reduction technologies for the horizons 2035 and 2050 were developed and studied [35].For this study, the most mature and promising concepts and technologies for the horizon 2050 were crystallized and compiled to conduct a comprehensive study on the potential of a new future vehicle by doing a perception-based noise assessment.In contrast to today's common tube-and-wing concept, the extended fuselage of the BWB provides acoustic shielding of the engines mounted on top of it [11,13].

Study overview
This paper presents the identified noise annoyance reduction potentials for a novel BWB aircraft concept with novel noise reduction technologies including advanced engine fan acoustic lining and optimized high-lift systems.To that aim, the BWB aircraft concept described in Section 2 is evaluated based on simulations described in Sections 3 and 4.This methodological approach is validated as described in Section 5. Using auralization, virtual flyovers of different aircraft variants are assessed with psychoacoustic listening experiments in the laboratory (Section 6).The BWB variants are evaluated in comparison to current (today's) reference aircraft with respect to short-term noise annoyance, to assess such future technologies not only based on computed ``classical'' noise indicators such as the L AE , but also with respect to human sound perception.The evaluation relies exclusively on synthesized sounds without any sound recordings.In the experiment different future aircraft variants of an advanced aircraft concept of a turbofan engine driven BWB aircraft for long-range operations are compared to current reference aircraft (denoted as ``Ref'' in the following).
The BWB design is equipped with possible low-noise technologies (LNTs) (see Section 2.5) and two engine variants (see Section 2.4).The Ref is a today's long-range tube-and-wing jet aircraft with similar toplevel aircraft requirements as the BWB (see Section 2.1).For the psychoacoustic evaluation, the effect dimensions ``noise annoyance'' and ``familiarity'' are investigated, namely the annoyance to and familiarity with sound of a BWB variant compared to the Ref.Besides the main goal to compare BWB and Ref flyovers, additional influencing parameters, namely flight procedure, observer location, engine type, additional LNTs and simulation type (Sim. 1 and Sim. 2) are investigated as well.
The remainder of the paper is structured as follows: Section 2 introduces the design of the future BWB aircraft, i.e. the main study object.Section 3 explains the prediction of noise sources of this new aircraft concept as well as of existing reference vehicles.Based on these predictions, virtual flyover sounds are synthesized as described in Section 4. The used simulation chains are validated in Section 5 by comparison with current tube-and-wing aircraft.The main results are presented in Section 6, where the new BWB aircraft is assessed during flyover acoustically and psychoacoustically.Section 7 discusses the limitations of this study and the paper concludes with Section 8.This paper extends the work presented by the same authors in two conference papers at [40,61].

Aircraft design
Within ARTEM, a novel low-noise aircraft was designed, i.e., a BWB concept with two engine options and optimized low-noise technologies, in other publications sometimes referred to as BOLT (Blended wing body with Optimized Low-noise Technologies).This vehicle concept is described in the following account.

Top-level aircraft requirements
The BWB configuration meets the Top-Level Aircraft Requirements (TLAR) of a long-range modern aircraft, with an expected payload of 400 passengers in a two-classes cabin layout: the design mission consists of 5500 nmi range (with 30 min of loiter) with a flight speed of 510 knots at 43.000 ft, and is equipped with two turbofan engines with bypass ratios (BPR) ≥ 8.The reference aircraft (denoted as `Ref') is a currently flying long-range tube-and-wing jet aircraft of similar range and mission as the BWB, and with a transportation capacity of up to 300-400 passengers.

Framework for multidisciplinary conceptual aircraft design
During the aircraft design phase, analyses and optimizations have been carried out using the multidisciplinary conceptual robust design optimisation framework FRIDA (FRamework for Innovative Design in Aeronautics), of the Aerospace Structures and Design group of Roma Tre University [12,[28][29][30].Since the FRIDA framework was developed for the conceptual design of both conventional and unconventional aircraft concepts, most of the implemented models are based on fundamental physics, with specific assumptions to reduce the order of complexity.JThe accuracy of the models is capable to capture the main aircraft dynamics with a computational effort compatible with the optimal design processes.
The FRIDA simulation modules are briefly summarized: (a) Geometric processor, including a multi-patch parametric kernel with structured-grid generator; (b) Sketch definition for the initial grossweight assessment and inner layout definition; (c) Structural module for the modal analysis using booms and skins idealization of the wingbox to build an equivalent beam model; (d) Weight breakdown based on iterative schemes, enhanced to account for the BWB layout; (e) Aerodynamic analysis based on a quasi-potential flow with a boundarylayer integral model for the viscosity effects, coupled with a transonic correction for the wave drag effects estimation; (f) Performance analysis of take-off, cruise and landing phases using flight-mechanics equations; (g) Flight simulation by means of inverse flight mechanic to derive the time history of all the relevant variables; (h) Aeroelastic analysis with a finite state reduction, using a matrix-fraction-based Reduced Order Model (ROM); plus modules for noise assessment and finances

Blended wing body layout
An initial aircraft layout, in terms of geometrical properties, which was derived based on previous experience, has been refined, on the basis of the TLAR and the flight mechanics constraints, throughout a multiobjective optimization problem.The optimization was explored to determine the best solution including the maximum take-off weight (MTOW) and the aerodynamic efficiency (lift-to-drag ratio L/D).The original constrained problem was replaced by an unconstrained problem, by defining a pseudo-objective function which includes the constraints, by means of the quadratic penalty function method.The minimization is provided by a gradient-free method, the Particle Swarm Optimisation (PSO), introduced by Kennedy and Eberhart [33], in the deterministic version implemented by Pellegrini et al. [48].The deterministic approach allows to neglect the statistics on the problem solution.
As the number of iterations necessary for Pareto front convergence depends on the number of design variables, the domain dimensions were reduced.This was done by a multi-level optimization strategy.The first optimization level consists of a performance-based optimization with fixed airfoil for the characteristic sections: the design space includes the geometric variables.The second optimization level, for the center-body airfoil optimal design, makes use of the resulting cruise load distribution and performs a minimization problem aimed at minimizing the drag coefficient and maximizing the lift curve slope with fixed lift coefficient at fixed angle of attack.This approach allows to further improve the overall aerodynamic efficiency and decrease the angle of attack at takeoff and landing: at this level, the maximum thickness of the center-body is kept fixed, since the aircraft interior has been fixed in the first optimization level.Fig. 1 shows the geometry of the optimized BWB layout.
Table 1 shows the improvements due the optimization.
The "clean" aircraft configuration in terms of wetted surface was complemented by FRIDA with a preliminary landing gear and high-lift device configuration (cf.Fig. 2).The landing gear arrangements include the estimate of the bays' volumes, the number of wheels, the tyres diameters and the structural lengths.A low-speed aerodynamic assessment, based on the simulation of simplified path segments for both take-off and landing procedures, yielded the high-lift device configuration, together with the control surfaces arrangement.

Aircraft engines
The aircraft propulsion system depends on the required thrust which consists of three main contributions: the total aerodynamic drag force D, the projection of the aircraft weight force W parallel to the flight path, and the inertial term linked to the total acceleration.The drag force varies with the high-lift devices settings and the landing gear deployment.The engines were dimensioned for the top of climb phase (highest aerodynamic load), by which they also deliver enough thrust during the acceleration of the initial climb.
The most balanced compromise between performance and weight was provided by a pair of new generation geared turbofan engines, with a thrust of 70 kN each at top of climb (43.000 ft, flight Mach number 0.84).They guarantee acceptable take-off performance, without affecting the longitudinal stability of the aircraft.Two engine variants (Eng. 1 and Eng. 2) were predesigned with the software GasTurb (GasTurb [21]), following the methodology outlined in Kurzke and Halliwell [37].The two variants differ with respect to their design bypass ratio (BPR) of 8 and 12, and their maximum static thrust of kN and 560 kN, for Eng. 1 and Eng. 2, respectively.They share the same key cycle parameters at their design point top of climb, namely, an overall pressure ratio of 60 and 1700 K burner exit temperature.Design, sizing and performance analysis are described in more detail in LeGriffon et al. [40].Fig. 3 shows the two engine cross sections.Eng. 2 is substantially larger and heavier than Eng. 1, but also about 5 % more fuel efficient at its design point, with 13.95 vs. 14.71 g/kNs specific fuel consumption.Fig. 4 depicts a sketch of the chosen engine locations above the BWB for acoustical shielding of engine noise towards the ground.The overwing integration further solves the problem of limited space underneath the wing when using large high-bypass turbofan engines [66].

Low-noise technologies
Different novel low-noise technologies (LNT) were developed within ARTEM and applied to the BWB in addition to conventional noise reduction measures.The LNTs include novel broadband fan inlet and outlet lining consisting of a slanted septum core with variable perforation [47], a new trailing edge technology consisting of optimised Krueger flaps [18,63], jet porous material and a modification at the main landing gear.The latter are aerodynamic modifications of the leg door fairing to slow down the impinging airflow.Different design variants were investigated in wind tunnel experiments at the University of Southampton and at DLR Braunschweig.
The LNTs were included in the simulations without penalties to the flight and engine performance or added weight of the aircraft.

Flight trajectories
Flight trajectories and corresponding dynamic engine data for the BWB aircraft have been calculated using the flight simulation environment within FRIDA.The technique is based on solving the inverse flight mechanics problem, which yields the relevant flight mechanics variables (characteristic angles of attack, aircraft orientation, high-lift devices settings, landing gear deployment and engine operating point).Take-off and landing maneuvers are tailored to the current BWB configuration.The starting point is a present ICAO flight procedure (as used for the reference aircraft) related to an aircraft with a comparable mass, in terms of number of trajectory segments and maximum altitude (last node for the departure and first node for the approach).This baseline operation is modified in accordance with the performances of the BWB under considerations, in order to derive a BWB-specific flight trajectory.The flight data are integrated with information on the high-lift devices (HLD) settings and landing gear (LG) deployment during take-off and landing operations: flap deflection has been imposed dependent on the aircraft speed, whereas LG settings have been calculated as a function of the aircraft altitude.The strong interplay between the trajectory simulation and the aeroacoustic assessment is revealed by the engine operating point calculation: a simplified model of the engines from Section 2.4 is used.The model provides the percentage of throttle, knowing the    relevant engine characteristics (e.g.engine pitch, bypass ratio, maximum thrust per engine at sea level) and the representative flight mechanics variables (altitude, drag force, actual aircraft weight and acceleration of the aircraft, HLD and LG settings).The rotational speeds N1 and N2 of respectively low-and high-pressure spools, are evaluated knowing the overspeed and idle conditions in terms of revolutions per minute.The jet velocities are calculated through the momentum equation and their temperatures are estimated with the energy balance.

Modelling concept
The noise sources are predicted using two similar scientific system noise prediction tools.The simulations of the new aircraft design are done twice, i.e. using both tools separately, to reduce the uncertainty of the source modelling.The two used simulation tools are CARMEN (described in Section 3.2) and PANAM (described in Section 3.3).The simulations using PANAM are henceforth referred to as `Sim.1′ and the simulations using CARMEN as `Sim.2′.
At each time step, the main noise sources on the aircraft are calculated by applying parametric semi-empirical noise source models while accounting for effects of varying operational and geometrical input parameters.In the present study, semi-empirical noise models derived from the literature for jet [64], fan [23], landing gear and flap sources [16,20,58] were included.The implemented semi-empirical models provide directivity and spectra in the far field.A thorough presentation of the most common noise source models used in CARMEN and PANAM is given in Bertsch et al. [6].The source prediction is followed by a module adding shielding effects to the spectral source directivities.The assumed source locations for the fan inlets and outlets, the jets, and the landing gears are shown in LeGriffon et al. [40].Insertion losses are predicted for point sources, e.g., to assess the shielding of fan inlet and exhaust noise, but also to assess reflection of landing gear noise by the aircraft body.An example of predicted insertion losses is shown in Fig. 5.
The setup of the system noise simulation tools is in general very similar, with remaining differences in the individual code implementation [4].The main difference in methodology between the two prediction tools lies in their applied shielding effects methods.In the following two sections, a more detailed description of the two tools is given, with a focus on the shielding tools.

CARMEN simulation tool
CARMEN [44] is a software tool that has served for parametric evaluation of new aircraft concepts [39] for the last few years.Experimental data were used to validate the noise source models (e.g.[41]) and the computational chain (e.g.[38]).
The method used for calculation of reflection and diffraction by the aircraft geometry depends on the frequency range under consideration (hybrid approach).For low and medium frequencies, a BEM is used, through the solver BEMUSE [59,60].This method gives an exact solution of the Helmholtz equation in the frequency domain by a surface integral method.Insertion loss is provided for each source location and 1/3 octave band on a lower hemi-sphere surrounding the aircraft in the acoustic far field.The higher the computed frequency, the stronger are local changes in insertion loss due to interferences.To obtain smooth distributions, the insertion loss is spatially mapped onto a coarsely discretized lower hemi-sphere.Given that the BEM computational cost increases with the square of frequency, the highest affordable frequency is 1 kHz for the BWB.A simplified integral method is applied for higher frequencies.The proposed method relies on the Kirchhoff approximation [10] to solve scattering problems by means of the Kirchhoff-Helmholtz equation defined on the scattering object surface.The Kirchhoff approximation was originally used for the study of the scattering of an optical wave by an aperture.Here, this approximation considers that the surface of the scattering object is locally planar and that the pressure is equal to zero on the shadowed part of the object [51].
To account for diffracted waves in the shadow of the object, such as the engines located on top of the aircraft (see Fig. 10), the Maggi-Rubinowicz formulation of the Kirchhoff diffraction theory is used, similarly to [43].This formulation has been developed for wave scattering from an aperture and generalized [46].Only first reflections are considered.Compared to the BEM solver, the high frequency method is 3-4 orders of magnitude faster in computation.
The resulting spectral insertion loss hemi-spheres are fed into CAR-MEN to add to the source directivities.

PANAM simulation tool
PANAM [3] is a software tool that has previously been used to design and study novel low-noise aircraft concepts, e.g.[36,50,67].It was validated by comparison to flyover field measurements [5].
For the reflection and diffraction of the different noise sources by the aircraft geometry, the tool SHADOW is applied [43].SHADOW was developed to investigate different engine installation locations during design phase with respect to noise shielding at low computational costs.In contrast to CARMEN's approach, this method is applied to the entire frequency range of interest.SHADOW is based on a high-frequency approximation of the linearized Euler equations.The pressure field is calculated by solving ordinary differential equations along rays in space, which originate from one point representing the source.The aircraft geometry is approximated by a triangulated surface, and a local 2nd-order polynomial geometry approximation is applied to take into account the surface curvature.The pressure amplitude along each ray is calculated based on energy conservation.For the diffraction part, a simple approach based on the Maggi-Rubinowicz formulation of the Kirchhoff diffraction theory is used [46].The diffracted field is calculated solving a line integral along the shadow boundary on the surface of the diffracting body.In contrast to the geometrical acoustic field, the diffraction correction is frequency-dependent.
Similar to CARMEN, insertion losses are predicted in 1/3 octave bands for each noise source in hemi-spheres and superimposed on the predicted source emission within PANAM.

Consideration of low-noise technologies
LNTs were included in the simulation as source-specific attenuations that were applied to the sound powers from PANAM and CARMEN independent of the flight configuration and constant in directivity.Except for the main landing gear noise where a constant attenuation of 0.5 dB was used, the LNTs are modelled frequency dependent but without a directivity.The insertion loss values of the different LNTs were determined by computational fluid dynamic simulations and wind tunnel experiments (see Section 2.5).The applied values are listed in Table 2.

Overview
The term auralization was introduced by Kleiner et al. [34] and can be understood in analogy to visualization.This technique allows the inclusion of human perception in the assessment of different noise scenarios, thus supporting the decision-making in processes regarding acoustics with a non-trivial cause-effect chain.Auralization is used here to evaluate the benefits of a novel aircraft concept and associated low-noise technologies (LNT) in comparison to current aircraft.This is of particular interest since disruptive aircraft designs may contain new noise characteristics that humans are not used to.According to the concept of [68], propagation and reproduction are separately represented in the auralization process.The auralization chain used here is realized within Empa's tool AURAFONE.The aircraft flyover auralization is realized as described in [50], but in extended form.As an example, the consideration of atmospheric turbulence effects [49] is such an extension.Fig. 6 gives a schematic overview of AURAFONE.

Emission sound synthesis
The sound signals of flying aircraft are synthesized using parametric synthesis methods as in previous studies [1,56].The time domain approach according to the methodology review in [55] is used, where emission synthesis is done prior to propagation modelling.In a first step, the emission signal of each componential sound source is generated considering its frequency content and its directivity.This results in time histories of the emitted sound pressure for the instantaneous emission angle at a chosen reference distance from the source.In a second step, these emission signals are modified to account for the propagation effects between the moving source and the static observer near the ground (see Section 4.3).In a third step, these observer sound signals are spatially reproduced in the laboratory as explained in the following Section 4.4.
In the auralization model, the partial sound sources from Section 3 are assumed to be incoherent, concentrated point sources.All engines per aircraft are assumed to be identical and operate under the same conditions, like the engine shaft speed.The sources are acoustically described by dynamic sound emission levels L e as a function of frequency, radiation angle and time.The structure of the auralization model is illustrated in Fig. 6.The sound of broadband, non-harmonic sources (i.e., airframe, jet, broadband fan noise) are generated using subtractive synthesis.Spectral input is provided in 1/3 octave band resolution from 31.5 Hz to 12.5 kHz (see Section 3).Tonal components from engine fan and buzz saw noise are generated using additive synthesis.Each harmonic is synthesized separately using its own timevarying frequency, amplitude and initial phase.The discrete-time emission signal p e,ftn for fan tonal noise (ftn) was computed by with the sample index k, the reference sound pressure p 0 = 20 μPa, the emission sound pressure level L e in dB, the phase angle ϕ, the audio sampling rate f s in Hz and the fan blade passing frequency f BPF in Hz.A sampling rate of f s = 48 kHz was used.Eq. ( 1) describes a series of numerically controlled oscillators.Fan tonal emission levels L e were obtained from the predictions in Section 3. A total of five fan tones (see Eq. ( 1)) were synthesized.
In contrast to an earlier study [50] where only landings of aircraft were considered, combination tone noise (or buzz saw noise) must be included in this study with aircraft take-offs and landings.The frequencies of the combination tones are harmonics of the shaft frequency.To estimate their signal powers, inter-and extrapolations were needed as explained in [56] because the used prediction models provide 1/3 octave band spectra as an output.For each time step of typically 1 s, where a buzz saw 1/3 octave spectrum is predicted, the amplitudes of all harmonics are chosen in such a way that this narrowband representation matches the predicted 1/3 octave band spectrum.A total of 200 buzz saw harmonics (adaptation of Eq. ( 1) to j = 1-200 and replacing f BPF by the rotational fan speed f rot ) were synthesized.
Assuming the observer to be located sufficiently far from the aircraft, all source signals per flyover were summed to result in a single representative source signal for the flyover.

Propagation filtering
In this step, the effects of source movement and sound propagation are simulated and applied to the source signal.Considered effects include Doppler frequency shift, geometrical spreading, air absorption, ground reflection, atmospheric turbulence-induced amplitude fluctuations and coherence loss.Similarly to current engineering propagation models, the most relevant propagation effects are independently described and modelled.Time-variant propagation filters transform the source sound pressure signal into an observer sound pressure signal.The filter network is depicted in Fig. 6.A description of the individual propagation filters is given in Pieren et al. [50].For this study, the air absorption simulation, ground impedance description and the consideration of coherence loss were improved [49] and are applied here for the first time.
For the air absorption simulation, a homogenous atmosphere was assumed with temperature and relative humidity set to local conditions at 2 m height, i.e. 15 • C and 70 %, respectively.For the ground reflection simulation, grassland with an airflow resistivity of 250 kPa s/m 2 was assumed.Ground impedance was modelled using the Miki model [45].Turbulence parameters were estimated from a weather forecast database (validation, Section 5), or set to defined reference conditions (evaluation, Section 6).

Spatial sound reproduction in the laboratory
To create a spatial sound impression of a flyover, the synthesized sound pressure signal from Section 4.3 is rendered audible via a loudspeaker array under laboratory conditions.Empa's listening test facility AuraLab in Switzerland was used (see Fig. 7).The listening room has a reverberation time of T mid = 0.11 s in the mid frequency range and a low background noise < 7 dB(A).The room fulfils the highest standards and recommendations for high-quality listening rooms.
In this study, the loudspeaker array consists of 15 two-way satellite speakers (type Neumann KH 120A) arranged on an upper hemisphere around the listener.Having speakers with elevation angles up to 60 • allows the perception of the virtual source above head.A frequencydependent amplitude panning technique is used to dynamically compute the speaker feeds based on the time-variant directional information (i.e.angle of sound incidence at the observer location) [65].The low frequencies are generated by four distributed subwoofers (type Neumann KH 805) with a crossover frequency of 100 Hz.This setup covers the frequency range from 20 Hz-10 kHz with a reasonably flat frequency response.The speakers are connected to two 16-channel digital audio controllers (type Xilica Neutrino A0816) to which multi-channel audio signals are sent from a computer via Ethernet (Dante protocol).The installed sound reproduction system is calibrated by adjusting the playback volume with a Class 1 sound level meter located at the listening spot.

Validation concept
Prior to auralizing the future aircraft concepts, a systematic and rigorous validation of the presented simulation process (simulationauralization chains according to Sections 3 and 4) was conducted.The objective of this was to test and validate the developed simulationauralization chains.The validation is based on inter-comparisons of models and tools from different independent institutes and on comparisons with measured existing aircraft.Previously, the used noise predictions (Section 3) were compared in LeGriffon et al. [40], and the developed auralization model (Section 4) was compared to other model implementations in Rizzi et al. [57].Here, field recordings of aircraft flyovers were used as a reference for a numerical and perception-based validation.
The developed hierarchical validation comprises four levels (see Fig. 8).Level I compares classical acoustical indices such as sound levels and psychoacoustic parameters.Level II qualitatively compares the reproduction of time-frequency features.The next two levels describe a psychoacoustic validation, with Level III testing the subjectively perceived plausibility and Level IV the reproduction of short-term noise annoyance.Levels III and IV are done through dedicated listening experiments (see Sections 5.5 and 5.6) for which the stimuli were spatialized.
The experiments were approved by the Ethics Commission of Empa (Approval CMI 2021-299 of 17 September 2021).The flyover stimuli were cut to 20 s and faded in and out during 1.5 s.Some of the stimuli were very loud with a maximum sound level above 90 dB (see Fig. 9).To reduce the risk of a possible hearing damage of the participants, a constant level reduction by 5 dB was applied during playback.The stimuli in the validation study were thus played back at a somewhat lower sound pressure levels than what would be physically correct.A total of three experiments were conducted (see Sections 5.5 and 5.6) in single, individual sessions as focused experiments.The order of the three experiments was counterbalanced between participants.Further, the order of the comparisons and rankings as well as the stimuli within comparisons/rankings were randomized, and the two sets of stimuli for the annoyance ratings were counterbalanced (details see below).A total of 31 persons (12 females, 19 males), aged 20-61 years (median of years), participated in the validation experiments.Initial analyses of these experiments have been published in Schäffer et al. [61].

Validation cases
The four validation cases are listed in Table 3.The case data was taken from a prior field measurement campaign.The field measurements were conducted around Zurich airport in the years 2013 and [71,72].A departure and an approach of the two current types of jet aircraft Airbus A320 (narrow-body vehicle) and A340 (wide-body vehicle), both commonly used aircraft in Europe, were selected.For these flights, flight deck recording (FDR) data were provided by the airline (SWISS), yielding the necessary input data like flight trajectory,  low compressor speed N1 and flap setting.Meteorological data were taken from a nearby monitoring station.Flights with calm atmospheric conditions, i.e. no precipitation, low wind and low turbulence, were chosen.The measurement microphones were located close to the noise certification points for take-off and landing, respectively [27].The microphone heights were 4 m and 10 m above ground, respectively, and thus larger than the standard 1.2 m used for noise certification (see Table 3).

Level I: acoustical indices
Using both tools for source prediction described in Section 3 for each validation case, a total of eight syntheses were generated.The simulations based on the two different source predictions are henceforth referred to as 'Sim.1′ and 'Sim.2′.All syntheses were enriched with ambient sounds (mainly containing birdsong) recorded at the measurement location to adjust to the sound environments of the measurement locations.The A-weighted sound exposure level L AE of each flyover is given in Table 4; the four validation cases lie within 10 dB.The flyover event of the approaching A340 (Case D) has the highest sound pressure level, with the resulting L AE being some 7.1-9.7 dB higher than the departing A320 with the lowest L AE (Case A).Comparisons of various acoustical and psychoacoustic single-number parameters are given in Table 5, which lists deviations of the syntheses from the measurements in terms of the L AE , the maximum A-weighted FAST time-weighted sound pressure level (L AF,max ), the C-weighted sound exposure level (L CE ), the maximum C-weighted FAST time-weighted sound pressure level (L CF,max ), the 5 %-percentile of Zwicker loudness [31] which is the Fig. 9. Comparison of measured (black) and simulated (red and blue for simulations 1 and 2) sound pressure level time histories for the four flyover stimuli from Table 3. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Table 3
Flyover cases of current jet aircraft for the validation of the noise prediction and auralization.The aircraft conditions are given for the time of the shortest distance to the receiver, i.e. the flyover instant.

Table 4
Sound exposure levels L AE in dB of four validation cases A-D (described in Table 3) for the measurement (Meas.)and two simulations (Sim.).value of the metric that is exceeded 5 % of the stimulus' duration (N 5 in sones), and the Effective Perceived Noise Level (EPNL).The EPNL is used for noise certification purposes of aircraft [27].
On average, the difference in L AE of the auralizations to the measurements amounts to − 0.9 dB with a standard deviation of 1.5 dB.Further, the L AF,max of the simulations deviates by − 0.2 dB to the measurements and the 5 %-percentile loudness by − 10 %.The differences of these computed single-value sound level indicators between all auralizations and the actual field recordings are considered small.This, together with the small relative difference in the psychoacoustic parameter loudness, are first indications that the auralizations correctly reproduce the loudness sensation.The excellent agreement found is surprising since the auralizations were produced fully blind to the acoustical field measurements and computed with independent calculation models without any tuning to the measurements.However, as indicated by the following level-time histories and spectrograms, spectral and temporal audible differences may still exist.
The results of the single-value indicator comparison are promising and demonstrate the quality of both source prediction tools for the different vehicles, different procedures, and different configurational settings.Yet, it has to be kept in mind that predicting the acoustical shielding of the BWB will increase the uncertainties and result in larger deviations between the simulations.

Level II: time-frequency features
A-weighted level-time histories of all four cases are displayed in Fig. 9.The general time courses are well reproduced by both simulations resulting in very good agreements between measurements and syntheses.Statistical variations occur in the measurements as well as in the simulations.They are due to random influences by atmospheric turbulence, in particular broadband amplitude modulations.Also the total spectral content in 1/3 octave band resolution is mostly well reproduced (not shown).
The reproduction of combined spectro-temporal patterns is demonstrated in Fig. 10.The time-varying ground effect due to the interference

Table 5
Single-number acoustic and psychoacoustic parameter differences between measurements (Meas.)and syntheses (Sim. 1 and Sim. 2) of today's aircraft for four validation cases A-D (described in Table 3).3).The flyover occurs at time 10 s.

R. Pieren et al.
of direct sound and the reflection from the ground is well visible as Vshaped patterns below 500 Hz in the measurement and the simulation.The spectrograms contain pronounced spectral components, indicating the turbofan engines' fan tones (fan harmonics) between 2 and 5 kHz.Over the whole flyover, these tones are pitched down by about one octave due to the Doppler effect.During take-off when the aircraft is flying towards the observer, buzz saw noise components are observable in the spectrograms as a series of horizontal lines, mainly between 200 Hz and 2 kHz (not shown).Some sound events in the highest frequency range are the result of adding ambient recordings containing birdsong and environmental background such as wind-induced vegetation sounds to the otherwise clean syntheses.

Level III: plausibility
In Level III, the syntheses and recordings were directly compared to study the identifiability of auralizations.The study participants did pairwise comparisons (Meas.vs. Sim.; two-alternative forced choices, 2-AFC) to identify the recording, for each of the four presented cases listed in Table 3, in total eight comparisons.Second, the syntheses and recordings were ranked by the participants for groups of three stimuli (Meas.vs. Sim. 1 vs. Sim.2; three-alternative forced choice, 3-AFC) regarding plausibility.Plausibility is interpreted as a subjective comparison to an inner reference and describes the perceived agreement of the listener's expectations with acoustical reality [42] and was assessed for each of the four presented cases, resulting in a total of four rankings.The data was analyzed to assess whether syntheses and recordings are classified into mutually exclusive classes, and whether they are perceived as equally or differently plausible.
In the direct comparison task (2-AFC), if measurements and simulations would not be discriminable from each other, both classes would have the same relative frequency of 50 %, indicating random selection by chance.Here, the recordings were correctly identified in 70 %, i.e., the simulation was taken for the recording in ~30 % of the comparisons, meaning that slightly less than half of the cases were not discriminable.Nevertheless, the differences in the relative frequencies of the two classes are significant (χ 2 -tests, p < 0.001).
In the ranking task (3-AFC), the ideal distribution of the measurements and two simulations would have the same relative frequency of ~33 %.Here, the recordings were most often (~72 %) ranked the most plausible, meaning that in ~28 % of the cases, one of two syntheses was rated the most plausible among the three stimuli.The differences in the relative frequencies of the three classes are overall significant (χ 2 -test, p < 0.001).These results are congruent with the results from the 2-AFC task.

Level IV: short-term noise annoyance
In Level IV, subjective annoyance ratings were collected to test whether auralizations and recordings yield similar (and nonsignificantly different) or different annoyance ratings (and thus potentially also other noise effects).To that aim, short-term noise annoyance was rated for all stimuli, using the ICBEN 11-point scale of ISO/TS 15666 [32] (direct scaling).To prevent an anticipated ceiling effect in the annoyance ratings, all stimuli were duplicated with an attenuation of 15 dB applied to the duplicates (in addition to the constant level reduction by 5 dB, cf.Section 5.1).In total, more than 700 annoyance ratings (24 stimuli × 31 participants) were collected.
The annoyance ratings cover a large range of the 11-point scale, which is usual for such tests.Some part of the scattering is explained by the large variation in sound exposure of about 25 dB, and to some lesser extent by noise sensitivity of participants (see Section 6.6).Besides, it reflects individual differences in the ratings, which was also observed in other studies and may be accounted for in mixed-effects modelling analysis (cf.Section 6).Overall, the raw ratings are very similar for the recordings and the corresponding syntheses, as shown in the bubble chart in Fig. 11.For perfect agreement of the noise annoyance to the simulations and recordings, all bubbles in Fig. 11 would lie on the indicated 1:1 correspondence line.The Pearson correlation coefficient is 0.83 for all stimuli, and 0.75 for the original stimuli only.In agreement with these observations, mixed-effects modelling analysis revealed no statistically significant differences between the simulations and the recordings, neither for the original nor for the attenuated stimuli (details see Schäffer et al. [61]).

Study concept
The major studied design variables were in this experiment: two aircraft generations (future: BWB, today: Ref); two flight procedures (departure, approach); and two observer locations per procedure underneath the flight path, namely 9 and 12 km from the brake release point for departures, and 5 and 10 km from the touch down point for approaches.Further, for the BWB, two engine types with different bypass ratios (see Section 2.4), two LNT implementations (with [w/], without [w/o] novel LNT from Section 2.5) and two simulations (Sim.and Sim. 2 from Section 3) were studied (Fig. 12, Table 6).To provide an audible impression of the stimuli, a video containing monophonic audio examples of the Ref and the BWB aircraft during take-off and landing is published at [24].
The experimental design resulted in a total of 36 stimuli, namely stimuli for Ref (2 observer locations × 2 procedures) and 32 Stimuli for BWB (2 observer locations × 2 procedures × 2 engines × 2 LNT × simulations) (Fig. 12).For BWB, this corresponds to a full factorial design with respect to observer location, procedure, engine, LNT and simulation.Likewise, for BWB and Ref, a full factorial design with respect to observer location and procedure results.All other model parameters were kept constant, like the observer height of 1.2 m above ground, and the meteorological conditions.Further, the design allows for the comparison of four future aircraft variants (BWB: LNT × Engine) to the Ref.To keep the total number of stimuli sufficiently small, the Ref was simulated only once.This choice seems justified by the positive Table 6 lists the four assessed observer situations including the aircraft height above ground and ground speed during the flyover.For the departure, the height above ground for the BWB is larger compared to the Ref because the BWB has a larger climb angle.In all four situations, the ground speed of the BWB is lower compared to the Ref, by 7-26 %.

Experimental setup and procedure
The experiment was conducted in the auralization laboratory Aur-aLab at Empa using the same setup and experimental procedure as for the validation experiments (see above), and similarly to previous experiments (Schäffer et al. [62] or Pieren et al. [50]).The experiment was approved by the Ethics Commission of Empa (Approval CMI 2021-299 of 11 February 2022).The synthesized aircraft flyover sounds were presented in a within-subject design, i.e., all subjects were exposed to all stimuli.The participants did the experiment individually, one at a time, doing focused tests where they deliberately listened to and rated the stimuli.
The stimuli were spatially played back over an array of loudspeakers.No ambient sounds were added.Because of the asymmetry in sound level with respect to the flyover instant (see example in Fig. 13) and to limit the stimuli duration for practical reasons, all stimuli were centered around the time instant of the maximum sound pressure level (L AF,max ) at 25 s, resulting in a total duration of 50 s.As the resulting flyover stimuli are quite loud (L AF,max of up to ~90 dB), a general level reduction by 5 dB was applied to all original stimuli to prevent possible hearing damage.
The experimental procedure consisted of the following steps.(1) A short introduction to the research topic (``perception of flight events of various types of commercial aircraft during start and landing in different regions around an airport'').( 2) Filling out a consent form for study participation.(3) A questionnaire about self-reported hearing capability and well-being as inclusion/exclusion criteria for study participation.(4) The actual listening experiment with an orientation (example stimuli to set the frame of reference, i.e., "anchoring"), exercise ratings to get used to the task and the interface, and the main experiment.( 5) A postexperimental questionnaire with questions on participants' characteristics (sex, age, noise sensitivity assessed with the NoiSeQ-R questionnaire [22], living situation (rural to urban, quiet to loud, close to or far from main road), number of consciously perceived aircraft per day and corresponding long-term noise annoyance, and some concluding questions about the experiment).A listening experiment software with a graphical user interface guided the subjects throughout the experiment, with automatic playback of the stimuli and recording of the entered ratings.The experiment took about 50 min.per participant.
During the experiment, the synthetized flyover sounds were assessed regarding subjective short-term noise annoyance using the numerical ICBEN 11-point scale of ISO/TS 15666 [32], which ranges from 0 (not at all annoyed) to 10 (extremely annoyed).To that aim, the participants rated each stimulus independently from the other stimuli (direct rating), by answering the following question (modified from [32]): ``When you imagine that this is the sound situation in your outdoor living environment, what number from 0 to 10 best shows how much you would be bothered, disturbed or annoyed by it?''As a follow-up question, the

Table 6
Observer situational parameters for the evaluation of the blended wing body (BWB) aircraft concept.participants were asked about their familiarity with the aircraft sounds, by giving their answers on a 3-point Likert scale ranging from 0 (I do not agree) to 1 (I am uncertain) to 2 (I agree), by answering the following question: ``In my experience, this aircraft flyover sounds familiar to me.''Both questions were asked alongside each other, on the same page of the listening experiment program, after each stimulus.Asking the questions separately, in two blocks, would have required playing back all stimuli twice, which would have lengthened the experiment too much and potentially have led to fatigue of the participants.In total, 36 ratings of annoyance and familiarity were made, one for each stimulus (Fig. 12).The order of the four groups (procedure × location) were counterbalanced between participants, and the order of the nine stimuli per group were randomized to account for serial position effects (e.g., [15]).

Study participants
Thirty-two persons (15 female, 17 male), aged 18-61 years (median of 43 years) with a self-reported noise sensitivity of 1.0-2.7 (median of 2.0) on a scale from 0 to 3 participated in the experiment.Of these, were laypeople without acoustics and/or aircraft noise backgrounds and seven were experts in acoustics and/or aircraft noise.The participants fulfilled the requirements for participation (self-reported normal hearing, feeling well and legal age [≥ 18 y]).Written consent for participation was collected from all participants.

Statistical analysis
For the analysis, a data set with a total of 1152 annoyance and familiarity ratings (i.e., 32 participants × 36 stimuli) was obtained from the experiments.The annoyance and familiarity ratings were visualized and analyzed by mixed-effects models.Such multilevel models allow separating fixed effects (design variables) and random effects (the participants, randomly chosen from a population) (see, e.g., [52]).
In a first step, the design variables of the experiment according to Fig. 12 were studied (procedure [departure, approach]; aircraft type [Ref, BWB]; engine variants [Engine 1, and 2]; additional LNT [w/o, w/ ]).Also the playback number of the stimuli was included to study possible simple order effects [15], as also done in previous studies, e.g., [62].Stimuli of different observer locations (two per procedure) and simulations (Sim. 1 and 2) were pooled.As the findings were similar for both observer locations, they are pooled for the sake of conciseness in the following account.The two simulations were pooled as they were included for the purpose to reduce simulation uncertainty of the source modelling (Section 3.1).In this analysis we tested (1) whether the sounds of the BWB variants and the Ref are equally or disparately annoying, (2) they sound equally familiar (or alien) to the participants, given their experience with (environmental) sounds from their every-day lives, and (3) whether familiarity affects the annoyance ratings.
In a second step, a separate mixed-effects modelling analysis was done for noise annoyance using a classical acoustical indicator besides the aircraft concepts (BWB and Ref), either the sound exposure level (L AE ) or the effective perceived noise level (EPNL).In this analysis we tested whether differences in annoyance between the aircraft concepts were exclusively due to noise exposure or also linked to other sound characteristics.

Acoustical indicators
The acoustical stimuli obtained for the different aircraft designs strongly differ with respect to the resulting sound exposure at the observer locations, depending on procedure, LNT, as well as engine variant.A wide L AE range was thus covered within the experiment, not only between the aircraft variants, but also between the observer locations.Table 7 in Appendix A lists the two conventional noise indicators L AE and L AF,max for all 36 stimuli.
For the BWB variant without LNTs compared to Ref, the L AE is 6-15 dB (L AF,max : 7-12 dB) and 11− 18 dB (L AF,max : 13− 23 dB) lower during approach and departure, respectively.With LNTs, the L AE is 10− 20 dB (L AF,max : 9-21 dB) and 17− 24 dB (L AF,max : 19− 28 dB) lower during approach and departure, respectively.Thus, the LNT reduce the L AE by 3-6 dB and 5-7 dB during approach and departure, respectively.Using Engine 2 instead of Engine 1 further reduces the L AE by about 1-3 dB and Fig. 13.Comparison of reference aircraft (Ref.) and future aircraft during takeoff: Time histories of A-weighted and FAST-time weighted sound pressure level (L AF ), loudness level (L N ), tone-corrected perceived noise level (PNLT) and sound incidence elevation angle (ϕ) of the departing reference aircraft and the future aircraft (BWB, equipped with engine variant 2 and additional low-noise technology, obtained from both simulations (Sim. 1, Sim. 2) at 9 km distance from the brake release point.Note that all stimuli were centered around the time instant of the L AF,max .

R. Pieren et al.
2-4 dB during approach and departure, respectively.As expected, also the observer location has a pronounced effect on the L AE .For the approach, the farther located observer has a 5-6 dB lower L AE and for departure a 1-3 dB lower L AE .
Not only sound energy, but also the level-time histories and spectra are strongly affected by aircraft concept.Fig. 13 shows exemplary time histories of the stimuli's A-weighted and FAST time-weighted sound pressure levels (L AF ), time-varying Zwicker loudness level (L N ) and tonecorrected perceived noise level (PNLT) at the close location for departure of the Ref and BWB.It is interesting to note that for the BWB flyover the maximum sound pressure level (L AF,max ) and the maximal loudness level occur well after the flyover instant (indicated by dashed vertical lines in Fig. 13) and much later than for the Ref.For the BWB, the highest sound levels occur at around 20-25 s, while the actual flyover occurs at around 10-15 s, i.e. 10 s earlier.This is due to the shielding effect provided to the engines by the vehicle body of BWB which is highest towards the front of the aircraft.This leads to a pronounced directivity to the back of the aircraft as compared to the tube-and-wing aircraft design.Fig. 14 shows exemplary spectrograms of the Ref and the BWB during departure and approach at the close location from the runway.Obviously, the fan tone is a salient component of today's aircraft.Fan tonal noise is however hardly present in the spectrograms of the BWB with Engine 2 and LNTs.Similar observations hold true for the approach, while less pronounced.

Short-term noise annoyance
Fig. 15 shows the mean observed short-term noise annoyance to different aircraft variants (Ref; different variants of BWB).Short-term noise annoyance was strongly linked to the aircraft design.Overall (i.e., arithmetic mean over all observer positions, procedures, LNTs, engines and simulations), the observed annoyance was 3.3 units on the 11point scale lower for the BWB than the Ref.The difference was more pronounced for the BWB with LNT (3.6 units lower) than without (3.0units), i.e., with LNT, annoyance was 0.6-0.7 units lower on the 11point scale than without.The LNTs are thus only responsible for some 15 % of the annoyance reduction of the BWB.Further, also the engine variant (or bypass ratio, respectively) was linked with annoyance, but to a lesser degree than the aircraft concept and LNTs (0.4-0.5 units on the 11-point scale).Finally, also the procedure (departure or approach) affected the annoyance ratings, which is expected given the different power settings as well as observer locations.Overall (pooled over procedure, observer locations and simulations), annoyance decreased in the This indicates that all three measures, the optimizations of the aircraft concept (BWB), the added LNTs as well as the engines are effectively reducing noise annoyance, and that the BWB aircraft concept is a very effective measure.
Statistical analysis confirmed these effects.The observed effects can be described with the following mixed-effects model (Eq.( 2)), where NA is the dependent variable noise annoyance, μ is the overall mean, Proc × AC is the categorical variable Procedure × Aircraft design (9 levels: i = 1 … 9, covering all combinations of procedure [departure, approach], aircraft design [current, BOLT], LNT [w/, w/o] and engine [Eng. 1, Eng. 2]), Fam and PN and NoisSe are the continuous variables Familiarity (see below), playback number (order with which the stimuli were played back), and noise sensitivity.Further, β 1 -β 3 are regression coefficients, u k is the participants' random intercept, and the error term ε is the random deviation between observed and predicted values of NA.
The index k represents the kth replicate observation of the ith aircraft flyover.All the above effects were found to be significant (p < 0.001; NoiSe: p < 0.04).In Figs. 15 and  The model revealed that annoyance (i) was significantly linked with aircraft design and procedure (cf.Fig. 15), (ii) was negatively linked with familiarity, i.e., the more familiar the aircraft sounded the less annoying it was perceived, (iii) increased with playback number, which was also observed as a simple order effect [15] in other studies [62] indicating that participants are getting more and more annoyed during the experiment, and (iv) increases with noise sensitivity of the participants.Further, the model reveals that Proc × AC has the most decisive influence on noise annoyance, with annoyance differences between Ref and different variants of the BWB of 2.4-3.5 units on the 11-point scale for approaches and 3.1-4.3units for departures.Further, annoyance increases by ~1 unit per unit increase of noise sensitivity (corresponding to an increase of 4.1 annoyance units over the whole range of noise sensitivity of 0-3) and increases by 0.02 units per unit increase in playback number (~0.8 units over the whole range).Finally, non-familiar sounds (familiarity rating = 0) were rated as somewhat more annoying (~0.4 units on the 11-point scale) than familiar sounds (familiarity rating = 2).This difference is small compared to the differences of 3.0-3.6units on the 11-point scale between the different aircraft concepts (Fig. 15) and thus clearly overcompensated by the favorable acoustical characteristics.
As the L AE (as well as L AF,max ) strongly varied between aircraft designs and observer locations (cf.Section 6.5), a strong effect on noise annoyance was expected, because sound pressure level is a decisive factor for noise annoyance.Fig. 16 shows the dependence of noise annoyance on L AE and EPNL, separately for the BWB and the Ref.A close linear relation between annoyance and L AE or EPNL is obvious.Besides, however, there is a distinct shift of this relation on the ordinate (i.e., annoyance) towards lower annoyance values for the BWB compared to the Ref.Mixed-effects model analysis again confirmed these effects, with the following model (Eq.( 3)) where L is the continuous noise indicator variable (either set to the L AE or EPNL), AC is the categorical variable Aircraft design (2 levels: i = 1, 2, for Ref and BWB (pooled across the technologies)), β 1 -β 4 are regression coefficients, and the other variables are defined in Eq. ( 2).Eq. ( 3) reveals a strong link of annoyance with L AE (increase of 0.16 units per dB, i.e., 1.6 units per 10 dB, while Fam, PN and NoisSe have almost the same effect size as in the first model (Eq.( 2)).
The level-dependent model in Eq. ( 3) reveals a difference in annoyance between Ref and BWB by 0.8 units, at the same L AE (i.e. the vertical offset between the red and the green line in Fig. 16)).This distinct psychoacoustic shift of the relation between annoyance and L AE on the ordinate (i.e., annoyance) towards lower annoyance values for the BWB compared to the Ref indicates that the novel aircraft designs are beneficial with respect to reduced noise annoyance not only due to substantial sound level reductions, but also due to beneficial sound characteristics.The psychoacoustic shift on the abscissa (i.e. the horizontal offset between the red and the green line in line in Fig. 16) amounts to 5 dB in L AE , indicating a ``noise bonus'' of the BWB meaning that the BWB, to be as annoying as the Ref, would have to exhibit a 5 dB higher L AE than the Ref.Note that this ``noise bonus'' should be interpreted with care, because it requires extrapolation of data since the two groups do not overlap, neither regarding noise annoyance nor L AE .
Not only the sound pressure level, but also other acoustical characteristics of the aircraft, such as tonal contents, might have played a role (see spectrograms in Fig. 14).The fact that the vertical offset between the BWB and the Ref is smaller for EPNL than for the L AE in Fig.
supports the hypothesis that audible tones in the Ref might have increased its annoyance, or in other words, that the lower audibility of tones in the BWB partially explains its lower annoyance.It is interesting to note that a previous study found the opposite effect of EPNL [54].
Thus, acoustic parameters other than L AE or EPNL, such as  psychoacoustic parameters (e.g., loudness, tonality) or further characteristics such as the slopes of rise and/or decrease in the sound level time histories (cf.Fig. 13) might yield important additional insights.However, this was beyond scope of the present analysis.

Familiarity
The different aircraft concepts were also disparately perceived with respect to familiarity (Fig. 17).For departures, the BWB were perceived as somewhat less familiar than the Ref.For approach, in contrast, no clear differences between the aircraft designs were observed.The disparately perceived familiarity might reflect differences in sound characteristics and the unusual spatial perception of the BWB (see elevation angle in Fig. 13).The different sound characteristics are observable in the differences between the spectrograms in Fig. 14 which are very pronounced for departures, but less so for approaches.
Mixed-effects model analysis again confirmed these effects.A similar model as for annoyance was initially developed (Eq.( 2)), but then simplified.The following model was found suitable to describe the observed effects (Eq.( 4)): where the variables are the same as in Eq. ( 2), except that in Eq. ( 4), familiarity is the dependent instead of a predictor variable.The model revealed that (absolute) familiarity differences between aircraft designs are all small and non-significant for approaches (0.09-0.27; p > 0.05), but partly larger (0.13-0.56) and partly significant (p < 0.05; cf.Fig. 17) for departures.Though only slightly, familiarity with the sounds affected annoyance (see above).

Summary
In this study, a rigorous, hierarchical validation concept for the developed simulation chains was successfully applied in Section 5. Validation revealed that both, the noise indicators and the timefrequency features, are well reproduced by the syntheses.Although the auralizations are at least partly discriminable from recordings in direct comparisons, they yield similar (and statistically non-significantly different) annoyance ratings.Thus, the proposed simulation methodology seems suited for a perception-based evaluation of future jet aircraft concepts such as the studied BWB.
The subsequent evaluation of the BWB variants in Section 6 revealed that, while BWB concepts with possible low-noise technologies may initially be perceived as more unfamiliar for departures, they are substantially less annoying than currently flying tube-and-wing aircraft of similar range and mission for departure and approach.The main reason for the lower noise annoyance of the BWB seems to be the acoustic shielding by the body of the extended fuselage which was found to be an important factor in reducing sound levels in the order of 10-20 dB and correspondingly also loudness, which is in line with early scale model measurements on another BWB design for fan inlet noise [14].Adding the LNTs and the geared turbofan engines with a high bypass ratio further reduced noise annoyance of the studied BWB.Furthermore, both, the lower maximum sound levels and the lower sound level slopes of the BWB suggest a reduction also of the probability for sleep disturbance as compared to today's aircraft [2].However, studying other health outcomes besides short-term noise annoyance was beyond scope of the current study.

Limitations
Some limitations of our study should be considered in interpreting the results.For the BWB, lower shielding of the engines and thus less sound level reduction is expected for lateral observers (compared to the used observers directly below the flight path).In the current modelling, the jet noise sources are assumed point sources relatively close to the engine outlets, thus potentially overestimating the shielding effect of the BWB.Positioning them further away and/or modelling as distributed sources would reduce shielding and thus result in higher sound levels at the observer locations.Therefore, this study reveals rather optimistic changes in the noise levels and noise effects, which might rather represent an upper limit of what can be achieved in reality.Also the differences between the two models used for calculating the shielding effect (Section 3 and more detailed in LeGriffon et al. [40]) indicate that the attenuation values should be considered with care.This can also be observed by the differences between the two simulations of the conventional noise indicators listed in Table 7 in Appendix A, which are in the order of 5 dB.Also, caution is advised regarding the estimated effect of the LNTs (Section 2.5) since they were considered in the modelling in a simplified way.Firstly, the LNTs were not developed, acoustically optimized and evaluated specifically for the BWB vehicle.All LNTs within the ARTEM project were developed independently of a specific vehicle and would thus probably require geometrical adaptations and optimizations for a specific application.Such adaptations would probably result in a slightly lower noise reduction.Secondly, the effect of the LNTs on flight and engine performance, and on aircraft weight are neglected.We may safely assume that the mass increase due to the LNTs is small compared to the total aircraft mass and consequently their impact on thrust needs is negligible.The chosen flap modifications, however, might have an impact on the aerodynamic performance.Thus, the flap modifications might affect the flight procedure, particularly during approach.Thirdly, the effect of the LNTs was modelled independently of radiation direction and flight configuration.While this seems valid for the liners and the landing gear modification, this might also be critical for the flap modifications.A more detailed consideration of the flap modification is thus proposed for future studies.
To keep the total duration per participant of the listening experiment reasonable, our study was limited to a total number of 36 stimuli.Due to the chosen focus on different variants of the novel BWB vehicle, no variations could be studied for the tube-and-wing reference aircraft Ref (see stimuli concept in Fig. 12).From our experimental data we therefore cannot determine conclusively what the effect of an engine replacement and/or the LNTs on noise annoyance would be for the Ref, e.g., replacing the currently installed engines of the Ref by the newly predesigned geared turbofan engine Eng. 2. To estimate the potential influence of these measures on the Ref and thus the importance of the different conceptual changes between the BWB and the Ref aircraft, additional simulations of engine noise were performed.All three engines, i.e. the two geared turbofan engines Eng. 1 and Eng. 2 and the engines of the Ref (Eng.0) operated at similar operating conditions, i.e., resulting in similar total thrust at a given speed and altitude.For the operating condition, a representative situation along the departure was selected (horizontal flyover at altitude of 610 m, TAS of 86 m/s, glide slope of 6.7 • , pitch of 18.9 • , in clean configuration).The resulting leveltime histories are depicted Fig. 18 in Appendix B. Four different simulations were set up: a) engines only, b) engines only with LNTs, c) engines installed on the BWB, and d) engines installed on the BWB with LNT.The comparison between these simulations reveals the considerably dominant noise reduction effect of the acoustic shielding by the BWB (i.e., the installed engine cases).For all considered engine variants and LNT options, the BWB design leads to engine noise level reductions by more than 24 EPNdB.A still advantageous but much smaller effect of 1-5 EPNdB can be observed when replacing the non-geared turbofan engines of the Ref by the geared turbofan engines.Especially Eng. 2 with a higher BPR of 12 and a lower fan rotational speed can reach a reduction by 3-5 EPNdB.A smaller noise reduction of on average 0.4 EPNdB is obtained when adding the LNTs to each of the three engine options.(Note that while the selected package of LNTs does not only concern engine noise but also airframe noise (flaps and landing gear), it is estimated that the noise reduction of the LNTs on a conventional tubeand-wing aircraft is in the same order of magnitude as for the BWB).Consequently, a conventional tube-and-wing aircraft with the LNTs would not be able to reach the low annoyance levels as found for the BWB.These additional simulations support our conclusions on the relevance of the BWB design to achieve the observed overall noise reduction and annoyance reduction.

Conclusions
The proposed simulation and auralization methodology was successfully applied in a perception-based evaluation of a novel blended wing body (BWB) aircraft concept with different vehicle variants.The BWB was compared to currently flying tube-and-wing aircraft of similar capacity, range and mission as a reference.The conduced psychoacoustic experiments revealed that the BWB may substantially reduce short-term noise annoyance.For the best BWB variant, average noise annoyance was reduced by 4.3 units for departures and by 3.5 units for approaches on the 11-point scale.The main reason for this reduction is the acoustic shielding by the body of the extended fuselage which was found to be an important factor in reducing sound levels in the order of 10-20 dB, and accordingly also to strongly reduce loudness.Adding a package of novel low noise technologies and selecting geared turbofan engines with a higher bypass ratio (12 vs. 8) further contributed to the reduction of noise annoyance of the BWB.
Acoustically speaking, the observed reduction in noise annoyance for the BWB was over-energetic, i.e., larger than predicted with conventional noise metrics such as the A-weighted sound exposure level or the EPNL.This psychoacoustic benefit of the BWB amounts to around 5 dB equivalent and is probably due to more favorable sound characteristics compared to today's aircraft, such as less variation over time and less audible tones.This shows a general potential for improved noise metrics and psychoacoustic optimizations of future aircraft.This indicates a potential for improving the noise situation around airports by replacing today's aircraft with such BWB vehicles in the future, although for negative health effects on humans, like long-term noise annoyance or sleep disturbance, also other factors play a role, such as the number of aircraft movements or personal characteristics of residents, which were beyond scope of this study.

Fig. 1 .
Fig. 1.Geometry (top, back and side view) and center-body airfoil of the blended wing body layout.

Fig. 4 .
Fig. 4. Engine locations on the blended wing body (side view and top view).

R
.Pieren et al.

Fig. 5 .
Fig. 5. Predicted insertion loss hemi-spheres for a fan inlet source shielded by the blended-wing body geometry at 250 Hz, calculated with the BEM solver BEMUSE (left) and the ray tracer SHADOW (right) (taken from LeGriffon et al. [40]).

Fig. 6 .
Fig. 6.Block diagram of the auralization tool AURAFONE consisting of emission synthesis modules and propagation filters.

Fig. 7 .
Fig. 7. Spatial sound reproduction in the listening test laboratory AuraLab with a hemi-spherical loudspeaker array to create virtual aircraft flyovers.

Fig. 8 .
Fig. 8. Hiearchical validation concept of the flyover auralization consisting of four Levels I-IV.

Fig. 11 .
Fig. 11.Short-term noise annoyance ratings of the stimuli of aircraft flyover recordings and the corresponding syntheses from the psychoacoustic validation.The bubble size indicates the number of collected ratings.The original sounds (blue) and the 15 dB attenuated sounds (red) are shown with the 1:1 correspondence line.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 12 .
Fig. 12. Flow chart describing all 36 synthesized flyover stimuli of the evaluation of the blended wing body aircraft variants.

*
Distance from the observer location to the brake release point (for Departure) or touch-down point (for Approach).
17, statistically significantly different values are indicated by differing capital letters, i.e. two bars showing the same letter are not statistically different from each other.As a reading example: In Fig. 15 (Departure), BWB with Engine 2 without LNT (``BWE, Eng.2 w/o LNT'') is statistically different to the Ref and BWB with Engine 2 with LNT (``BWB, Eng. 2 w/ LNT''), but not to the two BWB variants with Engine 1.

Fig. 14 .
Fig. 14.Spectrograms of synthesized flyovers of today's reference (Ref) tube-and-wing aircraft (top) and the future blended wing body (BWB) aircraft design (bottom) during departure (left) as observed in 9 km distance from the brake release point and during approach (right) as observed in 5 km distance from the touchdown point.The BWB is equipped with engine variant 2 and additional low-noise technology (LNT), and computed by simulation Sim. 1.The receiver is located at 1.2 m above grassy ground.

Fig. 15 .
Fig. 15.Observed noise annoyance (mean values plus standard error bars) for departure (left) and approach (right) of the aircraft concepts (reference aircraft: Ref; future aircraft concept: BWB, the latter with two engine variants [Eng. 1, Eng. 2] and without [w/o] and with [w/] low-noise technologies [LNT]).Mean values pooled over the observer locations and simulations.Statistically significant differences (p < 0.05; pairwise comparisons with Bonferroni correction) are indicated by differing capital letters between the respective bars.

Fig. 16 .
Fig. 16.Short-term noise annoyance (mean values per stimulus) as a function of the sound exposure level L AE (left) and EPNL (right) for the stimuli attenuated by dB, separated per aircraft concept (reference aircraft: Ref; future blended wing body concept variants: BWB).Symbols represent observed values and lines the mixedeffects models with 95 % confidence intervals of annoyance on L AE or EPNL.

Fig. 17 .
Fig. 17.Reported familiarity (0 = unfamiliar; 2 = familiar; mean values plus standard error bars) for departure (left) and approach (right) of the aircraft concepts (reference aircraft: Ref; future aircraft concept: BWB, the latter with two engine types [Eng. 1, Eng. 2] and without [w/o] and with [w/] low-noise technologies [LNT]).Mean values pooled over the observer locations and simulations.Statistically significant differences (p < 0.05; pairwise comparisons with Bonferroni correction) are indicated by differing capital letters between the respective bars.

Fig. 18 .
Fig. 18.Simulated A-weighted level-time histories of engine noise in a representative situation along a departure for the three engine options 0, 1 and 2 for four different cases: a) engines only, b) engines with LNTs, c) engines with BWB, d) engines with BWB and LNT.

Table 1
Improvement of maximum take-off weight (MTOW) and average and maximum aerodynamic efficiency (L/D) achieved through multi-level optimization.

Table 2
Spectral insertion losses of three novel low-noise technologies applied to the blended wing body aircraft.