The Effects of Forensically Relevant Face Coverings on the Acoustic Properties of Fricatives Julie Saigusa

This forensically motivated study investigates the effects of a motorcycle helmet, balaclava, and plastic mask on the acoustics of three English non-sibilant fricatives, /f/, /θ/, and /v/ in two individuals. It examines variation within the individual as an effect of the physical environment. Two speakers recorded a list of minimal pairs in each of the three guises and with no face covering. The results showed that facewear significantly affected fricative intensity and the four spectral moments: centre of gravity, standard deviation, skewness, and kurtosis. The acoustic changes caused by facewear have implications for judging the reliability of earwitnesses’ content recall and voice identification as well as forensic speech scientists’ examination of content and speaker identity in disputed recordings.


Motivation and Research Question
Criminal acts lend themselves to disguise, especially in a world where video and photo surveillance is increasingly pervasive.A range of disguises are exemplified in popular culture in films like V for Vendetta, which features the now-iconic Guy Fawkes mask, and Point Break, featuring robbers dressed in rubber masks of US presidents.Real-life CCTV footage circulated by news media often shows a suspect in a ski mask or other face-concealing disguise (e.g., Martinez and Lerten 2016).The evidence against such individuals must necessarily come from earwitness testimony of what was heard (i.e., content) and who was heard (i.e., speaker identification).Forensic phoneticians are also often called upon to judge the identity of a speaker or a crucial word in a bad quality recording (Fraser 2014).This study is a forensically based attempt to determine how several relevant face coverings affect the acoustic properties of certain fricatives and to explore the implications for forensic speech science.It expands the kind of phenomena examined in studies of intraspeaker variation to the passive and active effects of immediate physical obstruction on speech.
Intuitively, it can be predicted that a material obstructing sound propagation will affect how speakers produce speech, how speakers sound to listeners-either in speech content or voice quality, and how far the speech wave can physically propagate.Even the angle of the speaker in relation to the listener may affect perception: high frequency sounds do not travel as well in non-forward directions as lower frequency sounds (Thomas 2002).This study addresses the first aspect: are the acoustic properties of certain fricatives produced through facewear different from those produced without facewear?The study examines two fricative pairs: one with a voicing contrast (a contrast previously unstudied in this context), /f/ and /v/, and one with a place contrast, /f/ and /θ/.
The area of forensic facewear research is relatively new.Llamas et al. (2008) and Fecher (2014) pioneered this subfield by examining how different types of face coverings affect the acoustic signal, and in turn, how listeners perceive that signal.Since there has been little work on this topic to date, it is important not only to expand on previous findings by bringing in new factors and manipulations, but also to replicate what has been found so far.It is predicted that fricative acoustics will be affected by facewear and manifest in differing spectral properties and intensities across the conditions.

Forensic Speech Science and Facewear Research
Earwitness testimony is common in court cases.One legal meta-analysis (Laub et al. 2013) analysed 226 US court cases that focused on earwitness testimony.Earwitnesses may be asked to report what they heard a suspect say or to identify a suspect by their voice, such as in a voice lineup (Broeders 2013).Though the unreliability of eyewitnesses has been exhaustively researched (Wells and Olson 2003), less has been done on aspects of auditory evidence, though this is by no means an unstudied area.A number of studies have focused on the reliability of content recall (e.g., Ling and Coombe 2005), while others have investigated earwitness ability to identify a voice (e.g., Yarmey 2007).Briefly, Fecher (2014), which is discussed in more detail below, examined the ability of listeners to discriminate between voices when stimuli were recorded through facewear.The study found that compared to near-ceiling performance in the control condition, facewear degraded listeners' abilities to tell whether a pair of consonant-vowel syllables was spoken by the same or different people.In addition, covert recordings made by law enforcement or tapes of calls to emergency services are often of degraded quality and require careful acoustic analysis to determine their contents (Fraser et al. 2011).In these cases, knowledge of the type of face covering an individual is wearing, if any, could be of crucial importance to forensic practitioners.
Masks and other types of facial concealment are prevalent in crime.In addition to the iconic balaclava, criminals may use handkerchiefs, rubber masks, tights, party masks, or even motorcycle helmets to disguise themselves, to name only a few types of facial concealment (Fecher 2014).The current study uses a balaclava and helmet, as in Fecher ( 2014), and delves into the wide range of other face coverings with a plastic mask.Llamas et al. (2008) is the first forensic study on facewear.Their first experiment focused listener perception of speech produced through a balaclava, surgical mask, niqab (full-face veil), and no cover (control).Confusions between fricatives, especially place distinctions, were among the most common errors, inspiring the present investigation into the acoustic consequences of facewear on fricatives in particular.
The second experiment investigated the acoustic transmission loss properties of different materials on a non-speech signal produced by a loudspeaker.The aim was to observe how the materials themselves attenuate sound independent of how a speaker may change production while wearing facewear.The range of fabrics in this study was greater than in the perception experiment, and included the niqab (polyester), balaclava (acrylic yarn), surgical mask (paper), handkerchief (cotton), scarf (wool/acrylic blend), stockings (nylon), and loudspeaker cover fabric (a supposedly "acoustically transparent" woven fibre).The materials were placed between a loudspeaker that played a sequence of pulses and a microphone that picked up the pulses.Unexpectedly, only the surgical mask showed any significant transmission loss.The other fabrics, even thicker ones, showed little or no difference to the control.As Llamas et al. (2008) point out, this is not an accurate representation of how natural speech is produced or transmitted, but it gives a preliminary insight into characteristics of a range of fabrics and how they may affect speech.Their study did not examine any type of plastic face covering, which the current one does.Llamas et al. (2008) laid the foundations for Fecher (2014), which inspired the current study.Fecher's work is a comprehensive PhD thesis which examined the acoustic properties of voiceless stops and fricatives produced through facewear, listener perception of those consonants in both auditory only and audiovisual conditions, and listener discrimination of facially covered speakers.Fecher recorded five males and five females to produce the Audio-Visual Face Cover corpus (AVFC).The speakers were recorded wearing a motorcycle helmet, balaclavas with and without mouth holes, a strip of tape across the mouth, a niqab, a surgical mask, a scarf across the nose and mouth with a hoodie over the head, and a full-head rubber mask.The material recorded consisted of a nonsense consonant-vowel-consonant syllable (to prevent top-down word processing) in the carrier sentence "He said X".A range of consonants were used, including voiceless stops and fricatives in both onset and coda position, except where prohibited by English phonotactic rules, preceded and followed by the vowel /ɑ:/.
The voiceless fricatives produced for the AVFC were analysed for a number of spectral measures including spectral peak, centre of gravity, standard deviation, skewness, kurtosis, and intensity (see Section 3 for a discussion of these terms).Non-sibilant fricatives (the focus of the current study) showed a significant effect of facewear on intensity and spectral moments.Most fricatives were lower in intensity than the control.With few exceptions, centre of gravity was lower than the control for /f/ and /θ/./f/ showed more positive skewness in the balaclava rubber mask, tape, and helmet conditions.Kurtosis increased in the balaclava and helmet conditions for /θ/, and was affected overall for /f/.Additionally, syllable position was significant.Based on these results, facewear can be expected to affect intensity and spectral measures in the current study.In an auditory-only forced-choice identification task, non-sibilant fricative confusions were again among the most common error type./θ/ was commonly misidentified as /f/ across all conditions, but again the opposite (/f/ for /θ/) occurred less than 10% of the time./v/, the other phoneme used in the present study, was usually identified correctly.
The present study aims to extend Fecher's (2014) findings by replicating some previously used facewear conditions and adding a previously untested one.Llamas et al. (2008) tested an array of voiced sounds, including /v/, but Fecher (2014) only examined voiceless sounds acoustically.This study narrows the phonetic focus of Llamas et al. (2008) but expands on the findings of Fecher (2014) by testing two minimally contrastive phoneme pairs, /f-θ/ and /f-v/.
Finally, a number of non-forensic studies have examined the effects of various head and face coverings on communication in medical, military, and other industrial contexts.These results confirm the negative effect of facewear on intelligibility, though so far this area has lacked detailed investigation into acoustics or patterns of phonetic confusion.Surgical masks, respirators, and Air Force headgear have all have been found to reduce intelligibility (Radonovich Jr. et al. 2009, Sommer 1976, Wittum et al. 2013).

Speech Production and Acoustics
The first research question is concerned with the acoustic properties of fricatives produced through face coverings.A purely acoustic analysis cannot determine what changes come from natural inhibition of the sound wave by the material and what changes the speaker is actually making to their articulation in response to facewear.However, previous research on how speakers compensate for inhibited articulation provides a framework for what characteristics can be expected in this study.
Speakers perturbed by jaw weights or bite blocks have been found to adapt quickly to inhibited movement in one articulator by increasing use of another one (Lindblom et al. 1979), producing much the same acoustic output.Stevens' (1989) influential quantal theory proposed that languages exploit articulatory regions within which changes to articulation do not result in major changes to acoustic output.In the context of this study, these findings suggest that even with slightly restricted articulatory movement, changes in production will not be the major source of acoustic difference.Some studies, however, have reported that consonants require a much more precise articulatory configuration than vowels, finding incomplete adaptation (Flege et al. 1988, McFarland andBaum 1995).McFarland and Baum (1995), in particular, found that bite-block perturbed sibilants and stops differed acoustically from controls immediately after presenting perturbation and after a 15minute accommodation period.They also observed a degree of individual variability in adaptation ability.The current study is not focused on articulatory perturbation, but findings in that area can untangle what acoustic changes may result from the restriction of jaw movements by crash helmets or other coverings.
Two of the face coverings chosen for this study cover the speakers' ears as well as their mouths and noses (balaclava and helmet).Earwear affects the natural feedback loop that speakers use to monitor themselves.When wearing a garment that covers the ears, speakers perceive their own voices differently, and subsequent adjustments have been the focus of many studies (e.g., Garnier et al. 2010, Tufts and Frank 2003, Martin et al. 1976).Garnier et al. (2010) investigated the methodological implications of using headphones to transmit noise on manifestations of Lombard speech (i.e., speech in noise).The study found that the acoustics of speech produced by headphone-wearers differed significantly from noise transmitted by loudspeaker.Notably, Tufts and Frank (2003) examined speech produced when participants wore different types of earplugs in quiet and in noise.They found that, overall, speakers wearing earplugs produced lower intensity speech than those with uncovered ears as well as less higher frequency sound energy.Indeed, Fecher (2014) also observed a migration of sound energy from higher to lower frequencies in facewear conditions, which was attributed to the attenuating characteristics of the material itself.Llamas et al. (2008) reported fabrics attenuating energy above 10kHz.In the context of the current study, the earwear effect supports the prediction that facewear will affect speech acoustics.

Social Connotations of Phonetic Variation
One of the features investigated here, the realisation of /θ/ as [f], has been examined as "TH-fronting" in sociolinguistics and is a linguistic stereotype in parts of the English-speaking world.It is primarily found in working-class and urban areas of the south of England, though it has also recently been observed in the northeast and in some parts of urban Scotland (Clark andTrousdale 2009, Levon andFox 2014).TH-fronting is important in the present context in that its realisation could potentially be acoustically altered enough by a facial covering for listeners to misperceive it.It has been found that listeners' perceptions are affected by the dialect they believe they are hearing, even given identical stimuli (Niedzielski 1999, Thomas 2002).Perceiving /θ/ as [f] may well influence listeners' perceptions of the speaker's characteristics and their ability to recall voice qualities (Thomas 2002).In addition to non-facewear confusion studies which found voiceless non-sibilants (/f/ and /θ/) the most easily confusable, Fecher (2014) and Llamas et al. (2008) found facewear to significantly affect error rates for these consonants.In the present study, the extension of this effect to the perception of real words (cf.nonsense syllables) would support a TH-fronting illusion hypothesis.

Intensity and Spectral Moments
The acoustic parameters examined in this study follow Fecher (2014).They are intensity, centre of gravity, standard deviation, skewness, and kurtosis.Intensity is the measure of the loudness of a sound, here measured in decibels.Centre of gravity is the first of four measures referred to as spectral moments (Hardcastle et al. 2010).It measures the frequency in Hertz at which the sound energy is primarily concentrated (Watt 2013).The second spectral moment, standard deviation, is a measure in Hertz of how spread out the energy is around the centre of gravity (Watt 2013).A higher standard deviation indicates that energy is more spread across the spectrum, while a lower value indicates that energy is more concentrated around the centre of gravity.The third spectral moment, skewness, is a measure of spectral tilt, or energy asymmetry around the centre of gravity.Positive values indicate more energy in lower frequencies, and vice versa (Watt 2013).Kurtosis, the fourth spectral moment, is a measure of how concentrated spectral energy is in a peak relative to the energy distribution of the spectrum (Hardcastle et al. 2010), i.e., how the shape of a spectrum differs from the normal distribution.A positive value indicates a leptokurtic, or more peaked, distribution.Kurtosis and standard deviation are often correlated, since they are concerned with the energy distribution (Fecher 2014).

Summary
The current study aims to determine how a face covering affects the acoustic characteristics of three nonsibilants: /f/, /θ/, and /v/.Findings from pioneering forensic facewear studies (Fecher 2014, Llamas et al. 2008) indicate that facewear will have a significant effect on the intensity and spectral properties of fricatives, especially by moving sound energy to lower frequencies.Theories of speech production and studies on articulatory perturbation found that variable motor input can and often does lead to a minimally varied acoustic output (Stevens 1989).However, evidence from "earwear studies" such as Tufts and Frank (2003) indicates that intensity and energy concentration frequency can both decrease when a speaker's ears are covered.This suggests that any spectral effects of facewear in the current study are likely to be due to attenuating characteristics of the face covering itself and to the speakers' impaired ability to monitor the sound of their own voice rather than atypical articulatory configurations.The current study builds on previous work by examining a voiced fricative and expanding the range of facewear used.

Materials
The read materials consisted of 42 monosyllabic English words (see appendix for list) with one of the target fricatives (/f/, /θ/, or /v/) in onset or coda position.The words were minimal or near-minimal pairs of either a /f/-/θ/ (place) or /f/-/v/ (voicing) contrast.Word frequency was balanced within place and voicing contrast pairs (/fθ/: t(14.62)= -0.081,p = 0.93, /f-v/: t(15.08)= -0.1583,p = 0.88) (Davies 2004).Vowel height and backness were not perfectly balanced due to the limited number of possible minimal pairs (see list in appendix).Each contrast had a majority of high front vowels.The words were presented on individual index cards and randomised between each repetition.
The face coverings chosen for this experiment were a full-head motorcycle helmet, a balaclava (no mouth hole), and a plastic party mask (no mouth hole) (see Figure 1).The selection of these face coverings was motivated by Fecher's (2014) use of a helmet and balaclava and mention of plastic Halloween-type masks as prevalent in criminal and social situations, though that study did not include them.

Participants
The participants consisted of two native Standard Southern British English speakers: one 20-year-old female and one 24-year-old male, neither of whom reported any speech or hearing disorders.In addition, neither speaker normally exhibited TH-fronting.The male speaker reported previous experience wearing motorcycle helmets.

Audio Recording
Participants were recorded individually in a sound-treated booth and recordings were sampled at 44.1 kHz.The microphone was placed approximately 30 cm from the participant's mouth.They were told to try to stay at that position but adherence to this was not monitored.They were given a stack of 42 index cards and instructed to read each word, making sure not to flip to the next card until they had finished speaking.No special instructions on how to speak were given.Each participant recorded the list twice for every face covering as well as for no face covering, but not consecutively.After each repetition, the list was recorded in a different face covering; this order was varied between the subjects.This resulted in eight repetitions per participant.The word list was randomised by shuffling 8 times per participant after each repetition.In total, 672 tokens were recorded.

Processing
Each recording was hand-segmented for word and phoneme in Praat (Boersma and Weenink 2009).The fricatives were identified visually on the spectrogram and the waveform by energy onset and offset as well as by auditory examination through Sennheiser HD202 headphones (see Figure 2).Each word and fricative was extracted from the larger .wavfile.

Results
The individuals' data were analysed separately and are referred to as the "female speaker" and the "male speaker" to distinguish between them, but this does not imply that gender was a causal influence on any of the parameters investigated.Subsets of the data were examined visually to confirm normality.In total, 672 tokens were collected, of which 18 tokens were excluded because of noise in the recording, leaving 654 for analysis.Data were analysed in R (R Core Team 2015).All two-way ANOVAs tested cover condition levels (balaclava, control, helmet, and mask) and syllable position (onset and coda).Syllable position was included as a predictor to ensure that the main effects of facewear were not falsely reported, since position is known to have phonetic effects on segments (Fecher 2014).
There was a significant main effect of syllable position for /θ/ (F(1,63) = 9.24, p < 0.01), and a significant interaction between cover and position for /θ/ again (F(3,63) = 6.72, p < 0.01) for the male speaker.For the female speaker, T-tests conducted using a Bonferroni-corrected significance level of p = 0.017 reported a significant difference in intensity between the control and the balaclava (T(87.61)= 7.73, p < 0.001), the control and the helmet (T(159.38)= 7.77, p < 0.001), but not the control and mask (T(166.56)= 1.40, p = 0.16).For the male speaker, the helmet differed significantly from the control (T(154.12)= 11.60,p < 0.001).The main effect of cover condition is consistent with findings in Fecher ( 2014) and support the hypothesis that facewear affects fricative intensity.
The significant effect of cover condition on the three phonemes is consistent with Fecher's (2014) observation that centre of gravity for non-sibilants was lower in facewear conditions than in the control and supports the current study's hypothesis that facewear affects spectral properties of fricatives.
The significant effects of facewear found for /f/ differ from Fecher's (2014) finding that facewear did not significantly affect the standard deviation of /f/.Fecher also found a significant effect of facewear on /θ/ that was not observed in this male speaker.As seen in Figure 5, the helmet appeared to increase the spread of energy around the mean for /f/ and /θ/, and decrease it for /v/.

Acoustic Properties of Fricatives Produced through Face Coverings
The aim of this experiment was to determine if and how face coverings affected the acoustic properties of /f/, /θ/, and /v/.It was hypothesised, based on findings in Fecher (2014), that the acoustic properties of the fricatives recorded would differ significantly from those in the control.The evidence reported in Section 3 supports the hypothesis that covering the face changes the acoustic output.Intensity and all four spectral moments measured were significantly affected by facewear.

Intensity
Intensity is a measure of sound loudness.The interpretation of these results requires synthesis of several possible sources of difference.As outlined in the introduction, because the analysis did not involve articulatory measurements, it is not always possible to distinguish between any changes in speaker behaviour and physical absorption qualities of the face covering.Furthermore, as the two will always co-occur in facewear speech, separating them does not further the practical purpose of forensic research.
For the female, helmet and balaclava speech was lower than the control, while the mask did not significantly differ.This finding is supported by Tufts and Frank's (2003) observation that intensity was lower for speech produced in quiet when the speaker's ears were covered.Both the helmet and the balaclava covered the ears, while the mask did not.For the male speaker, the intensity of helmet fricatives was higher than the control.A possible explanation stems from his reported previous experience of wearing motorcycle helmets.It has previously been found that speakers seated in a car even in the absence of noise produce speech similar to that produced in car-noise (Podlubny et al. 2016).In short, their previous knowledge of the effects of their physical environment caused them to compensate even when the communication conditions were not adverse.Similarly, the male speaker may have known how his speech would be affected and how he would need to compensate to be heard, shifting his speech style (i.e., speaking louder) in response to the physical environment (i.e., the helmet).

Spectral Moments
The results indicate that face coverings affect the spectral properties of some fricatives.All three phonemes showed differences in the four spectral moments across the different types of cover.
These findings are well supported by previous studies that observed differences in sound energy concentration in the presence of intervening materials.Fecher ( 2014) reported that at least in the helmet condition, sound energy migrated into the lower frequency bands compared to the control, resulting in a lower centre of gravity and a more positively skewed spectrum (i.e., more energy in lower frequencies compared to higher ones).Observing the results in Figures 4 and 6, it can be seen that this, indeed, was the case with the two speakers.For every face covering, centre of gravity measures were lower and skewness was higher than in the control./v/, which was not examined in Fecher ( 2014), showed especially large differences in skewness and kurtosis between control and experimental conditions compared to the voiceless fricatives.However, though /v/ also showed a lower centre of gravity in facewear conditions than in the control, the difference was not as large as for the voiceless sounds.
The lowering of spectral energy concentration was also found by McFarland and Baum (1995) and Flege et al. (1988) in studies of articulatory perturbation.However, those studies used a bite-block perturbation that increased jaw widening and therefore the dimensions of the front cavity as well.Here, jaw movement was most likely to be restricted rather than increased, especially in the helmet condition.This lends support to the evidence that properties of the covering material account for migration of sound energy to lower frequencies rather than a large articulatory change on the part of the speaker.However, Tufts and Frank's (2003) earwear study found a lowering of spectral energy frequency, as well as a decrease in intensity.This suggests that the hearing obstruction caused by the balaclava and mask additionally affected centre of gravity.
The overall effect of syllable position for some phonemes was also found in Fecher ( 2014).This was expected given that coda consonants exhibit more articulatory reduction than onsets generally (Ohala and Kawasaki 1984).The acoustic analysis indicates that a material covering the face can significantly affect spectral properties that experts who are consulted in court cases use to determine individual speaker likelihood or disputed content.This is discussed further in Section 4.3.
The movement of energy to lower frequencies found in this study may be a reason why covered /f/ and /θ/ are easily confused: higher frequency noise information (especially above 10kHz) has been found to play a role in identifying these sounds (Tabain 1998, Tabain andWatson 1996), especially for /f/.This means that in a situation where a crime is occurring, /θ/ may be heard as /f/.As mentioned before, the phenomenon of THfronting, where the phonemic /θ/ in words like thin is realised as [f] (making thin and fin homophones), is well known.It has been the focus of numerous studies investigating regional and class dialects (Levon andFox 2014, Schleef andRamsammy 2013) and sound change (Blevins 2004), and is a growing feature in many southern English dialects as well as in parts of the northeast and Scotland (Bennett 2012, Clark andTrousdale 2009).THfronting is often characterised as a feature of working-class speech, or more specifically, "chavs".A "chav" is a social stereotype of the poor working class, often vilified by the media and other online users.For example, a highly rated definition of "chav" on Urban Dictionary is "amoral", and the term is described as applicable to "every culture with a nasty, thieving element" (chavspotting 2004).
Combined with the prevalent cultural stereotype of petty criminality in "chavs", the increased ambiguity and potential misperception of /θ/ may lead a witness to false impressions of the perpetrator's dialect and social class.Though the effects of facewear on other salient dialectal characteristics such as t-glottaling and hdropping (Bennett 2012) were not investigated, a TH-fronting illusion could be an important factor in determining the reliability of earwitness testimony.In addition to the sound-attenuating characteristics of any material itself, the lack of visual cues for place of articulation that necessarily occur when the face is obscured make these fricatives difficult to distinguish (Fecher 2014).Niedzielski (1999) found that participants' phonemic perception shifted depending on what dialect they believed they were listening to.This effect could easily be exacerbated by /θ/ misperception.

Forensic Speech Science: Masks in Crime Revisited
One of the main purposes of the study was to examine what the results of facewear experiments mean forensically, for example, in cases involving earwitness situations or disputed recordings.Firstly, fricatives are usually examined by forensic speech scientists in casework (Gold and French 2011).In a case where determining if a recorded voice belongs to a suspect or profiling a voice's characteristics is of the utmost importance, accurate measurement is crucial.It is well known that intraspeaker variation makes reliance on any one acoustic measurement unviable-indeed, conversation context has been found to affect individuals' fricative measurements (Saigusa 2016).This study has shown that face coverings can also alter those same spectral properties.Fecher's (2014) speaker discrimination experiment additionally found that facewear impairs a listener's ability to tell if syllables are spoken by the same or different people.Consequently, knowledge of how facewear affects speech is necessary to make an accurate analysis and present the best possible judgment.If an analyst does not know that the speaker's face was covered, their speaker comparison is likely to suffer.
Incriminating audio recordings present the possibility for misinterpretation with serious consequences.Recordings can be made by a hidden microphone (e.g., in a house or car) to capture incriminating evidence in a police investigation.Similarly, recordings of private telephone calls or calls made to emergency services are often provided as evidence in criminal cases (Fraser et al. 2011).These recordings are often of degraded quality due to the equipment (e.g., a small microphone), the background noise (such as crowd or engine noise), and in the case of emergency calls, an extremely distressed speaker.A famous example of how these factors can affect perception is taken from a murder case in 1994 (Fraser et al. 2011).In this instance, a man made a call to emergency services, where he allegedly confessed to murder by saying "I shot the prick".However, other listeners heard him accuse his father, "He shot them all".Fraser's study showed that listeners' interpretations of the call were not independent of context but varied significantly depending on the information they were given.
Uncovered speech in these recordings is clearly already acoustically ambiguous-often forensic phoneticians are called in to make an expert judgment on the content of the recording.If suspects were wearing any type of face covering at the time, the reliability of these recordings will be even more impaired.The results of this study showed that face coverings dampen noise in high frequency bands; if it is known that a speaker in a recording was wearing a face covering, its effects on the acoustic signal can be interpolated and taken into account when offering a judgment on content or identity.In fricatives, the migration of a peak to a different frequency due to a face covering may give a misleading acoustic impression.Knowing that a face covering was involved can help forensic phoneticians to make the most accurate analyses.

Conclusion
This study explored how facewear affected both the acoustic output and the perception of /f/, /θ/, and /v/ within two particular individuals.As hypothesised, facewear significantly affected fricative intensity and four spectral measures for both speakers.In particular, sound energy migrated from higher to lower frequencies in facewear conditions compared to the control, and in general, intensity was lower than in the controls.In addition to the passive attenuating effects of the facewear, there was some evidence that speakers shifted production to compensate for the obstruction.These results suggest that facewear, a common feature in criminal cases, can change the acoustic properties of speech and disrupt a witness's ability to hear certain words accurately.The acoustic confusability of /f/ and /θ/ and the potential to mistake facewear effects for TH-fronting could lead listeners to make false assumptions about speaker dialect and characteristics.Future research on a TH-fronting illusion, or even other salient dialectal features, would give better insight into how facewear interacts with dialect perception.The potential real-word consequences of the acoustic effects found in this study demonstrate the need for further research that will not only broaden scientific understanding of how the physical environment can affect speech style and acoustics but also enable experts in criminal cases to provide accurate testimony.

Figure 1 :
Figure 1: The three face coverings used in the experiment (left: a full-head motorcycle helmet; middle: a hard plastic party mask with no mouth hole; right: a knit balaclava with no mouth hole).

Figure 2 :
Figure 2: Spectrogram of a fear token showing segmentation.

Figure 3 :
Figure 3: Mean intensity (dB) of each phoneme for the two speakers.Error bars show the standard error of the mean.

Figure 4 :
Figure 4: Mean centre of gravity (kHz) of each phoneme for the two speakers.Error bars show the standard error of the mean.

Figure 5 :
Figure 5: Mean standard deviation (kHz) of each phoneme for the two speakers.Error bars show the standard error of the mean.

Figure 6 :
Figure 6: Mean skewness of each phoneme for the two speakers.Error bars show the standard deviation of the mean.

Figure 7 :
Figure 7: Mean kurtosis of each phoneme for the two speakers.Error bars show the standard deviation of the mean.