Face masks have emotion-dependent dissociable effects on accuracy and confidence in identifying facial expressions of emotion

The coronavirus pandemic has resulted in increased use of face masks worldwide. Here, we examined the effect of wearing a face mask on the ability to recognise facial expressions of emotion. In a within-subjects design, 100 UK-based undergraduate students were shown facial expressions of anger, disgust, fear, happiness, sadness, and neutral expression; these were either posed with or without a face mask, or with a face mask artificially imposed onto them. Participants identified the emotion portrayed in the photographs from a fixed choice array of answers and rated their confidence in their selection. While overall accuracy was higher without than with masks, the effect varied across emotions, with a clear advantage without masks in disgust, happiness, and sadness; no effect for neutral, and lower accuracy without masks for anger and fear. In contrast, confidence was generally higher without masks, with the effect clear for all emotions other than anger. These results confirm that emotion recognition is affected by face mask wearing, but reveal that the effect depends on the emotion being displayed—with this emotion-dependence not reflected in subjects’ confidence. The disparity between the effects of mask wearing on different emotions and the failure of this to be reflected in confidence ratings suggests that mask wearing not only effects emotion recognition, but may also create biases in the perception of facial expressions of emotion of which perceivers are unaware. In addition, the similarity of results between the Imposed Mask and Posed Mask conditions suggests that prior research using artificially imposed masks has not been deleteriously affected by the use of this manipulation.


Introduction
Human faces provide valuable information about not only the identity of a person but also their emotional state (Bruce & Young, 1986). The functional utility of transmitting this information is well recognised (e.g. Ekman, 2003), as are the facts that emotion recognition in faces is both rapid (e.g. Tracy & Robins, 2008) and generally highly accurate (e.g. Ekman, 2003;Ekman & Friesen, 1971). Although there is debate over whether the expression and recognition of emotions is culturally universal (compare, Ekman, 1992;with, Jack et al., 2012).
Given the importance of the transmission and perception of emotions through facial expressions, the fact that one key protective action in the current COVID-19 pandemic is wearing of face masks (World Health Organisation, 2020) which obscure part of the face, raises the issue of the impact of mask wearing on emotion recognition in faces. Previous studies have shown that emotion recognition is impaired when only part of the face is visible (e.g. Bassili, 1979;Roberson et al., 2012) and so the natural presumption is that face mask wearing will also impair emotion recognition. Indeed, several recent studies have demonstrated exactly this effect (Carbon, 2020;Gori et al., 2021;Gulbetekin, 2021;Noyes et al., 2021;Pazhoohi et al., 2021). However, one potentially problematic feature of all these studies is that they deployed Page 2 of 8 Grenville and Dwyer Cognitive Research: Principles and Implications (2022) 7:15 stimuli that were created by using image processing software to artificially impose a face mask, rather than use images of people expressing emotions while actually wearing a mask. The use of graphical manipulation does allow for the standardisation of the emotional expression between mask and no-mask conditions (Carbon, 2020); however, it also brings with it the possibility of stimulus artefacts due to the graphical manipulation itself that may impact on emotion recognition. Perhaps more importantly, it may be the case that some aspects of the contours of the lower part of the face that help demonstrate emotion may be discernible due to their effect on the shape of the mask itself: for example, the movement of the cheeks in a broad smile may raise the upper part of a mask or the opening of the mouth in surprise may stretch it vertically. Moreover, the facial expression of emotion may change when a person is wearing a mask (e.g. if they were to amplify, consciously or unconsciously, the emotion expression). Using image manipulation to impose masks negates the possibility of investigating such things in a naturalistic manner and may give an inaccurate assessment of the impacts of mask wearing on facial expression recognition. While all studies of mask wearing effects on facial emotion recognition report generally deleterious effects of the mask on accuracy of recognition, the effects were not consistent across emotions. Gulbetekin (2021) and Pazhoohi et al. (2021) reported a reduction in accuracy for all emotions tested (although with different effect sizes across emotions); Carbon (2020) reported recognition deficits for angry, disgusted, happy, and sad, but not for fearful or neutral expressions; and Noyes et al. (2021) reported deficits for angry, disgusted, fearful, happy, and surprised, but not for sad or neutral expressions. Firstly, the fact that the effect of covering the mouth and lower face with masks differs across emotions is consistent with studies, using the "bubbles" (Gosselin & Schyns, 2001) or related methods, which suggest the most diagnostic areas of the face differ across emotions (e.g. Blais et al., 2012;Smith et al., 2005;Wegrzyn et al., 2017); and with studies of occlusion that have revealed differences between emotions in the effects of covering the eye vs mouth regions (e.g. Beaudry et al., 2014;Kotsia et al., 2008;Schurgin et al., 2014). Secondly, the fact that the heterogeneity of effect of masks across emotions was not consistent across studies is perhaps unsurprising given that prior studies of the importance of different face features/regions in different emotions themselves have produced somewhat inconsistent results. While from bubbles-based studies it appears that generally the mouth region is most informative for happy, surprised, and disgusted expressions, the eyes for fearful and angry expressions, and both mouth and eye regions for sad and neutral expressions (Blais et al., 2012;Smith et al., 2005;Wegrzyn et al., 2017), the only consistent result from comparing occlusion of the eye and mouth regions is that identification of happy expressions are more disrupted by mouth than eye occlusion, while other expressions have inconsistent effects: for example, Kotsia et al. (2008) found anger more disrupted by mouth than eye occlusion and disgust more disrupted by eye than mouth occlusion, while Schurgin et al. (2014) reported the opposite pattern of results.
In addition, only two of the previous studies of the effects of mask wearing on emotion recognition (Carbon, 2020;Pazhoohi et al., 2021) collected data on the confidence of the observers. In both studies, confidence was lower for all expressions when masks were present. Indeed, it is notable that the effect sizes for the effects of mask wearing on confidence were higher than for emotion recognition accuracy, and accuracy was degraded by masks only for some expressions in Carbon (2020). In Carbon's original report, this difference between accuracy and confidence was not considered in any depth and largely dismissed as a being the product of ceiling effects obscuring accuracy differences in some emotions and nor was the difference in effect size for accuracy versus confidence discussed by Pazhoohi et al. (2021). However, if there is truly a discrepancy between the effects of masks on accuracy and confidence of emotion recognition, then it would suggest that observers do not have an accurate understanding of the degree to which their ability to determine emotional state from facial expression is impaired (or not). But whether there is a reliable dissociation between accuracy and confidence has yet to be confirmed because no other studies of mask wearing on emotion recognition collected confidence ratings.
Thus, in the current study we re-addressed the issue of the impact of face masks on the accuracy of emotion recognition, and directly compared graphically manipulated stimuli with stimuli where the emotions were posed by people wearing masks. In addition, we measured both accuracy of emotion recognition and confidence in those recognition judgements across different emotional expressions.

Participants
One-hundred psychology undergraduate students from Cardiff University were recruited for the study using the universities' online experimental recruitment system. All participants received course credit for their participation. Participants were aged 18 to 38, with a mean age of 19.5 (SD 2.34): 9 were male and 91 were female, and participant-reported ethnicity was 80 white, 15 Asian, 1 black, and 4 mixed. All participants reported normal or corrected-to-normal vision. Participants in the study reported here provided informed consent and the research was approved by the Research Ethics Committee at Cardiff University (Title: Face processing: real and imagined. Ethics Code: EC.16.10.11.4606GA_EG).

Materials
A total of 108 images of six different people (not professional actors or models) expressing an emotion were used. All stimuli were posed by white females who wore black tops and were photographed, facing fully frontally and with the face and shoulders visible, by one of the researchers (EG) against a plain white background. Each person posed the expressions of anger, disgust, fear, happiness, sadness, and a neutral expression-once while wearing a face mask, and once without. The face masks used were disposable medical 3-Ply blue face masks that were fixed around the ears to cover the nose, mouth, cheeks, and chin. These images comprised the stimuli for the No Mask and Posed Mask conditions. The stimuli for the Imposed Mask condition were created by graphically imposing an image of the same type of disposable mask over the images posed without a face mask. There were thus six images of each emotion tested in each of the three mask conditions (see Fig. 1 for examples of the stimuli used).

Design and procedure
The study was entirely within-subjects with the independent variables of mask condition (No Mask, Posed Mask, and Imposed Mask) and emotion (Anger, Disgust, Fear, happiness, Neutral, and Sadness). The dependent variables were the participants' accuracy and confidence in identifying the emotions portrayed in the photographs.
The study was conducted online using Qualtrics software (Version April 2021, Qualtrics, Provo, UT). Following general information about the study and the provision of consent, participants providing responses to questions about their age, gender, and ethnicity. They were then informed that their task in the experiment was to classify the emotion being expressed in the images presented to them and indicate their confidence in this judgement. For each trial, a single image was presented and a fixed choice of 6 emotions (happiness, sadness, neutral, fearful, disgust and anger) was available below the image. Below these response options a cursor with a scale from 0 to 7 was presented to indicate how confident they were in their answer. One side of the scale (0) was labelled 'very unconfident' , and the other side (7) was labelled 'very confident' . There was no time limit per trial and the stimuli remained on screen until answers were provided for both the emotion displayed and confidence. The 108 trials were presented in random order. The experiment took approximately 20 min to complete.

Data handling and analysis
Responses for the emotion displayed were categorised as correct or incorrect, and the number of correct responses (out of 6) per condition converted to a percentage for each participant. Confidence ratings were averaged across the 6 trials per condition for each participant. Both accuracy and confidence scores were analysed using repeated measures ANOVA with factors of mask condition and emotion (using Greenhouse-Geisser corrections where appropriate). Follow-up analyses of main effects and interactions were performed as t-tests. All analyses were performed using IMS SPSS Version 26. Because the subject pool was predominantly female and of self-reported while ethnicity, with low number of participants in other categories, it was not possible to perform a powerful analysis of participant gender or ethnicity. However, a re-analysis including only participants self-reported to be female and white revealed the same general pattern of results as reported below (this re-analysis is reported fully in the "Additional File 1" available with the online version of the paper). Fig. 2a (showing mean percentage correct emotion identification across the six emotions and three mask conditions) suggests that overall accuracy varied across emotions, was generally better for the No Mask than the Posed Mask or Imposed Mask conditions, but that the effect of mask condition was not consistent across emotions (in particular, the advantage for the No Mask condition appears negligible or reversed for Anger, Fear, and Neutral emotions Given the interaction between emotion and mask condition, follow-up tests were performed to compare the different mask conditions for each emotion separately. These revealed that for Anger accuracy was lower for   Given the interaction between emotion and mask condition, follow-up tests were again performed to examine compare the different mask conditions for each emotion separately. These revealed that for Anger there were no significant differences in confidence between mask conditions [largest t(99)  In summary, these results confirm that the accuracy of emotion recognition from faces is sometimes impaired when the lower part of the face is obscured by masks, but that this effect is not consistent across all emotions tested here. In particular, it was absent or reversed for anger, fear, and neutral expressions, and this is unlikely to be due to ceiling effects because accuracy was highest for happy or sad expressions where there was a clear negative effect of masks. Moreover, while there were some small differences across emotions, accuracy was generally similar for the Posed Mask and Imposed Mask conditions. In terms of the participants confidence in their emotion judgements, this was generally higher for stimuli not obscured by masks (with the exception of angry faces), and as with accuracy, confidence was generally similar between the Posed Mask, and Imposed Mask conditions despite some minor differences across emotions.

Discussion
The most general observation from the results reported here is that the accuracy of judgements of emotion from facial expressions (and confidence in those judgements) is impaired when the faces being judged were partially obscured by wearing a face mask. However, this highlevel summary obscures important aspects of the detail of the results: in particular the fact that the impairment in accuracy was not consistent across emotions (and was reversed in some cases) and that the pattern of effects for accuracy and confidence was different. But before turning to these issues, one of the key motivations for the current study was the possibility that previous investigations of the effects of mask wearing may have been misleading due to their reliance on graphically manipulated stimuli where masks were artificially imposed rather than posed directly. Fortunately, this concern does not seem to have been a material one-both accuracy in emotion detection, and confidence in those judgements, were generally similar for stimuli where the emotions were posed by people wearing masks, and where emotions were posed without masks and masks subsequently added by graphical manipulation. Although there were some minor differences between the Posed Mask and Imposed Mask conditions across emotions, even where present, they were generally small compared to the difference to the No Mask condition. Moreover, it should be remembered that there will be differences in the stimuli created by requiring people to pose emotions multiple times (with/ without the mask) that could impact on the ease with which those emotions are judged. Thus, while the use of Page 6 of 8 Grenville and Dwyer Cognitive Research: Principles and Implications (2022) 7:15 graphical manipulation remains in principle a potential limitation in studies of this kind, in practice it does not appear to have any materially deleterious effects.
Turning to the observation that impairments in accuracy were not consistent across all emotions tested. This general result is not entirely novel-both Carbon (2020) and Noyes et al. (2021) reported interactions between the effects of mask wearing and emotional expressionwith the disruption produced by masks not seen in all emotions. That said, both the specific emotions (fearful or neutral for Carbon; sad and neutral for Noyes et al.) and the explanation offered (ceiling effects for Carbon; specific importance of features in the mouth region for these emotions for Noyes et al.) differed. Our results are consistent with the lack of effect for neutral expressions reported previously (note also that Gulbetekin (2021) reported the smallest effect was for neutral expressions), but also suggested that accuracy was, if anything, higher for anger and fear when judging faces wearing masks. The consistency of effects (or more precisely, lack of effects) for neutral stimuli may reflect prior observations that the eye region is reasonably diagnostic for such (lack of ) expression (e.g. Blais et al., 2012;Smith et al., 2005;Wegrzyn et al., 2017) and so may not be expected to be particularly disrupted by occlusion of the mouth region by a mask. It would certainly seem unlikely that ceiling effects (see, Carbon, 2020) could be a general explanation because the neutral expression was recognised with less accuracy in the absence of masks both here and in Carbon (2020) or Noyes et al. (2021) compared to some expressions where there was a mask impairment in accuracy. With respect to the observation that, if anything, accuracy was higher in the presence of masks for anger and fear, it is true that obscuring part of the face may actually improve the accuracy of emotion judgements in at least some circumstances (e.g. Roberson et al., 2012), but such observations are relatively rare. Thus, the current observation of improved detection of anger or fear in the presence of masks may need to be interpreted with some caution in the absence of replication (however, even if this "reversal" of the overall negative effects of masks on these emotions is not reliable, the fact that mask effects are not consistent across emotions would remain, along with the dissociation between this and the consistent effects of mask wearing on confidence). Moreover, while Carbon (2020) reported a lack of impairment produced by masks with fear, Noyes et al. (2021) did not, while both reported impairments for anger (which was not replicated here), and Pazhoohi et al. (2021) report deficits for all expressions examined. In addition, there does not seem to be a clear fit with prior studies of the diagnostic features of the face (e.g. Blais et al., 2012;Smith et al., 2005;Wegrzyn et al., 2017), because they have suggested that the mouth region is typically diagnostic for at least some of the emotions not impaired by masks (but see, Schurgin et al., 2014). Thus, while the fact that mask wearing does not impair all emotions equally is clear across all studies of this issue, there remains uncertainty over which expressions are most (or least) affected. It is also the case that there is little direct match between the effects of mask wearing and prior studies of the diagnostic face features for different emotions. While speculative, one possible account of this heterogeneity of effects across studies is that specific details of the stimuli may have influenced the results. In this light, it is noteworthy that the stimuli from all studies are quite dissimilar: here, photographs posed by young females (not professional actors/models) specifically for the current study; Carbon (2020), 1 white male and female professional actors/models taken from a wide range of ages in the FACES database (Ebner et al., 2010); Noyes et al. (2021), male and female professional actors/models with a narrow age range across a variety of ethnic backgrounds from the NIMSTIM database (Tottenham et al., 2009);and Pazhoohi et al. (2021), white male and female professional actors/models taken between 19 and 31 years of age also from the FACES database (Ebner et al., 2010).
The other noteworthy aspect of the current results is the discrepancy between the effects of mask wearing on the accuracy of emotion detection and the confidence in those judgements: the participants were generally more confident (anger aside) when viewing faces that were not wearing masks, despite the fact that for some emotions accuracy was either improved by mask wearing or unaffected by it. A similar dissociation we also present (but largely dismissed as an artefact of ceiling effects for accuracy in some emotions) in the results of Carbon (2020), with confidence higher for unmasked faces across all emotions, but accuracy not impaired by masks for all emotions (and similarly, Pazhoohi et al. (2021) report larger effect sizes for mask wearing on confidence than on accuracy). It seems very unlikely that ceiling effects could offer a general account of this dissociation: both here and in Carbon (2020) maskrelated accuracy impairments were seen for emotions that were recognised with higher levels of precision Not applicable.

Significance statement
The transmission and perception of emotional states via facial expressions is a key part of human social interactions. At the same time, wearing face coverings is currently a key public health measure required or strongly recommended as a means of reducing the risk of COVID-19 transmission in many countries. Because face masks obscure much of the lower part of the face, they potentially reduce the information an observer can obtain about emotion from the face. But the degree and generality of any actual impairment in emotion recognition has yet to be determined. The current work builds upon the small number of prior studies of this issue: confirming that emotion recognition is impaired by face masks (and that is not an artefact created by artificially imposing masks onto previously posed images), that this impairment varies across different emotions (and is absent or potentially reversed in some emotions), and that observers over-generalise when their emotion recognition will be impaired by face masks.

Authors' contributions
EG performed the experiment as their final year undergraduate research project under the supervision of DMD. EG conceived the original experimental idea, which was refined by DMD. EG created all stimuli and experimental materials, collected all data, and performed the initial statistical analysis, with DMD overseeing these processes. EG's project report formed the initial draft of the manuscript, which was re-drafted for publication by DMD (including creating figures and performing additional data analysis). Both authors read and approved the final manuscript.

Funding
There was no external funding for this research.

Availability of data and materials
The datasets used during the current study are available from the corresponding author on reasonable request.