The impact of theatre on social competencies: a meta-analytic evaluation

ABSTRACT Background There has been agrowing interest in using artistic interventions as a method of developing interpersonal competence. This paper presents a meta-analysis evaluating the impact of theatre interventions on social competencies. Methods Twenty-one primary studies totaling 4064 participants were included, presenting evidence available since 1983. Included studies were assessed in terms of quality, heterogeneity, and publication bias. Results Our findings indicated that active theatre participation significantly improved participants’ empathic abilities, social communication, tolerance, and social interactions, with the largest pooled effect size for social communication (0.698) and the smallest for tolerance (0.156). Our findings did not corroborate the impact of theatre on self-concept. Conclusions This paper shows that theatre interventions have a positive impact on social competencies. The paper makes a methodological contribution by showing that randomized and non-randomized studies yielded comparably valid results.


Introduction
There has been a growing interest in using artistic interventions as a method of developing interpersonal competence.The idea that artistic activity can encourage interpersonal contact, increase group cohesion, or develop tolerance -and in so doing, address social issues as well as mental and physical problems of individuals -has been promoted by many professionals, non-profit institutions, and governmental agencies (e.g.Carnwath & Brown, 2014;Matarasso, 1997;McCarthy et al., 2004).Evaluation of the benefits of art has become central to the design of cultural policies (McCarthy et al., 2004) and debates concerning the legitimacy of public spending in the cultural sector (Belfiore & Bennett, 2010).With the rapid development of community-based arts programs (Bungay & Clift, CONTACT Kamila Lewandowska kamila.lewandowska@e-at.edu.plAleksander Zelwerowicz National Academy of Dramatic Art in Warsaw,ul. Miodowa 22/24,Poland.Supplemental data for this article can be accessed online at https://doi.org/10.1080/17533015.2022.21309472010; Hacking et al., 2008) and the upsurge of interest in measuring the social impacts of the arts (Belfiore & Bennett, 2007), there has been a flood of reports providing evidence for the curative power of the artistic programs and initiatives (Crossick & Kaszynska, 2016;Spandler et al., 2007).However, the available reports are based on predominantly qualitative evidence in the form of case studies focusing on the process of a psychological change rather than demonstrating a causal inference between variables and measuring its effect (see literature reviews in Armstrong et al., 2019;Koch et al., 2014;Yotis, 2006).To ensure that artistic interventions are an effective means to achieve social benefits, it is essential to gather, review, and summarize the existing experimental and quasiexperimental quantitative research.
In this paper, we intend to fill this gap and provide a meta-analytic evaluation of active theatre participation on social competencies.We focus on theatre interventions because, among different forms of art used in psychological treatments, theatre is particularly focused on social interactions.At the same time, it is one of the most under-researched art genres as far as psychosocial interventions are concerned.A small number of metaanalyses focuses on theatre and drama (Conrad & Asher, 2000;Kipper & Ritchie, 2003;Lee et al., 2015;Lewandowska & Weziak-Bialowolska, 2020;Ruddy & Dent-Brown, 2008), and there are no meta-analytic evaluations of the impact of theatre on social interactions.Given that the field of research in artistic interventions has expanded considerably in recent decades, especially since 2000 (Feniger-Schaal & Orkibi, 2020;Regev & Cohen-Yatziv, 2018), there is a need for a meta-analysis that summarizes current research findings and examines the impact of theatre on social competencies such as empathy, communication, or ability to interact with others.These competencies have been commonly associated with theatre art, but the effects of theatre interventions on their development have not been explored so far.
In what follows, we offer a narrative literature review, present methodology used in this paper (including the criteria of study eligibility, search methods, data collection procedures, and analytical approaches) and then describe the results of the systematic review and meta-analysis.In the Discussion section, we consider the results in light of current research and reflect on the limitations of our study.

Theatre and social interactions
The link between theatre and real-life interactions has become evident when Goffman (1954) introduced theatre as a metaphor for the presentation of the self in everyday situations.Similarly to actors playing on stage, individuals conscious of being observed "stage a character" to guide the impression that other people form of them.Because of this connection between performances and real-life interactions, theatre interventions are believed to be effective in developing social skills and the ability to maintain satisfying relationships with others.Firstly, theatre acting is associated with empathy (Goldstein, 2009;Nettle, 2006).Following Konstantin Stanislavski's Method (1936), performers use their emotional memory to understand motivations of the protagonist and revive the feelings experienced in situations parallel to the situation of the created persona (Verducci, 2000).Findings suggest that this process involves two types of empathic abilities: cognitive empathy (perspective taking), which enables individuals to understand the mental states of others, and emotional empathy (the vicarious sharing of emotion) referring to the affective response to other people's emotions and states (Smith, 2006).
Secondly, because theatre interventions are group-oriented and take place in an interactive, social context (Emunah, 1994), they provide opportunities for socialization (2014, p. 2014).Group theatre-making facilitates the development of social ties and encourages interpersonal contact by enhancing spontaneous interactions.Research shows that because managing distance between the self and the role is particularly relevant to theater art, theatre-making can be an effective "distancing device", enabling participants to find a balance between social closeness and separation (1982;Landy, 1983).
Thirdly, theatre interventions are believed to increase confidence and self-esteem.It has been argued that acting on stage facilitates the process of "the deconstruction of an old self and the creation of a new identity" (Snow et al., 2003, p. 74), which allows actors to re-develop a more powerful self.Participation in a theatre play evokes feelings of joy, pride, and fulfillment (Emunah & Johnson, 1983;Snow et al., 2003) and allows participants to gain self-awareness, leading to improved self-image and more acceptance of oneself (Holmqvist et al., 2017).
Theatre instructors and therapists use different techniques of active theatre participation.For example, theatre seminars based on Boal's Theatre of The Oppressed as well as Playback Theatre training were found to improve empathic skills, perspective-taking, and caring behavior in different target groups (Bhukhanwala & Allexsaht-Snider, 2012;Bodenhorn & Starkey, 2005;Moran & Alon, 2011;Ng & Graydon, 2016).Drama-based interventions have been used to increase self-esteem among individuals with mental illness (Orkibi et al., 2014), adults with learning disabilities (Hackett & Bourne, 2014), women afflicted with breast cancer (Mattson-Lidsle et al., 2007), and children who struggle with the feeling of abuse or rejection (Moore et al., 2017).The imitation and modeling techniques, such as role-playing and masks, have been considered helpful in understanding criminal and offending behaviors and used to enhance prosocial behavior among prisoners (Harkins et al., 2011) and adolescent bullies (Burton, 2010).
The majority of the above-mentioned studies use observational methods such as interviews, participant recollections, or video-based ethnographies.Our study, focused on experimental studies and quasi-randomized controlled trials, aims to collect and analyze the existing evidence and look at causal relationships (as required in the health sciences).

Inclusion and exclusion criteria
In order to identify suitable experimental studies, we defined the following eligibility criteria using the PICOS framework (participants, interventions, comparisons, outcomes, study design;Higgins & Green, 2008): (1) Participants.Our study starts with a broad scope regarding types of participants.
There were no restrictions on age, gender, ethnicity, or condition.
(2) Interventions.Our study evaluates interventions based on active theatre participation.This intervention type includes workshops and sessions utilizing drama therapy and performance techniques, such as Playback Theatre, Theatre in Education, Applied Theatre (McNamara, 2006), therapeutic theatre (Snow et al., 2003), or participatory theatre (Erel et al., 2017).Eligible interventions were based on creative practices, such as theatrical games and exercises, mask work, or improvisation of fictional scenes.At the same time, we excluded trials that utilized methods that draw from theatre arts but are closer to psychotherapy than to theatre-making.For example, interventions based on psychodrama were excluded, because psychodrama treatments typically focus on one person in a group at a time and are oriented more toward individual therapy and less toward social interactions (Emunah, 1994).Moreover, the goal of psychodrama is to address actual life dilemmas rather than practice art-making using metaphors and imaginary material (Blatner & Blatner, 1988;2014;Emunah, 1994;Kedem-Tahar & Felix-Kellerman, 1996;Wilkins, 1999).(3) Comparisons.Interventions against which active theatre participation was compared (comparisons) included both active (alternative activity) and inactive (no activity) control groups.(4) Outcomes.We focused on interventions aiming to improve social competencies.In the selected studies, we repeatedly reviewed the outcome measures in terms of their comparability and organized them into the following outcome clusters: Empathic abilities, Social interactions, Social communication, Tolerance, and Selfconcept.The clustering approach was borrowed from another meta-analysis (Koch et al., 2014).We analyzed heterogeneity within each cluster to investigate whether the results of the studies were comparable.(5) Study design.Eligible trials used intervention and control comparable groups and measured outcomes pre-and post-intervention.Studies that did not use control groups were excluded.We did not restrict our research to randomized controlled trials because randomization is not always feasible in theatre interventions (Goldstein, 2015) and rigorously conducted trials are generally scarce in the area of arts-based treatments (Jones, 2015;Ruddy & Dent-Brown, 2008;Yotis, 2006).Instead, we allowed experimental studies and quasi-experimental studies.The risk of bias was assessed to ensure high methodological quality of the included studies.
In addition, studies were analyzed in terms of data availability.When any of the necessary data were missing and we could not calculate them from reported study results, the corresponding authors were contacted.If the authors did not respond or could not provide the missing data, the study was excluded.We also excluded studies where the presentation of results, due to their inconsistency or unclarity, raised doubts about the quality of the study (see, Figure 1 for details).

Search methods and selection of trials
Seven databases were searched: PubMed, Cochrane CENTRAL, PsycINFO, EMBASE, Web of Science, ERIC, and Science Direct.We restricted our search to publications written in English.Given that each bibliographic database had a different search mode, our search parameters had to be modified accordingly but, in general, we used the keywords "theatre OR drama" with the additional search terms "trial OR random OR control OR experiment*".The Boolean operator ("NOT") with phrases like "operat* theat* OR surgical theat*" was used to avoid too many irrelevant search results.We did not apply date or study field restriction.Additionally, we conducted manual searches of three journals dedicated specifically to the topic of arts and health: The Arts in Psychotherapy, Arts & Health and Dramatherapy.To this, we added studies identified through Google Scholar searches.Reference lists of the already included studies were screened to identify possibly relevant trials that were still missing from our selection.The search process is summarized in the Prisma flow diagram (Figure 1).
The searching process led to 467 possibly eligible studies.The first author removed duplicates and screened titles and abstracts of the remaining 345 articles.After selecting relevant studies, full texts were retrieved and analyzed one by one by the first author based on the agreed-upon eligibility criteria created by all authors.When selection required a judgment call, the authors discussed and jointly decided on the study exclusion.At this stage, the reasons for excluding studies from further analysis were documented in the study dataset.Among the most typical reasons for exclusion was the fact that the publication type was irrelevant (e.g. it was a review or a study protocol), an inadequate intervention was used (e.g. one that combined theatre with other art, or non-art, forms and did not present separate data on the effects of theatre activities), the study measured inadequate outcomes (e.g.health-related knowledge), used inadequate design (e.g.no control group) or did not provide necessary data.

Data extraction
For each of the included studies, we extracted its title, author(s) and year of publication as well as PICOS • participants' characteristics (mean age of sample, sex, participant condition), • type of intervention (for example, playback theatre, theatre in education, creative drama), duration and frequency of the intervention, • activity of control groups, • outcomes of interest (some of the included studies measured more than one outcome of interest; for example, Dow et al. (2007) reported measures concerning empathetic abilities and social communication.The "Outcome" column in Table 1 shows this overlap).
• assignment to groups: randomized controlled trial, cluster-randomized controlled trial (see Unit of Analysis Issues), quasi-randomized controlled trial (studies in which the method of allocation is not considered strictly random).

Assessing risk of bias
The risk of bias tool recommended by the Cochrane Collaboration (Higgins & Green, 2008) was used to assess the methodological quality of the included studies.Two authors read the papers and independently assessed risks of bias for each of the studies.In the instances of disagreement, resolution was achieved via discussion.The authors evaluated risks in six different areas: Sequence generation, Allocation sequence concealment, Blinding, Incomplete outcome data, and Selective outcome reporting.The risk of bias in the first two areas: sequence generation (whether the rule of allocating interventions to participants was based on a random process) and allocation sequence concealment (whether the investigators could, or could not, foresee intervention allocations) was considered as "low" when the trials followed strict randomization procedures and "high" in case of quasi-randomized trials.
In terms of blinding, it is generally impossible to blind participants of artistic interventions to what kind of treatment they receive (Koch et al., 2014), and therefore we were only able to assess whether the personnel or research assistants responsible, for example, for coding or data analysis, received blinding.In five studies that provided such information, evaluators were blinded to group assignment and/or research hypothesis.
The loss of participants in trials due to withdrawals, dropouts and protocol failures was evaluated in the following manner: the risk was reported as "high" when reasons for missing data were likely to be related to the study results and "low" when they were unlikely to be related to the study results, when missing data were balanced in numbers across groups and when the plausible results among missing outcomes did not have an impact on the observed effect sizes in terms of practical importance of the treatment effect.Finally, in the area of selective outcomes reporting the risk was considered as "low" when there was no clear evidence that the outcomes had not been reported in a prespecified fashion and "high" when there was such evidence.Because protocols for the included studies were generally not available, the risk of bias in this area was typically considered as "unclear".All studies with "low" and "unclear" risks of bias were accepted for the meta-analysis.Studies that were not strictly randomized (high sequence generation risk) were not excluded, but we conducted a heterogeneity analysis to examine whether randomized and quasi-randomized studies were comparable in terms of effect sizes.Studies with "high" risks in other areas were accepted when we were able to address and resolve the issues that could undermine the validity of evidence, for example, the authors provided us with missing information.Due to many missing information in the primary studies, we only present the results related to random group allocation (Table 1).

Unit of analysis issues
Studies that were identified as cluster-randomized controlled trials (CRCT; see Table 1) used the following units of allocation: a classroom (Rousseau et al., 2007), a school (Mora et al., 2015) or a university year/student group (Pfeiffer et al., 2017).Randomization in educational research is particularly difficult (Goldstein, 2015) and cluster-randomized trials are common in school and university settings (Schackelton et al., 2016).The key assumption in CRCT is that participants within the same cluster tend to behave in a similar manner and therefore should not be treated as subjects independent of one another.Data analysis that does not take clustering into account can lead to false-positive conclusions in terms of the effect of the intervention (Higgins & Green, 2008).Since for all CRCTs the results were reported without taking clustering into account (that is, as if individuals had been randomized), we corrected sample sizes to account for clustering.The effective sample size was calculated by dividing the sample size by the design effect according to the formula: 1+(M-1)*ICC (Rao & Scott, 1992), where M is an average cluster size and ICC is the intraclass correlation coefficient (Higgins & Green, 2008, p. 496).ICC was derived from external sources, as suggested by Higgins and Green (2008, p. 496).For example, for studies by Rousseau et al. (2007) and Mora et al. (2015), the ICC for the self-concept outcome was sourced from Shackelton et al. (2016).In Pfeiffer et al. (2017), we assumed that ICC equaled "0" because the clusters consisted of university students who were assigned to short-term project groups.

Effect size coding
Cohen's d and its variance were either (1) extracted (if directly reported in the publication), (2) converted from other measures reported (e.g.t-test and p-value) or (3) computed, if reported measures were insufficient to use options (1) or (2).Regarding the latter, since all included trials reported continuous outcomes, to compute Cohen's d we extracted sample sizes, pretest and posttest mean (or only baseline adjusted posttest mean if pretest mean was not available) and standard deviation (or standard errors if standard deviations were not available) for each (experimental and control) group, following the strategy for pretest-posttest-control group design proposed by Morris (2008) and also applied by Friese et al. (2017).Specifically, Cohen's d was defined as the mean pre-post change in the treatment group minus the mean pre-post change in the control group, divided by the pooled pretest standard deviation, which was shown to provide an unbiased estimate of the population effect size and have a known sampling variance that is smaller than the sampling variance of alternative estimates (Morris, 2008).As shown by Morris (2008) standardizing by the pooled pretest standard deviation yields a more precise estimate of the true effect because interventions typically cause greater variation at the posttest.If only baseline adjusted estimates were reported, Cohen's d was defined as the difference in means divided by the pooled posttest standard deviation as suggested by Higgins and Green (2008) and Borenstein et al. (2009).
In the subsequent step, the Hedges' correction factor for small sample bias was applied to Cohen's d to compute Hedges' g and its variance.
For studies that presented continuous data from different scales rating the same outcome, we computed one effect size per outcome for each comparison.Since these effect sizes are not statistically independent, we averaged them taking into account the correlation between the scales in the computation of the standard error of the effect size (Borenstein et al., 2009, p. 230).Since the correlation between the scales was not known, we assumed it was 0.5 for each pair of scales.We recognize that there is a disjuncture between our approach and the theoretically supported methods (for a review see, Marín-Martínez & Sánchez-Meca, 1999;Friese et al., 2017).However, we applied the best approach that was achievable and also recommended by Borenstein et al. (2009).
As we are aware that it may influence the results and final conclusions, we examined the robustness of this approach by testing two scenarios.In the first one, the assumption was that the outcomes were poorly correlated (Pearson's correlation to be equal 0.1).In the second scenario, the assumption was that the outcomes were strongly correlated (the assumed Pearson's correlation was equal to 0.9).Results (presented in Table A1 in the Appendix 1) were robust to the level of correlation between the examined outcomes.Regardless of the strength of the correlation tested (r = 0.1, r = 0.5 and r = 0.9), the size of the effect of theater interventions on the examined outcome remained comparable.
In studies where authors reported multiple observations for outcomes at several timepoints [e.g.Mora et al. (2015) reported results measured three times after the intervention: after 1 month, 5 months and 13 months], we adopted the final time-point meta-analysis (FTM) approach (Peters & Mengersen, 2008) and collected evidence at first time-point for each study (typically, measures obtained right after the end of an intervention).We chose this approach for the sake of consistency, as in the majority of the studies there was no long-term follow-up and post-treatment measurements took place only once, shortly after an intervention.When we dealt with a multi-intervention study (i.e.consisting of more than one intervention group), we collected data only for the intervention relevant to our study (theatre) and for the group considered by the authors as a "control group".
All outcomes were revised with respect to scale orientation.Consequently, scores on "negative" outcomes were reversed to make a positive effect size indicative of better performance of treatment than control.The results together with the details on measurement scales and subscales, ranges and calculation details (sum or mean, pole change, etc.) are presented in Table 2.

Meta-Analytic procedure
Meta-analysis was conducted using a random-effects model with the DerSimonian-Laird estimation method (DerSimonian & Laird, 1986) as it was unreasonable to assume that there is one true, "fixed" population effect (Hedges & Vevea, 1998).Meta-analysis was performed in Stata 15, using a metan command that allows application of the inversevariance weights.
Since studies collected for meta-analyses should be sufficiently similar to allow for the estimation of the summary effect, the heterogeneity analysis was conducted.The heterogeneity of effect sizes was examined using the Q-statistic and the I 2 statistic (Borenstein et al., 2009;Higgins et al., 2003).We considered an effect to be homogeneous if the Q-statistic was not significant and I 2 indicated a small level of heterogeneity (I 2 < 50%; e.g.Cuijpers et al., 2009).
To account for the publication bias (studies that report relatively high effect sizes are more likely to be published than studies that report lower effect sizes or not significant effect sizes; Borenstein et al., 2009;Friese et al., 2017), the funnel plot (a scatter plot of each study's effect estimate [usually on the x-axis] against some measure of the precision of the effect [usually on the y-axis]; [Langan et al., 2012]) and the Egger's regression test (providing data on significance of the relationship) were applied (Borenstein et al., 2009).Since larger studies usually provide more precise estimates, less variability is expected at the top of the funnel plot, where large studies are generally clustered around the mean effects size.Instead, higher variability is expected at the bottom of the funnel plot, where smaller studies -usually having less precise estimate of effect -are located (they are expected to be spread across a broad range of values).In the presence of publication bias, the plot appears asymmetrical, with smaller studies located either on the left or on the right.It is worth noting that an asymmetrical funnel plot does not have to automatically indicate the presence of publication bias.This could also result from heterogeneity caused by factors associated with precision or withinstudy biases which may be more likely to emerge in small studies than in larger ones (Langan et al., 2012).

Characteristics of the included studies
The characteristics of the 21 studies included in this meta-analysis are presented in Table 1.Participants were almost evenly distributed by gender (49% of women) and age: 10 studies worked with children (five studies) and young people (five studies) and 10 studies involved adults (six studies) and elderly participants (four studies).Most of the young participants were in the non-clinical condition (regular school and university students), although one study involved participants with autism spectrum disorder.Studies that involved adult and elderly participants worked mostly with diagnosed patients and at-risk populations (older adults, burnout symptoms).Diagnosed participants included patients (54) with a diagnosis of idiopathic Parkinson's disease with a moderate disease severity (Hoehn-Yahr stage 2-4; Corbett et al., 2016;Mirabella et al., 2017), dementia (100 participants; average duration of illness in years M = 2.8; Van Dijk et al., 2012), schizophrenia (16 participants; more than 2 years of illness; Spencer et al., 1983), breast cancer (36 patients after surgical and/or radiation therapy; Ostby, 2016), and hemodialysis patients (31 participants; average duration of hemodialysis in years M = 3.5; Sertoz et al., 2009).The heterogeneity analysis was conducted to assess the generalizability of the findings.
Theatre interventions were characterized by active engagement of participants, with forms of engagement varying from being an actor (developing a character, role-playing) to taking part in theatre and drama-based exercises (games, body language techniques, voice training), and being an audience member (for example, in Playback Theatre interventions, participants tell their stories and then watch them enacted -played back -by professional actors on the spot).Typically, different forms of engagement (e.g.being an actor and a spectator) were combined within a single intervention.In most cases, intervention groups were guided by trained instructors, including theatre professionals (actors, playwrights, theatre professors, etc.) and individuals (teachers, caregivers, peers, etc.) who received theatre training before the intervention.At the same time, the information on their training level was vague and difficult to systematize (the studies reported that the training was provided by, e.g."a theatre professor", "theatre company", two individuals who "have training in psychology and/or creative arts therapies", etc.) or not provided at all (in 13 out of 21 studies) and we were unable to conduct the analysis of the effects of instructors' qualifications.
An example of theatre intervention: "The theatrical workshop, led by the theater company, consisted of 6-h daily sessions, for two consecutive days, once or twice per month, for a total of ~18 h/months, for 3 years.The initial part of every workshop focused on exercising basic skills.All subjects were trained in controlling breathing, posture, gait, coordination, and manual tasks.(. ..)The patients were then taught to approach theater texts and to analyze them.In the second part of the workshop, patients rehearsed singly or in groups, together with actors, based on improvisation or sketches.Sketches were always directed by actors of the company with the aim of recreating on the stage behaviors and emotions that could occur in real life.(. ..)After periods of training, some of the patients wrote a script and eventually presented it with the help of the director" In most studies, a theatre session lasted 90 min.Six of the included studies reported a onetime intervention, whereas thirteen studies used interventions that took place once or twice a week for 6 to 24 weeks.Two studies used interventions that lasted more than a year.Control group conditions included no particular activity (six studies), a waiting-list (four studies), or a non-theatre activity (10 studies).Types of non-theatre activities varied, depending on participants' type -for example, studies involving school or university students compared theatre activities with extra-curricular reading and lectures, whereas those involving patients typically applied non-theatre therapies (e.g.physiotherapy, rehabilitation, or reminiscence therapy).
Out of the 21 studies, 16 were fully randomized, including three studies that used cluster randomization.The remaining five studies were quasi-randomized controlled trials, that is, trials where the method of allocation is not considered strictly random (e.g.assignment to groups based on consecutive, rolling entry basis).

Effects of theatre interventions on empathic abilities
Nine studies examined the effects of active theatre interventions on empathic abilities with 1888 participants altogether (N intervention = 972, N control = 916).The pooled estimate of those studies, estimated as the random-effects mean effect size of theater intervention on empathic abilities, was g = 0.247, p < .001,95% CI (0.116, 0.379; Figure 2).This indicated that theater interventions focused on active participation did influence empathic abilities and this impact was moderate.The amount of variance in the observed effect sizes was low (I 2 = 31.6%).Cochran's Q was 11.69 (p = .165),which implied that the heterogeneity observed in the effect sizes was not significant.
The funnel plot (Figure 3) seemed to be slightly asymmetric with smaller studies grouped rather on the right, implying that small studies with negative effects might have been missing due to the publication bias.However, asymmetry of the funnel plot might also result from the heterogeneity caused by other factors associated with precision, which seemed to be in line with the fact that two studies (22% of studies) fell outside the region defined by pseudo 95% confidence interval (dashed sloping lines; Langan et al., 2012).These two studies might possibly be related to a distinct underlying concept.The visual perception was somehow supported by the Egger's test which indicated no significant correlation between effect sizes and their precision at p-value = .01,but at p-value = .05it led to the rejection of null hypothesis (the lack of publication bias).
These findings suggest that the interpretation of the effect of theatre interventions on empathic abilities should be made cautiously.

Effects of Theatre Interventions on Social Interactions
Five studies examined the effects of active theatre interventions on social interactions with 264 participants altogether (N intervention = 119, N control = 145).The pooled estimate of those studies, estimated as the random-effects mean effect size of theater interventions on social interactions, was g = 0.345, p = .004,95% CI (0.112, 0.579; Figure 4).This indicated that theater interventions focused on active participation did influence social interactions and the effect was moderate.The amount of variance in observed effect sizes was minuscule (I 2 = 0%).Cochran's Q was 2.6 (p = .627)indicating no significant heterogeneity observed in the effect sizes.
The funnel plot (Figure 5) was found to be symmetric implying no correlation between effect size and precision.All studies fell in-between pseudo 95% confidence interval implying that the studies estimated the same underlying effect.This visual impression was confirmed by the Egger's test that showed no support for rejecting the null hypothesis of no small-study effects (p = .892).
Figure 2. Forest plot showing the results of random-effects meta-analysis for the nine studies on the effect of theatre interventions on empathic abilities.Notes: ES-effect size, CI -confidence interval, IDstudy label.Horizontal axis represents the scale for the size of the effects.The vertical line is the "no effect line" and is located at the value where there is no association between theatre interventions and empathic abilities.Labels for the analyzed studies are represented on the left-hand side.An individual effect size for each study (presented on the right-hand side) is marked with a black diamond at the center of the grey square.Each horizontal line represents the 95% confidence intervals of the individual effect size (presented in the square brackets on the right-hand side).The area of the grey square is proportional to the corresponding study weight.A diamond at the bottom corresponds to the overall effect size and its width -to the 95% confidence interval.The height of the diamond is irrelevant.The random-effect model with the DerSimonian-Laird estimation method was applied to estimate the overall effect size.

Effects of theatre interventions on social communication
Five studies examined the effects of active theatre interventions on communication with 114 participants altogether (N intervention = 63, N control = 51).The pooled estimate of those studies amounted to g = 0.698, p = .003,95% CI (0.233, 1.164; Figure 6).This meant that theater interventions focused on active participation had a strong impact on social communication.Since the amount of the variance in observed effect sizes was moderate (I 2 = 42%) and not significant Cochran's Q (Q = 6.9; p = .142),lack of heterogeneity in the effect sizes was confirmed.
The funnel plot (Figure 7) was found to be symmetric with all studies evenly distributed across the horizontal axis.This implied that studies reporting both positive and negative effects were published.It also meant that no association between effect size and precision was found.This was confirmed by the Egger's test that showed no support for rejecting Note: This is a scatterplot of study-specific effect sizes on the horizontal axis against the measures of study precision reflected by standard errors of the effect size on the vertical axis.In the absence of small-study effects, publication bias and/or between-study heterogeneity, the plot looks symmetrical (studies should be symmetrically distributed around the overall effect size) and resembles an inverted funnel.As the effectsize estimates from the smaller studies are more variable than those from the larger studies, the scatter is usually wider at the base of the plot.As asymmetrical shape of the inverted funnel may suggest that nonsignificant results of smaller studies are not published and, consequently, not included in the metaanalysis.The solid vertical line represents the overall effect size.Two grey lines correspond to the pseudo (not genuine) confidence interval lines and are presented to provide some insight into the spread of the observed effect sizes about the estimate of the overall effect size.They should form an inverted funnel shape in the absence of the small-study effects, publication bias and/or between-study heterogeneity.The random-effect model with the DerSimonian-Laird estimation method was applied to estimate the overall effect size (θ DL ).  the null hypothesis of no small-study effects (p = .892).These results ruled out the risk of publication bias.

Effects of theatre interventions on tolerance
Five studies examined the effects of active theatre interventions on tolerance with 1594 participants altogether (N intervantion = 812, N control = 782).The pooled estimate of those studies was g = 0.156, p = .002,95% CI (0.056, 0.254; Figure 8).This implied that the effect of theater intervention on tolerance was significant but of small size.The amount of the variance in observed effect sizes was low (I 2 = 9.4%) and not significant, which was confirmed by the Cochran's Q statistic (Q = 4.41; p = .353).
The funnel plot (Figure 9) looked slightly asymmetric with two smallest studies grouped on the left and indicating none or negative effects.All five studies were within pseudo 95% confidence interval, implying homogeneity of the examined underlying effects.Visual asymmetry was not confirmed by the Egger's test that showed no support for rejecting the null hypothesis (p = .223)-thus the publication bias was not considered further.

Effects of theatre interventions on self-concept
Six studies examined the effects of active theatre interventions on the self-concept with 335 participants altogether (N intervention = 163, N control = 172).The pooled estimate of those studies was found to be not significant [g = 0.134; p = .223;95% CI (−0.081, 0.349)] (Figure 10).This implied no effect of theater interventions on the self-concept.The amount of the variance in observed effect sizes was very low (I 2 = 0.0%) and the Cochran's Q, which examines the presence of heterogeneity in the effect sizes, was not significant (Q = 1.88; p = .866).Analysis of the funnel plot (Figure 11) indicated that all examined studies were within pseudo 95% confidence providing additional support for the homogeneity of their effects.However, the pattern of points led to a conclusion that it was slightly asymmetrical.While the smallest studies reported positive effects, the largest studies provided evidence of rather null effects.This visual asymmetry seemed to be confirmed by the Egger's test (p = .024)with a small reservation.Although it led to the rejection of the null hypothesis of the lack of publication bias only at p = value = .05,at p-value = .01the statistical evidence to reject the null hypothesis was insufficient.We believe that these findings suggest that the interpretation of the lack of an effect of theatre interventions on self-concept should be made cautiously.

Discussion
This meta-analysis evaluated the impact of interventions based on active theatre participation on social competencies.Twenty-one primary studies, totaling 4064 participants, were included, presenting evidence gathered since 1983.Our findings indicated that active theatre participation benefits participants' social competencies.In particular, theatre interventions were found to improve empathic abilities, social interactions, social communication, and tolerance.Pooled effect sizes varied from small to large (0.156-0.698), with the largest effect observed for social communication and the lowest but still significant -for tolerance.At the same time, our findings did not corroborate the impact of active theatre participation on self-concept.Below, we discuss the results within each outcome.
The analysis of the effect of an active theatre participation on empathic abilities revealed a moderate positive effect size, suggesting that these types of interventions are beneficial for increasing empathy.Our result is in agreement with previous findings from observational studies that showed higher empathy levels among participants with acting experience (e.g.Goldstein & Winner, 2012;Moran & Alon, 2011) and demonstrated that both female and male actors are more empathetic than the general population (Nettle, 2006).Our findings, focusing on amateur participants rather than professional actors, contribute to this research by showing that not only a long-term commitment to acting education but also a series of theatre interventions can improve the ability to feel and understand the mental states of other people.Since we found some heterogeneity in the examined effects and a limited number of small studies reporting null or negative effects, this finding should be treated with caution and reevaluated when more studies in the field are published.
A meta-analysis carried out to examine the effects of active theatre participation on social interactions also suggested a moderate positive pooled effect.This result is in line with previous research showing that theatre interventions enhance group processes, develop responsiveness and attention to others and facilitate interpersonal contact (e.g.Emunah, 1994;Tomasulo & Szucs, 2015).
With regard to social communication, a large positive effect was found (g = 0.698).Together with a number of non-experimental studies (e.g.Karnieli-Miller et al., 2018;Noone et al., 2015), this finding supports the effectiveness of interactive theatre on interpersonal communication skills.The large effect might be due to the fact that creative arts, including theatre, are one of the most powerful means of expression and communication and have been recognized as an effective communication-enhancing tool (Fraser & Sayah, 2011).Because the use of metaphors and meaning-making are essential factors in artistic interventions, distinguishing this form of treatment from other types of therapies (Koch, 2017), arts-based activities are particularly useful in enhancing verbal and nonverbal communication between participants in different target groups (Samaritter, 2018).
Our results demonstrate a positive impact of active theatre on tolerance with a small but consistent effect size and suggest that theatre-making may play a useful role in developing more open-minded and accepting attitudes toward others.A similar finding was reported by Greene et al. (2014) who showed that exposure to a broader world through artistic participation increases the ability to understand other people and develop acceptance of different viewpoints.
Our analysis shows no significant influence of theatre interventions on the selfconcept.This result is consistent with the finding of the meta-analysis conducted by Conrad and Asher (2000) who also reported that, when the results of different studies are pooled together, a significant influence of theatre-making on self-concept disappears.However, our findings should be interpreted with caution due to a publication bias that seemed to be present in the data.Our meta-analysis suggests that more high-quality evidence with larger sample sizes is needed to effectively evaluate the impact of theatre on the outcome in question.Also, future studies should take into account the effect of populations being studies (e.g.clinical vs. non-clinical and young vs. old), which could be examined using meta-regression.While we collected several characteristics of populations (see, Table 1), we could not conduct such an analysis due to the limited number of included trials.However, the results of our heterogeneity analysis provide reassurance that such effects, if present, did not substantially impact the pooled effect size.The amount of variance (measured by I 2 ) in the observed effect sizes was low in four out of five cases and moderate -though not statistically significant -in the remaining one.These results imply that, despite the differences in the characteristics of the included studies, direction and magnitude of the observed effects were rather comparable, suggesting homogeneity of effects and robustness of our findings across different groups of participants.
To increase the validity of the results and ensure the quality of the evidence, two complementary approaches were applied to the assessment of risk of bias.First, following the Cochrane Collaboration's guidelines, the qualitative analysis of the risk of bias was conducted.All studies with the high risk of bias (mostly due to missing data and selective outcome reporting) were excluded from the analysis.Second, the publication bias was examined.In the case of three outcomes, we found no evidence of publication bias, while in the case of two outcomes, the statistical evidence to reject the hypothesis on the lack of publication bias was insufficient.These results allowed us to confirm the effects of theatre interventions on social communication, tolerance, and social interactions.At the same time, the results showing the effects of theatre on empathic abilities and the lack of effect on self-concept should be interpreted cautiously.

Limitations and future research
Regardless of the fact that this research was conducted following the Cochrane Collaboration's guidelines and significant efforts were undertaken to minimize the risk of bias, several limitations should be considered when interpreting the results.Due to the scarcity of rigorous experimental research in the field of artistic interventions, our final sample included a limited number of studies.Although reviews and meta-analyses based on less than 10 studies are common and considered a valid source of evidence (Bastian et al., 2010;Mallett & Clarke, 2002), we recognize that larger samples could lead to more precise calculations of effect sizes and strengthen our findings.
There has been much debate in terms of whether the results of randomized trials are consistent with the results of non-randomized trials, and whether randomization is required to avoid bias and provide evidence of causal effects.While some researchers have shown that non-randomized, observational, and correlation-based studies tend to overestimate treatment effects (Ioannidis et al., 2001), others provided evidence that both randomized and non-randomized studies yield very similar results (Benson & Hartz, 2000;Concato et al., 2000).This issue is particularly relevant to the field of art-based treatments where randomization is often not feasible and a large amount of evidence is accumulated through either quasi-experimental (Goldstein & Winner, 2012) or longitudinal observational studies (Fancourt & Steptoe, 2019;Lewandowska & Weziak-Bialowolska, 2020;Weziak-Bialowolska, 2016).Researchers in this field have argued that non-randomized studies are not necessarily less reliable; for example, Goldstein (2015) found that baseline characteristics of students do not predict the effects of acting classes.In our study, quasi randomization was addressed through heterogeneity analysis, since the number of included trials was too small to conduct subgroup analyses or meta-regression.The analysis of heterogeneity suggests that randomized and partly randomized trials were comparable in terms of their influence on the pooled effect size.
Several possible shortcomings stem from the design of the included studies.One of them is the fact that in some of the analyzed studies active theatre interventions were tested against activities that were relatively less attractive (e.g. standard school classes or lectures) or were compared with no intervention at all.According to Goldacre (2012, p. 182), testing a treatment against something that is known a priori to be less successful is a frequent source of bias in evidence-based research.We were unable to verify the degree to which this limitation impacted our results.Another critical issue lies in the fact that studies using cluster randomization did not take clustering into account.We adjusted the data to address this problem, but we had to rely on estimations (e.g.due to extracting intraclass correlation coefficient from external sources).Moreover, proper assessment of selective outcome reporting (whether the outcomes were reported in a pre-specified way) was practically impossible because study protocols in the arts field are generally not available.
Gathering information for meta-analysis was also challenging because many studies failed to report the necessary participant information and descriptive statistics or inconsistently presented the results.The representativeness of samples in terms of race/ ethnicity and socioeconomic status was difficult to assess because the relevant information was missing or reported inconsistently across studies, for example, only six studies reported the educational status, and while some reported years of education, others documented the percentage of participants with higher education.Furthermore, because considerably many studies lacked information on blinding techniques and dropouts, we were unable to make adequate assessments.In addition, descriptions of interventions were often vague, and our categorization of trials relied mostly on the authors' understanding and operationalization of different forms of interventions, that is to say, whether they recognized an intervention as "drama", "psychodrama", "theatre", or something else.
It is crucial to overcome these weaknesses in the future research.The effects of theatre interventions should be evaluated through better designed experimental studies.More randomized controlled trials with larger sample sizes are certainly desirable.Control groups should be allocated to activities that are to some extent comparable with theatre (for example, an intervention based on music or sport).Another important point would be to use more representative samples.Only eight studies reported participants' racial/ethnic background, which makes generalizable results difficult to draw.However, the fact that in those studies 81% of participants were Caucasian suggests that there is an underrepresentation of minorities in evidence-based research on the arts.
An increased attention in the future should also be paid to meticulous reporting of study design and results.To allow proper assessment of bias, studies need to provide complete information on the methods used (e.g.methods for assigning interventions, blinding techniques) and study protocols need to be published.The inclusion of all descriptive statistics (means, standard deviations, and exact sample sizes adjusted for clustering) needs to become a standard in the experimental studies in the field.
Finally, we encourage authors to provide more elaborate and transparent descriptions of interventions and on the qualifications and credentials of instructors.Guidelines on how to report art-based intervention studies (e.g.Robb et al., 2011) can help authors describe their interventions in a way that allows replication and facilitates interpretation of outcomes.Detailed intervention reporting is an important factor for increasing the replicability of arts-based experimental studies (Koch et al., 2014) and the translation of interventions to evidence-based practice.

Figure 3 .
Figure 3. Funnel plot for the nine studies on the effect of theatre interventions on empathic abilities.Note:This is a scatterplot of study-specific effect sizes on the horizontal axis against the measures of study precision reflected by standard errors of the effect size on the vertical axis.In the absence of small-study effects, publication bias and/or between-study heterogeneity, the plot looks symmetrical (studies should be symmetrically distributed around the overall effect size) and resembles an inverted funnel.As the effectsize estimates from the smaller studies are more variable than those from the larger studies, the scatter is usually wider at the base of the plot.As asymmetrical shape of the inverted funnel may suggest that nonsignificant results of smaller studies are not published and, consequently, not included in the metaanalysis.The solid vertical line represents the overall effect size.Two grey lines correspond to the pseudo (not genuine) confidence interval lines and are presented to provide some insight into the spread of the observed effect sizes about the estimate of the overall effect size.They should form an inverted funnel shape in the absence of the small-study effects, publication bias and/or between-study heterogeneity.The random-effect model with the DerSimonian-Laird estimation method was applied to estimate the overall effect size (θ DL ).

Figure 4 .
Figure 4. Forest plot showing the results of random-effects meta-analysis for the five studies on the effect of theatre interventions on social interactions.Note: ES-effect size, CI -confidence interval, IDstudy label.Details on other elements of the plot can be found in the note toFigure 2.

Figure 5 .
Figure 5. Funnel plot for the five studies on the effect of theatre interventions on social interactions.Note: SE-standard error.Details on the elements and interpretation of the funnel plot are presented in the note toFigure 3.

Figure 6 .
Figure 6.Forest plot showing the results of random-effects meta-analysis for the five studies on the effect of theatre interventions on social communication.Note: ES-effect size, CI -confidence interval, ID -study label.Details on other elements of the plot can be found in the note toFigure 2.

Figure 7 .
Figure 7. Funnel plot for the five studies on the effect of theatre interventions on social communication.Note: SE-standard error.Details on the elements and interpretation of the funnel plot are presented in the note toFigure 3.

Figure 8 .
Figure 8. Forest plot showing the results of random-effects meta-analysis for the five studies on the effect of theatre interventions on tolerance.Note: ES-effect size, CI -confidence interval, ID -study label.Details on other elements of the plot can be found in the note toFigure 2.

Figure 9 .
Figure 9. Funnel plot for the five studies on the effect of theatre interventions on tolerance.Note: SEstandard error.Details on the elements and interpretation of the funnel plot are presented in the note toFigure 3.

Figure 10 .
Figure 10.Forest plot showing the results of random-effects meta-analysis for the six studies on the effect of theatre interventions on self-concept.Note: ES-effect size, CI -confidence interval, ID -study label.Details on other elements of the plot can be found in the note toFigure 2.

Figure 11 .
Figure 11.Funnel plot for the six studies on the effect of theatre interventions on self-concept.Note: SE-standard error.Details on the elements and interpretation of the funnel plot are presented in the note toFigure 3.

Included into meta-analyses (n=21) Figure 1. Prisma
flow diagram for a search of studies on the effects of theatre interventions on social competencies.

Table 1 .
Characteristics of the included studies.

Table 2 .
Data for meta-analyses.
a SE : standard error.b J: Hedges' correction factor for small sample bias.