Effects of false-evidence ploys and expert testimony on jurors, juries, and judges

Abstract Triers of fact evaluated trial materials involving disputed confessions, false-evidence ploys (FEPs) during interrogation, and expert testimony. In two experiments, we assessed pre-deliberation and post-deliberation trial decisions as well as individual jurors’ perceptions, deliberating juries’ verdicts, and sitting judges’ perceptions and trial decisions. Judges convicted more often than did juries. Although triers of fact recognized the deception inherent in FEPs, the use of FEPs in police interrogations did not affect these decision-makers’ trial outcomes. Expert testimony, however, affected perceptions and reduced jurors’, deliberating juries’, and sitting judges’ likelihood of conviction. We provide recommendations for courts, scholars, and police interrogators.


PUBLIC INTEREST STATEMENT
In police interrogations, the use of false-evidence ploys (FEPs)false claims to have evidence that implicates the suspect in the crimeremains legal despite experimental and archival evidence that these tactics increase the likelihood of false confessions. We evaluated triers of fact (i.e., individual jurors, deliberating juries, and sitting judges), particularly their perceptions and trial decisions related to deception during interrogation, and we also evaluated the effects of expert testimony. We found that judges were substantially more likely to convict than were juries. Additionally, although triers of fact recognized the deception and coercion present in FEPs, the precence of police deception did not affect trial decisions. Expert testimony induced skepticism in all triers of fact, leading them to be less likely to convict, but did not help them become more sensitive to police deception. Juries and judges provide only limited protections for defendants who recant their confessions after police deception.
individuals falsely confessed, falsely plead guilty, or otherwise incriminated themselves, whereas the National Registry of Exonerations (2018) indicates that 12% of exonerated individuals falsely confessed. Because of these stark numbers, scholars have become increasingly interested in the personal, environmental, and situational factors influencing the likelihood an individual would falsely confess to a crime that he or she did not commit (for reviews see Gudjonsson, 2003;Kassin et al., 2010;Kassin & Gudjonsson, 2004;Leo, 2008;Woody, Forrest, & Stewart, 2011).
The present research targets how triers of fact evaluate and apply disputed confession evidence when reaching a verdict and recommending a sentence. Confession evidence goes beyond the defendant's admission of guilt. Instead, triers of fact are required to review detailed evidence/testimony from the interrogator, the defendant, and if allowed, expert witnesses. Therefore, in addition to the confession itself, triers of fact consider whether techniques known by scientists to contribute to false confessions (e.g., an FEP) could have influenced the defendant's decision to confess. Because a confession is such a powerful evidence, the legal system relies on decision-makers such as juries and judges to protect defendants from the consequences of disputed confession evidence (cf. Kassin et al., 2010). As explored subsequently, this study addresses a prominent disconnect in the study of interrogation and confession: scholars recognize the power of FEPs to increase false confession rates (Kassin, Redlich, Alceste & Luke, 2018) as courts continue to accept confessions generated by FEPs. After reviewing court decisions guiding triers of fact in their duties, relevant scientific studies, and the potential for expert testimony to shape legal processes, we outline two studies involving juries and judges, their perceptions of confession evidence, and related trial decisions. The confession evidence used in the studies varied in interrogation technique (presence or absence of an FEP) and expert testimony (absent or present).

Jurors and juries
Courts have expressed great confidence in the abilities of jurors and juries to evaluate disputed confessions. For example, the US Supreme Court ruled that it was not unconstitutional to use of preponderance of evidence, the lowest legal standard to decide a verdict, to decide whether to admit a disputed confession to trial. The Court stated that this decision was "not based in the slightest on the fear that juries might misjudge the accuracy of confessions and arrive at erroneous determinations of guilt or innocence" (Lego v. Twomey. 404 U.S. 477, 1972, p. 625). This confidence in a jury's ability to evaluate a confession's veracity became even more relevant after Arizona v. Fulminante, 111 S. Ct. 1246(1991 extended harmless error analysis to cases involving disputed confessions. By allowing an improperly admitted coerced confession to be a potentially harmless trial error, the US Supreme Court placed "great faith in the ability of a jury to properly evaluate a confession and the evidence about how it is obtained" (Wakefield & Underwager, 1998, p. 437) and assumed that jurors can both recognize and reject coerced confessions (Kassin & Sukel, 1997).
Unfortunately, experimental research reveals that even when jurors recognize coercive techniques during an interrogation and report that they rejected the resulting confession, they remain more likely to convict than do jurors who review identical trial stimuli without confession evidence (Kassin & Sukel, 1997;Kassin & Wrightsman, 1981). In fact, several studies suggest that jurors accept confession evidence in many settings where rejecting the confession may be more appropriate. Examples include confessions from suspects with mental illnesses (Henkel, 2008), confessions following maximization or minimization interrogation tactics (Kassin & McNall, 1991), confessions that conflict with DNA evidence when the prosecution provides a pro-guilt explanation for the discrepancy (Appleby & Kassin, 2016), or secondary confessions from accomplices, even when these accomplices have received rewards for their testimony (Neuschatz, Lawson, Swanner, Meissner, & Neuschatz, 2008;Neuschatz et al., 2012; but see Maeder & Pica, 2014).
The experimental literature mirrors findings from actual trials. For example, despite DNA evidence indicating another perpetrator had committed the crime, Juan Rivera's confession led to his conviction by a jury (People v. Rivera, 2011). As noted by Leo, Neufeld, Drizin, and Taslitz (2013), "confessions exert a strong biasing effect on the perceptions and decision-making of criminal justice officials and lay jurors alike, tending to define the case against a defendant and usually overriding any contradictory information or evidence of innocence" (p. 772; see also Drizin & Leo, 2004;Kassin, 2012Kassin, , 2017Leo & Ofshe, 1998). Despite courts' assertions that jurors provide protection against coercion (e.g., Lego v. Twomey. 404 U.S. 477, 1972), jurors appear unable to meet these legal expectations.

Judges
Classic studies by Kalven and Zeisel (1966) found that although judges agree with juries a substantial majority of the time (see also Eisenberg et al., 2005;Heuer & Penrod, 1994), judges are more likely to disagree with juries when the case is "close" (i.e., both sides had strong arguments) rather than "clear" (i.e., when one side appeared stronger; Kalven & Zeisel, 1966, p. 157). When judges and juries disagree, judges are more likely to favor conviction (Eisenberg et al., 2005;Heuer & Penrod, 1994). Similarly, judges are less likely than juries to convict when the evidence is weak, but more likely to convict when the evidence is strong (Eisenberg et al., 2005;Kalven & Zeisel, 1966; see also Gastwirth & Sinclair, 2004). Given that observers perceive confession evidence as clear and strong (Henkel, Coffman, & Dailey, 2008;Kassin & Neumann, 1997), these findings raise important questions about how the decisions of judges and juries may differ in cases involving disputed confessions. Although it is possible that judges would more accurately interpret confession evidence, judges appear to share cognitive biases with other decision-makers (Guthrie, Rachlinksi & Wistrich, 2001) and struggle to ignore inadmissible information (Landsman & Rakos, 1994;Wistrich, Guthrie & Rachlinksi, 2005). In studies that have directly compared judges to laypeople, small differences emerged, but both participant groups used information to make these judgments in "nearly identical" ways (Howe & Loftus, 1992;p. 111; see also Howe, 1991).
Similar to the Kassin and Sukel(1997) study of mock jurors, Wallace and Kassin (2012) presented stimuli that included weak or strong evidence and a confession condition (none, low pressure, or high pressure) to a sample of judges. Of the judges surveyed, 65.7% indicated the high-pressure interrogation was coercive, and almost all judges in this condition viewed the confession as inadmissible. Despite these encouraging findings, judges struggled to reject the confession; they convicted far more often in the weak-evidence/high-pressure condition than in the weak-evidence/ no confession condition, much as did mock jurors in Kassin and Sukel's (1997) experiment.

FEPs
One commonly used police interrogation technique is the FEP or a false claim to have evidence that connects a suspect to the crime (Forrest et al., 2012;Leo, 2008;Woody & Forrest, 2009). FEPs have served as foundational tactics in police interrogation since the transition from coercion to deception in the mid-twentieth century and remain in wide use (Inbau, 1976;Inbau & Reid, 1967;Kassin et al., 2018;Leo, 1992;Woody, in press). Police commonly use FEPs with adult and juvenile suspects (Cleary & Warner, 2016). In a national study of police detectives, 92% reported using FEPs at least some of the time (Kassin et al., 2007).
Despite these concerns, courts have typically admitted to trial confessions generated in part by police deception about evidence. Examples of such instances include false claims of fingerprints, bloodstains, or an accomplice's confession implicating the defendant (see e.g., Frazier v. Cupp, 394 U.S. 731, 1969;People v. Lira, 119 Cal. App. 3d 837, 1981;State v. Cobb, 115 Ariz. 484;566 P.2d 285, 1977;State v. Jackson, 308 N.C. 549, 1983). This consistent pattern of legal precedents to admit these confessions to trial provides the foundation for their continued use and for recent claims by Inbau, Reid, Buckley, and Jayne (2011) that false confessions are generated not by FEPs but rather by one of many illegal tactics that have already led courts to reject confessions (e.g., physical coercion or explicit threats).
As noted previously, there exists a substantial body of scholarship to demonstrate that FEPs increase false confession rates, and courts continue to accept confessions generated by FEPs. In the present study, we provided jurors, juries, and judges with an opportunity to consider these two perspectives on FEPs. Would triers of fact view FEPs as coercive, in line with experimental and archival findings and the beliefs of experts, or would these triers of fact view FEPs as noncoercive, in line with relevant court precedents and claims by Inbau et al. (2011)?
In general, expert testimony brings small but consistent effects on verdicts in criminal trials (Nietzel, McCarthy, & Kern, 1999). Expert testimony influences jurors by either generating sensitivity, allowing jurors to differentiate between strong and poor evidence (Cutler, Dexter, & Penrod, 1989) or skepticism, leading jurors to doubt the evidence overall (McCloskey & Egeth, 1983). Investigations of jurors who evaluate confession evidence have resulted in mixed findings. For example, Henderson and Levett (2016) found that expert testimony sensitized jurors to the consistency between the case facts and the confession. Jurors who received expert testimony were more likely to convict when the confession was consistent rather than inconsistent with case facts; however, jurors who did not receive expert testimony were unaffected by confession consistency. Similarly, Jones and Penrod (2018) used evidencebased instructions rather than testimony from an expert and found that instructions induced sensitivity to the length of the interrogation and to potentially coercive tactics. Other researchers, however, have found that expert testimony has generated general skepticism. For example, jurors who read expert testimony perceived interrogations as more coercive and deceptive (Woody & Forrest, 2009) than did jurors who did not read expert testimony. Expert testimony has also decreased jurors' guilty verdicts and beliefs in the defendant's guilt, regardless of how the confession was elicited (Gomes, Stenstrom, & Calvillo, 2014;Woestehoff & Meissner, 2016;Exp. 3;Woody & Forrest, 2009). Finally, some scholars report that expert testimony did not lead to sensitivity or skepticism (Jones & Penrod, 2016;Neuschatz et al., 2012;Maeder & Pica, 2014;Woestehoff & Meissner, 2016; Exp 1), leaving experts' testimony ineffective or unneeded. The conflict between the growing body of scholarship and existing court precedents may be particularly relevant for experts. Experts who discuss FEPs and other deceptive tactics may find their testimony in conflict with legal precedents and with judges' existing knowledge. We sought to systematically replicate previous investigations of expert testimony on individual jurors' perceptions and trial decisions and to extend these questions to deliberating juries and sitting judges.

Overview and hypotheses
In two studies, we examined individual jurors', deliberating juries', and judges' perceptions and decisions in a simulated trial involving a disputed confession. We evaluated the impacts of expert testimony and an FEP involving an eyewitness during the interrogation. Specifically, we examined the perceptions and pre-deliberation trial decisions of individual jurors, who then formed juries and deliberated to consensus on a verdict. We also examined jurors' post-deliberation perceptions and trial decisions. In a separate study using identical stimuli, we examined perceptions and trial decisions of a sample of sitting judges, and then we compared verdicts of sitting judges and deliberating juries.

Jurors and juries
We hypothesized that the presence of an FEP in the interrogation would impact jurors and juries in similar ways, and we sought to evaluate whether these effects would persist from individual jurors through jury deliberation. Based on Woody and Forrest's (2009) findings, we hypothesized that individual pre-deliberation jurors would perceive the interrogation as more deceptive and more coercive when an FEP was present. We expected, however, that the presence of an FEP would have only limited effects on individual jurors' verdicts and ratings of the defendant's guilt, both pre-and post-deliberation. We then evaluated whether the influence of FEPs on trial decisions extended into the more ecologically valid simulation of deliberating juries (Nunez, McCrea, & Culhane, 2011).
Based on previous findings (Gomes et al., 2014;Woody & Forrest, 2009), we hypothesized that expert testimony would produce a skepticism main effect, leading jurors to perceive the interrogation as more deceptive and more coercive regardless of the presence of an FEP. We also anticipated that across FEP conditions expert testimony would reduce guilty verdicts, guilt ratings, and length of sentencing recommendations. Although we expected a skepticism effect (see Woody & Forrest, 2009, from which we derived materials), we designed these studies so that we could also detect a potential sensitivity effect. A sensitivity effect would be present if expert testimony increases deception and coercion ratings, and decreases guilty verdicts, perceptions of guilt, and recommended sentences, in FEP-present conditions but not FEP-absent conditions.

Judges
Based on findings by Wallace and Kassin (2012) as well as others who have compared judicial to jury decision-making, we expected judges to evaluate FEPs similarly to jurors. As discussed previously, the US Supreme Court and other courts have consistently accepted confessions induced in part by FEPs, even though this acceptance conflicts with experimental and archival data and experts' beliefs. We therefore expected judges, as legal experts, to recognize the greater deception in FEPs but not to view these tactics as more coercive, and we sought this first known opportunity to assess the influence of FEPs on sitting judges' trial decisions. We expected that expert testimony would affect judges in similar ways to jurors, leading to general skepticism in the form of higher ratings of deception and coercion, reduced likelihood of guilty verdicts, and shorter sentences. Finally, as with jurors and juries, we designed our study with the opportunity to evaluate a potential sensitivity effect such that judges in FEP-present/expert-present conditions would differ from judges in the FEP-Absent/expert-present conditions, while judges without expert testimony would not differ as a function of FEP.

Comparisons
Based on previous findings that judges were more likely to convict when the evidence was strong or clear (Eisenberg et al., 2005;Gastwirth & Sinclair, 2004;Kalven & Zeisel, 1966), we expected judges to be more likely than juries to convict the defendant when faced with confession evidence, which participants consistently rate as powerful (e.g., Henkel et al., 2008;Kassin & Neumann, 1997).
Additionally, because judges use legal precedents, we expected FEPs to have smaller impacts on judges than on juries. Similarly, we expected the presence of expert testimony to have smaller impacts on expert legal decision-makers than on juries.

Participants
Six hundred twenty-two university students (238 male, 381 female, three unreported) participated to fulfill a course requirement (see Bornstein, 1999;Bornstein et al., 2016; for a review and metaanalysis, respectively). A majority (n = 441, 71%) were freshman, 100 (16.1%) were sophomores, 59 (9.5%) were juniors, and 21 (3.4%) were seniors; we did not sample students from law programs. Students participated as individuals, then formed 86 juries of six to eight jurors, deliberated to consensus on verdicts, and finally responded to individual posttest questions. All participants provided informed consent, and we treated all participants according to American Psychological Association [APA] ethical requirements (APA, 2002). The data that support the findings of this study are available from the corresponding author, William Douglas Woody, upon reasonable request.

Materials. Trial summary
A condensed trial summary provided the following information: the female homicide victim was an associate of the male defendant; police did not have actual evidence implicating the defendant; police initiated the interrogation due to the prior association between the victim and defendant; and the defendant confessed to killing the victim. 4 6.1.3. Interrogation transcript Stastny, Forrest, Leo, and Bienhoff (2006) condensed the 15-page interrogation transcript from an actual 385-page interrogation transcript (see Woody & Forrest, 2009). Across all conditions, the transcript opened with introductions between the officers and the suspect, Miranda warnings, and the signing of a Miranda waiver. The transcript then included interrogators' questions about the suspect-victim relationship and inconsistencies in the suspect's alibi. Both interrogation transcript conditions ended in the suspects' confession. In the FEP-present transcript, the police responded to the suspect's denials with a story concerning false testimony from a witness who saw the victim at the suspect's residence on the day of the murder. In response to the FEP, the suspect then confessed. In FEP-absent conditions, police persistently questioned the suspect about his denial and appealed to their need for the truth; the suspect then confessed.

Expert testimony
A credentialed expert 5 stated that false confessions occur and have occurred during police interrogation. In FEP-present conditions, the expert noted concerns about false confessions in response to FEPs. The expert also discussed the differences between demeanor, testimonial, and scientific FEPs (cf. Forrest et al., 2012;Woody & Forrest, 2009). In FEP-absent conditions, the expert noted concerns about false confessions in response to general tactics often employed during police interrogation, such as minimization and maximization. Although the interrogation in the FEPabsent condition does not include deception, we evaluated expert testimony because experts may testify about a wide range of topics related to the suspect and to the specific interrogation tactics, among other topics (Costanzo & Leo, 2007;Kassin, 2008a). We employed slightly different forms of expert testimony in our FEP-absent and FEP-present conditions to reflect actual expert testimony in these cases; an expert would be unlikely to discuss FEPs in a trial in which police did not use FEPs. We further discuss these differences below. Across both conditions, the expert did not address details of the specific case, defendant, or confession. The expert's testimony conditions were identical to previous research (Woody & Forrest, 2009).

Closing arguments and jury instructions
All participants read closing arguments from the prosecution and from the defense. The prosecution argued that the defendant's confession demonstrates his guilt. The defense argued that the defendant only confessed after the stress of the interrogation and that he is not guilty.
We used Colorado jury instructions that included definitions of murder in the second degree, presumption of innocence, and beyond a reasonable doubt as the standard of proof (Criminal Code, 18 CO. Rev. Stat. § § 3-103, 2004;Criminal Code, 18 CO. Rev. Stat. § § 1-402, 2004). Participants rendered verdicts and, if they found the defendant guilty, recommended a sentence. Although jurors do not typically sentence defendants, we included this measure to assess jurors' perceptions of the defendant. Second-degree murder is a class two felony in Colorado, and the presumptive sentencing range for judges extends from 8 to 24 years of incarceration (Criminal Code, 18 CO. Rev. Stat. § § 1. 2004). To reflect judicial discretion in sentencing, participants were selected from probation only or 4, 8, 12, 16, 20, or 24 years in prison.

Post-transcript questionnaire
Participants used a 10-point Likert scale (1 = not at all, 10 = completely) to rate the degree to which they perceived the defendant to be guilty. They used the same scale to report their perceptions of the interrogation, including the degrees to which they perceived interrogation as deceptive and coercive. We embedded these ratings in other questions such as the degrees to which the interrogation was strategic and justified.

Procedure
We collected data in three phases: pre-deliberation individual jurors, deliberating juries, and postdeliberation individual jurors. Each data collection session included a group of 6 to 8 participants who were all randomly assigned to one of four conditions formed by crossing presence or absence of FEP and presence or absence of expert testimony. Each group of participants completed predeliberation measures as individuals under the supervision of experimenters. Next, participants relinquished their pre-deliberation materials, formed a jury with 6 to 8 members, heard jury instructions, selected a foreperson, and deliberated to consensus on a verdict. If participants failed to reach consensus within one hour, experimenters offered another 10 min of deliberation to juries who stated that they could agree. If juries could not do so in this period of time, we designated them as hung. Lastly, participants completed individual post-deliberation measures that were identical to pre-deliberation measures. We immediately debriefed all participants.

Verdicts and guilt ratings
To evaluate participants' verdicts, we used a binary logistic regression equation with presence of FEP and presence of expert as between-participants predictor variables. We entered the main effects into the first step of the model and the interaction into the second step. The initial model was significant, −2 Log L = 848.29 (df = 619), p = .001, Cox and Snell R 2 = .02. Presence of FEP did not predict verdicts, Wald Χ 2 = .03, p = .854; however, as shown in Table 1, participants in expert-present conditions were less likely to convict the defendant than were participants in expert-absent conditions, Wald Χ 2 = 13.25, Β = 1.81, 95% CI [1.31, 2.48], p < .001. This finding demonstrated a skepticism main effect. The addition of the interaction did not improve the model, χ 2 (1) = 1.01, p = .30; there was no expert-induced sensitivity to the deception.
As with verdicts, expert testimony induced general skepticism rather than sensitivity to the FEP.  Table 1).

Deliberating juries. Verdicts
Juries deliberated to consensus on verdicts. We instructed juries that could not reach consensus to make an additional attempt to do so. Deliberation times ranged from 3 to 60 min (M = 14.86, SD = 12.37, Median = 11.00); no jury sought additional time beyond 1 h.
Across all conditions, 22 juries convicted the defendant, 59 acquitted the defendant, and 5 juries hung. We used a binary logistic regression equation with presence of FEP and presence of expert testimony as categorical predictor variables and jury verdicts as the dependent variable. We entered the main effects in the first step and the interaction in the second step; for this analysis we included hung juries with acquitting juries. In the first step, the model was not significant, −2 Log L = 93.62 (df = 83), p = .12, Cox and Snell R 2 = .05. Presence of FEP did not predict verdicts (Wald Χ 2 = .21, p = .65), but, despite the nonsignificant model, presence of expert testimony predicted verdicts (Wald Χ 2 = 3.73, Β = .36, 95% CI [.13, 1.02], p = .05). Juries who received expert testimony were less likely to convict the defendant (16.3%) than were juries who did not receive expert testimony (34.9%), demonstrating a skepticism main effect. The addition of the interaction term did not improve the model, X 2 (1) = 1.73, p = .19; there was no evidence of sensitivity.

Jury composition and deliberation
Juries ranged from six to eight members; therefore, for each jury we evaluated the percent of predeliberation pro-conviction jurors rather than actual numbers of jurors. A binary logistic regression analysis revealed that the percent of pre-deliberation jurors who favored conviction predicted the likelihood of conviction by the jury, −2 Log L = 47.26 (df = 79), p < .001, Cox and Snell R 2 = .44, Wald Χ 2 = 18.32. Juries that convicted the defendant started with a greater average percentage of proconviction individual jurors (M = 74.1%, SD = 15.38) than did juries that acquitted (M = 37.8%, SD = 18.29). Additionally, only one of the 22 convicting juries (4.5%) started with a majority of proacquittal jurors, and only two convicting juries (9.0%) started with even numbers of pro-conviction and pro-acquittal jurors. In contrast, 11 (18.64%) acquitting juries started with a majority of proconviction majority, 9 (15.25%) started with evenly divided jurors, and 39 (66.10%) started with a pro-acquittal majority.

Hung juries
Five juries could not reach consensus in the time allotted. Although the small n precludes formal analysis, each hung jury started with a substantial majority of pro-conviction jurors (63-86%), and four of the five juries hung with individual post-deliberation measures revealing a majority in favor of conviction and one or two jurors in favor of acquittal. In the fifth hung jury, a substantial majority started pro-conviction, but individual post-deliberation measures revealed that a majority changed their view to pro-acquittal and two jurors remained pro-conviction.

Post-deliberation individual jurors
After deliberation, participants individually completed all pre-deliberation measures. For our analysis of post-deliberation outcomes, we used a series of two-level linear regression models with juror fixed effects in order to account for the nesting of participants within juries. These models revealed a pattern of results that did not differ substantially from pre-deliberation analyses. As shown in Table 1, our regression revealed that presence of FEP led to significantly higher individual post-deliberation ratings of deception (t = 4.09 p < .001) but presence of expert did not affect deception ratings (t = −1.75 p = .08). In regard to coercion, neither presence of FEP (t = 0.54 p = .59) or expert (t = 0.23 p = .82) was significant. Next, we conducted a two-level binary logistic regression analysis to analyze verdicts using juror fixed effects. With both main effects, the binary logistic regression model was significant with predeliberation data, the analyses revealed no effect for FEPs on verdicts (t = 0.12, p = .91) but a significant effect of expert testimony on verdicts (t = −2.03, p = .04), such that jurors who read expert testimony remained less likely to convict. Our regression model for participants' Likert ratings of defendant guilt revealed a similar pattern. There was no effect for the presence of an FEP (t = 1.42 p = .16), but the presence of expert testimony decreased guilt ratings (t = −3.46 p = .001). The regression model for participants' post-deliberation sentences revealed no effects for presence of FEP (t = 1.13, p = .27) or presence of expert (t = −.63, p = .53).

Changes during deliberation
Across 615 individual jurors with complete data, 208 (33.8%) changed their verdicts after deliberation. The overall individual conviction rate reduced from 48.6% to 29.6%, McNemar Test, p < .001. We evaluated the impact of each independent variable on changes in individual verdicts pre-to post-deliberation; we did not examine the nonsignificant interaction.
When the interrogation did not include an FEP, 34.1% of jurors changed their verdicts, 81 from guilty to not guilty, and 22 from not guilty to guilty. This resulted in a change in conviction rates from 48.0% to 28.6%, McNemar Test, p < .001. When the interrogation included an FEP, 33.5% of jurors changed their verdicts, 81 from guilty to not guilty and 24 from not guilty to guilty, a change in conviction rates from 49.1% to 30.5%, McNemar Test, p < .001. The difference in the patterns of verdict changes between FEP-present and FEP-absent conditions, however, was not significant, Χ 2 (1) = 0.16, p = .69, Cramer's V = .03, 95% CI [.00, .16].
Among jurors who did not read expert testimony, 103 of 293 (35.2%) changed their verdicts, 74 from guilty to not guilty and 29 from not guilty to guilty; 56.3% of pre-deliberation jurors convicted the defendant, and 41.3% of post-deliberation jurors convicted the defendant, McNemar Test, p < .001. For jurors who read expert testimony, 105 of 322 (32.6%) changed their verdicts, 88 from guilty to not guilty and 17 from not guilty to guilty, resulting in a change in individual conviction rate from 41.6% pre-deliberation to 19.6% post-deliberation, McNemar Test, p < .001. The difference in verdict changes between these conditions was significant, Χ 2 (1) = 4.32, p = .04, Cramer's V = .14, 95% CI [.00, .28], such that expert testimony led to greater reductions in conviction rates during deliberation.

Discussion: juror and jury decision-making
Across conditions, jurors perceived the interrogation as deceptive and coercive, even in conditions without an FEP or any other deception (see Forrest et al., 2012;Woody, Forrest, & Yendra, 2013). Across pre-and post-deliberation data, individual jurors in FEP-present conditions viewed the interrogation as more deceptive and as more coercive than did jurors in FEP-absent conditions. Participants recognized the deception and potential coercion, even in the absence of expert testimony. Although the FEP influenced perceptions of the interrogation, it did not affect the verdicts of individual jurors or deliberating juries.
The presence of an expert, however, affected jurors' perceptions and trial decisions. Expert effects did not differ by FEP condition, despite the differences in the expert's testimony across conditions. Participants rated the interrogation as less deceptive after the expert testimony, a finding which conflicts with prior scholarship (Woody & Forrest, 2009). One possible explanation for this effect is methodological. The expert's discussion of deception could have led participants to view the nondeceptive transcript as less serious, and the expert's presentation of three types of FEP may have led participants who read a transcript containing a single FEP to view it as a mild deception about eyewitness evidence. Despite the unexpected direction of the influence of expert testimony across both FEP conditions, the presence of expert testimony provided sufficient information to affect individual jurors' verdicts and guilt ratings and to lead to a general skepticism effect. Although we evaluated the possibility of a sensitivity effect (i.e., that jurors would be less likely to convict in the expert-present/FEP-present conditions than in expert-present/FEP-absent conditions), no evidence for expert-induced sensitivity emerged in jurors' perceptions of deception or coercion, verdicts, guilt ratings, or recommended sentences.
Deliberation affected participants' perceptions and trial decisions. First, we found evidence of leniency bias (see Devine et al., 2004) in the composition of juries. Almost all juries (95.5%) that convicted the defendant started with a majority of pro-conviction individual jurors. In contrast, only 20 (66.1%) of pro-acquittal juries started comprised of 50% or more pro-acquittal jurors. Second, the overall individual conviction rate dropped after deliberation, further demonstrating a leniency bias (see MacCoun & Kerr, 1988;Ruva & Guenther, 2015), and this effect was more pronounced for jurors who read expert testimony. As noted by MacCoun and Kerr (1988), a lone juror can prevent conviction, and with one exception the composition of our hung juries demonstrated the power of one or a few jurors to prevent a conviction.
The overall pattern of jurors' perceptions and decisions remained generally consistent predeliberation to post-deliberation, but some important changes occurred. FEPs did not affect jurors' verdicts, and FEPs did not differentially affect the reduction in post-deliberation conviction rates. As reported by others (Woody & Forrest, 2009;Woody et al., 2013), jurors recognized the deception and perceived it as coercive, but these perceptions did not predict trial outcomes. Jurors appeared limited by the fundamental attribution error; despite their recognition of situational factors, they continued to emphasize individual decision-making and to devalue deception or other factors outside of the individual (see Appleby et al., 2013;Costanzo & Leo, 2007;Kassin, 2008b). A different pattern emerged for the effects of expert testimony. Expert testimony led to a skepticism main effect; jurors who received expert testimony were less likely to convict than jurors who did not, and jury deliberation enhanced the impact of expert testimony. These conclusions are limited by the nature of the sample of university students who served as mock jurors and examined written trial materials (Bornstein, 199;Bornstein et al., 2016), and we encourage replication and extension of these findings with more realistic samples. We then extended this study to include a sample of individuals who do face these legal decisions: sitting judges. As Robbennolt (2005) argued, "Ideally, an experimental comparison between judges and juries would compare the responses of a sample of judges to the responses of a sample of mock juries" (p. 487). We seized this opportunity to present identical trial materials to sitting judges.

Experiment 2: judges
We provided identical materials to a sample of sitting judges. These choices allowed us to evaluate the decisions of legal experts and to compare trial decisions of deliberating mock juries and actual sitting judges under nearly identical conditions.

Participants
Participants included 129 sitting judges (97 males, 21 females, 11 unreported) from across the United States. Judges earned law degrees between 1958 and 2001 and had served between 1 and 36 years on the bench (M = 13.75, SD = 8.74). Using the mail, we sent materials to approximately 2000 sitting District, Appellate, and State Supreme Court justices. Judges did not receive compensation for participation. Our approximate response rate 6 was limited, however, with only 6.4% of our sample responding. All participants provided informed consent, and we treated all participants consistent with APA ethical standards (APA, 2002).

Materials and procedure
Judges read a condensed trial summary and an interrogation transcript with or without an FEP. Judges in expert-present conditions read expert testimony, and all judges completed the same posttest questions described previously. After reading the trial summary, expert testimony (if applicable), and the interrogation transcript, judges answered several questions. Judges indicated whether they would a) admit the confession to trial, b) allow the jury to read the transcript, and c) allow the expert to testify in court (for judges in the expert-present conditions only). Beyond these differences, judges followed the same procedures described for individual jurors.

Results
We analyzed data from the 129 responding judges. We review judges' pretrial decisions, perceptions of the interrogation, and trial decisions.

Pretrial decisions
Overwhelmingly, judges (n = 123, 95.3%) reported that they would admit the confession into the trial, and 87 judges (67.4%) reported that they would allow the jury to read the interrogation transcript. Neither presence of FEP, presence of expert testimony, nor the interaction predicted judges' pretrial decisions (Wald Χ 2 s < .34, ps > .55).
As an additional measure of their perceptions of guilt, we asked judges to rate the degree of the defendant's guilt using a Likert scale (1 = not at all, 10 = completely). The mean rating was 6.77 (SD = 2.83), and descriptive statistics are in Table 2. We used an ANOVA with presence of FEP and presence of expert as independent variables. Similar to verdict, the presence of FEP was not significant, F(1,117) < 0.01, p = .97, d < .01, 95% CI [−.36, .36], but expert testimony outcomes approached significance for a skepticism main effect and lower guilt ratings than did expert-absent conditions, F(1,117) = 3.28, p = .07, d = .34, 95% CI [−.02, .70]. The interaction was not significant, F(1,117) = 2.11, p = .15, η p 2 = .02; expert testimony did not increase judges' sensitivity to the FEP.
Although prior scholars (Woody & Forrest, 2009) found that simulated jurors recommended shorter sentences when police deceived suspects about evidence, jurors typically do not sentence defendants. We extended their methods to evaluate sitting judges who do sentence defendants. For the 80 judges who both convicted the defendant and recommended a sentence, as shown in Table 2 12. Discussion: judicial decision-making As a group, judges followed the law throughout their decision-making process. An overwhelming majority of judges reported that they would admit the confession to trial. The legal threshold for admitting a confession is preponderance of evidence, and all depicted interrogation tactics, including the deception associated with the eyewitness FEP, have led to confessions that have been accepted by previous courts (e.g., State v. Jackson, 308 N.C. 549, 1983). Despite some judges who disagreed, a substantial majority of judges reported that they would allow jurors to read the transcript and that they would allow the expert to testify. Judges' decisions about the admissibility of expert testimony remained unaffected by the presence of the FEP; notably, judges were not more likely to admit an expert to discuss an interrogation that included deception about evidence. Instead they viewed expert testimony concerning interrogations, regardless of the techniques used in the current interrogation, as beneficial for jurors and potentially the judges themselves.
Judges' perceptions of the interrogation reflected legal expectations rather than scientific and archival findings or experts' beliefs. Judges who read an interrogation transcript including an FEP rated the interrogation as much more deceptive than did judges who read the same transcript without an FEP. These judges, however, unlike the lay participants in the previous conditions, only perceived the deceptive interrogation as slightly more coercive, in line with legal precedent about confessions generated by FEPs (e.g., Frazier v. Cupp, 394 U.S. 731, 1969) but in contrast with experimental studies, archival evidence, and widespread beliefs among scholarly experts about FEPs and false confessions, as discussed previously. Additionally, the higher coercion ratings in FEP-present conditions did not predict verdicts or sentences in these conditions. The greater perceived coercion did not lead to rejection of the confession, as found by Wallace and Kassin (2012) under similar circumstances. Similar to jurors, judges appear to be influenced by the fundamental attribution error in ways that limit their recognition of the influence of external factors.
Expert testimony induced a skepticism main effect on judges' verdicts and perceptions of the defendant's guilt. Although the p-values only approached significance, we evaluated these differences as worthy of review; the limited power of this analysis raises the risk of Type II error, and the effect size of the difference is similar to effect size measures for the significant difference between individual pre-deliberation jurors. The effect was small (i.e., a difference in conviction rates of 15.1% and a small difference in guilt ratings), but expert testimony had these impacts even on seasoned judicial legal decision-makers; these findings require replication and extension with larger samples and statistical power. Judges, like jurors, did not show evidence of a sensitivity effect. Woody and Forrest (2009) found that jurors recommend shorter sentences after FEPs; however, this preliminary investigation of sitting judges did not reveal effects of the FEPs or experts on judicial sentencing recommendations. Judges appeared to separate the details of the interrogation from verdict decisions and perceptions of guilt.
Although judges appear unaffected by recent findings regarding the potentially coercive effects of FEPs on participants in experimental studies as well as actual interrogations (see Kassin et al., 2010;Stewart et al., 2018), judges appeared to follow legal precedents appropriately regarding police deception about evidence. A substantial majority of judges were willing to admit expert testimony to educate the court and potentially themselves, but judges did not appear aware of the growing archival and scientific literature regarding the coercive effects of FEPs. Across conditions, judges did not recognize the potentially coercive effects of this deception and did not provide protections to defendants.
This first experimental investigation of judges includes several limitations, including low N and the low return rate, potentially due to the lack of compensation for participation, as well as a potentially biased sample of judges (e.g., those who find interrogation particularly interesting or have powerful views about confession evidence). Additionally, we did not query judges about whether they had presided over trials that included disputed confessions induced in part by FEPs. We return to these questions in the general discussion.

Exploratory comparison of legal decision-makers: judges and deliberating juries
As noted by Robbennolt (2005), direct comparisons of judicial and jury decision-makers who read identical materials are rare. Despite the limitations of our sample of judges, the opportunity to compare decisions of sitting judges with the decisions of deliberating juries offered several insights.
We used two binary logistical regression equations to compare judicial verdicts (n = 126) with jury verdicts (n = 86), and we considered potential interactions between decision-makers' identities (judge or jury) and our independent variables. To avoid dilution of power in these analyses, we evaluated the independent variables in separate equations.
Despite the substantial differences between the conviction rates in these samples as well as differences in typical experiences, legal knowledge, and other factors between individual mock jurors and sitting judges, we sought this opportunity to conduct an exploratory analysis of recommended sentences. We compared sentencing recommendations from individual pre-deliberation jurors who convicted the defendant (n = 300) to sentences recommended by judges who convicted the defendant (n = 80). An independent t-test revealed the sentences recommended by jurors (M = 13.61, SD = 7.09) did not differ from sentences recommended by judges (M = 15.16, SD = 6.89), independent t (378) = −1.80, p. = .07, d = −.22.
14. Exploratory comparison discussion: judges and deliberating juries Across conditions, judges convicted more often than juries, and both juries and judges remained largely unaffected by the presence of deception in the form of an FEP. Expert testimony induced a skepticism main effect and reduced conviction rates of both juries and judges. Pre-deliberation individual jurors did not differ from sitting judges in sentencing recommendations.
Despite the extensive judge-jury agreement reported by Kalven and Zeisel (1966) and others, we found substantial disagreement. None of the conditions in this study resulted in overwhelming conviction rates; therefore, to these triers of facts the evidence may have appeared more "close" than "clear"-conditions under which Kalven and Zeisel (1966) reported most jury-judge disagreement (p. 157). Additionally, the inclusion of a confession as part of the evidence generated the conditions (i.e., strong evidence) in which scholars have reported that judges are more likely than jurors to convict (Eisenberg et al., 2005).
Another important explanation emerges from judges' status as legal experts. Jurors, as laypeople, appear to recognize deception but often do not know that it is legally acceptable for police to lie during interrogation (see Rogers et al., 2010). Additionally, jurors may perceive deception as coercive if they use commonsense notions rather than legal precedents to evaluate interrogation evidence. Judges, as legal experts, should know relevant precedents, and they appear more likely to follow legal expectations rather than commonsense perceptions of the evidence or findings about FEPs from scientific and archival studies. Similar to jurors, judges recognize deception; unlike jurors, however, judges know that courts have accepted confessions induced by deception and that generally FEPs are viewed as not legally coercive (cf. judges' coercion ratings). In this context, the definition of "not legally coercive" is "sanctioned by courts (i.e., that the confessions they produce are admitted into evidence)" (Kassin, 2010, p. 233). In this study, despite some individuals who made different recommendations, judges overwhelmingly admitted the confession to the trial and largely convicted the defendant, both of which demonstrate that they did not view the interrogation, even the deceptive interrogation with an FEP, as coercive. Courts, however, are raising new questions about coercion and the legal uses of deception, and these precedents may shift (see e.g., Bandler, 2014aBandler, , 2014bPeople v. Thomas, 22 N.Y.3d 629;8 N.E.3d 308;985 N.Y.S.2d 193, 2014;Woody, 2017, in press).
Judges' status as legal experts may affect their perceptions and uses of expert testimony. Unlike jurors who rarely if ever face these decisions, judges bring their own knowledge and years of experience to these situations. Similar to individual jurors, however, across multiple dependent measures, we found that expert testimony induced general skepticism about confession evidence but not greater sensitivity to FEPs. Additionally, neither presence of FEP nor presence of expert interacted with the identity of the trier of fact.

General discussion
We evaluated the presence of FEPs and expert testimony on legal decision-makers. In particular, we evaluated whether individual jurors, deliberating juries, and judges can recognize and reject coerced confessions to provide protections to defendants. We also evaluated the impact of expert testimony on triers of fact, particularly whether expert testimony may increase general skepticism about confession evidence or sensitivity to deception during the interrogation. Fundamentally, we sought to explore the perceptions and decisions of triers of fact who face a fundamental disconnect: although scholars widely recognize the power of FEPs to induce false confessions (e.g., Kassin et al., 2018), courts continue to accept confessions elicited by these tactics.
Across the impacts of FEPs and expert testimony, our findings paint a consistent picture. In line with prior scholarship (Forrest et al., 2012;Kassin & Sukel, 1997;Wallace & Kassin, 2012;Woody & Forrest, 2009;Woody et al., 2013), the legal decision-makers in this study met only some of the expectations set forth for jurors in Arizona v. Fulminante, 111 S. Ct. 1246(1991. Individual jurors, both pre-deliberation and post-deliberation, recognized the deception inherent in FEPs and viewed FEPs as coercive. Yet, the presence of an FEP did not affect jurors' trial decisions as individuals or as deliberating juries; they emphasized the defendant's individual decision-making in line with the fundamental attribution error (see Appleby et al., 2013;Costanzo & Leo, 2007;Kassin, 2008b). Although we did not assess jurors' legal knowledge, jurors' responses aligned with legal precedents rather than scientific findings about FEPs.
Similarly, judges clearly recognized the deception inherent in FEPs but viewed these interrogations as only slightly more coercive. Almost all judges admitted the confessions to trial, and a substantial majority convicted the defendant. Simply stated, they overwhelmingly accepted the confessions depicted in this study, viewing FEPs through the lens of court precedents rather than scientific or archival findings, and expert testimony about findings widely accepted by the expert scholars did not make them more sensitive to FEPs.
Despite the legal questions about expert testimony in cases involving disputed confessions (see Citron & Johnson, 2006;Fulero, 2010;Quintieri & Weiss, 2005;Watson et al., 2010), expert testimony about false confessions similarly affected all triers of fact and did so despite the ecologically relevant differences in the depicted testimony across FEP conditions. Individual jurors (pre-deliberation and post-deliberation) who read expert testimony perceived the interrogation as less deceptive, which is contrary to previous research (Woody & Forrest, 2009). Paradoxically, instead of making jurors more sensitive to the impacts of the FEPs, comprehensive expert testimony about deception or about multiple FEPs may have made jurors more sensitive to the lack of deception in the control conditions and the use of only a single FEP in the deceptive conditions. The findings about perceived deception did not align with conviction rates or guilt ratings Beyond the impacts on individual jurors' perceptions, expert testimony affected trial decisions of individual jurors, deliberating juries, and individual judges in similar ways across both FEP-absent and FEP-present conditions. In combination with findings from other scholars (Gomes et al., 2014;Woody & Forrest, 2009), these outcomes suggest that expert testimony provides knowledge that jurors and perhaps judges do not possess (see Woestehoff & Meissner, 2016). These findings challenge the conclusions of the trial court in United States v. Belyea (2005), which rejected expert testimony because the court asserted that the phenomena of interrogation and confession are well-known to typical laypeople, including jurors (see Fulero, 2010).
Despite the consistent skepticism effect, expert testimony failed to increase the sensitivity of triers of fact to police deception. These results align with some previous studies and reviews of expert testimony about confessions (e.g., Leippe, 1995;Leippe & Eisenstadt, 2009;Woestehoff & Meissner, 2016, Exp. 3). Skepticism effects may emerge when the evidence appears potentially flawed (Leippe, 1995;Leippe & Eisenstadt, 2009), and participants viewed the interrogation as very deceptive and coercive, even in control conditions. Alternatively, skepticism can emerge when there is little other incriminating evidence (Leippe, 1995;Leippe & Eisenstadt, 2009), as was the case in this trial summary. In these circumstances, with little evidence beyond the confession to support the defendant's guilt, it would make sense that expert testimony would increase jurors' hesitancy to convict (cf. Leippe, 1995).

Judge-jury disagreement
As expert legal decision-makers, judges recognized the deception but appeared to use legal precedents about FEPs rather than scientific findings. Additionally, judges' and jurors' perceptions and trial decisions reflected the expert testimony in ways that suggest the testimony could serve as an important educational tool for the court, potentially including sitting judges. These trends raise important questions for defendants and their attorneys who may consider bench trials over jury trials. These data suggest that bench trials in cases with disputed confessions increase the likelihood of conviction, even if these exploratory analyses revealed no differences between sentencing recommendations of mock jurors and sitting judges. Additionally, although not evaluated in this study, appellate review may also provide only limited protections to defendants who challenge their confessions, perhaps particularly during harmless error analysis in which judges must first examine the confession and then attempt to evaluate the impacts of the confession on the trial (Wallace & Kassin, 2012). Due to the limitations of this sample of judges, these results require additional investigation.

Recommendations
First, jurors appear likely to possess limited knowledge about interrogation and confession (Blandon-Gitlin et al., 2011;Leo & Liu, 2009;Woody & Forrest, 2009), and expert testimony or instructions have the potential to improve jurors' knowledge (Jones & Penrod, 2018;Woestehoff & Meissner, 2016). Second, the growing body of research evidence makes it increasingly likely that scholars can meet the requirements for Daubert v. Merrill-Dow Pharmaceuticals (1993;see Fulero, 2010;Kassin, 2008a;Stewart et al., 2018;Watson et al., 2010). Third, these findings suggest that there exists the potential for experts to improve perceptions of individual jurors and verdicts by judges or deliberating juries, particularly in cases in which the evidence rests primarily or only on the defendant's confession. We also note, however, that our findings of skepticism also suggest that expert testimony may lead jurors to simply doubt the confession, which may be appropriate for this particular interrogation transcript and trial summary, rather than to evaluate FEPs or other interrogation deception information in line with the growing body of research on deception and false confessions.

Limitations
We examined university students as mock jurors who read a written trial summary (see Bornstein, 1999;Bornstein et al., 2016) and then deliberated to consensus in groups of 6 to 8 members. Additionally, to maximize statistical power in our analyses of individual jurors, we asked them to report pre-deliberation verdicts despite the possibility that this prior commitment may impact jurors' pre-deliberation certainty and the deliberation process itself (Hannaford, Hans, & Munsterman, 2000). Despite these limitations, we used an actual interrogation transcript (Stastny et al., 2006;Woody & Forrest, 2009) and realistic jury deliberation (Nunez et al., 2011). As noted previously, our small sample of judges raises questions as does our selection of trial-related research questions; we recommend that future scholars evaluate judges' experiences in trials that include disputed confessions induced in part by FEPs. The realism of the control transcript may have also limited these outcomes. The FEP-absent condition appeared deceptive and coercive to observers, although less so than the FEP-present condition, and this may have limited the perceived differences between conditions. Additionally, to emphasize ecological validity, we used slightly different versions of expert testimony across FEP conditions, and despite the absence of significant FEP by expert interactions (see also Woody & Forrest, 2009), this emphasis on realism limited the experimental design.
Additionally, the judicial and other protections we examined here are not available to suspects who choose to falsely plead guilty (Redlich, 2010). Defendants who falsely confess are far more likely to falsely plead guilty than defendants who did not confess, and the risks to these defendants remain unaffected by trial protections as examined in this study (Redlich, 2010).

Conclusions
Despite consistent outcomes of archival and experimental studies regarding the power of FEPs to induce false confessions and despite widespread recognition of these findings within the community of scholarly experts, the present findings suggest that legal decision-makers recognize deception in the form of FEPs but do not believe that FEPs are coercive for suspects. These beliefs reflect legal precedents rather than scientific findings about the power of FEPs. Jurors, juries, and judges do not appear to provide adequate protections for suspects who confess after FEPs, and although expert testimony induced skepticism about confession evidence in this case, experts did not help triers of fact become sensitive to the presence of deception about evidence. For these reasons, we join other scholars (e.g., Kassin et al., 2010;Leo, 2008;Woody et al., 2013) in calling for the elimination of police deception in interrogation, particularly because legal decision-makers do not appear to provide adequate protection for defendants facing police deception during interrogation. Notes 1. In the ALT-key paradigm, the experimenter instructs the participant to avoid the ALT-key. The innocent suspect is then accused of pressing the forbidden ALTkey during data collection. 2. In the cheating paradigm, a participant is asked by a confederate to violate the rules of an experiment. The experimenter then accuses the participant, who may be guilty or innocent, of cheating. 3. In the individual cheating method, individual innocent participants complete a examination or game according to rules and then are accused to violating those rules. 4. The trial summary and all other materials are available from the authors. 5. We depicted the expert as highly experienced, and the expert presented testimony that was concise and not complex (see Koehler, Schweitzer, Saks, & McQuiston, 2016;Parrott, Neal, Wilson, & Brodsky, 2015). The expert's credentials included a doctoral degree, more than 20 years of experience as a scholar of police interrogation, and numerous publications and presentations at psychology and police conventions. We held the expert's identities and credentials constant across conditions. 6. The exact response rate is approximate due to an unknown number of misreported addresses and mail errors as well as retirements and other reasons judges may have left the bench.