Enabling witnesses to actively explore faces and reinstate study-test pose during a lineup increases discriminability

Significance Incorrect lineup identification decisions have devastating consequences on the wrongfully accused and for society when guilty perpetrators remain at large. This paper heeds the call by the National Academy of Sciences to utilize technology to increase correct identifications and minimize incorrect identifications. We found that the simultaneous interactive lineup procedure, which allows witnesses to actively explore the lineup members’ faces, increases discriminability compared to traditionally used video and simultaneous photo lineups. Interactive lineups were beneficial regardless of perpetrator encoding viewpoint (i.e., front or profile). In the real world, eyewitnesses may view a perpetrator from multiple perspectives. The interactive procedure allows witnesses to adjust face viewpoints to match their memory of the crime, which may be beneficial for memory retrieval.

Eyewitness identification is a cornerstone of police inquiries and national security investigations.However, mistaken identification has been implicated in about 70% of U.S. wrongful convictions (1).Failures to identify guilty suspects mean missed opportunities to arrest guilty people, allowing them to commit additional crimes.Therefore, implementing procedures that enhance discriminability (i.e., that maximize correct and minimize incorrect suspect identifications) is an important public policy goal (2).
Nevertheless, the technology used to conduct identifications has not fundamentally changed over the past century.The static 2D photo lineup is the most widely used procedure worldwide (3).A lineup contains the police suspect, who may be guilty or innocent, and several fillers, who physically resemble the suspect and are known to be innocent of the crime.The members are presented head-on, in frontal view.If the lineup is target present (i.e., contains the guilty suspect), the witness could correctly identify the suspect or err by choosing a filler or rejecting the lineup (i.e., choosing no one).If the lineup is target absent (i.e., the guilty suspect is absent), the witness could correctly reject the lineup or err by choosing an innocent suspect or a filler.Witnesses in experiments make mistakes around half the time (4) and in real world cases frequently identify known-innocent suspects (5).
A large body of research dedicated to improving discriminability has accumulated.In laboratory studies, participant witnesses are shown a mock crime and tested under different lineup conditions.Many studies have compared sequential to simultaneous lineups, wherein, respectively, witnesses view the members one at a time versus all together.In research comparing simultaneous to sequential lineups, some studies have found discriminability is higher in simultaneous lineups (e.g., ref. 6), while others have found witnesses' response bias, or willingness to make a positive identification, is more lenient in simultaneous lineups, resulting in more guilty and innocent suspect identifications (e.g., ref. 7).
Laboratory research has also examined video lineups, which are used in the U.K. and presented sequentially (8), with each member turning their head left and right to show the faces from every angle.Some studies have found discriminability is higher in static photo simultaneous compared to video lineups (9)(10)(11), while others have found response bias is stricter in video than photo lineups (e.g., ref. 12) or no differences (13).However, none of the studies systematically controlled the perpetrator's face angle during encoding.

Significance
Incorrect lineup identification decisions have devastating consequences on the wrongfully accused and for society when guilty perpetrators remain at large.This paper heeds the call by the National Academy of Sciences to utilize technology to increase correct identifications and minimize incorrect identifications.We found that the simultaneous interactive lineup procedure, which allows witnesses to actively explore the lineup members' faces, increases discriminability compared to traditionally used video and simultaneous photo lineups.Interactive lineups were beneficial regardless of perpetrator encoding viewpoint (i.e., front or profile).In the real world, eyewitnesses may view a perpetrator from multiple perspectives.The interactive procedure allows witnesses to adjust face viewpoints to match their memory of the crime, which may be beneficial for memory retrieval.
If participants in past research largely encoded the perpetrator's face in frontal view, any benefit of face angle as a retrieval cue would be masked, since all procedures allow witnesses to see the members in frontal view.
In this study, we heed the National Academy of Sciences call for technology to improve discriminability (14) and investigate whether enabling witnesses to actively explore the members' faces along the vertical axis from −90° to 90° improves discriminability compared to sequential video and simultaneous photo lineups, the two most widely used procedures worldwide.We experimentally controlled encoding angle.
Discriminability should theoretically be higher in interactive compared to existing procedures since witnesses can engage in pose reinstatement, or rotate the lineup faces to match the angle in which the perpetrator was encoded.This is because encoding specificity, or the match between cues at encoding and retrieval, is important for memory retrieval (15,16).Further, according to diagnostic feature detection theory, discriminability is higher if the witness can detect diagnostic features (i.e., features that match the perpetrator only) and discount nondiagnostic features (i.e., features shared by all lineup members) (2).Therefore, discriminability should be higher if witnesses can evaluate the members' facial features from the same angle as they as they encoded the perpetrator.Affording witnesses the opportunity to consider a greater number of (diagnostic) facial features across members in the same angle means that nondiagnostic features play a proportionally lesser role.The ability of witnesses to actively explore faces may also be a contributing factor.Active exploration can enhance memory performance (17), perhaps by facilitating the intentional sampling of relevant facial features (18), which again could enhance the ability to perceive predominantly more diagnostic features.
Drawing on the theory and research outlined above, Colloff et al. (19,20) developed and tested interactive lineups.Discriminability was higher for witnesses who could reinstate pose (ref.19,Exp. 1).Without prompting, participant witnesses tended to rotate the interactive lineup members' faces into the same angle as they saw the perpetrator commit the crime, and discriminability was higher for those who interacted more (ref.19,Exp. 2).This suggests the structure of eyewitness memory includes information about face angle and allowing witnesses to utilise these cues facilitates memory retrieval.In another project that tested almost 10,000 participants, Colloff et al. (20) compared discriminability in sequential photo versus sequential interactive lineups (Exp.1), and in interactive lineups presented simultaneously versus sequentially (Exp.2).Discriminability was higher in interactive than photo sequential lineups and was boosted further with the simultaneous presentation of interactive faces.This suggests the ability to actively explore and compare faces from multiple angles allows diagnostic features to carry more weight during memory retrieval, facilitating accuracy.

Predictions
This study substantially extends previous theory and research by comparing discriminability in simultaneous interactive lineups against sequential video and simultaneous frontal pose static photo lineups.
Diagnostic feature detection theory predicts that discriminability is higher when witnesses can better detect and discount features that are nondiagnostic (2), such as when lineup members are viewed simultaneously versus sequentially.Additionally, encoding specificity predicts higher discriminability when encoding and test cues match.Therefore, for front-encoding, we predicted that discriminability would be higher in simultaneous interactive and simultaneous photo lineups versus sequential video lineups due to simultaneous presentation.No differences in accuracy were predicted for interactive simultaneous versus photo simultaneous lineups, as encoding and test cues match in both procedures.For profile-encoding, we predicted that discriminability would be higher in simultaneous interactive versus sequential video lineups because of the benefit of showing members together, and higher in simultaneous interactive versus simultaneous photo because witnesses can reinstate pose so that encoding and test cues match.For simultaneous photo versus sequential video lineups, we did not make a directional prediction: Accuracy could be higher in simultaneous photo versus sequential video lineups because the members can be compared simultaneously; however, accuracy could be higher in video versus photo lineups owing to pose reinstatement.

Results
Thirty-one participants in the interactive condition did not interact with any of the faces in the twelve lineups as instructed.We did not anticipate this and therefore did not preregister that we would remove participants who failed to follow instructions.However, by not interacting, the interactive lineup becomes equivalent to the photo lineup, making it difficult to meaningfully compare procedures.Further, discrimination accuracy did not differ for interactors compared to noninteractors, suggesting the participant groups were similarly motivated to make an accurate identification [front encoding: interactors (n = 65; d ′ = 1.47) versus noninteractors (n = 18, d ′ = 0.80), G = 2.06, P = 0.98; profile encoding: interactors (n = 78, d ′ = 1.28) versus noninteractors (n = 13, d ′ = 1.18),G = 0.98, P = 0.83].An analysis of all participant data (including noninteractors) is in SI Appendix.SI Appendix provides ID decision frequencies by confidence level without and with interactors (SI Appendix, Tables S2 and S3), and analyses of decision time by experimental condition (Table S1).None of the differences in discriminability can be explained by decision time differences across the lineup conditions.
Lineup Identification Decisions.Table 1 shows the total number and proportions of target, filler, and "Not present" (reject) identification (ID) decisions across conditions.Interactive lineups appeared to yield better performance in both the front-and profileencoding conditions versus simultaneous photo and video lineups.
Receiver Operating Characteristics (ROC) Analysis.ROC analysis followed Mickes et al. (6) and the partial Area Under the Curve (pAUC) was calculated using statistical package pROC (21).Alpha was set at 0.05, with one-tailed tests for directional hypotheses, and two-tailed tests for nondirectional hypotheses.Fig. 1 displays partial ROC curves with attendant pAUC values by lineup condition for front-(specificity = 0.50) and profileencoding (specificity = 0.49).The relative height of the ROC curves indicates that participants more accurately discriminated between innocent and guilty suspects in interactive versus photo and video lineups regardless of encoding angle.
Maximum Likelihood Signal-Detection Model Fits.Since ROC analysis is an atheoretical measure, we validated our findings with theoretical measures, fitting signal-detection models to the data (2,22).The 11-point confidence scale was combined into a 3-point scale to decrease the number of model parameters.We combined confidence ratings of 0 to 60% as c 1 ,70 to 80% as c 2, and 90 to 100% as c 3 (see ref. 20).Higher c parameter estimates indicate increasingly conservative responding, whereby more memory evidence is required before making an identification.We used the independent observation model with a correlation parameter (23), which assumes that an identification is made when the most familiar face in the lineup exceeds c 1 .If no face in the lineup is familiar enough to exceed c 1 , the lineup is rejected.Identification confidence is determined by the highest criterion exceeded.The correlation parameter (σ b ) allows the correlation between the suspect and fillers to vary from 0, because faces in a fair lineup match the witness's description and therefore should be theoretically correlated (23).We conducted the model-fitting separately for the front-and profile-encoding conditions.Note.Total rows contain the frequency of every identification decision collapsed over participants and confidence.Proportion rows are calculated by dividing the number of identification decisions by the number of lineups in a particular condition.For instance, the proportion of front-encoding target identifications in interactive lineups is computed by dividing the number of target identifications in target-present front-encoding interactive lineups by the total amount of target-present front-encoding interactive lineups, 173/(173 + 102 + 109).
To estimate summary discriminability (ability to discriminate innocent from guilty suspects), correlation, and confidence criteria parameters, the model was fit to the target, filler, and reject decisions in the three lineup conditions.The mean and standard deviation of the innocent distribution was set to 0 and 1, by convention.An unequal variance model was used, because σ target was always estimated to be significantly larger than σ innocent , as it should have been, as there were multiple target faces at encoding which presumably adds random noise to the process.Model-estimated σ target was constrained to be the same across the three lineup conditions (1 model parameter), because allowing it to differ across conditions never significantly improved the fit.The estimated value of the correlation parameter (σ b ) never significantly differed from 0 (though it theoretically should have); so it plays no role in the following model fits but is considered later.Discriminability (d) values were allowed to vary across the lineup conditions by allowing μ target to vary, so the full model had 14 degrees of freedom: 27 [3 lineup conditions × 9 (target, TP and TA filler IDs at 3 levels of confidence)] -13 parameters [3 lineup conditions × 4 (μ target , c 1 , c 2 , c 3 ) and σ target ].To test if any differences in d were statistically significant, we fit a series of reduced models (essentially three pairwise comparisons), constraining d to be the same across two conditions and compared the fit of the reduced to the full model (Table 2).For front-encoding the full model explained the data well (χ 2 (14) = 12.39, P = 0.575).Discriminability was larger in interactive (d = 1.32) than photo (d = 1.12) lineups, but this difference was not statistically significant (χ 2 (1) = 3.58, P = 0.058).Discriminability was significantly larger in interactive compared to video (d = 1.00) lineups (χ 2 (1) = 7.48, P = 0.006).There was no significant difference in discriminability between photo and video lineups (χ 2 (1) = 1.02,P = 0.313).For profile encoding, the full model explained the data well (χ 2 (14) = 12.32, P = 0.581).Discriminability was significantly larger in interactive (d = 1.04) than in photo (d = 0.60, χ 2 (1) = 19.02,P < 0.001) and video (d = 0.60, χ 2 (1) = 16.86,P < 0.001) lineups, but did not differ between photo and video lineups (χ 2 (1) = 0.00, P = 1.000).
The independent observation model that allows for a positive correlation among the memory signals of a lineup effectively operates like diagnostic feature detection theory under certain conditions by reducing the impact of the shared nondiagnostic features and increasing the pAUC.One way to reduce the impact of shared features is by increasing the correlated memory signals between faces (hereafter, the correlational account).This could explain why lineups that have correlated memory signals (owing to shared features across faces) yield higher discriminability than showups (a single face, where the concept of correlated memory signals does not apply see ref. 24).Another way to reduce the impact of shared features is to increase the number of diagnostic features available (increasing the total number of facial features considered).In which case, the shared features play a proportionally lesser role, thereby decreasing the correlation but increasing discriminability as the memory strength distributions for innocent and guilty lineup members become further apart (hereafter, the distributional account) (25).
To investigate the foregoing, we fit a model allowing μ target (d, the distance between the guilty and innocent distributions) and σ b (the correlation) to differ across the lineup conditions.Again, the model-estimated σ target was constrained to be the same across the lineup conditions, and the confidence criteria could vary.We found that for both front-and profile-encoding, the correlation is smallest and d is largest in the interactive condition, consistent with the distributional account (Table 3, see SI Appendix for model fits).
Overall, the signal-detection modeling validates the ROC analysis results, though for front-encoding, increased discriminability Note.In the full model d varies across conditions.In the reduced models, the two procedures that are compared are restricted to an equal d.In both full and reduced models, modelestimated σ target was constrained to be the same over conditions, and c 1 , c 2 , and c 3 were free to vary.Model-fit rows represent the goodness-of-fit statistic.
for the interactive compared to the photo lineup did not reach statistical significance in the modeling.The discriminability improvement observed for interactive lineups is most parsimoniously explained by the distributional account.

Discussion
We compared the simultaneous interactive to simultaneous photo and sequential video lineups, two widely used police procedures worldwide.Encoding angle (front or profile) was systematically controlled.For front-encoding, discriminability was significantly higher in simultaneous interactive compared to sequential video lineups and simultaneous photo lineups; but, this latter difference was statistically significant only in the ROC analysis and not the modeling.For profile-encoding, discriminability was higher in simultaneous interactive compared to simultaneous photo and sequential video lineups, which did not differ from each other (see SI Appendix for results with noninteractors included).This suggests active exploration and encoding specificity boost discriminability.Further, the modeling results suggest interactive lineups move the memory strength distributions for innocent and guilty lineup members further apart, ostensibly because they increase the number of diagnostic features available.
Interactive versus Photo and Video Lineups.For front encoding, for any false ID rate, simultaneous interactive lineups enhanced the correct ID rate of target faces by 35% and 27% compared to sequential video and simultaneous photo lineups, respectively.The ROC analysis found better discriminability for simultaneous interactive than simultaneous photo lineups, though this was not statistically significant in the modeling results (P = 0.058).The front-encoding findings suggest that active over passive exploration enhances feature sampling (17), aiding diagnostic feature detection (2).This notion aligns well with studies demonstrating a relationship between visual exploration and memory performance.For example, complex stimuli explored with more eye movements are better remembered than stimuli explored with fewer eye movements (e.g., ref. 26).For profile-encoding, for any possible false identification rate, interactive lineups enhanced the target correct identification rate by 75% and 60% compared to simultaneous photo and video lineups, respectively.This suggests discriminability is enhanced for simultaneous interactive witnesses because they can actively explore and compare faces in the same pose that they were encoded, thereby increasing the availability and use of proportionally more diagnostic features.Further research is needed to examine the relative contributions of active exploration and pose reinstatement in enhancing discriminability.This work should also incorporate independent measures of participant motivation and engagement to further investigate any potential biases that might be introduced by the exclusion of noninteracting participants from the analysis.
Photo versus Video.For both front-and profile-encoding, discriminability did not differ between simultaneous photo and sequential video lineups.For front-encoding, this does not support our prediction from the diagnostic-feature-detection theory (2) that simultaneous presentations (in this case, the photo lineup) enhance discriminability compared to sequential presentations (i.e., the video lineup).Findings are mixed, however.Fitzgerald et al. (3) concluded that video and photo lineups yield comparable discriminability, and no procedure is favorable over the other.However, video lineups are typically sequential, and photo lineups-although they can be sequential-are typically simultaneous, and this may affect performance.Seale-Carlisle and Mickes (10) found higher discriminability for simultaneous photo compared to sequential video lineups, and Seale-Carlisle et al. (11) found simultaneous photo and simultaneous video lineups did not differ.Further research considering encoding conditions is needed to clarify this debate.
Summary.This study demonstrates how psychology research grounded in strong theory from basic science can be tested and applied to improve forensic science, and particularly eyewitness memory performance.In the real world, it is likely that eyewitnesses will view a perpetrator from multiple perspectives (particularly nonfrontal) (27), and the interactive procedure allows witnesses to adjust lineup face viewpoints to match their memory of the crime, which appears to be particularly beneficial for memory performance.If research continues to evidence an interactive benefit, simultaneous interactive lineups could be adopted by police forces globally to allow for encoding-retrieval matching and increased ability for diagnostic feature comparison.
Design.A 2 (encoding: front or profile view) × 3 (procedure: interactive, photo, or video) × 2 (target: present or absent) mixed design was used.Encoding and procedure were varied between subjects, while target was varied within subjects.Each participant learned twelve faces at encoding (front view or profile view) and was tested on twelve lineups, of which, six were target-present and six were targetabsent.The outcome variables were identification accuracy and decision confidence.
We used ROC analyses because this method has been successfully applied in lineup research (6) and can quantify participants' ability to discriminate innocent from guilty suspects (28)(29)(30).For stable functions, ROC analysis requires large samples.ROC lineup studies typically recruit approximately 500 data points per condition, so for our study we adopted a data collection stopping rule of 6,000 data points (500 participants × 12 conditions = 6,000).
Ethical approval was granted from the Science, Technology, Engineering and Mathematics Ethical Review Committee at the University of Birmingham.

Fig. 1 .
Fig. 1. pROC curves and pAUC statistics.Note.pROC curves and pAUC statistics for simultaneous interactive, simultaneous static photo and sequential video lineups, separated for front (A and B) and profile (C and D) encoding conditions.ROC lines of best fit were plotted from numbers estimated by unequal-variance signal-detection models.Chance-level performance is indicated by dashed lines.For pAUC values (B and D), error lines are 95% CIs.

Fig. 2 .
Fig. 2. Visualization of the three lineup procedures.Note.Illustrations of the (A) simultaneous photo lineup, (B) simultaneous interactive lineup, and (C) sequential video lineup.In A, six faces were presented simultaneously in frontal pose.In B, six faces were presented simultaneously in frontal pose, and participants used the computer mouse to click on one face and rotate it, then all faces moved jointly together.In C, only one face was presented at a time via a 20-s video in which the face moved fluidly from front to profile, to the opposite profile, and back to the front.Adapted from ref. 20.

Table 3 . Fitted models for the interactive, photo and video lineups estimating correlation (σ b ) and distributional (μ target ) differences
(15)= 12.32, P = 0.655 Note.σ target was set to be the same over conditions, and c 1 , c 2 and c 3 were free to vary.Correlation r = σ b