Data on likelihood ratios of two-person DNA mixtures interpreted using semi- and fully continuous systems

In the paper, “Probabilistic approaches to interpreting two-person DNA mixtures from post-coital specimens” [1], we analysed 102 two-person DNA samples from simulated mixtures and male-female and male-male post-coital specimens. We report here data on profile characteristics of these samples and likelihood ratios (LRs) generated using semi- and fully continuous systems. Both log10 LRs from true and non-contributor tests are presented. These data may supplement studies comparing performance of different probabilistic systems for DNA evidence interpretation.


Data
Data on the quality of two-person DNA samples from simulated mixtures and post-coital samples are presented in Table 1. These mixtures were analysed using the LRmix Studio [2] and STRmix™ [3]. We present LRs from true contributor tests conditioned on the presence of a known contributor (H1 true LRs) ( Table 2). We further report a summary of non-contributor tests (Table 3) and attach a list of all non-zero log 10 H2 true LRs calculated using STRmix™ (Supplementary material).

Experimental design, materials and methods
Ethical clearance was issued by the University of the Philippines Manila Research Ethics Board (UPMREB Code: 2012-321-01). All sample donors provided written informed consent to participate.
Various post-coital specimens (vaginal and anal swabs, undergarment cuttings, internal and external condom swabs) were obtained from a male-female and a male-male pair. Simulated twoperson mixtures from a different male-female pair were also prepared at known proportions. Details on sample collection and processing can be found in Ref. [1]. Briefly, DNA samples were amplified using the PowerPlex® 21 system that targets 20 short tandem repeat (STR) loci then separated and detected using the AB® 3500 Genetic Analyzer (Thermo Fisher Scientific). GeneMapper® ID-X v.1.2 (Thermo Fisher Scientific) was used to generate and analyze electropherograms. A total of 102 twoperson mixtures of variable quality were available for interpretation.
Likelihood ratios were calculated using LRmix Studio v.2.1.3 [2] and STRmix™ v.2.5.11 [3]. LRmix employs a semi-continuous approach incorporating probabilities of drop-out and drop-in Ref. [4], while STRmix™ is a fully continuous system. It models peak height variation [3,5], exponential degradation [6,7], drop-in following a gamma distribution [8], and allele-specific stuttering [5,9,10]. It also reports mixture proportions according to Clayton and Buckleton [11]. Both use the Balding and Nichols' equations (recommendation 4.2 of NRC II) as population genetic model [12,13]. LRmix does not model stuttering, thus stutter filters were applied. However unlabelled peaks on stutter positions where an allele was expected were manually called to avoid inconsistent decisions on Specifications Value of the data The mixture profile characteristics of samples vis-a-vis corresponding likelihood ratios (LRs) can be used to investigate how the LR is affected by factors such as the number of drop-outs, average peak height, and mixture proportion of the person-of-interest (POI). The LRs presented can add to data comparing performance of different software for mixture interpretation. The dataset can be used in inter-laboratory comparisons employing different mixture interpretation strategies.  assigning short peaks either as allele or stutter. LRs were calculated using a Pr(D) determined by the software which results in the lowest LR. For STRmix™, stutter peaks were included in the input files.
Other specific parameters used in operating the software can be found in Ref. [1]. All computations used Philippine population allele frequencies [14] and a 0.03 subpopulation correction factor (q) [15]. Calculations were conditioned on the presence of the female or the receptive partner's profile. The following propositions were evaluated:  We further conducted non-contributor tests [16] for each interpretation system. This was done by replacing the POI 10,000 times with a randomly generated profile (H2 is true) from Philippine population allele frequencies [14]. LRmix Studio shows for each test the values for the minimum, maximum, as well as the 1st, 50th, and 99th percentiles among 10,000 LRs calculated, while STRmix™ reports all LR values.