The risk of false inclusion of a relative in parentage testing – an in silico population study

Aim To investigate the potential of false inclusion of a close genetic relative in paternity testing by using computer generated families. Methods 10 000 computer-simulated families over three generations were generated based on genotypes using 15 short tandem repeat loci. These data were used in assessing the probability of inclusion or exclusion of paternity when the father is actually a sibling, grandparent, uncle, half sibling, cousin, or a random male. Further, we considered a duo case where the mother’s DNA type was not available and a trio case including the mother’s profile. Results The data showed that the duo scenario had the highest and lowest false inclusion rates when considering a sibling (19.03 ± 0.77%) and a cousin (0.51 ± 0.14%) as the father, respectively; and the rate when considering a random male was much lower (0.04 ± 0.04%). The situation altered slightly with a trio case where the highest rate (0.56 ± 0.15%) occurred when a paternal uncle was considered as the father, and the lowest rate (0.03 ± 0.03%) occurred when a cousin was considered as the father. We also report on the distribution of the numbers for non-conformity (non-matching loci) where the father is a close genetic relative. Conclusions The results highlight the risk of false inclusion in parentage testing. These data provide a valuable reference when incorporating either a mutation in the father’s DNA type or if a close relative is included as being the father; particularly when there are varying numbers of non-matching loci.

The use of an increasing number of loci in a multiplex amplification leads inevitably to higher confidence in assignment of an individual as being a defined genetic relative of a known person. With an increase in the loci used in a paternity test comes also the increase in the chance of observing a mutational event; leading to the possibility of a false exclusion. However, there also comes the benefit of a potential higher power of discrimination. When testing close genetic relatives as part of a paternity assignment, it is expected that more alleles will be shared, such as in full siblings (1), when compared to a random member of the population. In support of this assumption, a previous study indicated that there was at least a 50% chance of two random men sharing at least one allele at 10 of the 14 loci tested (2). The chance of a false inclusion and exclusion is greater when testing one putative parent and an offspring (a duo scenario) than when there is an additional confirmed parent (a trio scenario). In instances of immigration cases, it may be that one relative poses as a parent of a child; such an incident was reported when a sibling claimed to be the father of a boy (3). The instance when a close genetic relative posed as a parent of an offspring where 9 or 10 loci were used in a paternity test led to unsatisfactory results (4). A similar study highlighted an instance when using 11 polymorphic short tandem repeat (STR) loci there was a matching allele at each locus between a child, the assumed mother, and skeletal remains that were not from the father of that child (5); this same study found 3 further instances of exactly the same scenario when using 10 STR loci. Recently, there has been a report of two tested men presenting matching alleles with a potential offspring at 19 STR loci in a duo case (6).
The probability of excluding a relative from being a true father of an offspring was examined using data for 12 STR loci from a known population (7). An extension of this study, using 12 STR loci, derived the probability of excluding a relative for close genetic relatives (8). A conclusion was that full siblings impersonating parent/child proved the most difficult scenario to discredit with DNA profiling alone. Similarly, it was reported that there was a probability of 12% that there would be no inconsistencies (a shared allele at all loci tested) when comparing data using 18 STR loci when a sibling of a true parent posed as the parent of the tested child (9). In motherless paternity analysis using 15 STR loci, the differences between probabilities for father and uncle were observed to be small (10).
The use of computer-simulated populations has the great benefit of an increase in the size of the avail-able data. Evaluation of the efficacy of trio sibship testing and sibling assignment for forensic purposes by using such model populations was performed in our laboratory (11). In this study, we report on the false paternity probabilities with 15 STR loci when comparing two close genetic relatives (two siblings, paternal grandparent/grandchild, paternal uncle/nephew or niece, two half siblings, and two cousins) and two random persons. These different combinations were generated using 10 000 simulated 3-generation families based on data from the Taiwan population (12). The risks of false inclusion for duos and trios in parentage testing were evaluated respectively.

Populations
A total of 10 000 family groups extending over 3 generations were simulated using 15 STR loci. These data were created using allele frequencies from the study of Lee et al (12). In this previous study, allele frequencies were calculated from 3794 random individuals of Taiwanese Han Population using the software PowerMarker (http://statgen. ncsu.edu/powermarker/index.html). The 15 STR loci were analyzed by using the AmpFlSTR® Identifiler PCR Amplification Kit (Applied Biosystems, Foster City, CA, USA). Genotypes of members G, H, I, J, N, S, X, Y, and Z were randomly generated and those of their off-springs were generated following Mendel's laws of inheritance in a spreadsheet of Microsoft® Office Excel 2007 using functions "countif, " "indirect, " "address, " "if, " and "randbetween. " The potential for a mutational event was not taken in account while creating family groups. The duo/trio populations, each with 10 000 combinations, were established with combinations of EB/ EFB (duo/trio or true parents), CB/CFB (sibling as the father), IB/IFB (paternal grandfather as the father), KB/KFB (paternal uncle as the father), WM/WJM (half-sibling, child's half brother, as the father), RB/RFB (cousin, being the son of father's sister, as the father), and XB/XFB (random male as the father) (Figure 1).

Calculations
The STR genotypes were entered into a spreadsheet and all calculations were performed using Microsoft Office Excel 2007. The likelihood ratio (LR) (or paternity index) of duo and trio parentage testing was calculated using the algorithm recommended by the ISFG (13), where the numerator assumes the tested man is the father and the denominator assumes a random man is the father. A value of zero was used for the non-matching loci. The confidence intervals for a proportion of non-exclusion rates were calculated with Modified Wald method (Agresti-Coull Interval) (14).

Possible false inclusions in duo cases
This study was designed to illustrate the potential of a misinterpretation of paternal relative (such as a grandparent, uncle, sibling, half sibling, and cousin) being a biological father compared to a random man in paternity testing. In duo cases, the highest non-exclusion rate was 19.03 ± 0.77% in the scenario where a sibling posed as the father (Table  1, eg, CB in Figure 1). This indicated that in 19.03 ± 0.77% cases, the child's sibling could not be distinguished from the true father. The non-exclusion rates when other relatives posed as the father were 2.81 ± 0.32% (grandparentchild, eg, IB), 2.78 ± 0.32% (uncle-child, eg, KB), 2.58 ± 0.31% (half sibling-child, eg, WM), and 0.51 ± 0.14% (cousin-child, eg, RB). The combination with the highest non-exclusion rate was when a sibling posed as the father. In this scenario, the accumulative non-exclusion rate was as high as 51.7 ± 0.98% when assuming one non-matching locus was due to a mutation.
The Log LR (Logarithmic value of likelihood ratio) for the true parent-child pairs ranged from 1.4845 to 11.4087 (Table 2). For paternity testing, LR reflects how many times more likely the alleged father is to be the child's father than  any male taken at random from the population. The mean value of Log LR (α = 0.05) was similar when comparing the true parent-child pairs (5.0207 ± 0.0247) to the sibling-child pairs (5.6010 ± 0.0577). It should be noted that the mean value of Log LR for the sibling-child pairs was even higher than the true parent-child pairs; however the standard deviation for the sibling-child pairs (0.0577) was higher than the true parent-child pairs (0.0247).

Possible false inclusions in trio cases
In trio cases, the highest non-exclusion rate was 0.56 ± 0.15% in the scenario where a paternal uncle posed as the father (Table 3, eg, KFB in Figure 1). The non-exclusion rates in scenarios where other relatives posed as the child's father were 0.51 ± 0.14% (sibling, eg, CFB), 0.46 ± 0.13% (half sibling, eg, WJM), 0.38 ± 0.12% (grandparent, eg, IFB), and 0.03 ± 0.03% (cousin, eg, RFB) for each of the 10 000 combinations. The highest non-exclusion rate was observed in the case of the uncle posing as the father, where the accumulative nonexclusion rate for this relationship was 3.57 ± 0.36% assuming a mutation; however, under this scenario, it was highest for the sibling relationship (4.28 ± 0.40%).
The Log LR for the actual parent-mother-child pairs ranged from 4.0061 to 16.0957 (Table 4). The mean value (7.1267) of Log LR (α = 0.05) for the half-sibling-mother-child pairs was the closest value compared to the actual parentmother-child pairs (7.4741); however, its standard deviation was highest when compared to other combinations of relatives.

dIsCussIon
For the duo cases, the results illustrated the highest nonexclusion rate when a sibling posed as the child's father. This scenario was in line with a previous report where the most difficult combination to distinguish was when a brother claimed to be the actual father of his sibling and when the mother's genotype was unavailable (8). It was also noted in this paper that the probability of not excluding a brother as being the father of his sibling using 12 STR loci was about 27%; and if one mismatch was assumed, it increased to 65%, further illustrating the difficulty of excluding a brother as being the father of a sibling.
It has also been reported that if the alleged parent and child are actually uncle and nephew, the probability of excluding a relative was 0.903 based on 9 STR loci in motherless cases (4), rising to 0.937 when 12 common STR loci were used; to 0.966 and 0.984 using 9 and 12 STRs, respectively, when the mother's genotypes were used. This same study also showed that when 20 STR loci were used, the corresponding probability of excluding a relative was 0.9986 (for a trio) and 0.9875 (for a duo), supporting the assumption that the number of STR markers typed and the inclusion of data from the mother's profile affected the rates of false inclusion. In this study, 15 STR loci were used.
It was reported by Poetsch et al that no STR mismatches for 15 STR loci between a child and an unrelated man were detected in 26 comparisons (duo cases) out of 116 004 from a region of northern Germany (15). Such a study highlights the opportunity for a false inclusion of paternity when a close genetic relative claims to be the father of a child, especially in a small geographical region.
Even with these data, the access to the genotypes of close relatives remains the preferred option to minimize the chance of a false inclusion; although it should be noted that these data are not always available. In the current study, we report on the risk of false inclusion in parentage testing to provide a valuable reference for forensic laboratories when incorporating either a mutation in the DNA profile from a putative father or when a close relative is the potential father.
We report on the evaluation of possible false inclusions in duo and trio cases when replacing the real/true father with the other close relatives and also with a random man. The highest non-exclusion rates for the duo cases were observed in the scenario where a sibling claimed to be the true father. For the trio cases the highest non-exclusion occurred when a paternal uncle posed as the biological father. When a single mutational event was incorporated into the 15 STR loci test, the highest accumulative non-exclusion rate was observed when a sibling posed as a true father in the duo and trio combinations. The results highlight the risk of potential false inclusion in parentage testing.
Funding We thank the National Science Council of Taiwan who supported the simulation study by a grant NSC97-2320-B-002-037-MY3 and the Taiwan Ministry of Justice that supported the DNA database project (100-1301-05-0503).
ethical approval Not required.
declaration of authorship JCIL, LCT, CYL, TYH, AL, and HMH participated in designing the methods, analyzing and interpreting the results, and preparing the manuscript. PCC, YYL, and YJY participated in performing of computer simulation.