Evaluation of Cofactor Markers for Controlling Genetic Background Noise in QTL Mapping **

: In order to control the genetic background noise in QTL mapping, cofactor markers were incorporated in single marker analysis (SMACO) and interval mapping (CIM). A simulation was performed to see how effective the cofactors were by the number of QTL, the number and the type of markers, and the marker spacing. The results of QTL mapping for the simulated data showed that the use of cofactors was slightly effective when detecting a single QTL. On the other hand, a considerable improvement was observed when dealing with more than one QTL. Genetic background noise was efficiently absorbed with linked markers rather than unlinked markers. Furthermore, the efficiency was different in QTL mapping depending on the type of linked markers. Well-chosen markers in both SMACO and CIM made the range of linkage position for a significant QTL narrow and the estimates of QTL effects accurate. Generally, 3 to 5 cofactors offered accurate results. Over-fitting was a problem with many regressor variables when the heritability was small. Various marker spacing from 4 to 20 cM did not change greatly the detection of multiple QTLs, but they were less efficient when the marker spacing exceeded 30 cM. Likelihood ratio increased with a large heritability, and the threshold heritability for QTL detection was between 0.30 and 0.05. (Asian-Aust. J. Anim. Sci. 2003. Vol 16, No. 473-480)


INTRODUCTION
Quantitative trait loci (QTLs) have been identified and mapped by associating trait phenotypes with multiple markers. Analytical models in conventional methods often included single marker effect or flanking marker effects. The former method was called the single maker analysis (SMA) and the latter was the interval mapping (IM, Lander and Botstein, 1989). However, these methods were likely to render a low resolution of a QTL due to a QTL effect confounded with the effect of another QTL elsewhere in the linkage group (Wright and Kong, 1997). In order to reduce such bias, Zeng (1993) and Jansen (1993) suggested the use of a marker interval and a few other well-chosen single markers simultaneously, and the method was called composite IM (CIM). Employing additional marker loci made the genetic background noise to be absorbed with a narrow marker region for a significant QTL, resulting in a considerably increased QTL resolution (Zeng, 1994). Although CIM has been increasingly used in practical QTL mapping in animals and plants (Boyle and Gill, 2001;Drake et al., 2001;Robison et al., 2001;Wayne et al., 2001;Wu et al., 2001Wu et al., , 2002, investigations for optimizing the effects of cofactors were still limited (Piepho and Gauch, 2001;Lee, 2002). The effect of CIM largely depended on the choice of cofactors. In this paper, we evaluated incorporating cofactors in both IM and SMA of QTL mapping through simulation. This improved SMA was referred to as the single-marker analysis with cofactors (SMACO).

A general idea on cofactor markers
Suppose a biallelic QTL was located between markers i and i+1. There were also some other QTLs elsewhere in the linkage. Assume the putative QTL of interest was r 1 recombination units apart from the marker i and r 2 apart from the marker i+1. The trait value was expressed as below; where k y was the observed value for individual k in the population, µ was the mean of observed values, ( ) The following likelihood ratio was defined as a test statistics for the hypothesis that a QTL is not linked to the marker (SMACO) or marker interval (CIM); The corresponding lod score for the null hypothesis test was 0 (10) The genetic effect of the putative QTL linked to marker(s) was estimated by maximum likelihood.

An example of single marker analysis with cofactors
Since Zeng (1993,994) described CIM as a combination of simple IM and multiple linear regression, here we show an example for SMACO using backcross progeny. A progeny population was derived from backcrossing the F1 (Aa) with the parent with AA marker genotype. The analytical regression model for SMACO with backcross progeny was was the intercept of the model, i b was the slope of regression for the putative QTL linked to marker i, ik X was a dummy variable taking 1 for marker genotype AA and -1 for Aa, j b was the partial regression coefficient of the observation on marker j, jk X was the dummy variable for cofactor marker j of individual k, taking 1 for the cofactor marker genotype AA and -1 for Aa, and ik e was the residual.
Assuming that ik e was normally distributed with mean zero and variance 2 σ , the likelihood function for the SMACO was 1 2 2 2 1 1 2 2 2 1 2 where 1 n and 2 n were the numbers of individuals having marker genotypes AA and Aa, respectively, at the marker locus i (  Then the likelihood ratio and the lod score for the null hypothesis were obtained as shown in the Equations (5) and (6).

Simulation
A simulation study for diploid organisms was performed to investigate the properties and utilities of cofactors. Typical backcross populations were simulated with QTL Cartographer for Windows Version 1.01 (Basten et al., 1994). From two distinct populations, six males from one population and six females from the other population were randomly selected and mated, producing six full-sib families in the F1 generation. Five daughters from each fullsib family were used to cross back to their corresponding fathers. Ten individuals were born from each pair, and the 300 individuals in the backcross generation were genotyped.
A total of six populations were simulated and two chromosomes that were both 120 cM in linkage length were generated for each population. The input values to simulate these populations were presented in Table 1. In all the populations, 13 markers were evenly located on both chromosomes. For population 1, QTL1 was assigned at 67cM of chromosome 1, and QTL2 at 35 cM and QTL4 at 99 cM on chromosome 2. The additive effects for QTL1, QTL2, and QTL4 were 1.040, 0.956, and 1.174, respectively, with heritability of 0.5. Dominance and epistasis were not considered for the QTLs. This population was analyzed using both CIM and SMACO. We examined the effect of using various numbers and different types of cofactors to control genetic background noise. QTLs were searched using SMA with 1 (SMACO_1), 3 (SMACO_3), 5 (SMACO_5), 7 (SMACO_7), and all other loci (SMACO_all) as cofactors. Note that SMACO_all was the multiple marker analysis. The CIM was also used with 1 (CIM_1), 3 (CIM_3), 5 (CIM_5), 7 (CIM_7), and all other loci (CIM_all) as cofactors. All the markers on chromosomes 1 and 2 were examined in groups such as unlinked markers (UL), markers linked to nuisance QTL (LNQ), markers not closely linked to nuisance QTL (NCLNQ), and flanking markers (FL). The UL were the markers in a linkage group other than the one where the QTL of interest was located. For example, when QTL2 and QTL4 on chromosome 2 were examined, the markers on chromosome 1 were regarded as unlinked markers. The LNQ referred to the ones located within 25 cM from the nuisance QTL, and those located farther than 25 cM from the QTL were regarded as NCLNQ. The FL were flanking the marker interval examined. Populations 2 to 6 were simulated to illustrate the influence of heritability level on QTL detection. The locations of QTL2 and QTL4 on chromosome 2 in these populations were simulated as in population 1, but QTL1 was not simulated. Additionally, QTL3 was simulated at 90 cM on chromosome 2, so it was only 9 cM apart from QTL4. The additive effects of the QTL2, QTL3, and QTL4 varied in the five populations while the same amount of environmental variance was used. The heritability of the five populations ranged from 0.1 to 0.9.
Additionally, various marker densities were generated for population 4 to investigate the influence of marker spacings on QTL mapping. The numbers of markers used per chromosome were 61, 31, 13, 7, 5, and 4, and they were evenly distributed at the two chromosomes. Their corresponding marker spacings were 2 cM, 4 cM, 10 cM, 20 cM, 30 cM, and 40 cM, respectively.
A total of 50 replicates were simulated for each population. The simulated data were analyzed by the proposed methods, and the calculated LRs were compared to genome-wide threshold values at a 0.05 significance level. The threshold values were obtained with 1000 replicates by permutation tests (Churchill and Doerge, 1994).

Single marker analysis with or without cofactors
In population 1, QTL1 was detected by SMA, regardless of including cofactors (Figure 1). Incorporating cofactors led the likelihood ratio (LR) to be increased at markers closely linked to the QTL except for SMACO_all, and to be decreased at markers far from the QTL. The LR estimates showed that SMACO_3 worked most effectively. Table 2 shows that the estimates of QTL1 effects using SMA and SMACO_all differed (p<0.01) from the input value. Using SMACO_1, SMACO_3, SMACO_5, and SMACO_7 reduced the difference (p<0.05). The estimation of the QTL effect was improved by introducing cofactors.
All the SMA with or without cofactors were able to discover the two QTLs on chromosome 2 except for SMACO_all ( Figure 2). Using ordinary SMA, the LR estimates at all markers were significant at the 0.05 genome-wide significance level. Introducing 3 to 7 cofactor loci dramatically reduced the genetic background noise.
With multiple QTLs, incorporating cofactors in the analytical models also improved the estimation of QTL   Table 2). Estimates of QTL2 and QTL4 effects using ordinary SMA were largely different (P<0.01) from their input values. The overestimation resulted from the genetic background noise due to the segregation of the linked QTL. Incorporating cofactors dramatically absorbed the genetic background noise in estimating QTL effects. The estimates of QTL2 and QTL4 effects using SMACO_5 did not differ (P>0.01) from their input values. However, using SMACO_all led to underestimation (P<0.05) of QTL with smaller effect and overestimation (P<0.01) of QTL with larger effect.

Interval mapping with or without cofactors
The use of CIM in QTL detection produced results similar to those obtained by SMACO but offered more accurate estimates of QTL effects than SMACO. When chromosome 1 was analyzed with cofactor(s), LR estimates were increased at marker interval flanking or closely linked to QTL 1 except for CIM_all, but decreased at other intervals ( Figure 3). The LR estimates showed that the use of three cofactors (CIM_3) worked most effectively as shown with SMACO_3. The estimation for the QTL 1 effect was improved by introducing cofactor(s) ( Table 3).
In the analyses of multiple QTLs on chromosome 2, LR estimates were significant at all marker intervals in IM (Figure 4). Incorporating cofactor markers produced smaller LR estimates at unlinked marker intervals (Figure 4). The effects for QTL 2 and QTL 4 were overestimated (p<0.01) with IM, and the position estimates were different from the input values (Table 3). However, they were greatly improved by introducing cofactor(s), and especially the estimates of the QTL2 and QTL4 effects using CIM_5 did not differ (P>0.01) from their input values.

Types of cofactors
When applying CIM to mapping QTLs on chromosome 2, the estimates of QTL effects and positions using CIM with UL were corresponding to those using IM (Table 4). However, estimation of the genetic effects was improved using linked markers such as NCLNQ, LNQ, and FL. Especially, estimates of genetic effects with LNQ did not differ (p>0.05) from their input values. Furthermore, incorporating such linked markers dramatically reduced the LR estimates at the control marker interval while the LR

Heritability level
Populations 2 to 6 with various heritabilities were used to investigate how heritability level influenced QTL detection with cofactors, and the results with chromosome 2 using CIM are shown in Figure 5. All analyses failed to distinguish the two closely linked QTLs, QTL3 and QTL4, and this jointly detected QTL was referred to as QTLJ. Larger LR estimates at the QTL regions were obtained with a larger heritability. QTL2 and QTLJ were detected from all the replicates when the heritability was 0.5 or larger. The results using SMACO showed similar patterns (data not presented). With a heritability of 0.1, SMACO failed to discover one of the two QTLs in 45 out of 50 replicates, and it failed in 17 out of 50 replicates with a heritability of 0.3. When the heritability was 0.5 or higher, these two QTLs were discovered at the significance level of 0.05 in all replicates.

Marker spacing
Population 4 was used to examine how marker spacing influenced QTL detection with cofactors, and the results using CIM are shown in Figure 6. QTL3 and QTL4 were detected jointly when the marker spacings ranged from 4 cM to 40 cM. They were discovered separately only when the marker spacing was 2 cM. In attempts to distinguish two very closely linked QTLs (less than 5 cM apart), both CIM and SMACO failed even when the marker spacing was reduced to 1 cM (data are not presented).
The peak of LR estimates for QTL2 was obviously discovered when the marker spacing ranged from 2 cM to 20 cM. It became hardly detectable when the marker spacing was 30 cM or larger, and the position began to be shifted.

Reduction of genetic background noise by cofactors
An ordinary SMA has a theoretical problem of producing underestimated QTL estimates (Liu, 1998), and so does the SMACO (Table 2). However, the genetic background noise attributable to other QTLs was dramatically absorbed with well-chosen markers as cofactors.
Applying single-QTL methods such as SMA and IM to deal with multiple QTLs was challenged by the confounded effect of multiple QTLs because segregation of other QTLs also contributed to the phenotypic variance. Therefore, both SMA and IM could not discern whether significant effects at several linked markers are due to a common QTL or to several linked QTLs. The presence of multiple QTLs introduced serious biases into QTL estimation with these approaches. It often led to a confusing situation where all  marker loci were significant. This problem was especially serious with SMA. However, this did not mean that SMA was not capable of searching for multiple QTLs. The current study showed that the accuracy of QTL mapping increased by obtaining the linked QTLs separately. Using SMACO and CIM, the effect of QTL elsewhere in the linkage was efficiently controlled. As a result, LRs increased for markers closely linked to a QTL, but decreased for markers far from the QTL. The results indicated that both CIM and SMACO were powerful in dealing with multiple QTLs.

Choice of cofactors
This study indicated that the choice of markers as cofactors was of great importance in mapping multiple QTL. The genetic background noise was dramatically absorbed with linked markers, but not with unlinked markers. This study showed that flanking markers could be desirable cofactors when the heritability was high. Piepho and Gauch (2001) also suggested that the use of flanking markers was the best controls because they were very efficient at absorbing genetic background noise. Note that precaution should be taken with closely linked flanking markers. The 1 A control non-QTL interval (60~70 cM) was also examined to reflect false positive results. 2 Four cofactors were used in all analyses except for NONE. Since the QTLs on chromosome 2 were being examined, all the markers located on chromosome 1 were regarded as UL. LNQ were those located within 25 cM from the nuisance QTL, and those located greater than 25 cM from the QTL were regarded as NCLNQ. FL were flanking the interval being examined. The standard errors were empirically obtained from 50 replicates. 3 * p<0.5, ** p<0.01. flanking markers tended to over-absorb the QTL effect, especially those with a small heritability. As a result, the effects of weak QTL were underestimated. Another concern in CIM and SMACO was the number of cofactors. This study showed that 3 to 5 cofactors led to good results in mapping multiple QTLs. Note that it was challenged by the over-fitting problem when the heritability was low and when the number of regressor variables was large. No more than 5 markers were recommended for a low heritability.

Heritability level
This study showed that the heritability level influenced QTL mapping. A higher level of heritability led to more significant LRs and more accurate estimates of QTL effects and positions. The accuracy of marker-trait association was increased with a large heritability, enhancing the ability to detect and map QTLs (Belknap, 1998;Wu, 1999;Kearsey and Farquhar, 1998). The current study suggested that reliable results in QTL mapping were obtained when heritability was 0.3 or larger. This concurred with the findings of Williams and Blangero (1999). They investigated the asymptotic power of LR test for detecting linkage to a QTL and discovered the minimum detectable QTL heritability of 0.35. Knowing such a threshold level of heritability was important for optimizing the experimental design and the sample size for QTL mapping (Weller et al., 1990;Moreno-Gonzalez, 1993).

Marker spacing
The results from the current study showed that the effects of CIM and SMACO in the search for multiple QTLs were not different with marker spacings from 4 to 20 cM. After a study, Darvasi et al. (1993) reported that the power of detecting a QTL was virtually the same for a marker spacing of 10 cM as that for an infinite number of markers and was slightly decreased for marker spacing of 20 to 50 cM. They also found that reducing marker spacing below the resolving power defined as the 95% confidence interval map position did not improve by narrowing the confidence interval. Similar evidence was presented by Piepho (2000) that the power of QTL detection and the standard errors of genetic effect estimates were affected little by any increase of marker density beyond 10 cM. The present study showed that the location of QTL 2 tended to be shifted with marker spacings from 30 cM to 40 cM. This was in agreement with Martinez and Curnow (1992) and Haley and Knott (1992) who reported spurious QTLs in their search for the presence of multiple QTLs using IM with such large maker spaces. On the other hand, reducing marker space to less than 2 cM allowed the two QTLs (QTL3 and QTL4) that were 9 cM apart to be discovered separately.