Haplotype-based genome wide association study using a novel SNP-set method : RAINBOW

Background Diffculty in detecting rare variants is one of the problems in conventional genome wide association studies (GWAS). The problem is closely related to the complex gene compositions comprising multiple alleles, such as haplotypes. Several single nucleotide polymorphism (SNP) set approaches have been proposed to solve this problem. These methods, however, have been rarely discussed in connection with haplotypes. In this study, we developed a novel SNP-set GWAS method named “RAINBOW” and applied the method to haplotype-based GWAS by regarding a haplotype block as a SNP-set. Combining haplotype block estimation and SNP-set GWAS, haplotype-based GWAS can be conducted without prior information of haplotypes. Results We prepared 100 datasets of simulated phenotypic data and real marker genotype data of Oryza sativa subsp. indica, and performed GWAS of the datasets. We compared the power of our method, the conventional single-SNP GWAS, the conventional haplotype-based GWAS, and the conventional SNP-set GWAS. The results of the comparison indicated that the proposed method was able to better control false positives than the others. The proposed method was also excellent at detecting causal variants without relying on the linkage disequilibrium if causal variants were genotyped in the dataset. Moreover, the proposed method showed greater power than the other methods, i.e., it was able to detect causal variants that were not detected by the others, especially when the causal variants were located very close to each other and the directions of their effects were opposite. Conclusion The proposed method, RAINBOW, is especially superior in controlling false positives, detecting causal variants, and detecting nearby causal variants with opposite effects. By using the SNP-set approach as the proposed method, we expect that detecting not only rare variants but also genes with complex mechanisms, such as genes with multiple causal variants, can be realized. RAINBOW was implemented as the R package and is available at https://github.com/KosukeHamazaki/RAINBOW.


Background
With the decreasing cost and increasing throughput of next-generation sequencing, the number of accessions that can be used for genome wide association study (GWAS) is increasing [1,2,3]. Using such large sequencing data, GWAS is now widely used not only in human but also in plant and animal genetics and breeding, and has identified novel genes related to important agronomic traits [4,5,6]. One example of large next-generation sequencing data is that of "the 3,000 rice genomes project" as used in this study [7,8], data from which are available in the "Rice SNP-Seek Database" [9,10,11]. GWAS results using these data have already been reported [12].
Despite the enhancement of such public data, however, the conventional GWAS method still faces obstacles in the detection of unknown candidate genes. One common example is its difficulty in detecting rare alleles or rare variants. One problem caused by rare variants is that the non-causal markers that have a strong linkage disequilibrium (LD) with one causal rare variant indicate a higher detection power than the true causal rare variant, which may interfere with the detection of the true causal variant. This phenomenon is known as synthetic association , and often happens when the minor allele frequency (MAF) of the non-causal marker is higher than that of the true rare variant [13]. This problem is closely related to the complex gene compositions comprising multiple alleles such as haplotypes because genes related to important agronomic traits often consist of multiple rare alleles, and this is why haplotypes are hard to detect using GWAS [14].
Several methods have been proposed to solve this problem. The sequence kernel association test (SKAT) is one of the methods used to detect rare variants, and has been used mainly in human genomics [15]. The SKAT employs a single nucleotide polymorphism (SNP) set approach, which tests multiple SNPs in each SNP-set at the same time. The SKAT evaluates the significance of the variance explained by a SNP-set of interest as a random effect using a mixed effect model approach [16,17]. The fatal drawback of the original SKAT is that the model does not take the effects of family relatedness into account as a random effect, which results in it generating false positives for GWAS in materials with a strong population structure or family relatedness, such as a world collection of rice germplasm as used in this study. In addition, several methods were also proposed to overcome another SKAT drawback, that a weighting scheme of the SKAT for rare and common variants can lead to loss of power of common variants, but their models also do not include the term for correcting the confounding effects of family relatedness [18,19].
To solve the fatal drawbacks of the original SKAT, several methods whose models include the term of family relatedness as random effects to control false positives have been previously proposed [20,21,22]. From a statistical point of view, these methods usually perform the score test, which is the computationally efficient method because it requires variance component estimation only for the null model. In terms of the detection power, however, the score test is not necessarily the best method for testing the random effects in the mixed effects model. The likelihood-ratio (LR) test is another candidate used to test the variance of a SNPset of interest, and several methods have been proposed that use the LR test for SNP-set GWAS in family samples [23,24]. In particular, Lippert et al. implemented a computationally efficient SNP-set GWAS method using the LR test, and reported that the LR test showed greater power than the score test [24]. Despite being such an efficient method, however, Lippert et al. mainly used the linear kernel for the covariance matrix of each SNP-set, and therefore there were some restrictions on its use.
However, only a few methods for haplotype-based GWAS have so far been proposed. In plant genomics, Yano et al. performed a haplotype-based GWAS by testing the effects of haplotypes while regarding dummy variables of haplotype groups as fixed effects, and found new candidate genes related to heading date for rice [25]. Other approaches have been proposed in animal genomics, which estimated ancestral haplotype effects by regarding them as random effects whose covariance matrix is a k k identity matrix where k is the number of ancestral haplotypes [26,27]. However, these conventional haplotype-based GWAS methods require haplotype information a priori, and it is not so easy to apply these methods at the genome-wide level.
In this study, we extended the multi-kernel mixed effects model more generally to take family relatedness into account, while enabling computational speed-up for some limited cases, and developed a novel SNP-set GWAS approach named RAIN-BOW (Reliable Association INference By Optimizing Weights). We also estimated haplotype blocks from genome-wide marker genotype data, and used them as SNPsets for analysis with RAINBOW to enable haplotype-based GWAS without prior haplotype information.

Methods for RAINBOW
In this subsection, we describe the basic idea of RAINBOW.

RAINBOW model
The RAINBOW model can be written as where y is a n × 1 vector of phenotypic values, Xβ is a n × 1 vector of fixed effects including an intercept, a term to correct the population structure and other covariates, Z c u c and Z ri u ri are n × 1 vectors of random effects, and ϵ is a n × 1 vector of residual errors. Here β is a p × 1 vector of fixed effects, where p is the number of fixed effects. u c and u ri are m c × 1 and m ri × 1 vector of genotypic values respectively, where m c is the number of genotypes for additive polygenetic effects and m ri is the number of genotypes for i th SNP-set of interest. X, Z c and Z ri are n × p, n × m c and n × m ri design matrices that correspond to β, u c and u ri respectively. As the following formula Eq. (2), we assume that the polygenetic effect u c follows the multivariate normal distribution whose variance-covariance matrix is proportional to the additive numerator relationship matrix K c .
where σ 2 c is the additive genetic variance to be estimated by solving the mixed effect model Eq. (1), and here m c × m c matrix K c = A, where A is the known additive genetic relationship matrix estimated from marker genotype data W c [28].
We also assume that the random effects from i th SNP-set of interest u ri follows the multivariate normal distribution whose variance-covariance matrix is proportional to the Gram matrix K ri .
where σ 2 ri is the genetic variance for i th SNP-set to be estimated by solving the mixed effect model Eq. (1) and K ri is the known m ri × m ri Gram matrix estimated from marker genotype data W ri belonging to the i th SNP-set. We offer a linear, an exponential and a gaussian kernel for the Gram matrix K ri , and faster computation can be realized for the linear kernel case (Supplementary Note in Additional file 1) [24].
Finally, the residual term is assumed to identically and independently follow a normal distribution as shown in the following equation.
where I n is a n × n identity matrix and σ 2 e is estimated by solving the mixed model.

Estimation of variance components
The variance components were estimated by maximum-likelihood (ML) and restricted maximum-likelihood (REML). Here we explain how to obtain ML and REML estimates of Eq. (1) for the general K ri . First we estimated the weights (we define w c and w ri ) between the genetic variances (σ 2 c and σ 2 ri ) by the following algorithm.
1 Setting initial parameters for w c and w ri : 2 Computing the following n × n matrix K s : 3 Solving the following single-kernel linear mixed model (LMM) by using EMMA (efficient mixed model association) or GEMMA (genome wide efficient mixed model association) [29,30]. where 4 Computing the full log likelihood (l F ) or the restricted log likelihood (l R ) of Eq. (7) by using estimated parameters;β,σ 2 s andσ 2 e : l R (y;σ s ,δ) = l F (y;β,σ s ,δ) HereĤ iŝ whereV is a phenotypic variance-covariance matrix andδ =σ 2 e /σ 2 s . 5 Optimizing w c and w ri over maximization of the full/restricted log likelihood by using L-BFGS optimization method through repeating step 2-4 [31].
After estimating the weights w c and w ri , we estimated the variance components (σ 2 s and σ 2 e ) of the model Eq.

Likelihood ratio test for GWAS
To test the significance of each SNP-set, we performed the LR test of whether σ 2 ri = 0 or not. As a null hypothesis, the following model, which does not include the term of SNP-set effects was assumed.
In contrast, as an alternative hypothesis model, the multi-kernel linear mixed model (MKLMM) of Eq. (1) was assumed. Therefore, we computed the following deviance after the estimation of variance components for each SNP-set.
wherel R,model is the maximum of the restricted log likelihood for the model of Eq.
(1) andl R,null is the maximum of the restricted log likelihood for the model of Eq. (12). Finally, we tested the significance of σ 2 ri and calculate the p-value by assuming that the deviance in Eq. (13) followed the mixture of two chi-square distributions with different degrees of freedom.
where π 0 is the mixture parameter and here we used π 0 = 1/2.

Materials and simulations
Genotype data In this study, 414 accessions of Oryza sativa subsp. indica were collected from "the 3,000 rice genomes project" (Table S1 in Additional file 2) [7]. We used a marker genotype consisting of core SNPs defined by the Rice SNP-Seek Database as "404k CoreSNP Dataset". Imputations were imputed using Beagle version 5.0 [32,33]. We analyzed only bi-allelic sites over all accessions with a MAF ≥ 0.025 by using VCFtools version 0.1.15 [34]. In the following analysis, genotypes are represented as -1 (homozygous of the reference allele), 1 (homozygous of the alternative allele) or 0 (heterozygous of the reference and alternative alleles). As a result of this data processing, marker genotypes with 112,630 SNPs were used for the following simulation study.

Estimation of haplotype block
To perform haplotype-based GWAS by regarding each haplotype block as a SNPset, haplotype blocks were estimated from marker genotype data by using PLINK 1.9 [35,36,37]. As a result of estimation, we obtained 15,275 haplotype blocks consisting of 78,237 SNPs.

Simulation of phenotype data
We considered two scenarios to validate our novel haplotype-based GWAS approach. In both models, phenotypic values were simulated as follows.
where y is the vector of simulated phenotypic values of 414 accessions, X 1 , X 2 and X 3 correspond to three quantitative trait nucleotides (QTNs) scored as -1, 0 or 1 (hereinafter, referred to as "QTN1", "QTN2" and "QTN3" respectively), β 1 , β 2 and β 3 are scalars representing the effects of the three QTNs, u is the vector of polygenetic effects, and e is the vector of the residuals.
Here, QTN1 and QTN2 were randomly selected from all genome-wide SNPs to satisfy that they belonged to the same haplotype block that harbored more than 4 SNPs. QTN3 was randomly selected from all the SNPs. We assumed that the effects of QTN1 and QTN2 had a variance 4 times greater than that of the effects of QTN3 to mainly check the detection power for the haplotype block.
The polygenetic effects u in Eq. (16) were sampled from the multivariate normal distribution whose variance-covariance matrix was the additive genetic relationship matrix A, and were normalized so that their variance was equal to that of three QTN effects.
G = Aσ 2 A is the genetic covariance matrix, and the additive genetic variance σ 2 A is automatically determined from the relationship with the heritability. In this study, the additive numerator relationship matrix was estimated based on the marker genotype of 112,630 SNPs using the "A.mat" function of the R package "rrBLUP" version 4.6 [38].
The residual e in Eq. (17) was sampled identically and independently from a normal distribution, and then was normalized so that the narrow-sense heritability was equal to 0.6.
where I is an identity matrix, and σ 2 e is the residual variance determined so that the heritability is equal to 0.6.
The difference between two scenarios is based on the directions of the two QTN effects β 1 and β 2 . Scenario 1 assumed that the directions of two effects were identical. That is, where ρ 12 is the correlation between X 1 and X 2 . We call this model as "coupling". Conversely, scenario 2 assumed that the directions of the two effects were opposite. That is, We call this scenario 2 as "repulsion".

Comparison of four methods
To validate our novel approach, we compared the following four methods: Single-SNP GWAS [39], haplotype-based GWAS introduced by Yano et al. (hereinafter, referred to as "HGF") [25], the SKAT [15] and our novel approach, RAINBOW. For all methods, to account for the population structure, the two eigen vectors (which correspond to the top two eigen values) of the additive genetic relationship matrix were included in the model as fixed effects.
Single-SNP GWAS The single-SNP GWAS model was similar to Eq. (12), but this method regarded each SNP as a fixed effect and tested the significance of each SNP effect one by one [39]. In this study, we used the method called "P3D", which first estimated σ 2 c and σ 2 e by solving the mixed model without SNP effects (Eq. (12)) by REML, and then tested each marker using the information of the estimated varianceσ 2 c andσ 2 e [29,40,41]. We used the "GWAS" function of the R package "rrBLUP" version 4.6 [38].
HGF method HGF tests the significance of each haplotype block effect. In this method, all accessions were divided into k ri groups based on the haplotypes. The haplotype groups were then represented by dummy variables. The significance of the haplotype effect was tested by regarding a matrix of dummy variables as fixed effects [25]. In this study, the k-medoids method (the "pam" function of the R package "cluster" version 2.0.6) [42] or UPGMA (unweighted pair group method with arithmetic mean, "hclust" and "cutree" function of the R package "stats") [43] were employed to group haplotypes. We set the number of groups k ri as 2,3 or 4. Therefore, we applied 6 different HGF models in total. To apply HGF models, we modified the R package "rrBLUP" version 4.6 to enable the use of a matrix of dummy variables as fixed effects [38].
SKAT The SKAT is a SNP-set GWAS method, and is widely used in human genetics [15]. This method was invented to detect rare alleles, but the SKAT model does not include the random effects term that accounts for family relatedness, which may cause false positives for GWAS in materials with a strong population structure or family relatedness. In this study, we used the "SKAT" function of the R package "SKAT" version 1.3.2.1 and performed SNP-set GWAS by regarding each haplotype block as each SNP-set.

RAINBOW
The methods for RAINBOW were described above. In this study, we performed haplotype-based GWAS for RAINBOW by regarding each haplotype block as each SNP-set. We used the linear kernel of each haplotype block for the Gram matrix K ri .

Evaluation of the simulation results
The value of − log 10 (p) of each marker or haplotype block was calculated by the four GWAS methods 100 times for the two simulated scenarios, coupling and repulsion. In this study, the following summary statistics were used to evaluate the simulation results.
− log 10 (p) and adjusted −log 10 (p) The first summary statistic is − log 10 (p) of each causal SNP or haplotype block itself. For haplotype-based GWAS methods, HGF, SKAT and RAINBOW, the significance of β 1 and β 2 was represented by − log 10 (p) of the causal haplotype block to which X 1 and X 2 belong. In the single-SNP GWAS method, the − log 10 (p) of β 1 and β 2 were calculated separately, even though these SNPs were in the same haplotype. To compare the single-SNP GWAS method with the haplotype-based GWAS methods, the − log 10 (p) values were averaged over β 1 and β 2 .
As some of these methods showed the results of inflated − log 10 (p), we defined the following summary statistic to evaluate the degree of inflation.
where p f alse,l is the l th of p-values for false positives arranged in decreasing order. In this study, L was set as 10. Then we adjusted − log 10 (p) of the causal by using the inflator (Eq. (20)) as follows.
adjusted − log 10 (p) = − log 10 (p) − inflator Here, we calculated each summary statistic in two ways. The first method is to calculate each summary statistic by directly using − log 10 (p) of each causal SNP / haplotype block. The other method is to calculate the summary statistics by regarding multiple SNPs or haplotype blocks within the extent of the LD as one set. In this study, we defined SNPs or haplotype blocks that satisfy the condition that they are within 300 kb from the focused SNP or haplotype block and the condition that their square of the correlation coefficients with the focused SNP or haplotype block are 0.35 or more as one set considering the LD. The highest value of − log 10 (p) in the LD region was assumed to represent the values of the SNPs or haplotype blocks within the extent of the LD.

Recall, Precision and F-measure
We calculated the recall, precision and F-measure means as other summary statistics to evaluate the GWAS results. These summary statistics were calculated from the numbers of SNPs or haplotype blocks that were true positives, false positives, false negatives and true negatives. Here, we regarded a SNP or haplotype block as "positive" when that SNP or haplotype block exceeded the threshold. In this study, the value of − log 10 (p) so that the FDR (false discovery rate) was 0.01 was set as the threshold by using the Benjamini-Hochberg method [44,45]. In addition, these three summary statistics, recall, precision and F-measure, were calculated by assuming that the highest value of − log 10 (p) in the LD region represented the values of the SNPs or haplotype blocks within the extent of the LD. Therefore, recall represents the proportion of causals detected by GWAS. In contrast, precision represents the ratio of the detected SNPs or haplotype blocks that were causals. Finally, F-measure was calculated as the harmonic mean of the recall and the precision, which evaluates the GWAS results comprehensively. The greater these three summary statistics, the better the results of GWAS are. Here we simply took the average of each summary statistic from all the 100 simulation results.

AUC for regions around causals
We calculated the mean of the AUC (area under the curve) for regions around the causals as a summary statistic. AUC refers to the area under the ROC (receiver operating characteristic) curve obtained by plotting the false positive rate on the horizontal axis and the true positive rate on the vertical axis when the threshold is varied. In this study, the AUC was calculated for the SNPs or haplotype blocks near the causal SNP / haplotype block (QTN1 and QTN2). In other words, the non-causal markers that had a strong LD with the causal SNP / haplotype block were regarded as false positives under this summary statistic. Therefore, this summary statistic indicates the extent to which the causal itself can be detected by GWAS without relying on the LD. Here, when taking the average of the AUCs obtained from the simulation results, two methods were used, either using all the 100 results or only using the results whose QTN1 and QTN2 were detected . Here, QTN1 and QTN2 were regarded as "detected" if adjusted − log 10 (p) ≥ 1.5 for each method.

Results
The detection power of four methods The detection power of the four methods was evaluated by the value of − log 10 (p) and adjusted −log 10 (p) of QTN1 and QTN2 for the two models, coupling and repulsion (Fig. 1). RAINBOW outperformed the other methods when the significance was evaluated by the causal itself (Fig. 1a,c,e,g). However, when the significance was evaluated by the highest values of SNPs or haplotypes within the extent of the LD, other methods, e.g., HGF (k = 2, k-medoids method), showed a greater detection power than RAINBOW (Fig. 1b). When the detection power was evaluated by taking the extent of inflation into account, RAINBOW showed as great a power as HGF (k = 2, 3) even if the significance was evaluated by the unit of the LD block (Fig. 1d). Moreover, although the detection power of all the GWAS methods for the repulsion scenario was less than that for the coupling scenario, the tendency for RAINBOW to outperform the other methods was clearer for the repulsion scenario than the coupling scenario (Fig. 1). Finally, as compared with the other haplotype-based GWAS methods, RAINBOW showed smaller variation among iterations, indicating that the causal variants can be stably detected (Fig.  1).
The detection power for QTN3 was also evaluated. The single-SNP GWAS method showed a greater power than RAINBOW when evaluated by − log 10 (p) (Fig. S1a,b,e,f in Additional file 3). However, if the detection power was evaluated by adjusted − log 10 (p), RAINBOW showed as great a power as single-SNP GWAS (Fig. S1c,d,g,h). Contrary to the results for QTN1 and QTN2, the detection power of all the GWAS methods for the repulsion scenario was greater than for the coupling scenario (Fig. S1).

Recall, Precision and F-measure
The characteristics of each GWAS method were evaluated by the recall, precision and F-measure means (Fig. 2). For the mean of recall, the HGF methods and SKAT showed higher values than RAINBOW and single-SNP GWAS for both scenarios (Fig. 2). However, the haplotype-based GWAS methods other than RAINBOW showed low precision. That is, these methods may cause too many false positives. In contrast, RAINBOW and single-SNP GWAS showed higher precision than the remainders, and RAINBOW showed the highest precision among all the scenarios. From the results for the three summary statistics, RAINBOW also showed the highest value for F-measure among the methods. In particular, for the repulsion scenario, the recall of RAINBOW was also higher than that of single-SNP GWAS, which resulted in the large difference of F-measure between these two methods (Fig.  2b). A similar tendency was also confirmed when changing the criterion of how to determine the threshold for the Bonferroni s correction for the significance level α = 0.01 (Fig. S2 in Additional file 4).
To compare the three summary statistics of the two scenarios in more detail, these values for each QTN were also calculated (Fig. S3 in Additional file 5). For both scenarios, RAINBOW showed the highest recall for QTN1 and QTN2 among the methods (Fig. S3ab). In particular, RAINBOW outperformed the other methods in all summary statistics for QTN1 and QTN2 for the repulsion scenario. However, it showed lower recall for QTN3 than the other methods (Fig. S3cd). In particular, the recall of RAINBOW for QTN3 was 0 for the coupling scenario (Fig. S3c).In addition, the three summary statistics of QTN3 for the repulsion scenario were greater than those for the coupling scenario in almost all the methods (Fig. S3cd). Regarding these results, a similar trend was confirmed even when changing the criterion of how to determine the threshold for the Bonferroni s correction for the significance level α = 0.01 (Fig. S4 in Additional file 6).

AUC for regions around causals
To evaluate how the causal itself can be detected by GWAS without relying on the LD, the AUC means for regions around the causals (QTN1 and QTN2) were compared (Fig. 3). The mean of AUC was almost the same when using all simulation results or using only the cases in which QTN1 and QTN2 were detected, although the value of the latter was slightly larger than that of the former in some methods. The results show that RAINBOW outperformed the other methods in both models (Fig. 3). Especially, the AUC mean of the single-SNP GWAS method in the repulsion scenario was much smaller than that in the coupling scenario, while RAINBOW was able to maintain a high AUC even in the repulsion scenario (Fig.  3b).

Examples in the repulsion scenario
Of the 100 simulations for the repulsion scenario, there were 7 cases in which QTN1 and QTN2 were detected only by RAINBOW. These cases were selected to satisfy three conditions that adjusted −log 10 (p R ) ≥ 1.5, adjusted −log 10 (p O ) ≤ 1.2 and the recall for QTN1 and QTN2 equals to 1. Here, p R represents the p-value of RAIN-BOW and p O represents the p-value of all the other methods. Although the same analysis was done for the other methods, no method satisfied the three conditions described above. One example of these cases (iteration 48) was shown by comparing the four GWAS methods; RAINBOW, single-SNP GWAS, HGF (the number of groups is 2, the grouping method is UPGMA) and SKAT (Fig. 4). The Manhattan plot shows that RAINBOW succeeded in detecting the causal haplotype block (of QTN1 and QTN2) that was not detected by the other methods. Although both QTN1 and QTN2 were also detected by the single-SNP method in one case (iteration 85), the same trend as the results for iteration 48 was seen for the remaining five results (Fig. S5 in Additional file 7).

Discussion
As shown in Results section, when − log 10 (p) was evaluated by the LD block unit for the coupling scenario, RAINBOW did not necessarily outperform the other methods. However, if we considered the inflation level of each result and evaluated the results with the adjusted − log 10 (p), RAINBOW showed as great a detection power as other methods (Fig. 1d), which means RAINBOW succeeded in controlling false positives compared with other haplotype-based GWAS methods. This can also be seen from the fact that the precision of RAINBOW was much higher than the other GWAS methods including single-SNP GWAS (Fig. 2).
Moreover, − log 10 (p) of RAINBOW was the highest when evaluated by that of the causal SNP/haplotype block itself, which implies that RAINBOW can detect causal haplotype blocks themselves without relying on the LD beyond the scope of the haplotype block. This can also be confirmed by the results that showed that the AUC for the regions around the causal was larger in RAINBOW than in any other methods (Fig. 3).
In addition, for the repulsion scenario, RAINBOW succeeded in detecting causal haplotype blocks that were not able to be detected by any other methods including single-SNP GWAS (Fig. 4). This result affected other results that RAINBOW outperformed the other methods especially when evaluated by the detection power, recall, precision and F-measure in the repulsion scenario. This fact suggests that RAINBOW is good for detecting the causal haplotype block with multiple causal variants. For example, RAINBOW can be applied to the detection of genes that have more than one variant. Therefore, for future analysis, RAINBOW can be used for gene-set GWAS (which regards one gene as one SNP-set) by using gene annotation information.
The only drawback of RAINBOW is that the detection power for the causal with small effects (QTN3) was not so high (Fig. S3cd). The drawback may be related to the fact that RAINBOW succeeded in detecting QTN1 and QTN2 well. In other words, RAINBOW cannot account for the loci of large effects well when testing other loci, and the loci of relatively small effects may be concealed by these loci of large effects. This drawback, however, can be easily resolved by using methods that condition the loci of large effects, such as composite interval mapping for QTL analysis [46, 47] or a multi-locus mixed model for GWAS [48]. For future analysis, we will implement this function to condition the loci of large effects when testing other loci of small effects.

Conclusion
We proposed a new method named "RAINBOW", which is a haplotype-based GWAS method using a novel SNP-set approach, and showed that it outperformed the conventional single-SNP GWAS method, the conventional SNP-set method, and the conventional haplotype-based method. Our method is especially superior in controlling false positives, detecting the causal itself, and detecting causal variants under complex conditions, e.g., allelic effects that are in a repulsion phase (i.e., the opposite direction of effects of closely linked causal variants). By using the SNP-set approach as our method, we expect that detecting not only rare alleles but also genes with complex mechanisms, such as genes with multiple causal variants, can be realized.  H2k-H4p : HGF methods. The numbers in the method names correspond to the numbers of the groups they assume. The last letters of the methods are "k" or "p". "k" corresponds to the k-medoids method and "p" corresponds to UPGMA method for the grouping method. SK : SKAT.  Additional file 2 - Table S1 Supplementary table for accession information used in this study. (CSV 19kb) Additional file 3 - Fig. S1 Supplementary figure for − log 10 (p) and adjusted − log 10 (p) of QTN3 for each method. How to view this figure (including legends and abbreviations) is the same as that of Fig. 1. (PNG 1275kb)