Associations between the orexin (hypocretin) receptor 2 gene polymorphism Val308Ile and nicotine dependence in genome-wide and subsequent association studies

Many genetic and environmental factors are involved in the etiology of nicotine dependence. Although several candidate gene variations have been reported by candidate gene studies or genome-wide association studies (GWASs) to be associated with smoking behavior and the vulnerability to nicotine dependence, such studies have been mostly conducted with subjects with European ancestry. However, genetic factors have rarely been investigated for the Japanese population as GWASs. To elucidate genetic factors involved in nicotine dependence in Japanese, the present study comprehensively explored genetic contributors to nicotine dependence by using whole-genome genotyping arrays with more than 200,000 markers in Japanese subjects. The subjects for the GWAS and replication study were 148 and 374 patients, respectively. A two-stage GWAS was conducted using the Fagerström Test for Nicotine Dependence (FTND), Tobacco Dependence Screener (TDS), and number of cigarettes smoked per day (CPD) as indices of nicotine dependence. For the additional association analyses, patients who underwent major abdominal surgery, patients with methamphetamine dependence/psychosis, and healthy subjects with schizotypal personality trait data were recruited. Autopsy specimens with various diseases were also evaluated. After the study of associations between more than 200,000 marker single-nucleotide polymorphisms (SNPs) and the FTND, TDS, and CPD, the nonsynonymous rs2653349 SNP (located on the gene that encodes orexin [hypocretin] receptor 2) was selected as the most notable SNP associated with FTND, with a p value of 0.0005921 in the two-stage GWAS. This possible association was replicated for the remaining 374 samples. This SNP was also associated with postoperative pain, the initiation of methamphetamine use, schizotypal personality traits, and susceptibility to goiter. Although the p value did not reach a conventional genome-wide level of significance in our two-stage GWAS, we obtained significant results in the subsequent analyses that suggest that the rs2653349 SNP (Val308Ile) could be a genetic factor that is related to nicotine dependence and possibly pain, schizotypal personality traits, and goiter in the Japanese population.


(Continued from previous page)
Conclusions: Although the p value did not reach a conventional genome-wide level of significance in our two-stage GWAS, we obtained significant results in the subsequent analyses that suggest that the rs2653349 SNP (Val308Ile) could be a genetic factor that is related to nicotine dependence and possibly pain, schizotypal personality traits, and goiter in the Japanese population.

Background
Despite the worldwide spread of cigarette smoking, many tobacco users are reportedly unaware of the harmful chemicals in tobacco products and tobacco smoke and the wide spectrum of specific illnesses that are caused by tobacco use [1]. They also frequently do not know that smoking can cause cancer (other than lung cancer), heart disease, stroke, and many other diseases [2]. Direct tobacco smoking is currently responsible for the death of approximately 5 million people worldwide each year, with many of these deaths occurring prematurely [3,4]. In Japan, the proportion of deaths that are attributable to tobacco is estimated to be 17 % for ages 30 and over [3]. Therefore, decreasing the frequency of cigarette smoking is imperative to prevent diseases caused by tobacco use. However, many smokers are not easily able to quit smoking because of the addictive property of nicotine, a major component of tobacco.
The heritability of nicotine dependence is estimated to range from 0.40 to 0.75 [5][6][7], and many genetic and environmental factors have been implicated in the etiology of the disorder. To date, several candidate gene variations have been reported to be associated with smoking behavior and the vulnerability to nicotine dependence. These were found mostly by investigating single-nucleotide polymorphisms (SNPs) on each gene related to specific phenotypes or genome-wide association studies (GWASs), which evaluate genetic markers in the entire genome. By taking the former candidate gene approach, several SNPs have been found to be associated with smoking behavior or nicotine dependence [8,9]. The latter genome-wide approach has also successfully identified numerous candidate SNPs that may contribute to the etiology of nicotine dependence or smoking-related behavior [10][11][12][13][14][15][16][17][18][19][20][21][22][23]. The strongest signal of association appears to be SNPs at 15q25, represented by a synonymous rs1051730 SNP or nonsynonymous rs16969968 SNP in the region of CHRNA5-CHRNA3-CHRNB4, which encodes neuronal nicotinic acetylcholine receptor subunits [16][17][18][19][20][21]. However, these studies were conducted with subjects with mostly European or African ancestry [10][11][12][13][14][15][16][17][18][19][20][21][22][23]. Although a recent study that used Korean samples found loci that influence smoking initiation and nicotine dependence other than those identified in studies with samples of European origin [24], the applicability of the results to Japanese or other Asian populations is unknown.
To elucidate the genetic factors that are involved in nicotine dependence in the Japanese population, the present study comprehensively explored genetic contributors to nicotine dependence using whole-genome genotyping arrays with more than 200,000 markers in Japanese subjects.

Results
Genome-wide association study identifies a locus associated with nicotine dependence in a Japanese population In the first exploratory stage analysis of the two-stage GWAS, 11,968, 10,865, and 11,244 SNPs showed p values less than the threshold (p = 0.05) and were selected as candidates for the second-stage analysis for the FTND, TDS, and CPD. In the second stage analysis and LD-based SNP pruning, 273, 250, and 248 SNPs showed p values less than the threshold (p = 0.05) for the FTND, TDS, and CPD, respectively. These SNPs were considered potent candidates, and SNPs for the replication study were selected from among these SNPs. The top 50 candidate SNPs with the lowest p values for the FTND, TDS, and CPD are presented in Tables 1, 2, and 3, respectively. The top 51-100 candidate SNPs for the FTND, TDS, and CPD are presented in Additional file 1: Table S1,  Additional file 2: Table S2, and Additional file 3: Table  S3, respectively. Several candidate SNPs after the second-stage analysis were found to be located within or around the gene regions that were already identified as candidate loci associated with nicotine dependence or smoking behavior (Additional file 4: Table S4). Additionally, several SNPs or related genes that were selected after the second-stage analysis commonly appeared in the lists of candidates that resulted from the analysis for different phenotypic measures of nicotine dependence (Additional file 4: Table S4).
Although the p values for the top candidate SNPs were not sufficiently low when considering the conventional genome-wide level of significance, the list of the top 50 candidate SNPs (Tables 1, 2, 3) was found to include a few nonsynonymous SNPs that involve amino-acid changes and are expected to have a greater impact on the functions of gene products than other SNPs. Therefore, the best of these candidates for the FTND and CPD, rs2653349 (Rank 50) and rs726016 (Rank 1), respectively, were considered for the further confirmatory replication study (Tables 1  and 3). In the replication study, of the two SNPs, only the       rs2653349 SNP showed a significant association with FTND score (χ 2 = 5.094, q = 0.048; Table 4). This SNP was within the region of the HCRTR2 gene, which encodes orexin (hypocretin) receptor 2. Imputation analysis was further conducted for SNPs around the rs2653349 SNP for the GWAS samples. As a result, several SNPs whose genotypes were imputed exhibited strong associations similarly to the rs2653349 SNP. These SNPs (rs2653342, rs2653343, rs2653344, and rs2653347) were located at a slightly upstream region of the rs2653349 SNP (Fig. 1). However, our database search with SNPinfo Web Server (see Additional file 5: Supporting Information) revealed that all of these SNPs were in absolute linkage disequilibrium with the rs2653349 SNP (D′ = r 2 = 1). Given that genotype call ratios by whole-genome genotyping or imputation for the rs2653342, rs2653343, rs2653344, rs2653347, and rs2653349 SNPs were 99.0 %, 97.7 %, 100.0 %, 98.0 %, 100.0 %, and 100.0 %, respectively, the apparently stronger associations that were observed for the rs2653342, rs2653343, and rs2653347 SNPs than for the rs2653349 SNP were merely attributable to the incomplete genotype data for these SNPs. Therefore, the contribution of the rs2653342, rs2653343, rs2653344, and rs2653347 SNPs to nicotine dependence was regarded as statistically equal to that of the rs2653349 SNP. Considering that rs2653349 is a nonsynonymous SNP and the others are located in the intronic region of the HCRTR2 gene, one could assume that this SNP is most likely the causative SNP.

Association between rs2653349 SNP and postoperative pain in patients who underwent major open abdominal surgery
To examine whether the rs2653349 SNP that was identified in our GWAS affects individual differences in the sensitivity to some drugs and pain, we investigated the association between the rs2653349 SNP and postoperative opioid requirements and pain in another cohort who underwent major open abdominal surgery (Additional file 6: Table S5). The subjects included 112 patients who underwent major open abdominal surgery under combined general and epidural anesthesia (Additional file 6: Table S5). Significant associations were not found for the frequency of analgesic administration (U = 495, p = 0.1373; Additional file 6: Table S5) or the total dose administered (U = 480, p = 0.1091; Additional file 6: Table S5). However, a significant difference in pain intensity at rest during the first 24-h postoperative period, reflected by scores on a numerical rating scale (NRS; i.e., a 5-point verbal rating scale, in which 0 = no pain, 1 = mild pain, 2 = moderate pain, 3 = severe pain, and 4 = extremely severe pain), was found between subjects with the A/G genotype and subjects with the G/G genotype in the rs2653349 SNP.
Furthermore, carriers of the G/G genotype complained more about pain than carriers of the A/G genotype (U = 212, p = 0.0017; Additional file 6: Table S5), suggesting that G/G carriers are more sensitive to postoperative pain than A/G carriers.

Association between rs2653349 SNP and the initiation of drug use in patients with METH dependence
We also investigated the association between the rs2653349 SNP and two measures of the severity of METH dependence: age at first use of the drug (years) and number of drugs used, including METH. Significant associations were not found for the number of drugs used (U = 1894, p = 0.2601; Additional file 7: Table S6). However, a significant difference was found in the first use of METH between the subjects with the A/G genotype and subjects with the G/G genotype in the rs2653349 SNP. Carriers of the A/G genotype first used METH at a significantly younger age than carriers of the G/G genotype (U = 1097, p = 0.0056; Additional file 7: Table S6), suggesting that A/G carriers might harbor a high risk of initiating METH use and becoming regular users at younger ages than G/G carriers.

Association between rs2653349 SNP and schizotypal personality trait in healthy subjects
To further examine the impact of the rs2653349 SNP on individual differences in the vulnerability to develop other psychiatric diseases in healthy subjects, we examined the possible effect of the rs2653349 genotype in the HCRTR2 gene on total Schizotypal Personality Questionnaire (SPQ) scores [25][26][27] using a one-way analysis of covariance (ANCOVA). To control confounding factors, age, sex, and years of education were used as covariates [28]. This analysis indicated a significant effect of genotype (F 1,306 = 5.80, p = 0.017; Additional file 8: Table S7). We then investigated the effect of genotype on the three SPQ factors: cognitive/perceptual, interpersonal, and disorganization [29]. Significant effects of genotype were found on the cognitive/perceptual factor (F 1,306 = 4.37, p = 0.037) and disorganization factor (F 1,306 = 4.19, p = 0.042; Additional file 9: Table S8), but no effect of genotype was found on the interpersonal factor (p > 0.09). Subjects with the risk A/G genotype of the rs2653349 SNP had higher schizotypal trait scores than subjects with the G/G genotype (Additional file 9: Table S8).

Association between rs2653349 SNP and goiter in samples of autopsy specimens
We also investigated whether the rs2653349 SNP affects the susceptibility to diseases other than psychiatric diseases in another cohort by evaluating the association between this SNP and a total of 104 diseases/phenotypes using autopsy specimens (Additional file 10: Table S9). Nominally significant associations were found between goiter, aortic aneurysm, myeloma, and amyotrophic lateral sclerosis and the rs2653349 SNP. Carriers of the A/G genotype were more frequently in the disease group than in the control group (Additional file 11: Table S10). The association between this SNP and goiter was much stronger than the associations with the other three diseases (Additional file 11: Table S10). These results suggest that A/G carriers may harbor a higher risk of developing these diseases than G/G carriers.

Discussion
Previous GWASs on nicotine dependence that used samples of mainly European origin identified numerous candidate SNPs [10][11][12][13][14][15][16][17][18][19][20][21][22][23]. Several of the candidate SNPs in the present study after the second-stage analysis were found to be located within or around the gene regions that have already been identified as candidate loci associated with nicotine dependence or smoking behavior (Additional file 4: Table S4). These candidate genetic factors may be common to some populations other than Japanese. Additionally, several SNPs or related genes that were selected after the second-stage analysis commonly appeared in the lists of candidates that resulted from the analysis of different phenotypic measures of nicotine dependence (Additional file 4: Table S4). These SNPs/genes might be candidates that possibly contribute to several aspects of dependence simultaneously. Among these candidate SNPs/genes were CSMD1 and IMMP2L, which were selected as candidate genes to test associations with the FTND, TDS, and CPD (Additional file 4: Table S4). CSMD1, a tumor suppressor gene [30], is a known candidate for nicotine dependence [11]. The SNPs in the region of CHRNA5-CHRNA3-CHRNB4 at 15q25, which have been reported to show strong associations with CPD in studies with subjects with European ancestry [16][17][18], were not included in our list of the top 100 candidates (Tables 1, 2, 3, Additional file 1:  Table S1, Additional file 2: Table S2, Additional file 3: Table  S3). Although definitive conclusions cannot be drawn until further studies are conducted with larger sample sizes, the SNPs in this region might not greatly affect the severity of nicotine dependence in the Japanese population. The best candidate SNP, rs2653349, is located within the HCRTR2 gene on chromosome 6 ( Table 4). The HCRTR2 gene encodes orexin receptor 2. The orexin receptor has two subtypes (orexin receptor 1 [HCRTR1] and orexin receptor 2 [HCRTR2]) and two known neuropeptide ligands (orexin A, which binds both HCRTR1 and HCRTR2, and orexin B, which binds only HCRTR2). Orexin peptides, whose mRNA initially identified them as a hypothalamus-specific peptides with neuroexcitatory activity [31], and their receptors are distributed in various tissues, including the hypothalamus and pituitary [32][33][34]. They are well known to regulate feeding behavior and states of sleep/wakefulness [32,35]. Orexin knockout mice were reported to exhibit an attenuation of morphine dependence [36]. The hypocretin receptor 1 antagonist SB334867 and hypocretin receptor 1/2 antagonist almorexant both dose-dependently reduced nicotine self- Fig. 1 Fine mapping of the candidate region after the imputation-based association analysis. The circle plot represents the results of the association analysis in a trend model using the GWAS samples. The range for the X-axis represents the genomic position from 55,100,000 to 55,400,000 on chromosome 6 in the HapMap database (http://hapmap.ncbi.nlm.nih.gov/index.html.ja; accessed July 15, 2015). The green object above the plot illustrates the schematic structure of the human HCRTR gene (Gene Accession no. NM_001526.3), based on the NCBI database (http:// www.ncbi.nlm.nih.gov/gene/; accessed July 15, 2015). The gene is illustrated between the 5'-and 3'-flanking regions, corresponding to left and right sides, respectively. The green vertical lines and horizontal line that connects them represent the exon and intron, respectively administration [37], suggesting the involvement of the orexin system in the rewarding effects.
Our further study that examined whether the rs2653349 SNP affects individual differences in the susceptibility to other diseases or phenotypic traits suggested that A/G genotype carriers of this SNP are less sensitive to postoperative pain (Additional file 6: Table S5), have a high risk of initiating METH use regularly at younger ages (Additional file 7: Table S6), have higher schizotypal trait scores (Additional file 9: Table S8), and harbor a higher risk of developing goiter, aortic aneurysm, and myeloma than G/G carriers (Additional file 11: Table S10). For this SNP [38], several previous reports demonstrated an association with cluster headache [39][40][41]. Intriguingly, carriers of the G/G genotype have been reported to have a higher disease risk than the other genotypes. Given the modulation of antinociceptive effects by the orexin system in various animal models of pain [42], one speculation is that G/G carriers are generally more sensitive to pain because of dampened activity of the orexin system. The severer levels of nicotine dependence in A/G genotype carriers of this SNP may be attributable to increased rewarding effects related to the orexin system, possibly leading to a higher risk of initiating METH use regularly at younger ages. The higher schizotypal trait scores in A/G carriers than in G/G carriers in the present study may be explained by orexin-induced activation of the mesolimbic dopamine system, which may underlie some of its actions in schizophrenia [43]. Previous reports indicated that nicotine dependence was more frequent in smokers with schizophrenia compared with the general population [44]. Chronic schizophrenia patients also have lower visual analogue scale postoperative pain scores than control patients [45], which appears to be consistent with the trend observed in the present study. Less understood is the involvement of the orexin system in goiter, aortic aneurysm, myeloma, and amyotrophic lateral sclerosis. However, the increased frequency of cigarette smoking associated with severer levels of nicotine dependence in A/G carriers of this SNP may cause an elevation of thiocyanate synthesis [46], which would then enhance the risk of goiter [47][48][49].
Our database search estimated that the possible impact of the amino-acid substitution from valine to isoleucine by the base substitution from G to A in the rs2653349 SNP on the structure and function of human protein is predicted to be "benign." Indeed, few studies have investigated the functional alterations caused by the rs2653349 SNP. A previous report only showed that the SNP, located in the exon 5 region, had no effect on orexin binding sites, but the corresponding residue Val308Ile might be at the dimer interface and influence the dimerization process of the receptor [50]. Further functional studies will be needed to provide a clearer explanation for the associations observed in human studies between the SNP and phenotypic traits.
With regard to our initial GWAS that was conducted as a consecutive two-stage analysis, the lowest exploratory p value for the entire sample was >~1 × 10 −6 ( Tables 1, 2, 3), which would have been deemed genome-wide nonsignificant if only a single-stage analysis was performed to calculate conventional Bonferroni-or FDR-corrected p values for the total samples to determine statistical significance. The situation would not have been improved even if only non-synonymous SNPs had been considered in the analysis as in previous studies [51,52] because a total of 1,413 non-synonymous SNPs were included among the 225,591 SNPs that were used in our two-stage GWAS, and the conventionally corrected p value of the rs2653349 SNP would have exceeded 0.05. However, "significant" results that are obtained as conventionally corrected p values may not always represent true associations, meaning that the results may not be necessarily replicated in other studies, and vice versa. For example, data from the National Human Genome Research Institute GWAS catalog (as of January 31, 2009) showed 1,321 entries of discovered associations with p values < 10 −5 , but only 550 of these entries had p values < 5 × 10 −8 [53], which is known as a conventionally corrected conservative threshold for declaring a significant association in a GWAS [54,55]. In both cases, truly potent candidate SNPs may be included in the outcome of the studies. Furthermore, inconsistencies between genome-wide significance and truly meaningful "significance" have been exemplified by numerous studies. For example, Bierut et al. [19] identified SNPs in the β3 nicotinic cholinergic receptor gene as candidates that contribute to the development of nicotine dependence based on p values that were genome-wide nonsignificant. Thorgeirsson et al. [18] subsequently performed a genomewide association meta-analysis using independent samples and found that SNPs in this gene region were genomewide significantly associated with CPD (p < 5 × 10 −8 ). Altogether, these reports suggest that conventionally corrected p values for combined samples may not be the sole criteria for identifying true associations between SNPs and specific phenotypes. Nevertheless, we did not obtain results with markedly low p values in the present GWAS. Future studies are required to draw more definitive conclusions about associations between the rs2653349 SNP and nicotine dependence and the other traits that were examined in the present study.

Conclusions
Although the p value did not reach a conventional genome-wide level of significance in our two-stage GWAS, we obtained results in the subsequent analyses that suggest that carriers of the A/G genotype of the nonsynonymous rs2653349 SNP are more likely to exhibit more severe levels of nicotine dependence than carriers of the G/G genotype in the Japanese population. This result suggests that this SNP may be a genetic factor that is associated with the exacerbation of nicotine dependence and may serve as a marker that predicts the severity of nicotine dependence in the Japanese population.

Subjects
The participants in the present study were a total of 999 patients who were ambulatory, able to communicate orally, and 60 years of age or older. Of these 999 patients, 148 were included in the two-stage GWAS to explore the association between polymorphisms in the entire human genome and the severity of nicotine dependence. Numerous participants in this study had various smoking habits and completed a questionnaire that contained questions about lifestyle, including alcohol consumption, smoking, diet, and cancer history [56,57]. The detailed demographic and clinical characteristics of the subjects are provided in Additional file 12: Table S11.
The subsequent study that investigated the associations between the candidate SNPs and sensitivity to pain and analgesia included 112 patient subjects who underwent major open abdominal surgery (age, 28-80 years; 60 males and 52 females), mostly gastrectomy for gastric cancer and colectomy for colorectal cancer [58,59]. Additionally, 203 patients with methamphetamine (METH) dependence (age, 18-69 years; 165 males and 38 females) who were recruited for a collaborative study of the Japanese Genetics Initiative for Drug Abuse (JGIDA) [60,61] were also enrolled in the present study. Healthy subjects (age, 18-66 years; 129 males and 182 females) were also recruited for the association study between candidate SNPs and schizotypal personality traits. To investigate the impact of the candidate SNPs on the susceptibility to various diseases, samples of 2,305 consecutive Japanese autopsy cases (1,266 male and 1,039 female; mean age at death, 80.6 ± 8.8 years) who were registered in the Internet database of Japanese SNPs for geriatric research [62] were also used. The demographic and clinical data of all of these subjects are described in the Supporting Information (Additional file 6: Table S5, Additional file 7: Table S6, Additional file 8: Table S7, Additional file 10: Table S9) and previous reports [58][59][60][61][62][63][64][65].

Smoking behavior data
The results of the questionnaire, especially questions related to smoking, were used in the analysis of the severity of nicotine dependence. Some of them included the Fagerstrӧm Test for Nicotine Dependence (FTND; a test that yields a continuous measure of nicotine dependence) [66] and Tobacco Dependence Screener (TDS; a screening questionnaire for tobacco/nicotine dependence according to the International Statistical Classification of Diseases and Related Health Problems, 10th revision [ICD-10], Diagnostic and Statistical Manual of Mental Disorders, 3rd edition, revised [DSM-III-R], and DSM-IV), which consists of 10 questions [67]. The questionnaire also included questions about the number of cigarettes smoked per day (CPD) and other smoking-related traits (see Additional file 5: Supporting Information). In the present study, the FTND, TDS, and CPD were used as measures of nicotine dependence (Additional file 12: Table S11).

Genotyping
Whole-genome genotyping and quality control A total of 300 DNA samples from the patients with smoking behavior data who visited Iwata City Hospital were used for genotyping. Whole-genome genotyping was performed using the Infinium assay II (Illumina, San Diego, CA, USA) with HumanCytoSNP-12 v2.0 (total markers: 301,232) BeadChips (see Additional file 5: Supporting Information).
A total of 225,602 SNP markers survived the initial filtration process as described in the Additional file 5: Supporting Information, and a total of 225,591 SNP markers were finally used for the GWAS after filtration using the Hardy-Weinberg equilibrium test (Additional file 13: Figure S1).

Genotyping for specific candidate SNPs
The TaqMan allelic discrimination assay (Life Technologies, Carlsbad, CA, USA) was adopted for a total of 112 and 203 DNA samples from patients who underwent major abdominal surgery and patients with METH dependence/psychosis, respectively. Additional file 14: Table  S12 provides the genotype distribution of the candidate SNPs. The details of genomic DNA extraction and the TaqMan allelic discrimination assay are described in the Additional file 5: Supporting Information and previous reports [68].
Array-based genotyping procedure for data extraction of a specific SNP To genotype the rs2653349 SNP for a total of 311 and 2,305 DNA samples from healthy subjects with schizotypal personality trait data and autopsy specimens, respectively, array-based genotyping was conducted. The genotype data for this SNP were extracted from the overall results. In the former case, genotyping was performed using Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) as previously described [69]. In the latter case, all of the samples were genotyped using Illumina Infinium HumanExome Beadchips (Illumina, CA, USA, USA) with iScan. The details are described in the Additional file 5: Supporting Information.

Statistical analysis Genome-wide association study
A two-stage GWAS was conducted for patients with smoking behavior data to initially investigate the association between genetic variations and the severity of nicotine dependence as an exploratory study to narrow down candidate SNPs. Among the 300 subjects genotyped, approximately half lacked smoking behavior data that were related to nicotine dependence. Therefore, a total of 148 subjects were used for our two-stage GWAS (68 and 80 subjects for the first-and second-stage analyses, respectively). In our two-stage GWAS, 225,591 SNPs were selected for the analyses.
The GWAS procedure is summarized in Additional file 13: Figure S1. In the first-stage analysis with 68 samples, the SNPs that showed p < 0.05 were selected as candidate SNPs for the second-stage analysis among the 225,591 SNPs. For these SNPs, the second-stage analysis was conducted with 80 samples, and the SNPs that showed p < 0.05 for the single analysis of this stage and combined analysis of the first and second stages were considered potent candidates. The SNPs for further replication analysis were selected from these SNPs after linkage disequilibrium (LD)-based SNP pruning to remove redundant SNPs due to strong LD with each other (see Additional file 5: Supporting Information). In the confirmatory replication study to corroborate the possible association between the SNPs that were selected after the second stage and phenotypic trait, the q values of the false discovery rate were calculated to correct for multiple testing, in addition to p values based on previous reports [72,73]. The SNPs that showed q < 0.05 in the analysis were considered statistically significant. A log quantile-quantile (QQ) p-value plot based on the results of the GWAS for the combined samples was subsequently drawn to check the pattern of the generated p-value distribution (Additional file 15: Figure S2).

Association study for candidate SNP
For all of the genotype data used in the additional analyses after the GWAS, the distributions were checked using the χ 2 test, and the absence of significant deviation from the theoretical distribution expected from Hardy-Weinberg equilibrium was confirmed (Additional file 14: Table S12).
To explore the association between the rs2653349 SNP and phenotypes, the Mann-Whitney U-test, the χ 2 test, and Fisher's exact test were performed. The statistical methods and software used in the analyses for the candidate SNP are provided in the Additional file 5: Supporting Information. The criterion for significance was set at p < 0.05.
analysis. SK assisted with data analysis and interpretation of findings. DN, KO, RH, TA, SM, MS, MNM, YY, MikY, NoS, MM, MaT, and KI drafted the manuscript. All authors critically reviewed content and approved final version for publication.