Introduction

Both environmental and host susceptibility factors are relevant to the etiology of nonhereditary cancers. Although it has been difficult to estimate the host susceptibility factors accurately and comprehensively in cancer epidemiology, analysis of genetic polymorphism offers a new opportunity to address this difficulty. Information on genetic differences of genes that may modify dose dependency or effects of carcinogenic exposure will contribute to more accurate identification of which risks can be intervened, and to establish tailor-made prevention measures.

The objective of this study was to describe the allele frequencies of single nucleotide polymorphisms (SNPs) of selected genes, which may be important in future gene-environment studies in Japan. Genes were selected based on their possible involvement in gene-environment/life-style interactions and constitute the genes of xenobiotic metabolism enzymes, DNA repair enzymes, and other stress-related proteins.

Although many studies have reported allele frequencies of various SNPs in Japan, little has been published using a strict random sampling method in the subject selection. Here we report allele frequencies of a large number of SNPs, of which typing was performed on samples obtained from middle-aged Japanese men randomly selected from areas served by five public health centers in different prefectures.

Subjects and methods

Subjects

Selection of five study areas and subjects was described previously (Tsugane et al. 1992a; Tsugane et al. 1992b). Briefly, each study area had a population of approximately 100,000 people and was covered by a single health center that supervised the health administration of the several cities, towns, and villages in the area. Men aged 40–49 years were selected by a random sampling method using the publicly available resident registration rolls of the following municipalities: Ninohe, Iwate Prefecture (n=175), Yokote, Akita Prefecture (n=170), Katsushika-kita, Tokyo (n=195), Saku, Nagano Prefecture (n=170), and Ishikawa, Okinawa Prefecture (n=170). These study areas have different characteristic cancer mortality rates (Tsugane et al. 1992a). The selected individuals were sent a letter accompanied by a prepaid reply postcard to explain the purpose of the study and to request their voluntary participation. In order to achieve a sufficient response rate, a follow-up letter, telephone call, and home visit were used to encourage participation.

In addition to the blood and urine samples, the subjects provided information about their life style and health-related condition through a questionnaire-based interview by registered nurses or nutritionists who as local public employees have a professional obligation to strict secrecy under the Local Public Service Law. The overall participation rate was 72% (634 out of 880). The surveys were done in 1989 for Ninohe and Ishikawa, in 1990 for Yokote and Saku, and in 1991 for Katsushika-kita.

DNA samples

A total of 25 ml of blood was drawn by venipuncture and divided into three tubes for extraction of plasma, buffy coat, and serum. The 11-ml heparinized sample was immediately centrifuged for 10 min at 2,500–3,000 rpm to obtain plasma and a buffy coat layer. Genomic DNA was extracted from the buffy coat layer using a commercial kit (Wako, Osaka, Japan). Some samples from Iwate, Okinawa, and Tokyo were used in another study (Sugimura et al. 1998), leaving 537 DNA samples (98 from Akita, 121 from Iwate, 111 from Nagano, 105 from Okinawa, and 102 from Tokyo). A total of 339 extractions yielded 0.5 μg or more DNA and were used for the present study (77 from Akita, 77 from Iwate, 104 from Nagano, 45 from Okinawa, and 36 from Tokyo), but some DNA samples were exhausted before the completion of the analyses, leaving a sample size of 207 (64 from Akita, 54 from Iwate, 67 from Nagano, 14 from Okinawa, and 8 from Tokyo), for which all SNPs were analyzed. Because the substantial variation in the amount of DNA extracted from the buffy coat layer was most likely due to technical reasons and should have nothing to do with the genotype per se, it is reasonable to suppose that random sampling is well preserved in the present study.

Ethical issues

All DNA samples were rendered anonymous by removing links with specific individual information, i.e., any ID, name, or address. The protocol was approved by the ethics review committee of the National Cancer Center (protocol number G12-02).

SNPs analyses

We developed 289 SNP typing assays for 44 genes using a mass spectroscopy-based technique, MassARRAY (Sequenom, CA, USA; Ross et al. 1998).

The following 44 genes were selected from genes encoding xenobiotic metabolic enzymes, DNA repair enzymes, and other stress-related proteins: cytochrome P450 genes (CYP1A1, CYP1A2, CYP1B1, CYP2A6, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4, CYP17A1, CYP19A1), aryl hydrocarbon receptor gene (AHR), estrogen receptor genes (ESR1, ESR2, ERRRG), progesterone receptor gene (PGR), epoxide hydrolase genes (EPHX1, EPHX2), hydroxysteroid (17-beta) dehydrogenase genes (HSD17B2, HSD17B3), glutathione S-transferase genes (GSTM2, GSTM3, GSTT2, GSTP1), N-acetyltransferase genes (NAT1, NAT2), catechol-O-methyltransferase gene (COMT), alcohol dehydrogenase genes (ADH1A, ADH1B, ADH1C), aldehyde dehydrogenase gene (ALDH2), nitric oxide synthase genes (NOS2A, NOS3), interleukin genes (IL1A, IL1B), repair genes for oxidative DNA damage (OGG1, NUDT1 [MTH1]), dopamine receptor genes (DRD2, DRD3, DRD4), serotonin transporter gene (SLC6A4), glucocorticoid receptor gene (NR3C1 [GCCR]), folate metabolizing enzyme gene (MTHFR), and quinone oxidoreductase gene (NQO1). For each gene, between five and seven SNPs were chosen from public databases and published papers.

Results and discussion

Of the 289 SNPs in 44 genes initially designed for MassARRAY, the assays could not be optimized for 14 SNPs, including all six SNPs from CYP2D6. Among the remaining 275 SNPs in 43 genes that were successfully genotyped, 122 SNPs and three genes had to be excluded for various reasons such as monoallelism, minor alleles having a frequency of less than 1% in our study populations, deviations from Hardy-Weinberg equilibrium with less than 5% significance, or revisions of the GenBank database relocating some SNPs to other genes.

Allele frequencies of the remaining 153 SNPs in 40 genes shown in Table 1 were calculated by combining the allele frequencies from the five geographical regions. There was no evidence of significant differences in allele frequencies among the five geographical regions (Fisher’s exact test with Bonferroni correction), although such a difference might be detected by an analysis with a larger sample size. These SNPs showed Hardy-Weinberg equilibrium for two alleles, and the minor allele frequencies were more than 1%. The allele frequencies of 46 of the SNPs analyzed in this study were also reported in the JSNP project (Hirakawa et al. 2002), and significant differences were reported for only four SNPs (p<0.05, Fisher’s exact test). In the JSNP study, allele frequencies were determined for 752 unrelated Japanese volunteer subjects; further demographic details are not available.

Table 1 Single nucleotide polymorphisms (SNP) allele frequencies

Among the 50 polymorphisms in 241 noncancer Japanese outpatients at the Aichi Cancer Center Hospital reported by Hamajima et al. (2002), we report the following minor allele frequencies for the ten SNPs that overlapped with those in our study: Glu487Lys in ALDH2 (rs671), 0.278; Val158Met in COMT (rs4680), 0.346; T-34C in CYP17 (rs743572), 0.435; Ser326Cys in OGG1 (rs1052133), 0.471; C-889T in IL1A (rs1800587), 0.085; C-31T and C-511T in IL1B (rs1143627 and rs16944), 0.450 and 0.441, respectively; Ala223Val and Glu430Ala in MTHFR (rs1801133 and rs1801131), 0.405 and 0.193, respectively; and Pro187Ser in NQO1 (rs1800566), 0.421. Only the allele frequency for ALDH2 (p<0.05, Fisher’s exact test) was significantly different from the frequencies observed in the present study. As this study was a hospital-based study, it might not be comparable with our population-based study using a random sample; further studies will be needed to confirm the reason for the difference in allele frequencies of ALDH2.

The enrollment of representative subjects is essential for studying allele frequencies in a population of an area. However, there are few studies using a strict random sampling method in Japanese populations. In a study using a random sample of 445 Japanese rural residents in Hyogo Prefecture, Lwin et al. (2002) reported that the minor allele frequency of C677T in MTHFR (rs1801133) gene was 0.40, a frequency not significantly different from our study.

Here we report allele frequencies of SNPs in genes that may modify the dose-dependency or effects of exposure to cancer risk factors in a population-based study with random sampling in Japan. The primary aim of this study is to offer basic information on the genetic background of the Japanese population, which is highly useful for designing genome-based epidemiological studies, especially in the field of cancer research. However, the data may also be utilized as an alternative control population in exploratory genetic association studies for screening disease-related genes. More in-depth analyses are underway such as assessing the possible differences in allele/genotype frequencies among the populations from the five prefectures, assessing population stratification in Japan, and assessing gene-life style interactions.