Introduction

Obesity is a major public health problem, and it has been defined by the World Health Organization based on body mass index (BMI, weight/height2, kg/m2).1 BMI is a convenient, simple and popularly adopted method to evaluate obesity. In the Chinese population, BMI of 18.5–23.9 is considered as optimal, 24.0–27.9 as overweight, and 28.0 and above as obese.2, 3 Over the years, obesity-related mortality had consistently increased.4

Obesity is a multifactorial and heterogeneous condition that results from alterations of various genes;5 the minimum genetic determination of 40% for human obesity has been established.6 Several genomic regions and candidate genes have been identified to contribute genetic variants for obesity by earlier linkage studies,7 candidate gene studies8 and recent genome-wide association studies.9 However, none of these genes or genomic regions has been found to explain more than 10% of variation in any obesity phenotypes. This leaves largely unknown genetic factors underlying obesity.9

Copy number variation (CNV) is the copy number change of DNA fragments at a range of 1 kilobase (Kb) to several megabases (Mb). A lot of CNVs and several hundred CNV regions have been identified in human populations.10, 11, 12 Recent studies showed that CNV occurred frequently in many susceptive individuals who were predisposed to diseases such as mental retardation and autism.13, 14 Therefore, investigation of CNVs may contribute to understand the genetic basis of variations in biological functions and phenotypes. However, it is unknown whether CNVs can be used as genetic markers to locate genes associated with BMI.

In this research, we performed a genome-wide CNV analysis in 597 elderly Chinese Han subjects using the Affymetrix GeneChip Human Mapping 500 K Array Set, which had been successfully used to detect the changes of genomic structure.15, 16 On the basis of the constructed genome CNVs, for the first time, we performed association analysis to suggest that CNVs may be associated with BMI variation in the Chinese population.

Materials and methods

Research subjects

The study was approved by the local institutional review boards of all the participating institutions. After signing an informed consent, subjects completed a structured questionnaire including anthropometric variables, lifestyles and medical history. The sample for the genome-wide CNV analyses consisted of 597 (258 males and 339 females) elderly Chinese Han subjects. All the subjects were unrelated northern Chinese Han adults living in the city of Xi’an and its vicinity.

Phenotype

Total body weight was measured in a standardized fashion after the removal of shoes and heavy outer clothing using a calibrated balance beam scale. Height was measured after removal of shoes using a stadiometer and recorded to the nearest 0.1 cm. The average weight of the 597 subjects was 59.4±11.4 kg, and the average height was 160.8±8.9 cm. BMI (kg/m2) was calculated as the subject's weight in kilograms divided by height in meters squared.

Genome-wide genotyping

Genomic DNA was extracted from peripheral blood leukocytes using standard protocols. Affymetrix Human Mapping 500 K array sets (Affymetrix, Santa Clara, CA, USA), which consisted of two chips (Nsp and Sty) with 250 000 single nucleotide polymorphisms (SNPs) each, were used to genotype each subject from the Chinese sample according to the Affymetrix protocol. Briefly, 250 ng of genomic DNA was digested with the restriction enzyme NspI or StyI. Digested DNA was adaptor-ligated and PCR-amplified for each enzyme-digested sample. Fragmented PCR products were then labeled with biotin, denatured and hybridized to the arrays. Arrays were then washed and stained using phycoerythrin on Affymetrix Fluidics Station FS450, and scanned using the GeneChip Scanner 3000 7G. Data management and analyses were conducted using Affymetrix GeneChip Operating System. Genotyping calls were determined from the fluorescent intensities using the Dynamic Modeling algorithm with a 0.33 P-value setting,17 as well as the Bayesian Robust Linear Model with Mahalanobis Distance (BRLMM) algorithm.18 Because of the efforts of repeated experiments, all the samples had a call rate of 95% and were thus all were included in the subsequent analyses. The final mean Bayesian Robust Linear Model with Mahalanobis Distance call rates reached a high level of 99.02%.

Assessment of genetic background

The program, STRUCTURE 2.2,19 and the method of genomic control20 were applied to detect possible population stratification of the Chinese sample. Two thousand SNPs tested to be in Hardy–Weinberg equilibrium were randomly selected genome wide to cluster all the subjects. The program uses a Markov Chain Monte Carlo algorithm to cluster individuals into different cryptic subpopulations based on multilocus genotype data. Potential substructure was estimated under a priori assumption of K=2 discrete subpopulations. For genomic control, we estimated the inflation factor (λ) on the basis of genome-wide SNP information.

CNVs and CNVRs determination

DNA CNVs were calculated by Affymetrix GeneChip Chromosome Copy Number Analysis Tool 4.0, which implements a Hidden Markov Model on the basis of an algorithm to identify chromosomal gains and losses by comparing the signal intensity of each SNP probe set for each test subject against a reference set. As an initial analysis, we used 299 random subjects as the reference set. In calculating CNVs for the 299 random subjects, when an individual subject was the test sample, he/she was excluded from the reference set. CNVs were defined when there were at least three consecutive SNPs showing consistent deletion or duplication. As it was not possible to pinpoint the boundaries of each CNV using genome-wide SNP genotyping arrays, we used the positions of SNPs as boundary approximates. After putative variant intervals of CNVs were identified in each individual, we used the following criteria to determine the boundaries of CNV region (CNVR). If two individual CNVs overlapped we merged them as a CNVR using the SNPs selected from these two CNVs with a maximum interval as the boundaries. When the interval of the next overlapping individual CNV exceeded this CNVR, the boundaries would extend accordingly.21 Briefly, a CNVR represented a union of overlapping CNVs.

Association analysis between CNV and BMI

For association analyses, we used the following procedure to redefine the CNVs (for those with frequencies exceeding 5%) contained in the CNVRs. We divided complex CNVR (illustrated in Plot C in Figure 1), including individual CNVs with discordant boundaries but overlapping regions, into several sub-CNVRs, so that the resultant sub-CNVRs had the same configurations as in Plot A or B in Figure 1. Thus, all CNVRs or sub-CNVRs contained only one kind of CNV with the same boundaries as their corresponding CNVR or sub-CNVR. CNVs with frequencies >5%, defined by the above procedure, were selected for association analyses. Multiple regression analyses were used to evaluate the effects of assumed covariates (sex, age, sex*age and age2) and only significant items (age and age2, P<0.05) were included as covariates to adjust the raw value for the subsequent association analyses. We used SPSS software (SPSS Inc., Chicago, IL, USA) to perform analysis of variance test to find the associations between CNVs and BMI. P-values <0.05 in our study were considered nominally significant, and were further subjected to Bonferroni correction to account for multiple comparisons.

Figure 1
figure 1

Copy number variation (CNV) redefined for association analyses. CNV regions (CNVRs) were divided into several sub-CNVRs with the same configuration as A or B, thus all the sub-CNVRs contained only one kind of CNV for association analyses. (A) All the individual CNVs in a CNVR had the same boundaries. (B) All the individual CNVs in a CNVR had at most one single nucleotide polymorphism (SNP) difference in each side of the boundaries. (C) CNVR with complex overlapping regions. This kind of CNVRs was divided into several sub-CNVRs with the same configuration as A or B. (D) Precise structure of redefined CNV 10q11.22. The numbers at the start and the end of CNVs were the physical positions on Chromosome 10 of each CNV.

Results

Basic characteristics of the Chinese sample, including age, weight, height and BMI were summarized in Table 1. The STRUCTURE program showed that all Chinese subjects were clustered together as one homogeneous sample. The estimated inflation factor (λ) value was 1.03. These results indicated that there was no detectable significant population stratification in the Chinese sample.

Table 1 Basic characteristics of the Chinese sample

Combining all CNVs data of each subject, we selected 24 CNVs (Table 2) with frequencies of .more than 5% from the total 1395 CNVs for association analyses between BMI and CNVs. The selected 24 CNVs covered 9 Mb with a mean length of 387 kb. One of the twenty-four CNVs was associated with BMI with nominal significance (P=0.011). However, it did not remain significant after strict Bonferroni correction. This CNV, illustrated in Plot D in Figure 1, was located in 10q11.22 with the physical position from 46 363 383 bp to 46 557 002 bp (named CNV 10q11.22 at Table 2).

Table 2 Characteristics of the 24 selected CNV for association analyses with BMI in the Chinese population

CNV 10q11.22 included four genes; SYT15 (synaptotagmin XV), GPRIN2 (G protein regulated inducer of neurite outgrowth 2), PPYR1 (pancreatic polypeptide receptor 1) and LOC728643 (heterogeneous nuclear ribonucleoprotein A1 pseudogene). In our sample, 12 subjects had CNV 10q11.22 loss (CN=0 or 1) and 18 subjects had CNV 10q11.22 gain (CN=3, 4 or more). Association analyses showed that the CNV 10q11.22 loss was significantly associated with higher BMI. Compared with the 567 subjects with two gene copy numbers (normal diploid), subjects with CNV 10q11.22 loss had 12.4% higher BMI value, and subjects with CNV 10q11.22 gain had 5.4% lower BMI value (Figure 2). Regression analysis showed that CNV 10q11.22 contributed 1.6% of BMI variation.

Figure 2
figure 2

Comparisons of body mass index (BMI) value for copy number variation (CNV) 10q11.22 in the Chinese sample. P-value was estimated by analysis of variance.

Discussion

In our study, we tested the association between CNVs and BMI. We discovered that CNV 10q11.22 as a genetic marker was associated with BMI. And CNV 10q11.22 overlapped with an earlier reported CNVs data from Database of Genomic Variants (http://projects.tcag.ca/variation/). Wong et al.,22 Sebat et al.,10 Pinto et al.23 and Jakobsson et al.24 reported the existence of CNVs in this region using array comparative genomic hybridization, representational oligonucleotide microarray analysis, Affymetrix 500K SNP Mapping Array and Illumina HumanHap Map 550 SNP Array.

CNV 10q11.22 covers four genes; PPYR1, SYT15, GPRIN2 and LOC728643. It is well established that PPYR1 is related to obesity. The PPYR1 gene was a key regulator of energy homeostasis and directly involved in the regulation of food intake.25 PPYR1, also named as neuropeptide Y receptor or pancreatic polypeptide 1, was a member of the seven transmembrane domain-G-protein coupled receptor family. Genetic variation studies have reinforced the potential influence of PPYR1 on body weight in humans.26 Pancreatic polypeptide is the preferential PPYR1 agonist.27 Peripheral administration of pancreatic polypeptide inhibits gastric emptying and decreases food intake in humans.28, 29 Currently, 7TM Pharma company (Horsholm, Denmark) reported that a selective PPYR1 agonist peptide, TM30339, had effect on the reduction of food intake and weight loss.26 These findings indicate a promising role of PPYR1 and its agonist in the treatment of human obesity. On the basis of the findings, a patent has been filed in Europe to use this gene as a potential target to treat human obesity (European Patent EP1362926). The different expression of copy number variant genes may lead to phenotypic variation.30 In our study, PPYR1 gene copy number gain was associated with lower BMI. Subjects with PPYR1 copy numbers gain may produce more expression products, which will regulate energy homeostasis through agonists or other pathways to inhibit obesity.

Animal experiments showed complicated results. Sainsbury A et al.25 reported that PPYR1 knockout mice displayed lower body weight and reduced white adipose tissue accompanied with increased plasma levels of pancreatic polypeptide. However, deletion of the PPYR1 on the ob/ob background mice had no effect on the hyperphagia, obesity or type II diabetic phenotype.25, 31, 32

The difference between human study and mice experiments can be elucidated by the following reasons. First, mice may not be the most appropriate model to understand the function of PPYR1 in human, as mice expressed a functional Y6 receptor and this receptor was also used to interpret receptor knockout results.27 Second, the mice PPYR1 amino acid sequence is 76% identical to human PPYR1;33 the highly variable PPYR1 across species may explain why its exact roles differ among species.27 Third, compared with the 597 studied subjects, PPYR1 knockout (PPYR1−/−) mice have neither copy of the gene nor corresponding gene expression products.

The other three genes, SYT15, GPRIN2 and LOC728643, have not been reported to have relation with any obesity phenotypes. And it was unknown whether the interactions of the four genes may lead to the BMI variation. Further functional studies are needed to identify their potential role on obesity.

In conclusion, our genome-wide CNV study suggested one CNV may be associated with obesity variation in the Chinese Han population. An important obesity-related gene, PPYR1, is located in this CNV. Our results suggested that CNV might be potentially important for BMI variation and CNV might be used as a genetic marker to locate genes associated with BMI in the Chinese population.