Detailed analysis of association between common single nucleotide polymorphisms and subclinical atherosclerosis: The Multi-ethnic Study of Atherosclerosis

Previously identified single nucleotide polymorphisms (SNPs) in genome wide association studies (GWAS) of cardiovascular disease (CVD) in participants of mostly European descent were tested for association with subclinical cardiovascular disease (sCVD), coronary artery calcium score (CAC) and carotid intima media thickness (CIMT) in the Multi-Ethnic Study of Atherosclerosis (MESA). The data in this data in brief article correspond to the article Common Genetic Variants and Subclinical Atherosclerosis: The Multi-Ethnic Study of Atherosclerosis [1]. This article includes the demographic information of the participants analyzed in the article as well as graphical displays and data tables of the association of the selected SNPs with CAC and of the meta-analysis across ethnicities of the association of CIMT-c (common carotid), CIMT-I (internal carotid), CAC-d (CAC as dichotomous variable with CAC>0) and CAC-c (CAC as continuous variable, the log of the raw CAC score plus one) and CVD. The data tables corresponding to the 9p21 fine mapping experiment as well as the power calculations referenced in the article are also included.


a b s t r a c t
Previously identified single nucleotide polymorphisms (SNPs) in genome wide association studies (GWAS) of cardiovascular disease (CVD) in participants of mostly European descent were tested for association with subclinical cardiovascular disease (sCVD), coronary artery calcium score (CAC) and carotid intima media thickness (CIMT) in the Multi-Ethnic Study of Atherosclerosis (MESA). The data in this data in brief article correspond to the article Common Genetic Variants and Subclinical Atherosclerosis: The Multi-Ethnic Study of Atherosclerosis [1]. This article includes the demographic information of the participants analyzed in the article as well as graphical displays and data tables of the association of the selected SNPs with CAC and of the meta-analysis across Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/dib ethnicities of the association of CIMT-c (common carotid), CIMT-I (internal carotid), CAC-d (CAC as dichotomous variable with CAC4 0) and CAC-c (CAC as continuous variable, the log of the raw CAC score plus one) and CVD. The data tables corresponding to the 9p21 fine mapping experiment as well as the power calculations referenced in the article are also included. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Genetic association studies controlling for CVD risk factors Experimental features

Specifications
The program R was used to perform genetic association studies Data source location

Multi-Ethnic Study of Atherosclerosis locations across the US Data accessibility
Data is within this article

Value of the data
Genetic variations play an important role in the atherosclerotic process. The data shows novel associations between genetic variations and atherosclerosis. The data also shows that previously described genetic associations with atherosclerosis vary considerably depending on ethnicity.
More research is needed to further elucidate the effect of ethnic-specific genetic variation in cardiovascular disease.

Data
Previously identified single nucleotide polymorphisms (SNPs) in genome wide association studies (GWAS) of cardiovascular disease (CVD) in participants of mostly European descent were tested for association with subclinical cardiovascular disease (sCVD), coronary artery calcium score (CAC) and carotid intima media thickness (CIMT) in the Multi-Ethnic Study of Atherosclerosis (MESA).

Study design
The MESA study has been previously described and it was designed to investigate the impact of sCVD and CVD risk factors on the development of clinically overt CVD [2]. Approximately 38% of the recruited participants are Caucasians (EUA), 12% Chinese (CHN), 28% African American (AFA) and 22% Hispanic (HIS). Table 1 describes the demographic characteristics of the participants.

Genotype data
The 66 single nucleotide polymorphisms (SNPs) included in this study (Table 2) were obtained from Affymetrix 6.0 GWAS dataset (MESA and MESA family data) on 8224 consenting MESA participants (2329 EUA, 691 CHN, 2482 AFA, and 2012 HIS) from the National Heart, Lung, and Blood Institute SNP Health Association Resource (SHARe) project. Absent SNPs were imputed using IMPUTE v2.2.2 [3] to the 1000 genomes cosmopolitan Phase 1 v3 as a reference. Genotypes were filtered for SNP level call rate o95% and individual level call rate o95%, and monomorphic SNPs as well as SNPs with heterozygosity 453% were removed. Allele frequencies were calculated separately within each racial/ethnic group, and only those SNPs with minor allele frequencies 4 0.01 were included in genetic association analyses. We further filtered imputed SNPs based on imputation quality 40.5, using the observed versus expected variance quality metric, and filtered genotyped SNPs for Hardy-Weinberg equilibrium P-value Z10 À 5 .

SCVD measurement
The imaging outcomes in the present study are coronary artery calcium [CAC, measured as a continuous variable as the raw Agatston CAC score plus one (CAC-c) or as a dichotomous variable (CAC-d) with CAC4 0] and carotid artery intima-media thickness [CIMT; internal carotid intima media thickness (CIMT-i), common carotid intima media thickness (CIMT-c)]. CAC was measured by either electron-beam tomography or multi-detector computed tomography, as described previously [4]. All scans were read at the Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center. Measurements of CAC were adjusted between the different field centers and imaging machines by using a standard calcium phantom of known density, which was scanned with each participant and CAC calculated as described previously [5] and the mean value from two scans used for analysis.  SLC22A4, SLC22A5, IRF1 0.14 (G) 9.62 Â 10 -10 CAD [22] CIMT measurements were performed by B-mode ultrasonography of the right and left, near and far walls, and images were recorded using a Logiq 700 ultrasound device (General Electric Medical Systems, Waukesha, WI). Maximal CIMT-i and CIMT-c was measured as the mean of the maximum values of the near and far wall of the right and left sides at a central ultrasound reading center (Department of Radiology, New England Medical Center, Boston, MA) as described previously [6].

Statistical analyses
Given skewed distributions, the common (CIMT-c) and internal (CIMT-i) IMT values were log normalized. CAC was analyzed as a continuous variable by obtaining the log of the raw CAC score plus one (CAC-c) or as a dichotomous variable (CAC-d) with CAC 4 0. Analyses were first performed stratified within each racial/ethnic group. For analysis involving EUA and CHN, an unrelated subset of individuals was constructed by selecting at most one individual from each pedigree. For analysis of phenotypes with a substantial familial component, among AFA and HIS, the analysis was performed using a linear mixed-effects model (continuous variables) and by generalized estimating equations  Table 1. The y-axis represents the À log10 of the p-value and the dotted line the Bonferroni corrected significance threshold. (dichotomous variables). Associations between each SNP and each individual phenotype was determined using separate multiple linear regressions (continuous variables) or logistic regressions (dichotomous variables) assuming an additive model. Two models were used to analyze the data. Model 1 accounted for age, sex, site of ascertainment, and principal components. Model 2 included Model 1 plus HDL cholesterol (HDL-C), LDL cholesterol (LDL-C), triglycerides, body mass index (BMI), hypertension status (self-report of physician-diagnosed hypertension along with use of antihypertensive medication or systolic blood pressure of 140 mm Hg or greater and/or diastolic blood pressure of 90 mm Hg or greater), diabetes status (fasting blood glucose was 126 mg/dL or greater or use of diabetes medications), and current smoking use (self-reported current smoking use within the past 30 days). Fixed effect meta-analysis was used to combine results across all four race/ ethnic groups, as implemented in METAL. [23] Fig. 1 shows associations of CAC-c by ethnicity. Fig. 2 shows SNP associations with sCVD in a meta-analysis across ethnicities. Fine mapping of the 9p21 region (100 kb upstream or downstream from SNPs rs1333049, rs4977574, and rs16905644) was performed for each ethnic group by selecting all SNPs on the chromosome 9 imputation set (NCBI Build 37) between positions 21997022-22225503. A total of 3282 SNPs were identified (598, 631, 1256 and 797 SNPs in EUA, CHN, AFA and HIS, respectively). This list of SNPs was supplemented by adding novel SNPs identified by deep sequencing efforts in this region [24,25]. Given that each ethnicity has its own LD structure, to account for multiple comparisons in each of the race/ethnic-specific analyses, we use an eigen-decomposition to estimate the effective number of independent SNPs in each race/ethnic group [26]. Table 3 shows the association for SNPs in the 9p21 region and CAC-c in EUA and HIS. Table 4 shows the association for SNPs in the 9p21 region and sCVD across ethnicities.
Significance was defined by Bonferroni correction by dividing an alpha of 0.05 by the number of SNPs tested (p o7.6 Â 10 À 4 given 66 SNPs tested (0.05/66) for the initial analysis, with greater number of SNPs used for the correction for the fine mapping effort). To assess genetic heterogeneity seen in stratified analyses of the four MESA race/ethnic groups, we used the I 2 heterogeneity metric to and CAC-c (log of the raw CAC score plus one) and CVD and sCVD SNPs. A linear regression assuming an additive model and controlling for age, gender, site of ascertainment, principal components, HDL cholesterol (HDL-C), LDL cholesterol (LDL-C), triglycerides, BMI, hypertension status, diabetes status and tobacco was performed in each ethnic group as described above. The program METAL was used to conduct a fixed effect metaanalysis to combine estimated effects and standard errors from stratified analyses. The dots represent previously identified CVD and sCVD SNPs in prior GWAS as detailed in Table 1. The y-axis represents the À log10 of the p-value and the dotted line the Bonferroni corrected significance threshold.        Table 6 Power to detect a genetic additive effect assuming a type I error rate of 7.6 Â 10 À 4 given 66 SNPs tested (0.05/66) for a quantitative trait with a population standard deviation of 0.11 as a function of SNP effect size (beta) and minor allele frequency (MAF). The estimation of standard deviation as well as SNP effect size are based on published IMT and genetic association data. quantify the proportion of total variation across studies attributable to heterogeneity rather than chance [27]. Tables 5 and 6 shows power calculations for dichotomous and quantitative traits.