Association study of PCSK9 SNPs (rs505151 & rs562556) and their haplotypes with CVDs in Indian population

Abstract Background Cardiovascular disease (CVD) has emerged as the most prevalent cause of death in India. Pro-protein Convertase Subtilisin/Kexin Type 9 (PCSK9) gene has been found to be associated with lipid levels and a biomarker for susceptibility of CVD. Aim To study the association of PCSK9 SNPs rs505151 & rs562556 and their haplotypes with CVDs in the Indian population. Subjects & methods The present study comprised of 102 angiographically proven CVD patients & 100 healthy subjects. To study polymorphism, Polymerase Chain Reaction and Restriction Fragment Length Polymorphism (PCR-RFLP) method was used. Biochemical parameters were analysed by enzymatic methods or automated analysers. Haplotype analysis was done using SHEsis software. Results The dominant genetic model with an odds ratio (confidence interval) of 4.71 (2.59 − 8.5), (p value = .0001), shows the risk of CVDs. However, rs562556 (I474V) variant was not found to be associated with clinical parameters and risk of CVDs (p value >.05). Out of four haplotypes, H3 (G-A) was found to be associated with the CVDs (OR- 3.137, p value = .0001). Conclusion This study concludes that G allele of rs505151 SNP (PCSK9) and the H3 (G-A) haplotype of rs505151 & rs562556 were found to be risk factors for CVDs in the Indian population.


Introduction
Cardiovascular diseases (CVDs) are the major cause of mortality and morbidity globally.Many previous studies have shown that CVDs are related to various risk factors such as age, gender, smoking, alcohol intake, obesity, stress, diabetes, and physical inactivity (Pletcher et al. 2005;Wu et al. 2015;Keto et al. 2016;Csige et al. 2018).Genetic factors, especially genes which play important roles in the metabolism of lipids, have also been found to be associated with the risk of cardiovascular diseases; PCSK9 gene is one of them.Seidah et al. discovered Proprotein Convertase Subtilisin/Kexin type 9 (PCSK9) gene in 2003 in a French family with hypercholesterolaemia.They also found that the subtilases family has PCSK9 gene in its subfamily of proteinase K.This gene has both extra and intracellular impacts.It is mainly expressed in the liver (Park et al. 2004).PCSK9 gene is considered as a genetic marker for cardiovascular diseases and inhibitors of this gene may improve CVD patients' health and life expectancy (Hartgers et al. 2018).PCSK9 genetic variant has also been reported to be correlated with the risk of atherosclerosis stroke (Abboud et al. 2007).Metaanalysis studies performed previously have shown the association of PCSK9 gene variants with increased levels of plasma total cholesterol (TC) and low-density lipoprotein cholesterol (LDL-C) and increased chances of CVDs (Zhang et al. 2013;Adi et al. 2015).Cohen et al. (2005) reported association between low LDL levels and loss-of-function mutations in genetic variants of PCSK9 gene.Degradation of low-density lipoprotein receptors (LDLR) by its available complex mechanism is the main function of the PCSK9 gene product.High levels of LDL crucially contribute to atherosclerotic CVDs; and by using LDL reducing therapies, the risk of CVDs can be reduced (Ference et al. 2017).
Different polymorphic sites of PCSK9 gene: rs2483205, rs11591147, rs505151, rs562556 have been studied in recent years (Qiu et al. 2017;Gai et al. 2021).Gain-of-function (GOF) mutations were found in PCSK9 (E670G) rs505151 SNP located in exon 12 and rs562556 (I474V) located in exon 9 region within the linker domain and these are found to be associated with the risk of CVDs (Ferreira et al. 2020).Recently, a study was conducted in North India that reported the association of E670G (rs505151) polymorphism with coronary artery disease (Reddy et al. 2018), but there have been no haplotype studies reported in the Indian population.The current study has been carried out to explore the association of PCSK9 gene variants and haplotypes of these variants with CVDs in the Indian population.

Sample selection
One hundred and two angiographically proven CVD patients and 100 healthy controls were selected from different regions of Haryana (India) for the present study.Subjects suffering from infectious diseases, metabolic diseases, malignant tumours, autoimmune disorders, and other diseases, were excluded from the study as they may affect the results.Subjects without complete information were also excluded from the study.The control group was comprised of people who were free of any major disease and had no history of cardiovascular diseases.Pregnant women were also excluded from both groups.The included subjects were receiving regular health examinations.The age for both case and control groups ranged from 18 to 65 years.The research proposal was approved by the Institutional Human Ethical Committee (IHEC) Kurukshetra University, Kurukshetra, India (vide letter no.DZ/17/IHEC/444), and was carried out in compliance with guidelines of the Indian Council of Medical Research (ICMR), India.

Sample collection and biochemical measurements
Blood samples (3-5 ml) were collected in K 2 EDTA vials with prior consent from each subject involved in this study.The blood plasma was used for the measurement of biochemical parameters and DNA was isolated from WBC's.Most of the patients were on antihypertensive and lipid-lowering drugs.Biochemical parameters analyses such as total cholesterol (TC), triglycerides (TG), high-density lipoprotein (HDL), lowdensity lipoprotein (LDL) and very low-density lipoprotein (VLDL) were done by enzymatic methods.The concentration of potassium was measured by an autoanalyser.

Genomic DNA isolation
Genomic DNA was extracted by the method of Miller et al. (1988) with slight modifications.For visualisation of the DNA, 1% agarose gel electrophoresis and a UV transilluminator were used.DNA purity was assessed by calibrating the absorbance at 260/280 nm.The ratio range for purity was found to be between 1.6 and 2.0 and the isolated DNA was stored at À20 C for further use.

DNA amplification (PCR) and restriction fragment length polymorphism (RFLP) of variant rs505151
For genotyping of DNA, the PCR-RFLP method was used.Amplification was done using a specific set of primers for a specific sequence of the PCSK9 gene.A set of the following primers was used: Forward primer À 5 0 -CACGGTTGTGTCCCAAA TGG-3 0 , Reverse primer À 5 0 -GAGAGGGACAAGTCGGAACC-3 0 (He et al. 2016).The required size of the PCR product obtained was 440 bp.The volume of PCR reaction mixture used was 50 ml.The reaction mixture was composed of 5.0 ml PCR buffer (10Â), 3.0 ml MgCl 2 (25 mM), 0.5 ml dNTPs (10 mM), 0.5 ml each forward primer and reverse primer (10 pmol), 1.2 ml Taq DNA polymerase (1.2 units), 1 ml DNA (50 ng), and 38.3ml of nuclease-free water.Tubes containing the reaction mixture were vortexed and placed in the thermo-cycler.The PCR cycle involved initial denaturation at 95 C for 5 min; followed by 35 cycles at 94 C for 45 s (denaturation), annealing at 62 C for 45 s, extension at 72 C for 45 s; final extension at 72 C for 5 min and final hold at 4 C. PCR product of 440 bp size was checked by using 1.5% agarose gel electrophoresis (Figure 1(A)).
The PCR product of PCSK9 gene was digested using specific restriction enzyme (RE) Eam 1104 l (New England Biolabs) by the RFLP method.The total volume for the reaction mixture was 10 ml containing 8.5 ml PCR product, 0.5 ml RE (2-5 U), and 1 ml of 10x buffer solution.The incubation time was 1 h at 37 C. Gel loading dye was used to prepare the samples for loading into the wells of 2% agarose gel.The fragments obtained after digestion were 440 bp, 290 bp, and 150 bp (AG genotype); 290 bp and 150 bp (AA genotype), and 440 bp in size (GG genotype).The results were confirmed by using a 100 bp DNA ladder (Figure 1(B)).

Primer designing and optimisation of rs562556
For amplification of the specific sequence of PCSK9 gene at SNP rs562556, primers were designed by using FASTA format of gene sequence in Primer 3 web online software.From the output of the software a set of primers was selected and UCSC In-Silico PCR online software was used further for the virtual amplification to confirm the results.For virtual restriction digestion with FOK1 restriction enzyme, Restriction Mapper version 3 software was used.lanes 5 and 6, AA genotype (150 bp and 290 bp); lanes 1, 2, and 3, AG genotype (150 bp, 290 bp, and 440 bp); and lane 4, GG genotype  (440 bp).

PCR and RFLP of variant rs562556
A set of specific primers designed for PCR was: Forward À 5 0 -TGCAGGACTGTATGGTCAGCÀ3 0 and Reverse-3 0 -ACACACGA AGGAGGGGTACA-5 0 and the product of 160 bp obtained after amplification.To perform PCR, 11.5 ml of master mix (2X) (G-Biosciences); 0.5 ml of each forward and reverse primers (10 pmol); 1 ml DNA (50 ng), and 11.5 ml nuclease free water were used to make the final volume of 25 ml.Initial denaturation at 95 C for 5 min was followed by 35 cycles of denaturation at 94 C for 45 s, annealing at 52.8 C for 45 s, extension for 45 s at 72 C and final extension for 10 min at 72 C. PCR product of 160 bp was analysed by agarose gel electrophoresis (1.5%) (Figure 2(A)) and visualised by UV light trans-illuminator.
PCR product was digested by FOK1 restriction enzyme (New England Biolabs) using RFLP method.For RFLP a final volume of 10 ml was prepared by using 8.0 ml PCR product, 1.0ml RE (5 U), and 1 ml of 10x buffer solution.Incubation was done for 1 h at 37 C.After digestion, bands of 51 bp, 109 bp, and 160 bp (AG genotype); 51 bp and 109 bp bands (AA genotype); and 160 bp size (GG genotype) were obtained which were identified by using 2.5% agarose gel electrophoresis and visualised with ultraviolet illumination.To confirm the size of bands a 100 bp DNA ladder was used as a marker (Figure 2(B)).

Statistical analysis
The analysis of data was assessed statistically by using IBM SPSS 21.0 software.For mean comparison between both groups, a student t-test was used.A Chi-square (v 2 ) test was used to compare the categorical variables, and also for the comparison of genotype and allele frequencies between both case and control groups.The direct gene counting method was used for the estimation of allele and genotype frequencies.In addition, logistic regression was used to get the OR (odds ratio) and CI (confidence interval) values in order to understand the risk.Statistical significance was defined as a p value <.05.The haplotype analysis and evaluation of the association & linkage disequilibrium of the rs505151 (E670G) & rs562556 (I474V) were done using SHEsis online software, Shi and He (2005).Power of the study computed was 89.5% at 95% CI.

Demographic characteristics and biochemical parameters analysis
The present study comprised 102 angiographically proven CVD patients and 100 healthy subjects as control.The demographic characteristics are shown in Table 1.Smokers and drinkers were found to be significantly higher in case subjects than the controls (p value ¼ <.001).Different biochemical parameters were analysed which revealed that the levels of TC, TG, LDL, and VLDL were higher in cases than the control group (p value ¼ .0001,<.001, .004,.453).TC and LDL levels were found to be significantly increased in cases compared to the control group (p value ¼ .0001,.004).Significantly higher levels of HDL were found in the control group than the cases (p value ¼ .002).The level of potassium (K þ ) in cases was found to be significantly elevated compared to the control group (p value ¼.004) (Table 1).

Amplification and restriction digestion of variants rs505151 and rs562556 (PCSK9)
PCSK9 genetic variants rs505151 and rs562556 genotypes distribution in cases and control group It was observed that in rs505151 SNP, the frequencies of AG (62.74%) and GG (1.96%) genotypes were higher in cases than the control group AG (28%) and GG (0%).The AA genotype frequency (35.29%) was found to be lower in the cases than in the control group (72%).The frequencies of AG and  GG genotypes of rs505151 (E670G) SNP were found to be significantly higher in case subjects than in the control group (p value ¼ .0001).G allele was found to be more prevalent among the cases group than the control group (p value <.001).The AA genotype and A allele frequency was found to be significantly higher in the control group than the cases (p value ¼ .0001)(Table 2).
In rs562556 SNP (I474V), the AA and GG genotype frequencies were found to be slightly higher in the control group than the cases (p value ¼ .490).The frequency of AG genotype was higher in the case group than the control subjects but not significant (p value ¼ .490).The A and G allele frequencies were found to be similar in the cases and control group (p value ¼ .844)as assessed by chi-square test.No deviations were observed from HWE (p > .05)(Table 2).

Association of PCSK9 rs505151 & rs562556 polymorphisms with CVDs
To further explore the association of rs505151 SNP with CVDs, a logistic regression analysis was performed.AG and GG genotypes under the dominant genetic model showed significant correlation with CVD [OR ¼ 4.71, CI: 2.59 À 8.5, p value ¼ .0001].The AA genotype of rs505151 polymorphism contributed as a protective factor against the risk of CVDs [OR (CI) ¼ 0.212 (0.116 À 0.385), p value < .0001].Whereas no significant association was observed between rs562556 variant and risk of CVDs under the dominant model (OR: 0.839, CI: 0.449-1.566,p value ¼ .581)(Table 2).

Inter-genotypic comparison of E670G (rs505151) & I474V (rs562556) variants with biochemical parameters of case and control groups
Comparative analysis of rs505151 SNP genotypes with biochemical parameters in cases was performed by using the dominant genetic model (AG þ GG vs. AA).The levels of TC and LDL were significantly higher in AG þ GG genotype than the AA genotype in the patient group (p value: .014and .045respectively), while no significant difference was observed in levels of TG, HDL, VLDL, and potassium between AG þ GG and AA genotypes (p value >.05).No significant association was observed between rs505151 SNP genotypes AA and AG þ GG with biochemical parameters in the control group (p value >.05) (Table 3).
The comparative analysis of genotypes of rs562556 variant with clinical parameters in cases and the control group under the dominant genetic model did not reveal any significant results (p value >.05) (Table 4).

Discussion
Cardiovascular diseases are a challenge in today's time all over the world, and cause an economic burden in both developed and developing countries.Genetic variants have a major role in the prevalence of cardiovascular diseases.Therefore, the current study was carried out to investigate the association of PCSK9 gene E670G (rs505151) and I474V (rs562556) variants and biochemical parameters with cardiovascular diseases.
In the present study, rs505151 SNP genotyping shows a significantly higher frequency of AA genotype in the control group and higher AG and GG genotype frequencies in the patient group (p value ¼ .0001).Similar to our study, AA genotype was found more in the control group than the cases group in a study performed by He et al. (2016), but Lin et al. (2019) found both AA and GG genotypes to be more in the cases group as compared to the control group.Some previous studies by Anderson et al. (2014), Han et al. (2014), Jeenduang et al. (2015) and Cai et al. (2018) in Brazilian, Han & Uygur, Thai, and Chinese populations, respectively, reported no significant distribution of genotypes (rs505151) in either of the groups (p value >. 05, .537 & .399, .478, .277, respectively) which is different from the results of the present study.
Higher levels of TC and LDL-C were observed in AG þ GG genotypes as compared to the AA genotype in the patient group.However, no significant association was found between AG þ GG genotypes and biochemical parameters of the control group in the current study.G allele frequency was found to be more prevalent in CVDs patients than in control subjects which shows the linkage of G allele with risk of CVDs, and is in accordance with the study conducted by He et al. (2016) and Lin et al. (2019) in the Chinese population, while Hsu et al. (2009) reported less distribution of G allele in the patient group than in the controls in the Chinese population.Qiu et al. (2017) found G allele association with increased level of LDL-C and TG in a meta-analysis of a Caucasian population and Zhang N. (2014) also reported that the cases with the G allele had high levels of LDL-C, TG, and TC, but in the present study G allele was not found to be correlated to TGs.Reddy et al. (2018) examined the PCSK9 rs505151 association with CAD in a North Indian population, and found association of the G allele with risk of CAD.Inconsistent with our study, there are some studies that do not report any association between LDL-C levels and risk of coronary heart disease and that may be due to geneenvironment interaction or due to heterogeneity of the population studied (Scartezini et al. 2007;Polisecki et al. 2008;Huang et al. 2009).
In our study, we found that the results of the dominant genetic model (AG þ GG vs. AA) of PCSK9 rs505151 SNP are associated with the risk of CVDs (OR: 4.71, CI: 2.59-8.5, and p value ¼ .0001).Lin et al. (2019) also reported similar results in their study (OR: 2.91, CI: 1.15-7.36,p value ¼ .019)showing increased risk of CAD.
In the present study, the distribution of genotypes of rs562556 variant was not found to be significantly different in either the cases or control group (p value ¼ .490).Our study confirms the results of studies proposed by Shioji et al. (2004), Miyake et al. (2008), Anderson et al. (2014) Jeenduang et al. (2015) and Wanmasae et al. (2017).Inconsistent with our study, Gai et al. (2021) found significant differences in the genotype frequency of CAD and control groups in the Chinese population (p value ¼ .020).In our study, no association was found between clinical parameters and different alleles (A/G) of rs562556 variant in two groups.The dominant genetic model did not reveal any correlation between the rs562556 variant and CVDs (OR (CI): 0.839 (0.449-1.566), p value ¼ .581).Contrary to our results, Gai et al. (2021), under the dominant model, found GG genotype as a protective factor against the risk of CAD (OR: 0.57, CI: 0.34-0.95,p value: .032)and rs562556 variant associated with reduced level of LDL-C.Similarly, Shioji et al. (2004) reported the association between rs562556 variant and reduced level of LDL-C & TC in their studied population of Japan, but Kotowski et al. (2006), Norata et al. (2010), and Anderson et al. (2014) did not find any association of rs562556 variant with plasma level of LDL-C and TC.They suggested that PCSK9 can be a major determinant for plasma lipid levels.Some studies performed in Caucasian Canadians and general populations in Copenhagen, found rs562556 association with lower LDL-C values and reduced risk of heart disease (Mayne et al. 2013;Benn et al. 2019, respectively).From the studies discussed above it was found that PCSK9 gene variants modulate lipid levels and hence are associated with risk of CVDs.
In the present study, the haplotype analysis of SNPs rs505151 (E670G) and rs562556 (I474V) of the PCSK9 gene has shown H3 (G-A) haplotype association with risk of CVDs (OR-3.137,CI: 1.864-5.278,p value .0001),and this analysis has been reported for the first time in the Indian population.In a previous study, haplotypes of E670G & I474V variants were seen in linkage disequlibrium (LD) (D'-0.93) in a UK population (Scartezini et al. 2007).Whereas the Chen et al. (2005) study reported partial LD between E670G & I474V variants in Lipoprotein Coronary Atherosclerosis Study (LCAS) in a Caucasian population.Another study of E670G & I474V variants (PCSK9) in the Brazilian population did not report any LD (D'-0.1) between both the groups.Also, the frequency of four haplotypes constructed was found to be similar in both hypercholesterolemic (HC) and normolipidemic (NL) groups (p value > .05)(Anderson et al. 2014).
Some other studies were conducted on different genetic variants of PCSK9 gene such as a large-scale phenome-wide association study of the variant rs11591147 of PCSK9 gene according to which rs11591147 SNP plays a protective role against the risk of ischaemic stroke (Rao et al. 2018).Another study carried out to find the novel genetic variants in the exon 7 region of the PCSK9 gene did not report any pathogenic variation in the Indian population (ArulJothi et al. 2016).Differences and similarities found in different researches may be due to different populations, geographical differences, and ethnicity.However, genetic factors can't be neglected.

Conclusion
The present study concludes that the individuals with G allele of PCSK9 rs505151 SNP are more prone to cardiovascular diseases (CVDs) as compared to those carrying the A allele.A allele was found as a protective factor against the risk of CVDs.No association of rs562556 variant was reported with CVDs risk.The haplotype analysis revealed that H3 (G-A) haplotype of rs505151 (E670G) and rs562556 (I474V) variants of PCSK9 gene were in association with the risk of CVDs whereas H1 (A-A) haplotype was found as a protective factor towards the risk of cardiovascular diseases.However, large cohort studies are required to evaluate the PCSK9 gene rs505151 genetic variant association with the development of CVDs.

Table 1 .
Comparative analysis of demographic characteristics and biochemical parameters between CVD patients and control group.

Table 2 .
Genotype and allele frequency of PCSK9 rs505151 and rs562556 SNPs in cases and the control group.

Table 3 .
of biochemical parameters in the cases and control group.

Table 4 .
Inter-genotypic variations (rs562556)of biochemical parameters in the cases and control group.