Genetic variation of the ABC transporter gene ABCC1 (Multidrug resistance protein 1 – MRP1) in the Polish population

Multidrug resistance-associated protein 1 (MRP1), encoded by the ABCC1 gene, is an ATP-binding cassette transporter mediating efflux of organic anions and xenobiotics; its overexpression leads to multidrug resistance. In this study, 30 exons (from 31 in total) of the ABCC1 gene as well as and their flanking intron sequences were screened for genetic variation, using the High Resolution Melting (HRM) method, for 190 healthy volunteers representing the Polish population. Polymorphism screening is an indispensable step in personalized patient therapy. An additional targeted SNP verification study for ten variants was performed to verify sensitivity of the scanning method. During scanning, 46 polymorphisms, including seven novel ones, were found: one in 3’ UTR, 21 in exons (11 of them non-synonymous) and 24 in introns, including one deletion variant. These results revealed some ethnic differences in frequency of several polymorphisms when compared to literature data for other populations. Based on linkage disequilibrium analysis, 4 haplotype blocks were determined for 9 detected polymorphisms and 12 haplotypes were defined. To capture the common haplotypes, haplotype-tagging single nucleotide polymorphisms were identified. Targeted genotyping results correlated well with scanning results; thus, HRM is a suitable method to study genetic variation in this model. HRM is an efficient and sensitive method for scanning and genotyping polymorphic variants. Ethnic differences were found for frequency of some variants in the Polish population compared to others. Thus, this study may be useful for pharmacogenetics of drugs affected by MRP1-mediated efflux.


Background
The human multidrug resistance-associated protein 1 (MRP1) is a member of the ATP-binding cassette (ABC) transporter superfamily and is encoded by the ABCC1 gene [1]. MRP1 was first described by Cole et al. [2] who cloned the overexpressed transporter from lung cancer cell line H69AR. The gene is mapped to chromosome 16p13.1. Its length is around 200.000 base pairs; the gene contains 31 exons encoding a protein of 1,531 amino acids, with molecular weight of 190 kDa [3,4]. The protein has 3 hydrophobic transmembrane domains (TMDs), also called membrane-spanning domains (MSDs), named TMD0 (Nterminal), TMD1 (middle) and TMD2 (C-terminal). They contain 17 transmembrane helices (TM), 5 in TMD0 and 6 in both TMD1 and TMD2. The intracellular region of the protein contains 2 hydrophilic nucleotide binding domains (NBDs), NBD1 associated with TMD1 and NBD2 associated with TMD2 [5,6]. They are both involved in binding and hydrolyzing ATP, which is indispensable for substrate transport. Each of NBDs contains highly conserved motifs: Walker A motif which binds the β-phosphate of ATP and Walker B motif that interacts with Mg 2+ ions. There is also a third, 13 amino acid sequence motif, between Walker A and Walker B called the ABC signature [1,7].
MRP1 is ubiquitously expressed in many human tissues and physiological barriers like blood-brain or blood-testis barriers. High levels of MRP1 expression were detected in lung, kidney, heart, testis, placenta, adrenal glands and skeletal muscles, while lower levels of expression were found in intestine, colon, brain, peripheral blood mononuclear cells and liver [1,3,8]. With the exception of brain cells, MRP1, in contrast to other ABC transporters, is expressed at the basolateral membrane of polarized epithelial cells where it plays a protective role against xenobiotics and their metabolites [3,9].
Many drugs are good substrates for MRP1, so its overexpression leads to multidrug resistance (MDR), especially during cancer chemotherapy. This effect can be observed in many types of cancer cells including solid tumors (lung cancer, breast cancer, gastric and colon carcinomas, melanoma, prostate cancer, neuroblastoma) as well as in various types of leukemias [3,9]. MRP1-overexpressing cells are capable of transporting a large group of substrates, including anticancer drugs like folate-based antimetabolites, anthracyclines, Vinca alkaloids, antiandrogens and even some inorganic anions such as arsenite and arsenate. MRP1 is also an active transporter of a broad range of endogenous compounds including glutathione (GSH), glucuronate and sulfate-conjugates of organic anions, such as inflammatory mediator leukotriene C 4 or estradiol-17-β-D-glucuronide (E 2 17βG). Therefore, MRP1 plays a crucial role in efflux of organic anions, anionic drugs, xenobiotic conjugates and additionally in the maintenance of redox balance by participation in cellular response to oxidative stress [3,7,[9][10][11].
A large number of single nucleotide polymorphisms (SNPs) in ABCC1 gene were detected so far in studies of various populations, especially Asian and Caucasian. Saito et al. [12] reported 779 genetic variations in eight ABC genes in Japanese population, 95 among them concerning ABCC1. Subsequently, Fukushima-Uesaka et al. [13], screening ABCC1 in the same population, identified 86 genetic variations including 31 novel ones. Similar screening of Chinese population revealed 32 SNPs [14]. Leschziner et al. [15] screened five ABC transporter genes in Caucasian individuals and detected 221 variants, including 61 for ABCC1, among them 22 novel for this gene. They identified also large blocks of high linkage disequilibrium and low haplotype diversity across all the genes in the examined population. Comparative study including polymorphisms from four different populations: Chinese, Malay, Indian and Caucasian showed that apart from population-specific SNPs, only 6 of 71 detected SNPs altered amino acid sequence, so ABCC1 is considered to be a highly conserved gene [16]. Majority of previously reported genetic variations were found in intron sequences. However, there is evidence in literature that some SNPs change protein functioning and effectiveness of cancer chemotherapy. Such cases were reported for neuroblastoma [17], breast cancer [18], ovarian cancer [19] and other serious diseases like: chronic obstructive pulmonary disease (COPD) [20,21] cystic fibrosis [22] and major depression [23].
Previously mentioned studies have shown importance of ABCC1 polymorphisms and its genetic variation. These data can have great relevance for pharmacogenetics in personalized patient therapy. Based on the data described above for several populations, there exist ethnic differences in the frequency of ABCC1 polymorphic variants. Some of them are so strongly linked to each other that they are always inherited together, forming haplotypes. Haplotype is a combination of polymorphic variants inherited on the same chromosome and sometimes has stronger clinical relevance for drug response or adverse events than a single polymorphism [24]. Moreover, there is a large haplotype diversity across different populations [25]. Therefore, prior comprehensive analysis of the whole gene is required in the specific population before application of genetic variation data in pharmacotherapy.
Our previous studies performed during the project TESTOPLEK in which we have examined genetic variation of multidrug transporter genes in the Polish population indicated that ABCC1 polymorphisms were present at significant level in a sample of Polish individuals. However, various SNPs in the ABCC1 gene differed significantly in their frequency in tested DNA samples [26]. This prompted us to verify the whole coding sequence of ABCC1 gene. We conducted high throughput analysis and screened 30 exons (from 31 in total) of ABCC1 gene for polymorphism variants in the Polish population (which is exclusively Caucasians). This is the first study of genetic variation of ABCC1 gene in the Polish population.

Materials and genomic DNA samples
Human genomic DNA samples were derived from anonymous Polish unrelated volunteers who declared to be healthy. Samples were randomly selected from the "normal Polish population" genetic collection at the Biobank Lab, Department of Molecular Biophysics, University of Lodz. Genetic material for this collection was sampled in 2011 -2012 within the EU-funded TESTOPLEK project. All subjects gave their written informed consent to participate in the study. This study was approved by the relevant regional ethical committee (Research Bioethics Commission, University of Lodz -Decision no. 8/KBBN-UŁ/II/ 2014 and Statement of the Research Bioethics Commission, University of Lodz from 17th of June, 2010) and all procedures were performed in accordance with the Declaration of Helsinki.
Saliva was collected into Oragene OG-500 DNA collection/storage receptacles (DNA Genotek, Kanata, Canada) and genomic DNA was subsequently isolated by the MagNA Pure LC DNA Isolation Kit -Large Volume (Roche, Basel, Switzerland) with final concentration normalized to 200 pg/μl [27]. A total of 190 samples were enrolled in the scanning study and other 380 in genotyping.

Screening of ABCC1 by High Resolution Melting (HRM) Method
Investigation of ABCC1 genetic variation was conducted using High Resolution Melting method. The method is based on precise measurements of DNA melting profile. The dsDNA melting temperature depends on length and nucleotide composition of the PCR product. Even a single base variation generates a different melting curve, and many variants can be detected in this way. The list of primers used for HRM scanning of all the areas is presented in Table 1. The single reaction mixture (10 μl) was prepared using Janus® Automated Workstation (Perkin Elmer Inc., Waltham, USA) and composed of GoTaq® Colorless Master Mix (2×) (Promega, Madison, USA), LC Green Plus® dye (10x) (BioFire Defense Inc., Salt Lake City, USA), 0.5 μl of 10 μM primers mixture, 3 μl DNA, and filled up to the final volume with water. Reaction was performed on 384-micro well plates using CFX384™ realtime PCR system (Bio-Rad Laboratories Inc., Hercules, USA) on duplicate samples. The reaction conditions were as follows: initial denaturation at 95°C for 3 min, 50 amplification cycles of denaturation at 95°C for 30 s and annealing at specific temperature depending on the primers used, for 30 s. The plate was read after each cycle. Directly afterwards, melting curve was determined, the plate being incubated at 90°C for 60 s, 40°C for 60 s and from 65°C to 95°C with an increment of 0.2°C for 10 s with plate reading. The obtained data were analysed with the Bio-Rad Precision Melt Analysis Software, version 1.2 (Bio-Rad Laboratories Inc., Hercules, USA). Based on HRM melting curve analyse, confirmation of genetic variation for every melting cluster was obtained by direct sequencing method for several samples selected from each cluster (or all samples from clusters with less than 4 samples). Study workflow design for all described methods is presented in Fig. 1.

HRM genotyping
Independently of the scanning study, we performed an additional genotyping study. We focused on highly polymorphic regions of the ABCC1 gene existing in databases, often examined in recent publications. We also tested some interesting variants observed in literature that may have clinical significance. The selection provided us with 17 SNPs to check. To identify common genetic variation of selected SNPs in the ABCC1 gene, we used the HRM analysis method according to conditions described above. Sequences of primers used for genotyping are described in Table 1. Based on HRM melting curve analysis, confirmation of genetic variation for every melting cluster was obtained by direct sequencing method for several samples selected from each cluster (or all samples from clusters less than 3 samples). We used the data obtained from genotyping to compare them with HRM scanning results (comparison of minor allele frequencies -MAFs) and thereby to validate the accuracy of the screening method.

DNA sequencing
For sequencing, selected samples were prepared by PCR. Reaction mixture filled with water to a final volume of 50 μl per sample was composed using GoTaq® Green Master Mix (2×) (Promega, Madison, USA), 5 μl of 10 μM primers mix and 5 μl DNA. Reaction conditions were as follows: an initial denaturation at 95°C for 3 min, 45 amplification cycles of denaturation at 95°C for 30 s, annealing at specific temperature depending on the primers used for 30 s and extension at 72°C for 45 s, followed by final extension at 72°C for 5 min (list of primers for exon sequencing is shown in Table 1). Afterwards, PCR products were purified using NucleoSpin® Gel and PCR Clean-up kit (Macherey-Nagel GmbH & Co. KG, Düren, Germany). DNA concentration and size of PCR product was determined by agarose electrophoresis of 2 μl of each sample. Based on the intensity of bands the samples were diluted from 5 to 50 times and applied for sequencing reaction using BigDye Terminator V3.1 (Applied Biosystems, USA) according to the manufacturer's protocol. The PCR-sequencing product was purified using BigDye X-Terminator kit following the manufacturer's protocol (Applied Biosystems, USA). Further, 30 μl of each purified sample was applied to 96 wells of titration plates and analyzed in 3500 Genetic Analyzer (Applied Biosystems, USA).

Polymorphism detection
The ABCC1 genomic DNA sequence (NG_028268.1) was obtained from GenBank (http://www.ncbi.nlm.nih.gov) and used as a reference sequence during analysis of sequencing results by CodonCode Aligner software (Codon-Code Corporation, Centerville, USA). Sequencing results of selected samples were compared with respective HRM clusters. Based on this, genotype for each melting cluster was established and genetic variation for each sample was verified.

Availability of supporting data
For each of the polymorphisms detected in this study in ABCC1 gene, the following parameters were assigned: dbSNP IDs (rs numbers) (http://www.ncbi.nlm.nih.gov/ SNP/), nucleotide position within or relative to the coding sequence based on the NM_004996.3, and amino acid position in the protein for SNPs in the coding sequence    104* -upstream from the START codon. 21** -downstream from the STOP codon. Bold variants means these ones which genetic variation was detected during genotyping method based on the reference sequence NP_004987.2. These data were obtained from GenBank (http://www.ncbi.nlm.nih. gov) and were used in this paper as the nomenclature of variants.
The obtained data as supporting materials are presented in details in Additional files 1, 2, 3 and 4.

Prediction of impact of SNPs upon protein functionality
Using PolyPhen-2 v2.2.2r398 tool (http://genetics.bwh. harvard.edu/pph2/), we performed automatic prediction of effects of missense SNPs and amino acid substitutions for protein functionality and structure. PolyPhen-2 uses damaging alleles datasets, like HumDiv or HumVar, to calculate naïve Bayes posterior probability and to classify a mutation deleterious effect as benign, possibly damaging or probably damaging. This ternary classification depends on thresholds of false positive rate (FPR) denoting the chance that mutation would be falsely qualified as deleterious. For HumDiv-trained model, the thresholds are 5 % and 10 % FPR while for HumVar-trained model they are 10 % and 20 % FPR, respectively [28].

Linkage disequilibrium and haplotype blocks analysis
The observed genotype distribution was determined for all the detected polymorphisms by performing the Hardy-Weinberg equilibrium (HWE) exact test assuming consistency for P-value higher than 0.001. Linkage disequilibrium (LD) and haplotype block analysis was performed by Haploview 4.2 software (http://www. broad.mit.edu/mpg/haploview/). The LD analysis was made using |D'| and r 2 parameters for each variation pair. |D'| coefficient was preferentially used to model recombination rates in examined population, and r 2 parameter to model association power [29,30].
Studying haplotypes in the human genome requires definition of the location of haplotype blocks, which are regions with little evidence for historical recombination and within which only a few common haplotypes are observed [31]. To determine haplotype blocks in ABCC1 for the Polish population, we used the method described previously by Gabriel et al. [31] based on confidence bounds on D'. For pairs of SNPs being in strong linkage, 95 % upper confidence bound D' was higher than 0.98 and the lower bound was above 0.7 while for pairs of SNPs with strong evidences of historical recombination upper bound on D' is lower than 0.9. According to this model, polymorphisms with MAF [22] lower than 0.05 were excluded during the creation of blocks and 16 polymorphic variants with higher MAF were included. Haplotypes with frequency lower than 1 % were grouped as "others".

ABCC1 genetic variation detected in this study
We screened coding regions and exon/intron boundaries as the most important gene areas (Additional file 1). To assure optimal detection level, the maximal length of tested PCR product was never longer than 250 and 125 base pairs for scanning and genotyping, respectively. For scanning, most of exons had to be divided into separate scanned areas, which provided 51 fragments for analysis. We performed complete polymorphism screening in 30 exons in total. We were unable to scan exon 1 because of problems with amplification during PCR, caused by extremely GC-rich sequence region.
In the scanning study on 190 Polish volunteers, we found 46 different SNPs in the scanned areas, and seven of them being novel and not reported yet (Table 2) We used the data from genotyping study to validate our scanning results and compare accuracy of HRM methods (Additional file 2). For genotyping, we tested a subset of 17 SNPs, selected according to the literature or detected during scanning. Seven of them have not been found during scanning in examined samples and also did not show any genetic variation by genotyping method. Genotype validation by sequencing was performed for all of examined SNPs (Table 3). We calculated MAFs to enable us to assess the accuracy of scanning method compared to genotyping. Results obtained for eight of SNPs were quite similar and the differences were not statistically significant (Table 3). We found statistically significant difference in MAF values obtained in the both our studies for only two locic.1062 T > C (p.Asn354+) and c.2012G > T (p.Gly671Val).
All the amino acids altered by non-synonymous variants were located in cytoplasmic domains of the pro-

Linkage disequilibrium analysis
Based on full genotype sets of 44 polymorphic variants confirmed by Hardy-Weinberg equilibrium exact test (Table 2), linkage disequilibrium analysis using r 2 and |D'| statistics was performed (Additional file 4). Accordingly to the r 2 parameter, perfect linkage (r 2 = 1) was de-

Haplotype block analysis
Based on the model of Gabriel et al. [31] a total of 12 common haplotypes were identified and their frequencies were defined in four blocks (Table 4). In block 1, variants c.809 + 54C > A and c.809 + 64C > G were included, and the most frequent haplotype in this block was that containing major alleles (frequency: 0.686), the next most frequent one was the haplotype with both minor alleles (frequency: 0.231). Block 2 spanned SNPs c.825 T > C (p.Val275=), c.1062 T > C (p.Asn354=) and c.1218 + 8A > G at a distance of 1 kb, with the most frequent haplotype containing all major alleles (frequency: 0.659), subsequently the haplotype with all minor alleles (frequency: 0.300), two other ones had low frequency (set alleles TCG: 0.021 and set alleles TCA: 0.011). Block 3 had two haplotypes composed of SNPs c.1678-37G > A and c.1684 T > C (p.Leu562=). The first haplotype was the set with major alleles (frequency: 0.870) and the second one the set with minor alleles (frequency: 0.130). Block 4 was created of SNPs c.4126-45G > A and c.4487 + 18G > A at a distance of around 2 kb. The most frequent haplotype was that of alleles combination AG (frequency: 0.447), the next ones with major alleles (frequency: 0.405) and allele combination GA (frequency: 0.148). The frequency of recombination was also calculated between blocks as a value of multiallelic D' coefficient, and equaled to 0.65 between block 1 and 2, 0.55 between block 2 and 3 and 0.27 between block 3 and 4. HaploView predicted 51 possible connections of haplotypes for recombination between blocks higher than 1 %. However, recombination between blocks higher than 10 % occurred only between the most frequent haplotype from block 1 and the most frequent one from block 2, between three most frequent haplotypes from block 2 and the most frequent one from block 3, and between the most frequent one from block 3 and three haplotypes from block 4. Apart from c.1684 T > C (p.Leu562=), the remaining SNPs in blocks were tagged in silico as haplotype tag SNPs (htSNPs).

Discussion
In this paper we highlighted the fact that the High Resolution Melting method is a powerful technique both for targeted genotyping of selected SNPs and for scanning of long parts of gene sequences. We were unable to compare them with the ones from the only other publication on ABCC1 variant distribution in Caucasians because of unavailability of the raw data from that study [15].
Comparing our genetic variation results obtained using the scanning method with data for Asian populations, 18 variants were detected in the Polish population which were present also in the Japanese one (n = 153) [13], while 13 variants from the Polish population were also detected in the Chinese one (n = 27) [14]. Some of these variants exhibited significant differences in MAF values, confirmed by chi-square test for alleles (with Yates' correction if any allele quantity was below 5) and showed as their P-values, assumed significant if <0.05. Higher MAF for the Polish population was detected for variant c.809 + 54C > A (MAF = 0.231, compared to 0.059 for Japanese (P < 0.001) and 0.093 for Chinese (P = 0.020) populations), synonymous SNP c.  We performed analysis of linkage disequilibrium (which is defined as non-random allelic association of different loci) to summarize our results for those 44 polymorphisms which were found to be consistent with the Hardy-Weinberg model. It turned out that, as expected, the r 2 coefficient was most useful for analysis, and we detected polymorphic variant pairs with strong LD: our results are very similar even to those obtained for Japanese population [13]. It can be concluded on this basis that the specific linkages between particular variants are highly evolutionary conserved and occur in every population. Human genome is organized in haplotype blocks with high LD inside them, separated one from another by regions of low LD. Blocks show no evidence of historical recombination inside. This definition of haplotype blocks could be applied to 17 SNPs which had MAF higher than 0.05, which was the lower threshold in this analysis. We found four blocks and 12 common haplotypes including nine SNPs detected during this study. We determined htSNPs which should be sufficient to identify common haplotypes in population. Low haplotype diversity of ABCC1 in Polish population is consistent with the conclusion of another study on Caucasians [15]. Although blocks for our results are small and the number of SNPs used for haplotype construction is low, the main core of blocks and their borders are retained across populations regardless of the blocks definitions used. Our blocks 1 and 2, usually connected as one in other studies, extend from intron 7 to intron 9. Our block 3 include SNPs from vicinity of exon 13 and block 4 contains SNPs from intron 28 to intron 30. It corresponds to block borders detected in other studies and suggest points of recombination which had taken place in intron 12 and in intron 27 or 28 [13][14][15].
Analysis using PolyPhen-2 which predicts the influence of detected non-synonymous SNPs on protein functionality suggested that six of them are suspect. However, each analysis is only a statistical calculation based on a full dataset and some experimental data do not support these results. Létourneau et al. [33] examined 10 missense SNPs of ABCC1 and almost all of them, classified by PolyPhen-2 software as deleterious, affected neither expression nor functionality of the protein. A similar observation was also reported previously for the SNP c.2012G > T (p.Gly671Val), detected also by us. Although this substitution affects a residue near the highly conserved motif Walker A, it reduced mRNA expression but did not change transport properties of the protein, although PolyPhen-2 analysis clearly showed its damaging impact [34]. Recently, other data confirmed these observations and none of the amino acid substitutions: p.Arg633Gln, p.Gly671Val, p.Arg723Gln, detected also in our study, was found to change functionality of MRP1 transporter. Two substitutions altered drug resistance and inhibitor sensitivity and we found one of them, c.4002G > A (p.Ser1334=) [35].
On the other hand, there are other papers showing contrasting data for above-mentioned variants, proving their clinical association or experimental impact on protein functionality. Thus, it was demonstrated that variant c.2168G > A (p.Arg723Gln) reduced resistance activity of MRP1-overexpressing cells to drugs like daunorubicin, doxorubicin, etoposide, vinblastine and vincristine in vitro [36]. Although correlation between carrying this SNP and chemotherapy response was not observed in lung cancer patients [37], this variant was linked to the response to chemotherapy in patients with ovarian cancer [19]. Two other polymorphic variants detected in this study c.825 T > C (p.Val275=) and c.2012G > T (p.Gly671Val), were correlated with febrile neutropenia as an effect of FEC-induced hematological toxicity in breast cancer patients [38]. The major allele variant c.825 T > C (p.Val275=), in combination with another one c.*543C > T (beyond our scanning), was also associated with anthracycline-induced cardiotoxicity after chemotherapy in children with acute lymphoblastic leukemia [39]. The discrepancy observed in different studies on clinical significance of c.2012G > T (p.Gly671Val) and c.2168G > A (p.Arg723Gln) is of unclear origin. Cellular conditions in vitro in which mutants were expressed could be inadequate to reveal all their real functions, e.g. a change in mRNA structure could be more essential than the amino acid change caused by SNP [34]. Conseil and Cole [35] concluded from this discordance that comprehensive biochemical and pharmacological characterization of the respective MRP1 mutants should be performed to identify consequences of ABCC1 SNPs on phenotype because these effects might be quite selective. Additional nucleic acid-based methods should be also used to clarify the impact of individual SNPs.
These examples show that only comprehensive examination of polymorphisms and studying their phenotypes can tell more about their real influence. Therefore, novel variants like c.596C > T (p.Ser199Leu), detected in this study and qualified in silico as damaging, need to be checked in an experimental investigation. Additionally, the fact that we found no novel mutation in important nucleotide binding motifs, nor any of the mutations identified previously in vitro as deleterious in NBD loops, proved that these highly conserved regions in the protein structure are unlikely to be affected by polymorphisms occurring in the healthy population.

Conclusions
We scanned 30 exons and short flanking intron sequences of ABCC1. HRM is an efficient and sensitive method for scanning and genotyping polymorphic variants. We compared results of long-range scanning to targeted genotyping of selected SNPs showing that both methods are highly sensitive tools to study genetic variation. During our study we found 45 polymorphic variants, some of them rare and eight of them novel not previously reported. This is the first study about ABCC1 genetic variation in the Polish population and we found some ethnic differences in frequency of genetic variants compared to other populations. These results need further complementation with a larger number of probands and new data including also introns and untranslated regions as sequences with polymorphic variants which may have clinical relevance as well.