Identification of novel mutations in congenital afibrinogenemia patients and molecular modeling of missense mutations in Pakistani population

Background Congenital afibrinogenemia (OMIM #202400) is a rare coagulation disorder that was first described in 1920. It is transmitted as an autosomal recessive trait that is characterized by absent levels of fibrinogen (factor I) in plasma. Consanguinity in Pakistan and its neighboring countries has resulted in a higher number of cases of congenital fibrinogen deficiency in their respective populations. This study focused on the detection of mutations in fibrinogen genes using DNA sequencing and molecular modeling of missense mutations in all three genes [Fibrinogen gene alpha (FGA), beta (FGB) and gamma (FGG)] in Pakistani patients. Methods This descriptive and cross sectional study was conducted in Karachi and Lahore and fully complied with the Declaration of Helsinki. Patients with fibrinogen deficiency were screened for mutations in the Fibrinogen alpha (FGA), beta (FGB) and gamma (FGG) genes by direct sequencing. Molecular modeling was performed to predict the putative structure functional impact of the missense mutations identified in this study. Results Ten patients had mutations in FGA followed by three mutations in FGB and three mutations in FGG, respectively. Twelve of these mutations were novel. The missense mutations were predicted to result in a loss of stability because they break ordered regions and cause clashes in the hydrophobic core of the protein. Conclusions Congenital afibrinogenemia is a rapidly growing problem in regions where consanguinity is frequently practiced. This study illustrates that mutations in FGA are relatively more common in Pakistani patients and molecular modeling of the missense mutations has shown damaging protein structures which has profounding effect on phenotypic bleeding manifestations in these patients.


Background
Hemostasis is the normal physiological response that prevents blood loss following vascular injury. It is dependent on an intricate series of events involving platelets and specific coagulation factors. Inherited bleeding disorders can be grouped into abnormalities of primary and secondary hemostasis. Fibrinogen (Factor I) deficiency can originate from congenital or acquired causes. Congenital afibrinogenemia (OMIM #202400) is a rare coagulation disorder that was first described in 1920 [1]. It is as a recessive autosomal inherited trait characterized by the absence of fibrinogen (factor I) in plasma [2]. The disease has a worldwide prevalence of 1-2 per million in the general population [3]. Fibrinogen is a 340 KDa hexameric protein of hepatic origin with multiple functions including roles in platelet aggregation and platelet plug formation and is an acute phase reactant [4]. It is secreted as zymogen similar to all other clotting factors and needs to be activated prior to its participation in the coagulation cascade. It consist of three pairs (Aα, Bβ and Gγ) of polypeptide chains [5] encoded by three genes (FGA, FGB and FGG) clustered in a region of approximately 50 kb on chromosome 4q28-q31 [6,7]. The normal plasma levels of fibrinogen are 4 g/l [8,9] and its half-life is approximately 100 h/4 days [10]. The main role of fibrinogen in hemostasis is to strengthen the platelet plug by converting into its polymeric insoluble form called fibrin by thrombin [11]. The fibrin meshwork traps red blood cells and platelets to form a plug which stops bleeding from site of injury. The absence of fibrinogen may result in excessive blood loss after a trauma. Moreover spontaneous bleeding events can occur. Fibrinogen defects are classified as quantitative (Hypofibrinogenemia and Afibrinogenemia, depending upon the partial or complete absence of fibrinogen) or qualitative (Dysfibrinogenemia and Hypodysfibrinogenemia) [12]. The most common symptom associated with fibrinogen deficiency is umbilical stump bleeding with other secondary bleeding manifestations including epistaxis, gum bleeding, cutaneous bleeding, muscle hematoma and haemarthrosis [13].
Congenital fibrinogen deficiency is considered as rare coagulation disorder but its incidence is growing higher in those regions where consanguineous partnerships are common [14,15]. Pakistan is the country with high ratio of consanguinity resulting in increasing numbers of rare inherited bleeding disorders including congenital afibrinogenemia. Our focus was to identify the mutations, assess the possible structure functional impact of affected protein by using molecular modeling/silico analysis tools. In addition to this, the study also encompasses the insight for possible mutational spectrum in frequently involved fibrinogen gene which may contribute for future prenatal diagnosis of carriers of these defects in Pakistani population.

Patient inclusion and exclusion criteria
This study, involving human subjects, was performed according to the Declaration of Helsinki, 1975, revised in 2000, and was approved by the relevant institutional Ethical Committee. Patients with congenital afibrinogenemia i.e. absent or undetectable levels of fibrinogen antigen (0-0.1 g/dl) and it activity in plasma were selected for this study. These low levels excluded acquired causes of fibrinogen deficiency, such as liver disease and consumptive coagulopathies, leukemia or other factor deficiencies. Patients from across Pakistan were recruited from centers including Karachi (Sindh) and Lahore (Punjab). A written informed consent was taken from patients and guardians incase of minor. Sampling was performed independent of sex or age. A comprehensive questionnaire was completed containing information about the patient's demographics and disease symptoms. A diagnosis was made on the basis of history and quantitative analysis. All subjects were registered at Hemophilia Society of Pakistan. Samples from all centers were collected and initially processed and saved at the National Institute of Blood Diseases (NIBD) for coagulation profile, biochemistry tests including liver profile and viral markers. DNA sequencing was performed in NIBD genome department, Karachi.

Sample collection and lab assays
Blood samples from patients were collected in 3.2% sodium citrate for coagulation profile in serum (RST) for biochemistry analysis, including liver profile and viral profile, (HBsAg, Anti HCV and HIV) and in K 2 EDTA for complete blood count and DNA extraction for amplification and sequencing. All sampling was performed with supportive infusion of cryoprecipitate to avoid bleeding. Platelet-poor plasma was collected by centrifugation of citrate tubes at 4000×g for 10 min and coagulation profile was performed, including PT, APTT and fibrinogen assay, using the Clauss method. Liver function tests (direct and indirect bilirubin, ALT, AST and alkaline phosphatase) and viral markers (HBsAg, anti HCV and HIV) were performed to exclude any acquired cause of afibrinogenemia.
Genetic analysis was performed after isolation of genomic DNA using standard protocols, exons and intronexon junctions of the fibrinogen genes were amplified by polymerase chain reaction [16] and sequenced [17] as previously described.

Pathogenecity scoring
Pathogenecity scoring was done by five prediction tools to predict the possible structure functional impact of affected protein in identified novel missense mutations. The prediction software tool Poly-phen2 (polymorphism phenotyping v2), (http://genetics.bwh.havard.edu/pph2/ accessed on 20th April 2015) was used to assess the possible impact of substitution on structure and function in human SNPs (Single nucleotide polymorphism). MUPRO (predictions of protein stability changes upon mutations), (http://mupro.proteomics.ics.uci.edu/ accessed on 20th April 2015) utilizes an SVM (support vector machines) model to predict the changes in stability as a result of single-site mutations, primarily from sequential information, and optionally provided structural information. The result only predicts whether the alteration in single amino acid will lead to destabilization or not. MUPRO predictions are reported with the confidence score (C score). A positive score indicates higher stability whereas a negative score shows the mutation decreases the protein stability (http://mupro.proteomics.ics.uci.edu/ accessed on 20th April 2015). SNP&GO (Single nucleotide polymorphism and GO terms, http://snps.biofold.org/snps-and-go accessed on 20th April 2015). SIFT (Sorting Intolerant from Tolerant, http://sift.jcvi.org accessed 20th April 2015) are algorithms which predict whether an amino acid substitution will affect protein function based on sequence homology and the physical properties of amino acids. A SIFT score of less than 0.05 is predicted to be deleterious. A substitution with a score greater than or equal to 0.05 is predicted to be tolerated (http://www.exeterlaboratory.com/ molecular-genetics/). Provean (http://provean.jcvi.org/ about.php) accessed on 27th January 2015) has the default threshold of −2.5 that means if the score of a variant is equal or below this threshold then the mutation is said to be deleterious and if the threshold is above −2.5, the score of variant is said to have neutral effects. Protein accession numbers were provided by Uniprot (Universal Protein Resource, http://www.uniprot.org/) and wild type color fasta sequence was first accessed (http://pga.gs.washington.edu/data/fga/fga.Colorfasta.html) on 27th January 2015 and later on 20th April 2015.

Structural analysis of novel missense mutations using molecular modeling
Among the six reported novel missense mutations from this study, four mutations were located in an area of the alpha chain that has no resolved crystal/NMR-based structure (Nuclear magnetic Resonance). Thus, to assess the putative structural effect of these mutations, we modeled this region on the ITASSER (Iterative Threading ASSEmbly Refinement) threading modeling server (http:// zhanglab.ccmb.med.umich.edu/I-TASSER/; accessed on 12th November 2014). The model for this region was then joined to the remaining beta chain for which the structure has already been determined and submitted in the protein structure database (PDB file ID: 3GHG; 2.9 Å resolution). Model joining was performed by replacing the last two amino acid residues common to the model and the crystal structure (PDB file ID: 3GHG; chain A) (PDB: Protein Data Base, ID: Identity, 3ghg is a 4-character unique identifier of every entry in the Protein Data Bank) downloaded from the protein structure database (http://rcsb.org/pdb/ home/home.do;) accessed 20th November 2014) to maintain the dihedral angles for the full model at the point of joining the same. The complete model was refined by a short solvated simulation lasting 500 ps as described in Krieger et al., 2004 (Force field: Yamber3, periodic boundary conditions, temperature: 298 K, water density: 0.997 g/L, pH: 7.4). The local neighborhood of the wild type residue corresponding to the reported mutation was investigated to establish a logical hypothesis for the effect of the mutation. An additional one missense mutations (p.Trp432Arg) in the beta chain lies on the structurally resolved region of the PDB file 3GHG; chain B). Similarly the local molecular environment for this wild-type residue was also inspected. All structural analysis and image rendering were performed with YASARA (Yet Another Scientific Artificial Reality Application) version 12.8.6 (www.yasara.org/).

Results
Mutations were identified in all 13 patients. The major bulk of identified mutations is present in FGA gene which tends to be the most frequently occurring mutation site in our study population. Ten patients who have mutations in FGA gene are individual unrelated probands.
Mutations in FGB gene are less frequent as compared to FGA.
In FGA gene, eight mutations were identified as novel and the remaining two were reported mutations. Eight novel mutations include five missense, one nonsense and two frameshift mutations including homozygous and a compound heterozygous frameshift mutation. The two nonsense mutations in FGA are reported in literature. There is one more mutation with reported status in proband (C3). This patient had compound heterozygous mutation with frameshift as novel mutation and nonsense as reported.
We identified three mutations in FGB including one novel missense mutation (C9) and two homozygous nonsense mutations reported in siblings.
The FGG gene mutations are the rarest of all three fibrinogen genes. We detected three novel mutations including two similar nonsense mutations in siblings and one frameshift mutation in unrelated proband in different exons of FGG gene (Table 1).
Structural analysis of novel missense mutations using molecular modeling A) alpha chain missense mutations All four novel missense mutations from the α-chain reported in this study were present in a region (residues 220-860) of the α-chain, which had no resolved/known crystal structure. The region surrounding the reported mutations (residues 300-400) was relatively poorly conserved with most of it missing from some fibrinogen homologues.
Among the mutated residues, p.Pro302 was present in all homologues, which contained this part. The p.Ser325 residue was also conserved in all homologues with the exception of Musmusculus, where it was substituted by an Asn. The two Thr residues, p.Thr302 and p.Thr331, were relatively variable and substituted by Ser or Asp in a few homologues. Only in the homologue from Canis lupus familiaris was one of the Thr residues (p.Thr302) observed to be substituted by an Ala residue, which has been reported as a mutated residue for both Thr    [26]. ** Siblings, NA not available, s (seconds). The fibrinogen levels in all patients were found to be equal to or lower than 0.1 g/l (Normal Range 2-4 g/dl), PT more than 120 s (Normal Range 9-11 s) aPTT more than 180 s (Normal Range 24-27 s) and prolonged thrombin time (normal range 10-13 s). Ethnicity explains the frequency of majorly affected, thickly populated and largest province of Pakistan (Punjab) residues in our study. Modeling of this region showed that this region could be split into two central cores, each of which is organized as a beta sheet surrounded by flexible coils (Fig. 1). The two cores are connected by a central long helix. The first core, apart from being surrounded by flexible coils, also contains a few short helices. Three of the four reported mutated residues were located on these short helices with the exception that p.Ser325 is located on a short loop connecting one of the short helices to the central core. The residues p.Pro302, p.Thr305 and p.Thr331 are partially buried, with the p.Pro302 and p.Thr305 side chains oriented toward the central core beta sheets. The residues p.Pro302 and p.Thr305 participate in intra-helical hydrogen bonds with each other and with p.Ser399 and p.Arg308, respectively. The residue p.Thr331 lies at the edge of a short helix and also participates in intra-helical hydrogen bonding (p.Gly327). Interestingly, within the fold on which all four of these mutations reside, lysine residue p.Lys322 is known to be cross-linked to ∝ − 2 antiplasmin proteins and a glutamine residue, p.Gln347, which participates in inter-chain cross-links during clot formation.

B) Beta chain missense mutations
The one novel missense mutations (p.Trp432Arg) reported in the chain occurs in a highly conserved region. (Fig. 2). The residues are completely conserved in homologues that have been used for the present alignment. The p.Trp432 residue lies completely in the densely packed hydrophobic core of the C-terminal region of chain. This densely packed hydrophobic core consists of a number of other aromatic acids, which are in close proximity to p.Trp432 (p.His400 and p.His438, p.Trp433 and p.Tyr434). The p.Trp432 residue hydrogen bond contacts with p.Tyr434 and p.Ser406.

Pathogenecity score
Pathogenecity scoring of six novel missense mutations identified in FGA and FGB was done on five different pathogenicity scoring software ( Table 3). Out of five missense mutations of FGA, two mutations were found to have damaging effect and decreased protein stability calculated by two different softwares (MUPRO and Provean). Other software didn't show the deleterious effect for the same two mutations identified in two unrelated proband. In FGB gene the missense mutation was found to be damaging or deleterious and showed decreased structure stability. The damaging effect and lack of protein stability in structure may lead to the bleeding manifestations in patients which can vary from mild to severe bleeding.

Discussion
Fibrinogen deficiency is a rare inherited bleeding disorder that is characterized by two subtypes of either reduced or completely absent levels of fibrinogen in the blood [18]. FGA is documented as the most affected gene in literature [19,20]. We have found the larger chunk of mutations in FGA gene in our set of data. A total of 169 mutations in fibrinogen are reported on the Human Gene Mutation Database (http://www.hgmd.cf.ac.uk/ac/index.php) date accessed August 12, 2014). Consanguinity involving The spectrum of causative mutations for afibrinogenemia is interesting as FGA appears to stand out from the two other fibrinogen genes [21]. The predominant inheritance pattern was homozygous with a high proportion of nonsense mutations followed by missense mutations in our study results. A frame shift mutation (p.Glu262AspfsX158) in FGA exon 5 reported in one study is predicted as truncated polypeptide. It is associated with exceptionally long stretch of abnormal residues in homozygous patient with congenital afibrinogenemia [22]. Frameshift mutation (p.Gln282Thr fsx83*) and (p. Lys (AAA) 48Arg fs9*) are the novel compound heterozygous mutations which have manifested deletions along with frameshift defects. The  bleeding phenotype is severe as these mutations worsen the symptoms due to combined effect of compound mutation and truncation of polypeptide chain. Three missense mutations (Pro302Ala, Thr305Ala and Thr331Ala) in the alpha chain reside on short helices surrounding a central beta sheet core. All these mutations are non-conservative in nature, i.e., the Pro302Ala substitution results in the replacement of a rigid imino group with a smaller, more flexible residue, and the Thr305Ala and Thr302Ala substitutions result in the replacement of polar side chains by smaller but hydrophobic side chains. In addition, the introduction of alanine in these regions will most likely disrupt some of the intra-helical hydrogen bonds, thereby breaking the helical structure surrounding the central core. Because these short helices provide order and stability around an otherwise disordered coiled-coil region, their disruption might result in a loss of stability for this region and the alpha chain. The third mutation in this chain, Ser325Gly, is also non-conservative, i.e., it results in the substitution of a polar residue to a very small and flexible Gly residue. Because the wild-type residue already lies on the flexible loop, the introduction of a small residue will make this region more disordered and therefore unstable. Moreover, because all four missense mutations belong to a fold of the alpha chain that might be interacting with Factor XIII (this fold also contains the Lys and Gln residues that participate in interchain cross linking and cross linking to alpha 2-antiplasmin), conformational changes induced by these mutations on this fold might interfere with the interaction of fibrinogen alpha chain with Factor XIII. The one beta chain missense mutation resides on a highly conserved region of the beta chain, most likely because many of the residues of this region contribute to the stability of its densely packed hydrophobic core. The p.Trp432Arg substitution occurs in the middle of the hydrophobic core. The introduction of a large polar, positively charged residue instead of a hydrophobic aromatic one would destabilize the hydrophobic core of this region. Thus, the mutation affects the stability of the beta chain by disrupting its C-terminal hydrophobic core.

Conclusions
Rare inherited bleeding disorder specifically congenital afibrinogenemia has a growing incidence especially in regions like Pakistan where consanguinity factor is strongly present. Our study is purely based on Pakistani patients of congenital afibrinogenemia. It has shown the frequently affected gene FGA in our set of patients. We have documented the pathogenicity scores for missense mutations as a description for protein molecule stability and functional defects. We have also performed molecular modeling to see the structural defects and damages and their impact on the clinical manifestation of patients. In this way the genotype well correlated with phenotype of these patients.