Sequence and Structure Analysis of CRP of Lung and Breast Cancer Using Bioinformatics Tools and Techniques

Published by Oriental Scientific Publishing Company © 2018 This is an Open Access article licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/ ), which permits unrestricted Non Commercial use, distribution and reproduction in any medium, provided the original work is properly cited. Sequence and Structure Analysis of CRP of Lung and Breast Cancer Using Bioinformatics Tools and Techniques

C-reactive protein was one of the most important proteins in the medical field and in identifying disease states associated with inflammation.C-reactive protein (CRP), is the one of the most important protein , belonging to pentraxin family of proteins 1 .The Cytogenetic location for CRP gene : 1q23.2 which is the long (q) arm of human chromosome1 at position 23.2 ,this position contains gene which encode to CRP 2 C-reactive protein produced in the liver in response to IL-6 3 .Interlukine-6 consider one of tumour necrosis factor 4 .CRP has been associated with cancer size and the stage of disease, and also can it be used as a "cancer marker" in the determining of disease activity 5 .Chronic inflammation may be a caused factor in a variety of cancers.In general, the exposure to inflammation for long time, lead to the risk of cancer.The relationship between inflammation and cancer was that central role in the regulation of inflammatory and immune response was IL-6.And also IL-6 plays important roles in cancer progression related to proliferation, migration, and angiogenesis 6 .The national cancer institute consider CRP as marker for cancer disease 7 .Cancer considers one of the most important health problems of the current era and also a leading cause of death among population.Cancer can simply be defined as a malignant tumor or malignant neoplasm,It includes a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body.It can also be defined as a group of disorders that are characterized by uncontrolled division of cells and the ability of these abnormal cells to spread 8 .The most frequent cancer among women was the breast cancer.It's the female malignant neoplasia with the highest incidence in the world 9 .Breast cancer remains a major problem of women's health and its incidence is increasing in developing countries 10 .Another important type of the cancer was the lung cancer.The lung cancer considered the most common cancer in terms of both incidence and mortality .Also it consider the second most commonly occurring of cancer in the world and it is leading to the cancer-related cause of death 11 .The aims of this research paper is to investigate the possibility of using CRP as a marker for cancer, computational analysis of CRP using bioinformatics methods, sequence and structural analysis of CRP of breast and lung cancer using bioinformatics methods.

Blood Sample collection
The sample size of this research paper is 40 human patients with breast and lung cancer.For each patient drawn three milliliter of whole blood for patients was obtained under aseptic conditions from each subject by a vein puncture using a disposable syringe.Whole blood was divided into two parts: first part (about two milliliters) was collected in sterile EDTA tube for genetic test.The second part of blood collected in gel tube were separated by centrifugation at 3000 rpm for 10 minutes for qualitative test.The blood and serum samples were subjected to freezing at -20ºC 12 .

Qualitative test
The serum samples containing human CRP were obtained from patients with lung and breast cancer, that give positive and negative results for qualitative test by using protocol in the Latex kit for C-Reactive protein (Cat.No.: NS 514001/ Salucea company).

DNA extraction from blood samples
The human DNA extraction from the whole blood of the human patients by using protocol in the gSYNC™ DNA Extraction Kit (Cat.No.: GS100/Genaid company).

Estimation of DNA concentration and purity
The human DNA concentration and purity of the samples extracted from whole blood were estimated by using nanodrop(Acta gen/USA) by putting one microliter of the extracted DNA was add to the nanodrop machine to evaluate the concentration in ng/ µl.The purity (1.6-1.8)means DNA high purity [13].

Agarose Gel Electrophoresis
After genomic DNA extraction from blood, agarose gel electrophoresis was adopted to confirm the presence and integrity of extracted DNA 13 .The samples were carefully loaded into the individual wells of the gel.Then the electrical power was turned on at 56 volt for 30 minutes.Afterwards, the DNA moved from the cathode(-) to the anode(+) poles.The Ethidium Bromaide stained bands in the gel were visualized using ,UV transiluminator at 365 nm and photographed.

Primer design for CRP gene
The full CRP gene sequences and annotation taken from Genome Database of the national center for biotechnology information (NCBI),the NCBI reference sequence:NC_000001.11.Using primer3 software(http://bioinfo.ul.ee/primer3) for the three Exon of CRP design 14,15 , table (1) shows the sequences of the designed for the forward and reverse primers.Then the primers provided by Bioneer Company /Korea in a lyophilized form and using the primers in PCR.

PCR
The reaction mix was constituted by 30µM KCl,10µM Tris-HCl(PH 9.0),1.5µMMgCl2,250µM each dNTP, 2µl each primer.1U of Taq DNA polymerase , 5 ¼L of the DNA template(sampl) and ultrapure deionized sterile distilled water up to 20 µL.The thermocycler conditions consisted of the program was set in the thermo cycler to amplify the target DNA for CRP1 ,CRP2 and CRP3 primers in table (2): Electrophoresis (1 h at 75 V) was performed with 5 µL of the reaction solution in a 2% agarose gel with ethidium bromide.PCR products were visualized under UV light and photographed.The DNA ladder (100pb) was used to estimate the molecular size of the bands.

DNA sequencing
The result of PCR(Polymerase Chain Reaction) for patient with breast and lung cancer whom gave clear band after PCR analysis of the analyzed (CRP) gene, sent to macrogen company/ Korea, for sequencing the three Exon of CRP gene for each samples.Compared between the DNA sequence of CRP gene for samples and CRP gene retrieved from NCBI using BLAST server (https ://blast .ncbi.nlm.nih.G ov / Blast.cgi) 16, for detection the location of mutation in CRP for patients.Translation the DNA sequence of CRP gene to amino acids sequence using BLASTX server (https ://blast .ncbi.nlm.nih.G ov / Blast.cgi)fortranslation DNA nucleotide to amino acids 16 .

Physicochemical properties Prediction
using online tool Prot Param (http:// us.expasy.org/tools/protparam.html)for Physico-chemical properties of the CRP for the pateints with lung and breast cancer,also the CRP retraived from NCBI.The parameters computed by Prot Param include the molecular weight, theoretical pI, amino acid composition, atomic composition, instability index, aliphatic index and grand average of hydropathicity (GRAVY) 17 .

Protein Structure Prediction
Using PSIpred online tool (http://bioinf.cs.ucl.ac.uk/psipred/psiform. html) for Secondary structure has been predicting 18,19 where the FASTA format of the sequence was given as input.It provides the structural information of the protein sequence in form of coils, helices and using PHYRE2 sofware (http ://www .sbg.bio.ic.ac.uk/ phyre2/ html) for 3D-structure prediction 20 .The modeling involves four basic steps, first searching structure showing homology with target, then selecting a best template having maximum identity with the target sequence which follows its alignment with the target and modeling the structure.On the other hand using swiss-model (http://swissmodel.expasy.org/)an automated system for modeling the quaternary structure of a protein from its amino acid sequence using homology modeling techniques 21 Annealing 58ºC (2) ,57ºC (3) 45 second 4 Extension 72ºC (1) 1 minute 5 Final Extension 72ºC (1) 5 minutes 1 6 Storage 4ºC (1) "

RESULTS AND DISCUSSIONS
Serum specimens were tested for presence of CRP using latex kit for CRP.However, the results obtained from qualitative test were 1(3.3%) patient give positive result and 29 (96.7%)patients with breast cancer give negative result to CRP, while the study of 22 , Which proved that the CRP was act as marker for predictor and to identify the risk of breast cancer.About 2 (20%) patients with lung cancer give positive result and 8(80%) patients give negative result for CRP, while study of 23 , that proposed the levels of CRP in serum sample are increased in patients with lung cancer.All these results agree with study of 24 , that proved CRP was non-specific marker for inflammation.Genomic DNA from the blood samples was extracted following the standard protocol which used in The result demonstrated that the purity of the extracted DNA in all samples were sufficiently high for PCR analysis .

PCR
The present study used PCR technique for DNA isolated from the blood from patients to amplify the 3 exons of the CRP gene.By using three primers(CRP1,CRP2 and CRP3 primer) ,that were designe by primer3 software and obtained from Bioneer Company (Korea).The PCR results were interpreted by the presence or absence of specific bands of amplified gene on 2% agarose.In this study, three primers were screened for PCR analysis using 40 DNA samples that were extraction from the whole blood of patients.The results of using first set of primers (CRP1F and CRP1R), second set of primers (CRP2F and CRP2R) and using third set of primers (CRP3F and CRP3RR) showed amplified fragment (238pb,597bp and485bp.respectively) a clear band by electrophoresis on a 2% agarose gel at 60 volt for 90 minute, of 40 DNA samples as shown in figure (1).The result of using second primer (CRP2F and CRP2R) showed amplified fragment (597bp.)as a clear band by electrophoresis on a 2% agarose gel at 60 volt for 90 minute, as shown in figure (2)  in all patients and control.
The result of using third set of primer (CRP3F and CRP3RR) showed amplified fragment (485bp.)as a clear band by electrophoresis on a 2% agarose gel at 60 volt for 90 minute, as shown in figure (3) in all patients and control.
T h e o u t c o m e s o f t h e C R P g e n e amplification using PCR analysis represented that all patients gave positive result (formation bands).This results showed that the product of amplified first set of primers (CRP1F and CRP1R), second set of primers (CRP2F and CRP2R) and third set of primers (CRP3F and CRP3RR) of CRP gene find in all patients.

Sequencing of C-reactive protein gene
The current study utilized forward primers for two patients by direct sequencing.: first with lung cancer(4) and second patient with breast cancer figure (5).
The comparison between DNA subjects and reference sequence are shown in figure (6)  Genotype analysis of CRP gene for the PCR product of the primer indicated of much genetic alteration.The effect of mutations in translation and the result revealed that there were five missense mutations and four deletion mutations in patient with breast cancer.The foure missense mutations ,six deletion mutations and eight insertion mutation in the patient with lung cancer table (3).
Finally the thymine transversion by adenine at the site 159714441 in the first Exon of CRP gene on the long arm of first chromosome.This mutation recorded in NCBI ,DDBJ and ENA with the numbers LC276937 and LC276938 this point mutation effected on the translation and cause missense mutation , when methionine replaced by lysine appeared in all patients at same site and has same effect, this meaning there are relationship between this mutation and cancer disease.And also this mutation detection in multiple tumor samples, so it can be consider the cause of cancer disease these result agree with study 25 demonstrated that Tay-Sachs disease is caused by a genetic mutation in the HEXA gene on chromosome 15.these result agree with study 26 demonstrated that sickle-cell anemia is caused by a point mutation in the ²-globin chain of hemoglobin, caused replace glutamic acid with amino acid valine at the sixth position.these result agree with studies of 27,28 demonstrated that neurofibromatosis is caused by point mutations in the neurofibromin gene and these result agree with study 29 demonstrated that cystic fibrosis is caused by a mutation in the CFTR gene.

Primary structure of C-reactive protein
The sequence of CRP for patients with lung and breast cancer consist from 219 and 211 amino acids respectively.While the sequence of CRP retrieved from NCBI consist from 224 amino acids.

Sequence analysis
Primary structure analysis provided the physicochemical properties of CRP.Molecular weight of CRP for the patients with lung and breast cancer was (24338.88 and 23508.82MW).While the molecular weight for CRP that retrieved from NCBI FASTA format-protein was (25038.58MW).Most abundant amino acid was Ser(S) ) in CRP for lung and breast cancer 9.6% and 9.5% respectively, while in CRP that retrieved from NCBI was 10.3%, Gly(G) 8.7% and 8.5% respectively, while in CRP that retrieved from NCBI was 8.0%, Leu(L) 8.7% and 8.5% respectively, while in CRP that retrieved from NCBI was 8.9%, val (V) 8.2% and 8.5% respectively, while in CRP that retrieved from NCBI was 8.5%.A protein whose instability index is smaller than 40 is predicted as stable, a value above 40 predicts that the protein may be unstable [20].Prot Param server predicted that CRP for patients with lung and breast cancer were stable.And also the CRP that retrieved from NCBI was stable.The isoelectric point of a protein is an important property, because it is at this point that the protein is least soluble 20 .Computed isoelectric point of CRP for patients with lung and breast cancer were 5.75 and 5.26 respectively also the PI for CRP that retrieved from NCBI was 5.76 below 7 so they are likely to precipitate in acidic buffers.This result agree with studies of 30,31 demonstrated that the mutation cause large changes in the sequences this effected on physiochemical properties of protein especially on protein stability.

Structure analysis
The secondary structure of present study gave the following alpha helix predicted values 6.06%,6.06%,5.71% and 6.06% for CRP of patients with lung, breast cancer and CRP that retrieved from NCBI respectively.Secondary structure of present study gave the following ß-turn predicted values 42.42%, 42.86% and 42.42% for CRP of patients with lung ,breast cancer and CRP that retrieved from NCBI respectively.Secondary structure of present study gave the following coil predicted values 52.5% , 51.52% and51.52%for CRP of patients with lung ,breast cancer and CRP that retrieved from NCBI respectively figure (7).
For CRP of patient with lung cancer, out of all the templates given by PHYRE2 server, the one with highest % i.d.(94 %) was used to predict the 3D structure.For CRP of patient with breast cancer, out of all the templates given by PHYRE2 server, the one with highest % i.d.(99 %) was used to predict the 3D structure.For CRP retrieved from NCBI, out of all the templates given by PHYRE2 server, the one with highest % i.d.(100 %) was used to predict the 3D structure figure (8).
For CRP of patient with lung cancer, out of all the templates given by swiss-model server, the one with highest % i.d.(94.53 %) was used to predict the quaternary structure .For CRP of patient with breast cancer, out of all the templates given by swiss-modle server, the one with highest % i.d.(98.96%) was used to predict the quaternary structure.For CRP retrieved from NCBI, out of all the templates given by swiss-modle server, the one with highest % i.d.(100 %) was used to predict the quaternary structure figure (9).
The mutations on CRP gene for patients with lung and breast cancer were recorded by present study caused different effects on the structure of protein when it compared with CRP retrieved from NCBI as show in the result of structure analysis such as the numbers and positions of alpha helix, ß-turn and coil as shown in figures (7,8 and 4), this result agree with studies 30,31 demonstrated that mutation cause large changes in the sequences this effected on the structurally of protein.Furthermore, this study has been shown that pathways of protein folding are largely unaffected by changes in the sequence.

CONCLUSIONS
In the present study, CRP is non-specific marker for patients with breast and breast cancer.The genotype at the site 159714441 was found to be associated with cancer disease.The mutations on CRP gene for patients with lung and breast cancer effect on the physicochemical properties of C-reactive protein such as molecular weight, PI , percentage of amino acid and also protein stability when compared with CRP retrieved from NCBI.The mutation effected on structures of CRP for the present study by changed the number and position of alpha helix, ß-turn and coil this effected on CRP by decrease 2Ca +2 ion for each subunit ,this lead to loss the function of host defense of CRP for patients with lung and breast cancer .

Fig. 5 .
Fig. 5.The chromatogram for the forward nucleotide sequence of amplified the first primer for patient with breast cancer by DNA sequencer

Fig. 6 .
Fig. 6.Shown simple part of the comparison results between DNA subjects and reference sequence by BLAST

Table 3 .Fig. 7 .
Fig. 7. Secondary structure of CRP for patient with lung cancer(A) ,breast cancer(B) and CRP that retrieved from NCBI(C) respectively by PSIpred

Fig. 8 .Fig. 9 .
Fig. 8. 3D structure of CRP for patient with lung(A), breast cancer(B) and CRP retrieved from NCBI FASTA format-protein (C), by Phrye2 online software.The N-termini of proteins are colored blue and the C -termini red

Table 1 .
Names and sequences of the primers used for CRP gene amplification

Table 2 .
The program was used for amplification the target DNA for (CRP1 ,CRP2 and CRP3 primers).