The Relationship between Gastroduodenal Pathologies and Helicobacter pylori cagL (Cytotoxin-Associated Gene L) Polymorphism

Background: The polymorphisms in the region between 58 and 62 amino acids of the 194-amino acid CagL protein (CagL hypervariable motif) affect the binding affinity of CagL to integrin α5β1 (ITGA5B1) receptor in host epithelial cells and have an effect on the development of various gastrointestinal diseases. We aimed to evaluate the associations of gastroduodenal pathologies, with the polymorphisms of cagL gene of Helicobacter pylori (H. pylori) and also associations between vacA genotypes and cagL polymorphisms. Methods: A total of 19 gastric cancer, 16 duodenal ulcer, and 26 non-ulcer dyspepsia patients were included in this case-control study. All cases had H. pylori. A fragment of 651 bp from gene cagL (hp0539) and cagA, vacA genes was amplified by polymerase chain reaction. Purified polymerase chain reaction products were sequenced by Sanger sequencing, and nucleotide sequences were translated into amino acid sequences. Results: All of the H. pylori strains had cagL and cagA genes. In the 16 (84%) gastric cancer cases, the D58 amino acid polymorphism was significant than the 4 (15.4%) duodenal ulcer cases (P = .029), and the D58/K59 amino acid polymorphism was significant in 12 (63.1%) of the gastric cancer cases than 1 (3.85%) duodenal ulcer case (P = .008). D58/K59 and DKIGQ (n = 10; 52.63%) were the most common polymorphisms in the gastric cancer and were associated with the vacA genotype s1/m2, respectively (P = .022 and P = .008). The D58/K59 amino acid polymorphism was found to have a significant Odds Ratio (OR) value of 8.9 (P = .0017) in multivariate logistic regression analysis. Conclusions: The risk of gastric cancer development is 8.9 times higher with D58/K59 polymorphism.


INTRODUCTION
Helicobacter pylori (H. pylori) is one of the most common causes of cancer-related deaths 1 and is thought to infect half of the world's population. 2,3 H. pylori can contribute to the development of various gastrointestinal diseases. [4][5][6] Cytotoxin-associated gene L protein (CagL) is a virulence factor used for attachment to the epithelial cells and encoded by the 165-bp gene region in the CagPAI region. CagL is located in the pilus of the Type 4 Secretion System and attaches to the integrin α5β1 (ITGA5B1) receptor in epithelial cells. The Argin ine-G lycin e-Asp artat e (RGD) motif, the adjacent helper sequence (RHS), and Pheny lalan ine-G lutam ate-A lanin e-Asp aragi ne-Gl utama te (FEANE) motif mediate the binding of CagL. 1,[7][8][9][10][11][12][13] It has been suggested that the polymorphisms in the region between the 58 and 62 amino acids of the CagL protein, which is called the CagL hypervariable motif (CagLHM), affect the binding affinity of CagL to ITGA5B1 and may have an effect on the development of various gastrointestinal diseases. 1,[7][8][9][10][11][12] CagL has 6 chains numbered from α1 to α6 in its crystal structure, and the CagLHM region is located in the hinge region between the α1 and α2 chains. 1,12 The CagL protein is thought to be a good vaccine target because of its surface expression and the presence of various motifs. 14,15 CagA, another virulence factor of H. Pylori, undergoes tyrosine phosphorylation after entering the host cell. 16,17 The fourth amino acid (tyrosine) of the EPIYA motif, which is located at the C-terminal is the main source of this phosphorylation. 16 After binding of H. pylori to the host epithelium via CagL, CagA is translocated into the cell, which has the main oncogenic effect. 17 Various amino acid polymorphisms between amino acids 58 and 62 in the CagLHM region might contribute to the development of different gastrointestinal pathologies. [7][8][9][10][11][12][13] The VacA toxin of H. pylori is a toxin synthesized to form selective membrane channels. 18 The vacA gene is divided into groups with different alleles according to the signal region (s) and middle (m) at the amino terminus. It can contain one of the s1a, s1b, s1c, and s2 alleles in the s region and one of the m1, m2a, and m2b alleles in the m region. [18][19][20] We aimed to evaluate the associations of gastroduodenal pathologies with the amino acid polymorphisms of CagL protein detected in H. pylori DNAs isolated from patients. We also discuss whether our findings might be specific to the Turkish setting and investigate the relationship between the cagL polymorphisms and vacA genotypes.

MATERIALS AND METHODS Study Design and Patients
This case-control study was conducted between 2019 and 2020. The 2 patient groups, comprising a total of 61 patients (19 gastric cancer [GC] and 16 duodenal ulcer [DU] patients; mean age, 57.316 years for GC and 42.5 years for DU patients), and a control group, comprising a total of 26 individuals with non-ulcer dyspepsia (NUD) (mean age, 51.77 years; age range 22-76 years), were enrolled. All subjects had H. pylori. The control group was matched with the patient group (P > .05). The antrum and corpus biopsies were used for molecular studies. We excluded patients who were younger than 18 years old, had previous gastric surgery or H. pylori eradication treatment, or had a history of therapy with antibiotics, antisecretory drugs, bismuth salts, or sucralfate in the month prior to sampling.
Collected biopsies were transferred immediately in Brucella broth to the laboratory. The study was approved by the Clinical Research Ethics Board of Istanbul University Cerrahpasa Faculty of Medicine (Ethical approval No: A-08/2019) and recognized the standards of the Declaration of Helsinki. All patients gave informed consent to participate in the study.

Molecular Methods ureC Gene Detection in H. pylori
The presence of H. pylori was determined histopathologically from biopsy samples. DNA isolation was done using the QIAamp DNA Mini Kit (Qiagen GmbH, Hilden, Germany). In order to verify H. pylori DNA, the ureC gene region (glmM) of H. pylori was determined by the qPCR method using a Fluorion device (Iontek, Istanbul, Turkey) and H. pylori-QLS 1.0 kit (Iontek, Istanbul, Turkey) device.

Amplification of the H. pylori cagA and vacA Genes
The cagA, vacAs1/s2, and vacAm1/m2 genotypes were determined using a molecular PCR technique using specific primers. All primer sets used were selected from the published studies and are shown in Table 1. 7,10,20,21 The study protocol was as follows: initial denaturation at 95°C for 2 minutes, followed by 45 cycles of 95°C for 30 seconds, 45 seconds at 53°C, and 45 seconds at 72°C. The final elongation was performed for 5 minutes at 72°C. for 1 minute; and 1 cycle at 72°C for 7 minutes. Each reaction included a positive (DNA from strain 26695) and a negative control. All reactions were performed in a Mastercycler Ep gradient thermocycler (Eppendorf, Hamburg, Germany). Polymerase chain reaction products were analyzed by agarose gel electrophoresis at 1.5% and stained with ethidium bromide. A second PCR was performed with DNA from those strains that were negative in the first reaction, using primers cagLFwd-2 and cagL-16, which amplified a 165-bp product.

Purification and Sequencing of Polymerase Chain Reaction Products
Polymerase chain reaction products were cleaned with ExoSAP. For the purification of H. pylori/cagL gene, positive PCR products from the first and second PCR stages, 4 μL of PCR product, and 1 μL of Exo-Sap (Exonuclease 1 and Shrimp Alkaline Phosphatase enzymes) were mixed to remove unbound DNA and primers from the reaction tube. Exonuclease 1 enzyme was used to digest unbound primers in the medium, and the Shrimp Alkaline Phosphatase enzyme was used to digest unbound dNTPs in the medium. The Exo-Sap program of the PCR device was selected. Tubes were placed in the instrument, the instrument was turned on, and a PCR of 30 minutes was performed.
Purified PCR products were sequenced by the Sanger Sequencing method on ABI 3730XL (Applied Biosystems, Foster City, California, USA) using the Big Dye™ Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher, Massachusetts, USA) kit. The nucleotide sequences obtained were aligned using MEGA 7.0 (Mega software, USA) and translated into amino acid sequences using ExPASy (Swiss Biotechnology Institute, Lausanne, Switzerland).
Evolutionary analysis was performed using the neighbor-joining method. 22 The length of the branches of the optimal tree was determined as = 0.32397031. The evolutionary distance was generated by the software using the Poisson correction method. 23 The analysis included 48 amino acid sequences. All positions with gaps and missing data have been eliminated. There were a total of 194 positions in the final data set. All evolutionary analyses were performed using MEGA7 software. 24

Determination of cagL Gene Polymorphisms
The amino acid sequences obtained were aligned in the MEGA 7.0 program and compared with the H. pylori 26695 references strain for the determination of the amino acid polymorphisms of each DNA sample. The H. pylori 26695 (ATCC 700392; NCBI:txid85962) reference strain was used because it is a well-known strain with full genome and virulence properties.

Statistical Analyses
The Statistical Package for the Social Sciences version 25.0 (IBM Corporation, Armonk, NY, USA) program was used. Fisher's exact test was used for comparisons of CagL amino acid polymorphisms in the groups, and the results were shown as Benjamini-Hochberg corrected P-values. In the context of determining the cause and  Table 2.
The sequences of 61 cagL (+) H. pylori DNA isolates were aligned with the sequence of strain H. pylori 26695 (ATCC 700392) (Figure 1). The Arginine-Glycine-Aspartate (RGD) and RGD Helper Sequence (RHD) motifs are identical in all strains. The phylogenetic tree prepared on the basis of amino acid polymorphisms is shown in Figure 1.
All of the 61 amino acid sequences are shown in Figure 2. Table 3 shows the comparisons of the study and control cases for single amino acid polymorphisms. The D58 polymorphism was detected in 16 (84%) and 4 (15.4%) of GC and NUD cases, respectively (OR: 29, P = .0001).
The D58 polymorphism was detected in 8 DU (50%) and 4 NUD (15.4%) cases (OR: 5.5, P = .032). In 24 (68.6%) of the GC + DU cases, the D58 polymorphism was significant (OR: 12, P = .0001) when compared to the NUD cases. Table 4 shows the comparison of the study and control cases in terms of the 58 and 59 amino acid polymorphisms. The D58/K59 polymorphism was detected in 12 (63.1%) of the GC cases and in 1 (3.85%) NUD case. A significant difference was found between the groups (OR: 42, P = .0001), and there was a significant difference between 15 (42.9%) GC + DU cases and 1 (3.8%) NUD case (OR: 18.75, P = .001).
The DKIGQ polymorphism was detected in 10 (52.63%) of the GC cases but not detected in the NUD cases (OR: 58.5, P = .0065). When comparing the GC + DU and NUD cases, the DKIGQ polymorphism was detected in 11 (31.43%) of the GC + DU cases and in none in the NUD group (OR: 24.8, P = .0290).
The most detected polymorphism outside of the CagLHM region was I134 and followed by K122, I175 and T72 T41, A112, G140, and I203 polymorphisms. The most detected I134 polymorphism in the GC cases was also detected in 4/26 (15.38%) NUD cases but not significant (P = .09).
All polymorphisms detected outside of were shown in Supplementary Material 1 and in Figure 2.

DISCUSSION
It is of utmost importance for public health to catch the development of GC at the earliest pre-atrophy stages and to direct the treatment strategy accordingly. It has been proposed that various polymorphisms in the 58-62 amino acids in the cagL region are effective in the development of various gastrointestinal pathologies. 1,[7][8][9][10][11][12]25 The D58 polymorphism was significantly higher in our GC (OR: 29), DU (OR: 5.5), and GC + DU cases (OR: 12). This polymorphism lost its significance in the multivariate analysis. The significantly increased D58/K59 polymorphism was also detected in the GC cases. In the study by Yadegar et al. 8  D58 polymorphism stands out significantly in our GC (OR:29) cases compared to the control cases. The variations of CagLHM polymorphisms in the world can be explained by the regional geographical differences. Gorrell et al 26 suggested that this significant geographic diversity was observed in the CagLHM sequences, which might have occurred specifically as a result of the co-evolution of H. pylori and the host. In addition, while the same researchers draw attention to the strong positive correlation between the I60 and E59 polymorphisms and GC, on the other hand, they emphasized that the D58 and K59 polymorphisms are detected at a high rate in GC cases, and this might be more related to the diversity of H. pylori circulating locally and their adaptation to the host.
The Q62 polymorphism was high (100%) in our GC cases, with the I60 polymorphism ranked second (89%). In our univariate analysis, D58 (84.2%) was found to be significantly associated with GC cases (OR: 29) but was not significant in logistic regression analysis. The D58/ K59 polymorphism was the most common, occurring in  In parallel with the possible hypothesis of Yeh et al 10 we predict that atrophy with hypochlorhydria and the development of GC may be triggered in gastroduodenal pathologies related to H. pylori infections, including the D58/K59 polymorphism. In our study, the most common DKIGQ polymorphism was detected in 10 (52.6%) GC cases (OR: 58.5). Similar results were found also in our GC + DU cases. However, the D58/K59/I60/G61/Q62 combination lost its significance in multivariate analysis. In the study of Yadegar et al, 8  polymorphisms were 66.6% and 33.3% in GC cases, respectively, whereas in PUD cases, a different polymorphism, NKMGK, was detected in three (42.8%) cases.

NEIGQ and NKIGQ
Yadegar et al 8 reported that the NEIGQ polymorphism is more common in GC cases. In the review of Gorrell et al. 26 the most common polymorphisms at the international level are NEIGQ, NKIGQ, DKMGE, and DKIGK. Also, in this report, the prevalence of DKMGE in Africa as a geographic region suggests that it is an ancestral sequence for the motif in this region; although it differs slightly in the United States, the same motif is often seen, while European origins predominantly have NEIGQ and NKIGQ motifs. However, the same researchers pointed out that there is a different pattern for each region in Asia and that DKMGE is rarely seen, that this CagLHM region is very variable, and polymorphisms of the CagLHM region are specially formed from 58 to 59 two amino acid polymorphisms.
The amino acid polymorphic sequence diversity of the CagLHM region suggests that CagL has a versatile pathogenic role. The detection of polymorphisms in the