Determining Risk Factors for Gastric and Esophageal Cancers between 2009-2015 in East-Azarbayjan, Iran Using Parametric Survival Models

Background: Esophageal cancer (EC) and Gastric cancer (GC) have been identified as two of the most common cancers in the northeastern regions of Iran. The increasing rates of these types of cancers requires attention. This study aims to assess the potential risk factors for these two cancers and then determine shared risk factors between them in a population of Iranian patients using parametric survival models. Methods: This retrospective cohort study was conducted using 127 patients with EC and 184 patients with GC in East Azarbaijan, Iran who were diagnosed and registered during the years 2009-2010 in Iran’s National Cancer Control Registration Program and were followed for five years. Parametric survival models were used to find the risk factors of the patients. Akaike Information Criteria was used to identify the best parametric model in this study. Interaction analysis was used to determine shared risk factors between EC and GC. Results: The mean (±standard deviation) age of diagnoses for EC and GC were 66.92(±11.95) and 66.5(±11.5) respectively. The survival time ranges of GC patients was (0.07-70.33) and the survival time ranges were from 0.10 to 69.03 months for EC patients. Multivariable Log- logistic model showed that being married (OR=2.25, 95% CI: 1.33 - 3.81) for EC patients and Esophagectomy surgery for EC (OR: 1.62, 95% CI: 1.04 – 2.55) and GC (OR: 1.60, 95% CI: 1.02 – 2.53) had significant effects on survival. Age at the time of diagnosis, job status, and Esophagectomy surgery were statistically comparable regarding their magnitude of effect on survival of two cancers (all Ps>0.05). Conclusion: Esophagectomy surgery and being married were important risk factors in EC and GC. The log-logistic model was the most appropriate statistical approach to identify significant risk factors on survival of both cancers.


Determining Risk Factors for Gastric and Esophageal Cancers between 2009-2015 in East-Azarbayjan, Iran Using Parametric Survival Models
Elaheh Zarean 1 , Payam Amini 2 , Mehdi Yaseri 3 *, Morteza Hajihosseini 4 , Tara Azimi 4 , Mahmoud Mahmoudi 3 cancers of the upper gastrointestinal tract (GI). Previous studies have shown that GC remains the seventh most common cancer in the United States and it is still the most common cancer in northern regions of Iran (Malekzadeh et al., 2009;Hu et al., 2012). Usually, patients are referred to hospitals in their advanced stages of the disease. Although the incidence of GC has decreased in recent years, approximately 990,000 people are diagnosed with GC each year worldwide and about 738,000 of them die due to GC (Karimi et al., 2014). According to the reports from the Iranian Ministry of Health and recent related studies, GI cancers such as EC and GC are the most common cancers in the East-Azarbaijan province (Naghavi, 2001;Somi et al., 2014;Darabi et al., 2016) but few studies have been conducted on the occurrence of GI cancers, their survival rates, and their related risk factors. Also, since the incidence of GI cancers in East-Azarbayjan Province is significantly high (Somi et al., 2014), identifying the potential risk factor of these fatal diseases with appropriate tools is necessary.
Due to the increasing use of survival analysis including semi-parametric and parametric multivariable survival models in medical studies, especially in cancer research, the need for efficient models with more flexibility is necessary. Despite the popularity of the semi-parametric models like the Cox Proportional Hazard model, parametric approaches can be better alternatives in some circumstances (Ghadimi et al., 2011). Although one of the most critical assumptions of the Cox Proportional Hazard Model is holding proportionality hazards (PH) assumption, in several clinical setting this underlying assumption does not hold. In such situations, accelerated failure time parametric survival techniques can be used to model risk factors for rare diseases (Kleinbaum and Klein, 2012;Cox, 2018).
This study aimed to assess the potential risk factors of patients with EC and GC in the East Azarbaijan province of Iran using the best parametric survival model. Also, by using proper statistical methods, the shared risk factors between the two cancers were then determined for the first time in a sample of Iranian patients.

Materials and Methods
This retrospective cohort study utilized information on patients with gastrointestinal cancers that were registered during the years 2009-2010 in Iran's National Cancer Control Registration Program. In this national program, all pathology centers, health centers, and hospitals in provinces are obligated to report their data to the Cancer Office of Disease Control and Prevention. The data sets of this study were collected from 127 cases of patient with EC and 184 cases of patient with GC who lived in the cities of East-Azarbaijan Province. The patients that were referred to health centers and hospitals in this province were followed up for five years until 2015 and their information was extracted from their records. The patients were contacted via phone to gather information about their health and survival. The beginning of the study was assumed as the date of the pathologic diagnosis of cancer. The study outcome was considered death due to EC or GC cancer. Survival time was calculated using the difference between the dates of death and the first report of their cancer pathology. Patients who survived by the end of the study were considered as right censored.
The Two types of cancer sites included in this study were defined according to the International Classification of Diseases, 10th revision. 184 GC patients were defined by code C16 and 127 EC patients by code C15. In order to assess the potential risk factors of EC and GC, patients with prior cancers were excluded from this study. Also, there is no loss to follow-up in this study. In addition, the current study data is extracted from a MSc thesis which was checked and approved by the Ethics Committee of the TUMS (IR.TUMS.DDRI. REC.1396.4148).
The current study included three types of information: demographic, biological and socioeconomic data.
The demographic variables were the age at the time of diagnosis, gender, educational status, marital status, and job status. The biological variables were non-communicable disease (NCD) affected status, Esophagectomy surgery, chemotherapy, and radiotherapy. In addition, socioeconomic status (SES) obtained based on a checklist of wealth and social position characteristics such as household fuel consumption, residential facilities, personal family facilities, and household appliances used by the family, total monthly household income, education status, and job status. Principle component factor analysis was applied to obtain the socioeconomic status (EC: KMO=0.722, Bartlett's Sphericity test p-value<0.001; GC: KMO=0.788, Bartlett's Sphericity test p-value<0.001). The extracted score was categorized by the median to low and high level.

Statistical Analysis
Descriptive characteristics of the patients are shown as mean (± standard deviation) and frequency (percentage) for continuous and categorical variables respectively. Log-rank test was performed to assess the difference in the distribution among the levels of variables. The Cox Proportional Hazard model was performed for both EC and GC cancers, and also PH assumption was checked to take advantage of using this model in the current study. Akaike Information Criteria (AIC) was utilized to compare the performance of parametric survival models (EC AICs' models: Log-logistic=337.94, Exponential= 343.90, Weibull=343.82; GC AICs' models: Log-logistic=577.77, Exponential= 621.44, Weibull=601.01).
The univariate parametric model was used to assess the risk factors in EC and GC patients. Since the log-logistic model was recognized as the best model among the others, all the variables were entered in the multivariable Loglogistic model to find the adjusted effects of the factors on patients' survival. Chemotherapy was removed from the multivariable model due to its high collinearity with the other predictors. The results were presented as Odds Ratio (OR) in Log-logistic models. Comparison of the estimated coefficients (beta) that resulted from the two parametric models could indicate the difference of the effect of factors on the survival of the two cancers. So, we fit another multivariable log-logistic model using a merge file of both EC and GC datasets. The type of cancer and its interactions with each of the factors is also added into this new model. These newly modified interactions assess the magnitude of the effect of various factors on the cancers. The non-significant difference between interactions estimation indicates comparable magnitude of the factor on the survival of the two cancers. All analysis performed using STATA (version 12) and the p-value<0.05 was considered statistically significant.

Results
The following results were found from 184 GC and 127 EC patients. The mean (± standard deviation) age of the 184 GC patients was 66.5 (± 11.5) years. The survival time ranged from 0.07 to 70.33 month and the mean and median survival time was 16.8 (95% CI: 13.6-19.9) (p-value=0.035) and GC (p-value= 0.044), parametric models can be selected as useful alternatives in this study. According to the results of AICs from multivariable parametric models, the Log-Logistic model had the best fitting distribution compared to other parametric models.
The patients' characteristics, log-rank test and univariate log-logistic model for EC and GC are shown in Tables 1 and 2 respectively. According to the log-rank results, EC survival time was significantly longer among females, unemployed cases, those who had Esophagectomy surgery, and patients with high socioeconomic status. Gastrointestinal cancers survival-time was influenced by Esophagectomy surgery and chemotherapy. Accordingly, the results of the univariate log-logistic models showed that survival in patients with EC is affected by being male (OR=0.45; 95% CI: 0.28-0.73), unemployed(OR=1.81; and 8.33 (95% CI: 5.9-10.6) months respectively. One, three and five-year survival probabilities in EC and GC patients were 40.16%, 11.18%, and 11.02% for EC and 34.8%, 13% and 11.4% for GC, respectively. A total of 163 (86.6%) individuals experienced death due to GC by the end of the study. Moreover, the mean (± standard deviation) age of the 127 patients with esophageal cancer was 66.92 (± 11.95) years. The survival time ranged from 0.10 to 69.03 months and the mean and median survival time was 16.99 (95% CI: 13.46-20.52) and 10.06 (95% CI: 6.49-13.63) months respectively. A total of 113 patients (89%) experienced death due to esophageal cancer by the end of the study. The Kaplan-Meier survival estimate of patients with Esophageal and Gastric Cancers are showed in Figure 1 Table 3. Based on the multivariable 95% CI for the odds ratios, married EC patients were 2.25 (95% CI: 1.33 -3.81) times more likely to survive than unmarried patients. Patients who had Esophagectomy surgery had a 62% increase of odds of survival (OR: 1.62, 95% CI: 1.04 -2.55). However, survival after diagnosis of GC was only affected by Esophagectomy surgery (OR: 1.60, 95% CI: 1.02 -2.53).
Based on the p-values in Table 3, gender, smoking habit and radiotherapy affected GC survival more than EC survival. In contrast, EC survival was more affected by education, marital status, NCD diagnosis status and SES comparing to GC. Age at the time of diagnosis, job status, and having Esophagectomy surgery showed a similar magnitude of effect on both of the cancers.

Discussion
In this paper, we investigated the possible association between the survival of the patients with EC and GC and several of the most common prognosis factors. We assessed the impact of age at the time of diagnosis, gender, education, marital status, job status, smoking habit, NCD diagnosis status, Esophagectomy surgery, chemotherapy, radiotherapy and socioeconomic status (SES). In the current study, being male was a significant indicator of prognosis in both univariate and multivariable analysis in patients with EC. This finding supports findings from Chen et al. where being male was found to significantly reduce patients' survival rates (Eil et al., 2014). In our study, high-level socioeconomic status had a positive influence on the survival of EC patients in the univariate Log-logistic model, which was similarly found in a study by Gammon et al., (1997). Our findings also indicated that Esophagectomy surgery had a significant effect on the survival odds of patients with EC in both univariate and multivariable Log-logistic approaches. Previous reports have suggested that Esophagectomy surgery increases the odds of survival among EC patients . In the current study, married patients in the EC group had significantly higher odds of death in the multivariable Loglogistic model. Previous studies have also demonstrated that a patient's survival is affected by their marital status (Ernster et al., 1979;Aizer et al., 2013). The current study has found controversy in the results of univariate and multivariable approaches regarding radiotherapy. Using an intergroup phase III randomized clinical trial, Al-saraf et al., (1997) compared the effect of combined chemotherapy-radiotherapy versus radiotherapy only in patients with locally advanced EC. They found that the median survival of patients with combined chemotherapyradiotherapy was higher than that of radiotherapy alone. Our study has shown that Esophagectomy surgery is an essential and outstanding prognostic indicator of GC survival in both the univariate and multivariable Loglogistic model. Improving survival followed by surgery was assessed by Hallissey et al., (1994) on a group of British GC patients with stomach cancer. They concluded that surgery is the standard treatment for GC. However, our findings are not consistent with results from the prospective randomized controlled trial study by Allum et al., (1989) where the effects of surgery with adjuvant radiotherapy and chemotherapy in patients with operable GC were evaluated. Neither forms of adjuvant therapy were associated with the survival of GC patients nor surgical treatment remained the principal treatment.
Parametric and semi-parametric survival models have been used widely in fitting survival data. This might be due to the fact that the parametric approaches such as Log-logistic, Weibull, and Exponential models provide accurate estimates with some predetermined assumptions (Efron, 1977). The AIC scores in our study revealed that the Log-logistic model was the best-fitted model in GC and EC datasets. Findings from Ghadimi et al., (2012)'s study on patients with GC in the city of Babol, Iran was consistent with the findings in the present study. The result of our study indicated that in the multivariable log-logistic model, the odds ratio of survival among age≥50 in GC is lower than EC. Roshanaei et al., (2010) conducted a study regarding the survival of patients with GC under surgery in which gender pathologic stage, age at diagnosis and weight-loss were significantly related to the survival of the patients in the multivariable analysis (Roshanaei et al., 2010). The effect of age at diagnosis has been discussed by Zare et al. They demonstrated that the age at diagnosis has a significant effect on the survival of patients with GC who have undergone surgery (Zare et al., 2014). Moreover, Greenstein et al. found that being over 70 reduced the chance of survival for patients diagnosed with esophageal cancer (Greenstein et al., 2008). This study exposed that GC patients are more likely to survive after radiotherapy compared to EC patients. Moreover, SES has a greater effect on survival time among EC patients compared to patients with GC. The present study showed that females in both cancers have higher odds of survival compared to males. This might be related to their riskier lifestyles . However, males with EC live shorter lives than those with GC. In another study in China, Chen et al. revealed that the rate of mortality and incidence of EC in males was higher than those of patients with GC (Chen et al., 2016). Our study showed that married people have higher odds of survival. Moreover, those with EC have a higher survival time than those with GC. Lagergren et al. assessed the impact of marital status, education, and income on the risk of esophageal and gastric cancers. They showed that patients with long marriages have lower incidence rate ratios compared to those with shorter marriages or those who were never married, remarried, or divorced. The ratios were lower among EC patients in comparison to patients with GC (Lagergren et al., 2016).
There were some limitations on the relatively small sample size in our data. The most important limitation of the survey was the absence of clinical information including the esophageal and gastric cancer type and the stage of these cancers. Since this was a retrospective cohort study, we did not have access to the information on the exposures that patients encountered.
We conclude that marital status and Esophagectomy surgery were potential risk factors for the survival of EC patients. Surgical techniques may be a useful method to increase the survival rate of patients with esophageal cancer. Radiotherapy is an appropriate treatment and may decrease death caused by EC. In patients who have already had GC surgery, chemotherapy and radiotherapy are alternative treatment approaches to increase the survival chances of patients with gastric cancer. We founded that the Log logistic model could be a proper approach for statistical analysis of risk factors in patients with EC and GC.