Associations between genetic liabilities to smoking behavior and schizophrenia symptoms in patients with a psychotic disorder, their siblings and healthy controls

healthy controls in a six-year follow-up prospective cohort study. Associations between smoking be- haviors, PRS and schizophrenia symptoms were explored using linear mixed-effect models. The mean number of cigarettes smoked per day were 18 for patients, 13 for siblings and 12 for controls. In the overall sample, PRSs- smoking initiation (i.e


Introduction
Smoking is probably the single most unhealthy human behavior (Revicki, Sobal, & DeForge, 1991). Although the prevalence of tobacco smoking in the general population has decreased in the past decades, smoking rates among patients with schizophrenia are still very high (up to 70%) (Lasser et al., 2000). The odds that patients with schizophrenia smoke, is more than three times higher than for people from the worldwide general population (de Leon & Diaz, 2005;Zeng et al., 2020). Such high rates of smoking lead to tobacco-related diseases that substantially increase early mortality rates in patients with schizophrenia. For example, the standardized mortality rate for chronic obstructive pulmonary disease-related deaths is 9.9, CI 9.6-10.2 (Olfson, Gerhard, Huang, Crystal, & Stroup, 2015).
Increased rates of smoking are observed in first-degree relatives of individuals with psychotic disorders, albeit to a lesser degree than in patients (Lyons et al., 2002;Vermeulen et al., 2018), giving rise to the possibility of genetic overlap between psychosis and smoking behaviors, which is supported by findings from hypothesis-generating genetic studies: in a genome-wide association study (GWAS), single-nucleotide polymorphisms (SNPs) in the human CHRNA5-A3-B4 clustera promising candidate region for smoking behaviors -were significantly associated with schizophrenia (Schizophrenia Working Group of the Psychiatric Genomics, 2014; Trubetskoy et al., 2022). This gene cluster, located at chromosome 15, encodes an nAChR subunit that is associated with an increased risk for heavy smoking but not with smoking initiation (Liu et al., 2019). Moreover, polygenic risk scores (PRSs), weighted sums of trait-associated alleles, of the nicotine metabolite cotinine are significantly associated with PRSs for schizophrenia (Chen et al., 2016). In addition, PRSs for age at smoking initiation were significantly associated with schizophrenia in a Japanese sample (Ohi et al., 2020). Finally, positive genetic correlations (rg = 0.19; p = 0.037) have been established between schizophrenia and age at onset of smoking and rate of cigarettes smoked (rg = 0.14; p = 0.049) using LD regression score (http://ldsc.broadinstitute.org/).
To our knowledge, these initial observations on the possible shared genetic susceptibility between schizophrenia symptom clusters and smoking behaviors have not been substantiated by family studies examining polygenic liabilities. Findings from such studies can provide further insight into the role of smoking in the causal chain leading to psychosis and its subclinical course of psychotic symptoms, which is not fully understood (Quigley & MacCabe, 2019). Healthy (unaffected) siblings of patients with a psychotic disorder share environmental and half of the genetic risks of the patients and have genetic liabilities to psychosis in between patients (i.e., fully affected) and healthy controls, and are free of illness-specific confounding factors. Including unaffected siblings of patients with a psychotic disorder can thus contribute to disentangle the associations between genetic underpinnings of smoking behaviors and psychosis symptom clusters.
Here, we therefore investigated 1) baseline difference in smoking behavior between patients with a psychotic disorder, unaffected siblings and healthy controls; 2) associations of smoking behavior PRSs with psychotic disorder case-control status; and 3) associations of smoking behavior PRSs with schizophrenia symptom clusters. To that end, we generated PRSs for age at onset of smoking, age at first regular smoking, number of cigarettes smoked per day, and schizophrenia. We then evaluated associations with case-control status and psychosis symptom clusters in a longitudinal cohort of patients with non-affective psychosis, unaffected siblings of patients with psychosis, and healthy controls. Based on the shared-vulnerability hypothesis, we expected higher genetic risk for smoking behaviors to be associated with higher symptom levels in patients and siblings due to shared genetic and environmental factors, and to a lesser extent in healthy controls.

Study population and study design
This study was performed within the naturalistic, multi-center cohort study of the Genetic Risk and Outcome of Psychosis (GROUP) study (Korver et al., 2012). The full sample consisted of 3,684 participants: 1, 119 patients with a diagnosis within the non-affective psychotic spectrum, 1,059 unaffected siblings, 920 parents of patients with psychotic disorders, and 586 unrelated healthy controls. Study design, power calculations, recruitment procedure and baseline characteristics of participants have been described in detail previously (Korver et al., 2012). In short, patients aged between 16 and 50 years, and diagnosed with non-affective psychosis according to the Diagnostic and Statistical Manual of Mental Disorders Fourth Edition (DSM-IV) (Diagnostic and Statistical Manual of Mental Disorders: DSM-IV-TR, 2000), were recruited by clinicians from four university medical centers and 36 associated mental health care facilities in representative geographical areas in the Netherlands and Belgium between 2004 and 2014. The age range was chosen to allow for inclusion of early-onset cases as well as long-term follow-up (older age would have incurred higher risks of mortality and thus loss to follow-up). Siblings and controls were included if not affected by a psychotic disorder. All patients, unaffected siblings and controls took part in the baseline assessment (T0) and were invited for follow-up assessments three (T1) and six (T2) years after inclusion (and the baseline visit). Patients were included in the analyses when genetic data and at least one of the outcomes of interest was available.
This study was approved by the Medical Ethics Committee of the University Medical Center of Utrecht. Written informed consent was obtained before inclusion.

Genetic data and quality control steps
Genotype data for 2,812 individuals were generated on a customized Illumina, IPMCN array with 570,038 SNPs. This chip contains ~250k common SNPs, 250K rare, exomic, non-synonymous SNPs [minor allele frequency (MAF) < 1%], and ~50K psychiatric-related variants. Quality control (QC) procedures were performed using PLINK v1.9 (Purcell et al., 2007) as previously reported (Pazoki et al., 2020); detailed quality control steps are shown in the Supplementary method, p.4 of the Supplement. The quality-controlled SNPs were imputed on the Michigan server (Das et al., 2016) using the HRC r1.1 2016 reference panel with European samples after phasing with Eagle v2.3. Post-imputation QC involved removing SNPs with an estimated r 2 (Rsq) info score<0.3, with a MAF<0.01, SNPs that had a discordant MAF (MAF difference >0.15) compared to the reference panel, and strand ambiguous AT/CG SNPs and multi-allelic SNPs, leaving a total of 2505 subjects and 14,132,467 SNPs for final analyses. We excluded the patients' parents (n=700) from the sample because of lacking phenotypic information (smoking behavior and CAPE score), resulting in 1805 individuals for analysis, including n=706 patients, n=731 siblings and n=368 healthy controls.

Levels of (sub)clinical symptoms clusters
Psychosis-related symptoms in patients, siblings and controls were assessed with the self-rated Community Assessment of Psychic Experiences (CAPE; (Mossaheb et al., 2012)). The CAPE is a self-report questionnaire to assess schizophrenia symptom clusters. At baseline (T0), the CAPE assessed lifetime symptoms; at follow-up T1 and T2 CAPE assessed symptoms over the past three years. Each of the items is rated in terms of frequency on a scale from 0 (absent) to 3 (severe). Mean scores of all items (0-3) representing positive symptoms (20 items), negative symptoms (14 items) and depressive symptoms (8 items) were calculated for patients, siblings, and healthy controls.

Statistical analyses
Demographic variables, age of smoking initiation and numbers of cigarettes smoked per day, were compared across groups (patients, siblings, and healthy controls) using ANOVA and t-tests (p-value threshold for significance < 0.05).
For 1), we performed a logistic regression using cross-sectional data (time point= T0; N= 1084, including 706 patients and 368 healthy controls) to detect associations between psychotic disorder case-control status (Y) with PRS-smoking behaviors, the first 3 genetic principal components (PCs), age, and sex as fixed effects as follows: For 2), we first performed these analyses in the entire study population and then in patients, sibling, and controls separately. The first 3 genetic PCs, time, age, sex, and sibling effects (0=control, 1=siblings and 2=patients) as fixed effects were added to the association models as covariates. In addition, we added intercepts for subjects and by-subject random slopes for the effects of time and family structure as random effects for association tests in the entire cohort. The equation for the entire study population was: In the subgroups (patients, relatives, and controls) the equation was: Y 1..3 = β 0 + β 1 * PRS + β 2 * Age + β 3 * Sex + β 4 * PC1 + β 5 * PC2 + β 6 * PC3 + Z * FamilyID + Z * Time + ε All variables were added en bloc and models were fitted with restricted maximum likelihood models (REML). To report how much variance is explained by the risk score itself, the delta r 2 for fixed effects were calculated in linear mixed models with PRSs and without PRSs. Pvalues were calculated with the Kenward-Roger approach (Kuznetsova, Brockhoff, & Christensen, 2017). As the PRSs were calculated in 12 p-values thresholds, the association p-values in all PRS analyses were adjusted for false discovery rate (FDR) of these 12 PRSs (p-value threshold for significance < 0.05) (Benjamini & Hochberg, 1995). To test whether the PRS-smoking associations would still remain when accounting for genetic liability to schizophrenia, we added PRS-SCZ (pt=0.05) as a covariate to the abovementioned models. Following suggestions put forward during the peer review process, we also checked whether the results for patients on olanzapine, clozapine or both were similar to the results for the entire group of patients by restricting our analyses to such patients as a subgroup.
We then split samples by PRS (p t =0.05) tertiles into low, middle and high PRS groups. Symptoms were then compared between low, middle and high PRS-SI, PRS-AI and PRS-CPD groups in the entire cohort using Wilcoxon rank-test (p-value threshold for significance < 0.05). Finally, we tested interaction effects of patient/siblings control status (group status) with PRSs (pt=0.05) to check whether the PRS association results remained. GROUP database release 5.0 was used for all analyses.

Smoking behavior differences between cases, unaffected siblings and health controls
In total, genotype data of 706 patients with non-affective psychosis, 731 unaffected siblings and 368 healthy controls (total N = 1805) were available after we executed the genetic quality control steps, as described above.
The percentage of smokers in the different groups was 64% (n=454) of the patients, 38% (n=277) of the siblings and 23% (n=85) of the controls.
CAPE scores in patients showed a mean (SD) item score of 0.67 (0.49) for positive symptoms, 1.01 (0.53) for negative symptoms and 0.97 (0.56) for the depressive subscale. These and other characteristics for the three groups (patients, unaffected siblings, healthy controls) are reported in Table 1.
For the association analyses with schizophrenia symptoms, varying by the completeness of CAPE subscale scores, we used data on 686-687 patients with 1601-1602 observations, 575-579 siblings with 966-971 observations and 298 controls with 538 observations. The specific sample sizes for association tests are listed in each respective table (Supplementary Tables S2-5).

Association analyses between smoking behavior PRSs, psychotic disorder case-control status and symptom levels
The most significant association was found between psychotic disorder case-control status and PRS-CPD at p t =5 × 10 − 3 (beta= 0.105, SE=0.053, P FDR =0.003; Supplementary Figure S1). However, when adding PRS-SCZ to the model as a covariate (fixed), the association between PRS-CPD and psychotic disorder case-control status was not significant anymore.

Association analyses between smoking behavior PRSs and psychotic symptoms in patients, siblings and controls separately
Strongest associations between PRS-SI and psychotic symptoms were found in healthy controls, while no significant associations were found in patients (with similar results in patients using clozapine, olanzapine, or both); siblings showed intermediate strengths of associations (Suppl .  Table S3A; Fig. 3).
When adding PRS*patient/siblings/controls status to the model (Suppl . Table S6), the main associations of PRS-SI with positive, negative, and depressive symptoms remained (P<0.05), with no significant interactions detected. Correlations between PRS are reported in Suppl. Table S7.

Discussion
Here, we found that unaffected siblings of patients with psychotic disorders show smoking behaviors at an intermediate level between patients and healthy controls. Furthermore, we demonstrate that polygenic risk scores for smoking initiation and age of smoking initiation (PRS-SI and PRS-AI) are associated with schizophrenia symptom clusters only in unaffected siblings and healthy controls. In patients, no associations were found between genetic liabilities for smoking phenotypes and schizophrenia symptom levels. These findings suggest that genetic liabilities to smoking behaviors are differentially related to schizophrenia symptom clusters.
Our results suggest that many SNPs related to smoking initiation with small individual effect sizes together contribute 3.1% and 1.9% to the variance of positive, negative and depressive symptom levels in individuals without illness-related confounders. Possible explanations for this finding could include a shared underlying biological pathway such as the cholinergic receptor or horizontal pleiotropy (Quigley &  Figure 1A. Violin and box plots of age of smoking initiation in healthy controls, siblings and patients (years). Figure 1B. Violin and box plots of number of cigarettes smoked per day in healthy controls, siblings and patients. The dashed line denotes the mean age of smoking initiation or cigarettes per day in all participants, respectively. The diamonds denote the mean age of smoking initiation and cigarettes per day in each group, respectively. Mean comparisons between pairs were examined using T-tests: ns: p>0.05, *: p<0.05, **p<5.0 × 10 − 3 . ****: p <5.0 × 10 − 5 .
MacCabe, 2019). A possible explanation of the absence of findings in patients may relate to dominant non-smoking related biological pathways underlying the schizophrenia case-control status. In addition, in patients, intervention changes the symptom course and thereby the association with neurobiological variables (like PRSs for smoking). These associations are likely to be different from siblings and healthy controls who don't receive interventions.
There are contrasting, non-mutually exclusive hypotheses that try to explain the high rates of smoking in patients with psychotic disorders (Quigley & MacCabe, 2019). The shared-vulnerability hypothesis proposes that shared genetic and environmental factors render people more vulnerable to both tobacco use and psychotic disorders (Chambers, Krystal, & Self, 2001). Furthermore, emerging accumulating evidence supports the hypothesis that smoking is a causal risk factor for the development of severe mental illness (Quigley & MacCabe, 2019). For example, a Mendelian randomization study and a meta-analysis of prospective observational studies found evidence for a causal relationship between smoking and schizophrenia (Gurillo, Jauhar, Murray, & MacCabe, 2015;Wootton et al., 2018). However, genetic variation also captures environmental risk factors. For example, offspring genetic risk for schizophrenia was associated with prenatal environment factors like maternal smoking during pregnancy (Krapohl et al., 2017). Although explained variances were generally small, the results of the current study provide support for a partly shared vulnerability hypothesis of genetic risk for smoking and subclinical levels of psychosis symptoms only in unaffected siblings of schizophrenia patients. Future research is needed to disentangle the complex interplay between smoking and severe mental illness, also by identifying plausible pathophysiological mechanisms via which genetic risks for smoking could affect mental health symptoms (such as the expression of cholinergic receptors) and to elucidate whether this relationship reflects a causal process.
Strengths of our study include the large cohort of patients, siblings and controls, using extensive follow-up phenotype data (enabling multicross-sectional associations). Several limitations should nonetheless be noted. First, large samples are required for polygenic risk scores and our study might be underpowered to pick up some small effects in the patient group (Dudbridge, 2013). Second, due to the observational character of our study and the current sample size, causal analyses (e.g.,   Figure 2A. Figure 2B. Figure 2C. The dashed line denotes the mean CAPE subscale scores in all participants. The diamonds denote the mean CAPE subscales in each group. Mean comparisons were examined using T-tests: ns: p>0.05, *: p<0.05, **p<5.0 × 10 − 3 . ****: p <5.0 × 10 − 5 . As can be appreciated from the graphs, all PRS-SI upper tertiles consistently contained the highest symptom levels relative to the lowest PRS-SI tertiles. CAPE = community assessment of Psychic Experiences; PRS-SI = polygenic risk scores for smoking initiation. , and the y-axis represents symptom scales. The blue line represents the regression line: the slope (effect size of the regression model) decreases from controls to siblings to patients with a psychotic disorder, showing that PRS has larger association results in controls or siblings than in patients. The intercepts increase from controls to siblings to patients, reflecting significant mean symptom score differences between the groups. one-sample Mendelian randomization) could not be conducted. Future research is needed to explore possible residual confounding effects of environmental factors. Third, patients participating in GROUP reported relatively low levels of psychotic symptoms. This may have influenced our findings and may limit the generalizability of the findings to patients with a more severe course of their psychotic disorder. Finally, some phenotypic substance use data, e.g. the Fagerstrom test for nicotine dependence, have not been collected and may enrich future analyses in other cohorts.
In conclusion, our findings indicate that smoking behaviors in unaffected siblings lie at an intermediate level in between patients and healthy controls. We observed that genetic susceptibility for smoking behavior is associated with schizophrenia symptom expressions in a population of unaffected siblings of patients with a psychotic disorder. Further research into biological pathways is needed to elucidate the underlying mechanisms for this association.

Author statement
JJL and BDL conceived the study. BDL and JMV conducted the analyses. BDL, JMV and JJL wrote the first draft. All authors were involved in data collection and critically revised the final draft and approved of it. JJL supervised the study.

Ethics
This study was approved by the Medical Ethics Committee of the University Medical Center of Utrecht. Written informed consent was obtained before inclusion.

Declaration of Competing Interest
None.