Identification of Sweetness Preference-Related Single-Nucleotide Polymorphisms for Polygenic Risk Scores Associated with Obesity

Our study aimed to identify sweetness preference-associated single-nucleotide polymorphisms (SNPs), characterize the related genetic loci, and develop SNP-based polygenic risk scores (PRS) to analyze their associations with obesity. For genotyping, we utilized a pooled genome-wide association study (GWAS) dataset of 18,499 females and 10,878 males. We conducted genome-wide association analyses, functional annotation, and employed the weighted method to calculate the levels of PRS from 677 sweetness preference-related SNPs. We used Cox proportional hazards modeling with time-varying covariates to estimate age-adjusted and multivariable hazard ratios (HRs) and 95% confidence intervals (CIs) for obesity incidence. We also tested the correlation between PRS and environmental factors, including smoking and dietary components, on obesity. Our results showed that in males, the TT genotype of rs4861982 significantly increased obesity risk compared to the GG genotype in the Health Professionals Follow-up Study (HPFS) cohort (HR = 1.565; 95% CI, 1.122–2.184; p = 0.008) and in the pooled analysis (HR = 1.259; 95% CI, 1.030–1.540; p = 0.025). Protein tyrosine phosphatase receptor type O (PTPRO) was identified as strongly associated with sweetness preference, indicating a positive correlation between sweetness preference and obesity risk. Moreover, each 10 pack-year increment in smoking was significantly associated with an increased risk of obesity in the HPFS cohort (HR = 1.024; 95% CI, 1.000–1.048) in males but not in females. In conclusion, significant associations between rs4861982, sweetness preference, and obesity were identified, particularly among males, where environmental factors like smoking are also correlated with obesity risk.


Introduction
The prevalence of obesity in the United States increased to 42.4% in 2018, and obesityrelated metabolic syndromes, including type 2 diabetes, cardiovascular diseases, and cancer, are among the leading causes of premature death in this country [1].Genome-wide association studies (GWAS) have revealed over 40 genetic variants associated with obesity since 2006, and most cases of obesity have multifactorial causes stemming from complex interactions between the associated genes and environmental factors, including dietary components [2].Because the sense of taste plays a central role in the development of obesity (as it contributes to food selection) and, consequently, body weight, studies have been conducted to identify genetic predispositions to obesity in terms of polymorphisms that influence taste receptors.In humans, taste sensations arise when molecules from foods bind to taste receptors in the taste buds and gut cells [3].The stimulation of taste receptors can also occur in extra-oral tissues [4], focusing attention on the relationship between taste receptors and obesity.However, not everyone experiences the same taste sensations due, in part, to individual genetic differences that influence food preferences and consumption.Accumulating evidence has shown that susceptibility to obesity is governed by various genetic variants of taste receptors and their interactions with environmental factors [5,6], highlighting the importance of taste genetics.Taste-related single-nucleotide polymorphisms (SNPs) that are associated with taste perceptions in individuals with metabolic syndrome have been identified, revealing a strong inverse association between overall taste perceptions and body mass index (BMI) [7].Although our understanding of obesity genetics has advanced significantly over the past few decades, the link between sweetness preference-associated receptors and obesity and the molecular mechanisms underlying the disease remain largely unknown.Additionally, although sex differences have been discovered in terms of caloric expenditure, body fat mass, and the onset of menopause [8], the understanding of how sex differences and SNPs in sweetness preference-associated receptors correlate with obesity is lacking.Given that obesity is a chronic disease due to an increase in body fat accumulation, high-sugar diets related to sweetness preference can lead to obesity.This is because sugar consumption increases body adiposity without causing a dramatic increase in body weight [9,10].Therefore, our main goal was to perform a GWAS to identify sweetness preference-associated SNPs, characterize novel candidate genetic loci related to sweetness preference, and develop polygenic risk scores (PRS) based on sweetness preference-related SNPs to analyze their associations with obesity.
Our hypothesis was that sweetness preference-associated receptor polymorphisms are correlated with sex-dependent obesity and that the correlations between sweetness preference-associated receptor SNPs and environmental factors like smoking affect the pathogenesis of obesity.Thus, novel candidate loci associated with obesity and sweetness preference were identified and characterized within the American Nurses' Health Study (NHS1) and Health Professionals Follow-up Study (HPFS) cohorts to verify the hypothesis.The results will have implications for diagnosing obesity, refining trial recruitment strategies, treating the disease, and tailoring personalized nutrition.

Study Populations
The study population included two large prospective cohorts (the NHS1 and HPFS cohorts).We followed up with the participants in both cohorts with biennial questionnaires regarding their medical histories and lifestyles, and semi-quantitative food-frequency questionnaires (FFQs) were completed every 4 years.The baseline year in both cohorts was 1986, when detailed information regarding the participants' dietary habits and lifestyles was available.This study included 18,499 females and 10,878 males of European ancestry who had complete baseline information and available genotype data for GWAS [11][12][13].All participants were free from diabetes and cancer at the baseline.We excluded both females and males with a BMI (in kg/m 2 ) > 30 at baseline, as well as individuals with missing or implausible FFQ responses.Following exclusion based on these criteria, the final cohort used for data analysis included 12,098 females and 7555 males.
This study was approved by the institutional review board of Keimyung University (approval number 40525-202002-BR-087-01), and the protocol was reviewed and approved by the Brigham and Women's Hospital and the Harvard T. H. Chan School of Public Health.

Assessment of Obesity and Covariates
Height and body weight data were self-reported in the questionnaires administered at enrollment and at each follow-up.The BMI was calculated as the kg/m 2 , and subjects with a BMI greater than 30 kg/m 2 were determined to be obese.We converted the average time spent per week participating in physical activities (e.g., walking, running, and biking) to metabolic equivalent h (METs) per week [14].Alcohol intake was assessed on the FFQ every 4 years, and trans fat and total energy intakes were updated from these questionnaires.

Dietary Assessment and Phenotype Definitions
The participants were asked about their frequencies of consuming specific foods and beverages during the past year, and the participants' food-intake frequencies were quantified using nine categories: "never/almost never", "one to three times a month", "once a week", "two to four times a week", "five to six times a week", "once a day", "two to three times a day", "four to five times a day", and "more than six times a day" [15].Sweetness preference was classified into two groups by calculating the cumulative average sum intake of sweet foods such as chocolate, cookies, brownies, donuts, cake, and jam.In this manner, two groups were created: "low" sweetness preference (<5 servings per month) and "high" sweetness preference (>50 servings per month) to determine the sweet-taste preference phenotype [16].The grouping rationale was based on previous findings showing that the recalled sweet taste intensity was associated with self-reported liking and habitual intake of commonly consumed sweet foods [17].

Genotyping and Calculating PRS
Genotyping and merging were performed with the pooled GWAS dataset, which was generated using five different platforms (Illumina HumanHap array, Illumina OmniExpress array, Humancore array, Oncoarray, and Affymetrix 6.0 array), as described in detail by Lindström et al. [18].Samples with a missing call rate >5% (with any platform) during the merging process were excluded from further analysis.SNPs with a minor allele frequency <1% or an imputation quality (r 2 ) <0.5 were also excluded [19].Missing genotypes were imputed using the Haplotype Reference Consortium as the reference panel [20].
We used a weighted method to calculate the PRS based on 677 SNPs (p < 1.0 × 10 −3 ).Each SNP was weighted by its relative effect size (β coefficient) on the sweetness preference.We calculated the PRS using the equation weighted PRS = (β1 × SNP1 + β2 × SNP2 + . . .+ β677 × SNP677) × (677/sum of the β coefficients of all SNPs), where SNPi is the risk allele number of each SNP, to ensure that it accurately reflects the risk [21].

Functional Annotation and Gene Mapping
Functional annotation was performed using Functional Mapping and Annotation of Genome-Wide Association Studies (FUMA GWAS, an online platform for the functional mapping of genetic variants) [22].Additionally, Combined Annotation Dependent Depletion (CADD) [23] scores (with scores > 12.37 indicating the deleteriousness of an SNP) and RegulomeDB [24] scores (with a lower score indicating a higher probability of having a regulatory function) were annotated to SNPs.The nearest gene to each SNP was identified using HaploReg v4.1, a tool for exploring annotations of the noncoding genome in variants on haplotype blocks, using the RefSeq genes [25].

Statistical Analysis
Genome-wide association analyses were performed using PLINK [26,27] and a logisticregression model for the two pooled GWAS datasets.A meta-analysis was conducted using METAL software [28].We used Cox proportional hazards modeling with timevarying covariates to estimate age-adjusted and multivariable hazard ratios (HRs) and 95% confidence intervals (CIs) for incident obesity.Each participant's person-time of follow-up was calculated based on the return date of the baseline questionnaire (i.e., 1986 for the NHS and HPFS) to the date of obesity diagnosis based on the BMI value (>30 kg/m 2 ), death, or the end of follow-up, whichever occurred first [29].Males and females who reported a BMI > 30 kg/m 2 , had cancer or diabetes, or who died were excluded from the subsequent follow-up.Among the covariates, the age, level of physical activity, trans fat level, and total energy intake were added to the model as continuous variables.The smoking status and alcohol consumption level were added as categorical variables.Interaction effects of the PRS and environmental factors on incident obesity were also tested by including the interaction terms in the regression models.Adjusted multivariable HRs for both cohorts were pooled using fixed-effect meta-analysis, using either SAS software (version 9.3; SAS Institute Inc., Cary, NC, USA) or R software (version 4.0.0;R Foundation for Statistical Computing, Vienna, Austria) [30].
who reported a BMI > 30 kg/m 2 , had cancer or diabetes, or who died were excluded from the subsequent follow-up.Among the covariates, the age, level of physical activity, trans fat level, and total energy intake were added to the model as continuous variables.The smoking status and alcohol consumption level were added as categorical variables.Interaction effects of the PRS and environmental factors on incident obesity were also tested by including the interaction terms in the regression models.Adjusted multivariable HRs for both cohorts were pooled using fixed-effect meta-analysis, using either SAS software (version 9.3; SAS Institute Inc., Cary, NC, USA) or R software (version 4.0.0;R Foundation for Statistical Computing, Vienna, Austria) [30].

Characteristics According to PRS between Males and Females
Baseline age-adjusted descriptive statistics were determined according to the highest, intermediate, and lowest thirds of the PRS in the NHS1 and HPFS cohorts (Table 2).Females were more likely to be current smokers at baseline.Males generally consumed higher levels of alcohol, had higher total energy intake, a higher level of trans fat intake, and greater total fiber intake, whereas females consumed more coffee, fruits, and sweetened beverages at baseline.Males and females had similar healthy eating indexes and sleep times.However, the males had higher glycemic load levels than the females and consumed more ice cream, chocolate, and cake at baseline.The thirds of the PRS values ranged from 631 (low) to 665 (high), and the participants with higher PRS had higher intakes of chocolate, ice cream, and cake.

Genotype of SNP rs4861982 and Obesity
Among the SNPs associated with sweetness preference, rs4861982 was associated with obesity in the NHS1 and HPFS cohorts (Table 3).In males, the TT genotype of rs4861982 significantly increased the risk of obesity as compared to the GG genotype in HPFS cohort in Model 1 (HR = 1.596; 95% CI, 1.145-2.225;p = 0.006) and Model 2 (HR = 1.565; 95% CI, 1.122-2.184;p = 0.008).Similar results were obtained in the pooled analysis in Model 1 (HR = 1.262; 95% CI, 1.032-1.543;p = 0.023) and Model 2 (HR = 1.259; 95% CI, 1.030-1.540;p = 0.025).In females, however, the TT genotype was not significantly associated with obesity.The association of the TG genotype of rs4861982 with obesity was negligible in both males and females, as well as in the pooled analysis.A regional association plot for the reference SNP rs4861982 is shown in Figure 2. SNPs are plotted with the negative logarithm of the associated P-value as a function of the genomic position ranging from chr4:181966644 to chr4:182966644 (Genome Reference Consortium Human Build 37, GRCh37).The most strongly associated SNP was found on chromosome 4 at nucleotide position 182466644 (rs4861982, purple diamond) [31].  Model 2 is adjusted for age, smoking status, alcohol consumption, physical activity, trans fat, total energy intake, and race. 4Follow-up in NHS1 was from 1986 to 2014. 5 Follow-up in HPFS was from 1986 to 2014. 6Results of two cohorts are pooled by means of inverse variance-weighted, fixed-effects meta-analysis (all P-values for heterogeneity).

Genotype of SNP rs4861982 and Obesity
Among the SNPs associated with sweetness preference, rs4861982 was associated with obesity in the NHS1 and HPFS cohorts (Table 3).In males, the TT genotype of rs4861982 significantly increased the risk of obesity as compared to the GG genotype in HPFS cohort in Model 1 (HR = 1.596; 95% CI, 1.145-2.225;p = 0.006) and Model 2 (HR = 1.565; 95% CI, 1.122-2.184;p = 0.008).Similar results were obtained in the pooled analysis in Model 1 (HR = 1.262; 95% CI, 1.032-1.543;p = 0.023) and Model 2 (HR = 1.259; 95% CI, 1.030-1.540;p = 0.025).In females, however, the TT genotype was not significantly associated with obesity.The association of the TG genotype of rs4861982 with obesity was negligible in both males and females, as well as in the pooled analysis.A regional association plot for the reference SNP rs4861982 is shown in Figure 2. SNPs are plotted with the negative logarithm of the associated P-value as a function of the genomic position ranging from chr4:181966644 to chr4:182966644 (Genome Reference Consortium Human Build 37, GRCh37).The most strongly associated SNP was found on chromosome 4 at nucleotide position 182466644 (rs4861982, purple diamond) [31].

Correlation Effects of PRS and Environmental Factors on Obesity
Different socioeconomic or cultural factors [32] and family environments can increase the prevalence of obesity through food supply, caloric intake, and physical activity [33].Therefore, we further investigated the ability of genetic analysis, in the form of a PRS, to identify individuals who are at high risk of obesity.Correlation analyses between the PRS and environmental factors, such as smoking and dietary components, are summarized in Figure 3. Including the Alternate Healthy Eating Index (AHEI), glycemic load (GL), other dietary factors, and physical activity in our models did not show significant PRS interactions with the risk of obesity.No consistent correlation effects were found between any type of dietary intake and the PRS regarding obesity risk.However, each 10 packyear increment of smoking, in the interaction with PRS per additional 10 risk alleles, was associated with an increased risk of obesity in the HPFS cohort (HR = 1.024, 95% CI, 1.000-1.048),as shown in Figure 3.These findings indicate that smoking is significantly associated with a correlation between the PRS and obesity in males.
Different socioeconomic or cultural factors [32] and family environments can increase the prevalence of obesity through food supply, caloric intake, and physical activity [33].Therefore, we further investigated the ability of genetic analysis, in the form of a PRS, to identify individuals who are at high risk of obesity.Correlation analyses between the PRS and environmental factors, such as smoking and dietary components, are summarized in Figure 3. Including the Alternate Healthy Eating Index (AHEI), glycemic load (GL), other dietary factors, and physical activity in our models did not show significant PRS interactions with the risk of obesity.No consistent correlation effects were found between any type of dietary intake and the PRS regarding obesity risk.However, each 10 pack-year increment of smoking, in the interaction with PRS per additional 10 risk alleles, was associated with an increased risk of obesity in the HPFS cohort (HR = 1.024, 95% CI, 1.000-1.048),as shown in Figure 3.These findings indicate that smoking is significantly associated with a correlation between the PRS and obesity in males.3. The results for two cohorts are pooled by means of inverse variance-weighted, fixed-effects meta-analysis.All P-values for heterogeneity are >0.05.

Discussion
CTNND2, the gene identified here, is expressed within proliferating neuronal progenitor cells of the neuroepithelium and in the dendritic compartment of postmitotic neurons [34].Genetic variation in CTNND2 has been reported to be involved in neuroplastic processes in the olfactory pathways of rats [35] and is associated with human neurodevelopmental phenotypes, such as autism [34] and intellectual disability [36].CTNND2 is identified as an adhesion-related molecule in human periodontal ligament cells [37] and is located at 5p15.2, predominantly expressed in the brain with distinct regional expression patterns [38].Despite several genome-wide studies implicating polymorphisms  3. The results for two cohorts are pooled by means of inverse variance-weighted, fixed-effects meta-analysis.All P-values for heterogeneity are >0.05.

Discussion
CTNND2, the gene identified here, is expressed within proliferating neuronal progenitor cells of the neuroepithelium and in the dendritic compartment of postmitotic neurons [34].Genetic variation in CTNND2 has been reported to be involved in neuroplastic processes in the olfactory pathways of rats [35] and is associated with human neurodevelopmental phenotypes, such as autism [34] and intellectual disability [36].CTNND2 is identified as an adhesion-related molecule in human periodontal ligament cells [37] and is located at 5p15.2, predominantly expressed in the brain with distinct regional expression patterns [38].Despite several genome-wide studies implicating polymorphisms within CTNND2 [39], a definitive role of CTNND2 in the pathogenesis of obesity has not yet been determined.
Several genes have been reported to be related to obesity in previous studies.Among these genes, WBP1L was found to be related to BMI in a meta-analysis of the epigenomewide association study in REGICOR (REgistre GIroní del COR) [40].The expression of the cell-proliferation marker MKI67 was also significantly increased in the endometrial polyps of postmenopausal females with obesity, suggesting that the BMI influences the proliferation marker [41].The SNP most strongly associated with sweetness preference, with the smallest P-value (p = 9.68 × 10 −8 ), is rs1457538 on chromosome 12 (Table 1).The nearest gene to rs1457538 is PTPRO, which is a member of the R3 subfamily of receptor-like protein tyrosine phosphatases and is associated with adipose tissue.The PTPRO gene is upregulated in the adipose tissues of obese individuals [42].The roles of PTPRO have been reported to involve the control of glucose and lipid metabolism, obesity-induced systemic inflammation [43], and the inactivation of the insulin receptor [44].Although the heterogeneity (I²) for rs1457538 is relatively high, the expression of PTPRO may contribute to the risk of obesity, implying a positive correlation between sweetness preference and obesity risk.
One important finding of this study is that the closest gene to rs4861982 (p = 2.68 × 10 −6 ; implicated in this study) is LINC00290, a long intergenic non-protein-coding RNA gene.Interestingly, LINC00290 interacts with sodium arsenite, a naturally occurring component of sediment and groundwater, which makes human exposure inevitable [45].Both arsenic exposure and obesity are prevalent and widespread [46], and arsenic exposure can affect gene regulation at both the transcriptional-initiation and splicing levels [45].Coincidentally, our epidemiological study of rs4861982, which revealed its correlation with sodium arsenite, showed that it significantly increased the risk of obesity in males but not in females.Given that this cohort study was based on a subset of the NHS1 and HPFS with an age range of 50-67 years, it is plausible that the GWAS signal indicating that the TT genotype was significant only in males was partially influenced by the onset of menopause, postmenopausal hormonal changes between females and males, and socioeconomic status when comparing females with males [47,48].
Studies have shown that energy imbalance and metabolic disorders can lead to obesity by affecting other contributing or predisposing factors.The interplay between obesity and environmental factors, such as obesogenic infectious agents, toxic chemicals, genetic influences, epigenetic influences, the gut microbiome, and brown or beige fat, can make certain groups more susceptible to obesity [9].Environmental factors like smoking can affect energy balance and metabolism, thereby increasing the risk of obesity.We found a significant and unique gene-environment interaction of the PRS with smoking in males, where males who smoked more had an approximately 2.4% increased risk of obesity.This result indicates that smoking masks the PRS effect, which typically results in weight loss, yet it contributes to a mere 2.4% increase in the obesity risk, whereas the TT genotype of rs4861982 contributes to a significantly higher risk of obesity (56.5%).Further, crosssectional studies indicate that the mean BMI tended to be lower among smokers than among nonsmokers in many populations [49].We found no significant correlation between the PRS and female smokers.This may be because the TT genotype was significant only in males, partially due to the onset of menopause and postmenopausal hormonal changes, as mentioned above [47,48].An important aspect of this study was the discovery that genome-wide PRS, along with a single SNP, can quantify hereditary obesity and identify adults at risk for obesity based on sex.While many studies on the association of a single SNP or genotype with obesity, this study uniquely investigated the effect of PRS on obesity as well as the correlated effects of environmental factors.
A limitation of this study is that measurement errors of self-reported behaviors are inevitable.The inherent biases associated with self-reporting require educating participants on how to use the devices involved in data acquisition.Since one's BMI can be estimated indirectly, individuals of different heights or body builds with different proportions of total body fat may exhibit similar BMI scores [50], which can provide limitations for this study.Additionally, the limitations that stem from cohort specificity, environmental factors, and gender disparity need to be addressed to overcome the heterogeneity of SNPs and their nearby genes related to obesity.This could be achieved by maximizing the sample size and refining samples by focusing on specific variables at onset and recurrence.The molecular mechanisms in which SNPs related to sweetness preference and related genes participate in the pathogenesis of obesity need to be further studied in the future.

Conclusions
This study verified that sweetness preference-related polymorphisms are associated with sex-dependent variation in obesity and that correlations between sweetness preference-related SNPs and environmental factors affect obesity.Specifically, significant associations between rs4861982 and both sweetness preference and obesity were identified.Compared to the GG or TG genotype, the TT genotype of rs4861982 significantly increased the risk of obesity among males.Among environmental factors, smoking was significantly associated with the correlations between the PRS and obesity in males but not in females.

Figure 1 .
Figure 1.Manhattan plot displaying significant SNPs according to their −log10 P-values (shown on the y-axis).The SNPs are ordered by their chromosomal position along the x-axis.Each dot on the Manhattan plot signifies an SNP, and the SNP with the strongest association (rs1457538), i.e., with the smallest P-value (p = 9.68 × 10 −8 ), was on chromosome 12.

Figure 1 .
Figure 1.Manhattan plot displaying significant SNPs according to their −log 10 P-values (shown on the y-axis).The SNPs are ordered by their chromosomal position along the x-axis.Each dot on the Manhattan plot signifies an SNP, and the SNP with the strongest association (rs1457538), i.e., with the smallest P-value (p = 9.68 × 10 −8 ), was on chromosome 12.

Figure 2 .
Figure 2. Regional association plot for the reference SNP, rs4861982.SNPs are plotted by displaying the negative logarithm of the associated P-value as a function of the genomic position.The SNP rs4861982 is located on chromosome 4 at nucleotide position 182466644 (based on GRCh37); it was identified as the most strongly associated SNP on chromosome 4 (rs4861982, purple diamond).

Figure 2 .
Figure 2. Regional association plot for the reference SNP, rs4861982.SNPs are plotted by displaying the negative logarithm of the associated P-value as a function of the genomic position.The SNP rs4861982 is located on chromosome 4 at nucleotide position 182466644 (based on GRCh37); it was identified as the most strongly associated SNP on chromosome 4 (rs4861982, purple diamond).

Figure 3 .
Figure 3. Correlation of PRS with environmental factors (such as lifestyle and dietary components) in the HR of obesity.Different colors represent distinct variables, as indicated in the figure.The forest plots show HRs and 95% CIs for interactions between the PRS (per 10 risk alleles) and changes in environmental factors (10-increment servings per month).The results are adjusted for the same set of variables shown in Table3.The results for two cohorts are pooled by means of inverse variance-weighted, fixed-effects meta-analysis.All P-values for heterogeneity are >0.05.

Figure 3 .
Figure 3. Correlation of PRS with environmental factors (such as lifestyle and dietary components) in the HR of obesity.Different colors represent distinct variables, as indicated in the figure.The forest plots show HRs and 95% CIs for interactions between the PRS (per 10 risk alleles) and changes in environmental factors (10-increment servings per month).The results are adjusted for the same set of variables shown in Table3.The results for two cohorts are pooled by means of inverse variance-weighted, fixed-effects meta-analysis.All P-values for heterogeneity are >0.05.

Table 2 .
Age-standardized characteristics according to PRS in thirds among US males and females in the NHS1 and HPFS.

Table 3 .
Adjusted HR (95% CI) of obesity for genotypes of the SNP rs4861982 in the NHS1 and HPFS 1 .