Mapping the FACT-G to EQ-5D Utility Index in Cancer: Data From a Cross-sectional Study in China

This study aimed to develop a function for mapping the cancer-specic instrument (FACT–G) to a preference-based measure (EQ-5D-3L) utility index for HRQoL, in which the utility scores were generated using the Chinese value set. The data are based on a cross-sectional survey of 243 patients in China with different cancer types. Cancer patients who completed the EQ-5D-3 L and the FACT-G questionnaire, and patient demographics and clinical characteristics were included in this study. Regression models were used to predict the EQ-5D-3L utility index values based on four subscale scores of the FACT-G using the ordinary least squares (OLS) model, generalized linear models (GLM), censored least absolute deviations (CLAD), Tobit model, and two-part model (TPM) regression approaches. The performance and predictive power of each model were also evaluated using r 2 and adj- r 2 , mean absolute error (MAE) and root mean squared error (RMSE). analysis in China.


Page 3/18
with demographic characteristics, and results are similar when repeated measures of preferences are used in different populations in different countries [9,10]. In the past, most researchers employed the United Kingdom algorithm set to estimate health utility values. Today, however, most countries have been established country-speci c value sets because a utility value algorithm based on a speci c population does not apply to others. Liu et al. successfully developed Chinese populationspeci c EQ-5D-3L health states using the time trade-off (TTO) method in 2014 [11]. Thus, we used the value set based on the Chinese general population algorithm for this study.
However, preference-based instruments are not always available in clinical trials because many dimensions may not be relevant or sensitive to therapeutic effects. Disease-speci c instruments are mostly used to measure the HRQoL rather than generic preference-based measures as they can provide more speci c details about the patients' assessment of a particular disease but it does not calculate the health utility scores and QALYs directly. Moreover, it provides only ordinal-level measurement scales, thus limiting their usefulness in health economic analysis [12]. The Functional Assessment of Cancer Therapy-General (FACT-G) is one of the most widely used cancer-speci c HRQoL instruments as well as a non-preferencebased measure for cancer patients [13]. The FACT-G is used in clinical trials to assess the quality of life (QoL), with higher values representing a better QoL, and its reliability and validity have been proven [14]. However, it does not estimate the health utility index in economic evaluations for HRQoL, unlike the preference-based instrument. The lack of health utility values limits the development of health economics research. One solution is to use the development of a mapping algorithm that maps scores from the HRQoL data collected by non-preference-based instruments to general preference-based instruments [15]. A mapping function allows health utility values to be predicted when the straightforward health utility data are not available with non-preference-based measures. Growing literature studies have suggested that a mapping function from disease-speci c instruments to generic preference-based measures using regression models in health economic evaluation is available [4,[15][16][17].
Mapping algorithms were developed using the data from different instruments to compare predicted and observed values. The function is not only developed for mapping multiple disease-speci c to cancer-speci c HRQoL (including lung, breast, prostate, colorectal, melanoma cancer [16,[18][19][20][21]) but also applied to other areas, such as HIV [22], cystic brosis [4], and genital warts [5]. Furthermore, there are a few studies mapping from FACT-G to EQ-5D-3L that have been developed and evaluated using regression model analysis in health economic research because most studies use responses from patients with single cancerspeci c for HRQoL. A study from Canada performed a mapping function to both the EQ-5D-3L and SF-6D health utility indices from the FACT-G [23]. Meanwhile, a study was conducted to evaluate the validity of both FACT-G and preference-based instruments (including the EQ-5D-3L, SF-6D, HUI-2, and HUI-3) in assessing cancer severity levels in Canadian patient data [24].
Another mapping from the FACT-G to the EQ-5D-3L health utility index in Singapore shows that a single equation can be applied to different versions of the FACT-G [25]. However, no studies are available to convert the FACT-G to EQ-5D-3L with mapping algorithms in Chinese population due to the inconsistency between utility value sets of different countries [26].
Therefore, it is necessary to develop a health utility value mapping from FACT-G to EQ-5D-3L for Chinese patients. Some studies have shown that mapping can improve the accuracy of models with socio-demographic and clinical factors among the instruments, thus affecting health utility in cost-utility analysis [10]. These studies also compare different regression methods with more accurate models [16].
The objective of the present study was to develop a mapping algorithm to estimate EQ-5D-3L utility values from the FACT-G in economic evaluation analysis for the Chinese population, using ve regression models to account for ceiling effects and anticipate any violations of normality and homoscedasticity and to better estimate the patients' health status and provide recommendations for future mapping studies.

Methods And Materials
Study design and data collection The Cancer Screening Program in Urban China, a major public health service project supported by the central government of China beginning in August 2012, was designed to screening programs for lung, breast, colorectal, liver, stomach, and Page 4/18 esophageal cancers [27]. Meanwhile, a multicenter cross-sectional study was conducted in 12 provinces between September 2013 and December 2014, with appropriate screening interventions targeted at speci c types of cancer [28]. The study protocol was approved by the Institutional Review Board of the Cancer Hospital of the Chinese Academy of Medical Sciences (Approval No. 15-071/998). All participants gave their [written] informed consent. The EQ-5D-3L is referred to as EQ-5D in the rest of the article.
This study involved 243 subjects according to the following criteria: 40-74 years old; ability to provide written informed consent; diagnosed with lung, breast, stomach, esophagus, colorectal, and liver cancer; completed both the EQ-5D and FACT-G scales and subscales. Exclusion criteria included: the refusal to sign the consent form, non-cancer-related subjects, missing or duplicate responses on the questionnaire, and being unable to understand the questions, or record their evaluations. The questionnaire survey was conducted through a face-to-face interview between the investigator and the followed-up subjects.
For information regarding age, sex, marital status, level of education, family population, employment, family nancial pressure, and signi cant life events, patients were required to complete a health and demographic questionnaire. Simultaneously, the questionnaires were also completed face-to-face with the community doctors, who were trained by research assistants. Scales that needed to be completed by patients included the EQ-5D and, FACT-G.

EQ-5D Scale
The EQ-5D scale is a generic preference-based instrument that provides a simple and universal health measurement method for clinical and economic evaluation. It consists of a two-part questionnaire. The EQ-5D descriptive system consists of ve dimensions (i.e., mobility, self-care, usual activities, pain/discomfort, and anxiety/depression), each with three levels of health, indicating no problems, some or moderate problems, and extreme problems [29]. The EQ-5D index scores were calculated using an algorithm based on societal preferences from the general population-based valuation. We calculated the EQ-5D health states using the TTO method developed by Gordon G. Liu et al., who developed a utility algorithm based on a TTO survey of 1147 Chinese respondents [11]. The EQ-5D utility index ranges from -0.149 to 1, where 1 indicates full health, 0 indicates a state equivalent to death, and a negative value implies that the respondent's health state is worse than death [30].
Nevertheless, no negative values were observed in this study. At present, the state of health is calculated using a 20-centimeter visual analog scale (VAS) that ranges from 0 to 100. The worst imaginable health state was scored as 0 and the best imaginable health state was scored as 100 [14].

FACT-G Scale
Participants also completed the FACT-QoL questionnaire using a version speci c to their tumor type. FACT-G is scored by summing the individual scale scores, with higher scores indicating better QoL. The FACT-G produces four subscale scores that re ect the patient's QoL: physical wellbeing (PWB) (7 items), social/family well-being (SFWB) (7 items), emotional well-being (EWB) (6 items), and functional well-being (FWB) (7 items) [31]. The scales for different disease-speci c types are different.
However, the cancer-speci c HRQoL uses instruments, including 27-item FACT-G and several items of cancer-additional concerns scale, that contained a breast cancer subscale, a lung cancer subscale, an esophageal cancer subscale, a colorectal cancer subscale, a gastric cancer subscale, and a hepatocellular carcinoma subscale. All items were rated on a 5-point Likert scale, with higher scores indicating better HRQoL. Most of the previous studies suggested that, the OLS model may not be appropriate when preference-based scores are highly skewed [23]. The ceiling effect may also invalidate the normality assumption of OLS [32]. The GLM with Gamma family and identity link predicts EQ-5D utility, which relaxes the assumption of the OLS that allows the skewed distribution of utility values.

Page 5/18
The Tobit model is an alternative model that accounts for the ceiling effect, thus limiting predictions within a credible range.
However, it is sensitive to normal distribution and heteroscedasticity. The CLAD model assumes that the median is more resistant than the mean to ceiling effects and is a possible solution to the heteroscedasticity problem as well, which minimizes the sum of absolute differences between observed and predicted values [22,32]. The TPM is speci cally designed to deal with limited dependent variables, which divide the data into two parts to predict responders in perfect health and those who are, not.
The TPM with logistic regression is used to predict the probability of EQ-5D utility at the ceiling in the rst part, a truncated OLS to predict EQ-5D index for those individuals whose EQ-5D utility is below the ceiling in the second part and combined they obtain the overall utility value [19,22].
Five model speci cations were used to develop the mapping functions. The OLS model, GLM, Tobit model, CLAD model, and TPM were performed in ve different models. We increased the squared terms and the interaction terms to improve the model accuracy for this study, as suggested in the literature [4]. Model 1 uses the FACT-G overall scores to regress the EQ-5D utility We calculated the goodness of t of each model to assess how well the responses to the FACT-G predicted EQ-5D utility. Model goodness of t was measured using mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE), Akaike information criteria (AIC), and Bayes information criteria (BIC) to examine the differences between mean observed and predicted EQ-5D utility, in which lower values indicate better model performance. The coe cient of determination, R 2 , and adjusted R 2 also estimated how well the model explained the values in OLS. However, it is not available for other regression models. Instead, we computed the square of the correlation coe cient (r) between the observed and predicted values of each model, with r 2 being equivalent to R 2 in OLS [26]. To penalize the complexity of the model, we de ned the adjusted r 2 as follows: adjusted r 2 where n represents the sample size and p is the number of parameters in the model. Predictive ability was evaluated by a paired t-test to compare the differences in the distributions between the observed and mapped EQ-5D utility scores. The different EQ-5D utility scores from different models with demographic and clinical features were examined by non-parametric analysis. Moreover, we selected the lowest MAE/RMSE and the highest r 2 and adjusted r 2 as the best performing models. The EQ-5D observed values and predicted utility values were compared in patients with different demographic and clinical characteristics in the different models.
All statistical analyses were performed in STATA version14.1 and all hypothetical tests were two-tailed, and p-value < 0.05 was considered statistically signi cant in this study.

Results
A total of 243 cancer patients were included in this analysis. Demographic and clinical features are summarized in Table 1  p, Wilcoxon rank-sum tests for two categories or Kruskal-Wallis test for more than two categories.
The distributions of the EQ-5D utility index and the FACT-G total and subscale scores are described in Table 2. The mean values of EQ-5D and FACT-G utility scores were 0.935 and 82.7, respectively. The value of EQ-5D ranged from 0.364 to 1.000, and the median was 1.000, with 62.7% of the subjects having the highest score. All FACT-G scores reached their ceiling levels but the FACT-G total score was negligible, with notable values for the PWB (27.2%), EWB (6.2%), SFWB (7.8%), FWB (5.8%), and FACT-G total (0.8%). Cronbach's α of all scales exceeded the threshold value (α ≥ 0.7), which was considered satisfactory.
Likewise, all the scales exceeded the threshold for good reliability (α ≥ 0.8), except for the SFWB subscale (α = 0.683). Both the EQ-5D and FACT-G utility scores were negatively skewed. Spearman's rank correlation coe cients between the EQ-5D and FACT-G (including the total and four subscales) are shown in Table 3. Most of the scores showed moderate and high correlations. The correlation coe cient between the EQ-5D utility index and FACT-G total scores was 0.5382. The correlation coe cient between SFWB and other scores showed negligible correlations except for the FACT-G total, EWB, and FWB scores. All correlations are signi cant at the 0.05 level after Bonferroni correction.   The distribution of the observed and predicted utility values were summarized in Table 6. The observed mean EQ-5D utility   Table 7 presents the predicted mean observed and predicted EQ-5D utility values with the statistically signi cant demographic and clinical characteristic features in the best models from the different regression algorithms. Compared with TPM 3 and Tobit 3, the estimated health utilities from OLS 5, GLM 5, and CLAD 5 were closer to the observed values from the EQ-5D. We also found that all models tended to overpredict the higher top and bottom end of the EQ-5D utility: because a few responders in the FACT-G data set reported severe problems. In addition, the CLAD model predictive performance was more accurate than other models under the in uence of heteroscedasticity and various misspeci cation confounding factors. Discussion Several mapping studies have been developed to predict generic preference-based scores from non-preference-based scores in cost-utility analyses. We developed an algorithm that maps the EQ-5D health utility index of the general cancer population from the FACT-G based on the data collected from a cross-sectional study in China. The current study suggested that the consistency between the predicted and observed EQ-5D utility scores was feasible among ve models algorithms. Meanwhile, it also con rmed that the FACT-G no-preference-based scores can estimate EQ-5D health utility scores by using a mapping function. Our ndings suggest that the OLS model, GLM, and CLAD model are better for predicting EQ-5D utilities compared to the Tobit model and TPM in terms of goodness of t and model performance.
In this study, the coe cients of PWB, EWB, and FWB were signi cant in most models for all the regression algorithms, whereas, the coe cient of SFWB was not signi cant in most models. Hence, the SFWB score of the FACT-G was not included in the regression model. We found that the SFWB was not statistically signi cant and showed weak correlations with the EQ-5D utility index compared to other FACT-G and EQ-VAS subscale. Previous research showed that mapping studies tended to con rm the predictive ability of health utility more easily when exploring the correlation between the EQ-5D and FACT-G scales [3,32,33]. Furthermore, the SFWB was also not statistically associated with the EQ-5D utility index due to the lack of socialrelated functions [23]. Some of the previous study also showed that the correlation coe cients of SFWB were negative in regression models, with mapping studies of FACT-P [8], FACT-L [32], FACT-G [25], and FACT-B [19]. That is to say, social or functional well-being has no direct relevance with the individual's health utility, as it is unlikely that better social well-being would decrease health utility.
In this study, we found that the OLS model had the largest r 2 and adjusted r 2 among the regression models. The r 2 and adjusted r 2 values of all models were larger than 0.5, except for Model 1, which indicated that the model had good explanatory power. A systematic review of mapping studies showed that a more complex approach including interaction and squared terms and non-health-related variables (e.g., socio-demographics) was feasible and improved the accuracy of the model by r 2 and adjusted r 2 [34]. Therefore, we used the squared terms and interaction terms in Models 4 and 5 to improve model performance and make more accurate predictions for this study. In previous studies, the model's explanatory power ranged from 0.417 to 0.909 in terms of r 2 [15]. The r 2 of model OLS 5 reached 0.623, indicating that the model performed well. The predictive performance of the models is to examine the difference between the predicted and observed values for assessing the mapping algorithm by MAE and RMSE. The OLS model has the lowest MAE, and the TPM has the lowest RMSE. However, taking the MAE as the predictive criterion in the regression model, the OLS model, GLM, and CLAD model have a similar goodness of t results, but the Tobit model and TPM did not perform well in the current study. A literature review [34] showed that the MAE values ranged from 0.0011 to 0.19, representing a margin of error of up to 15% of the range of the preferencebased measure with the uncertainty of the mapping estimation [35].
Our nding demonstrated that the OLS model had the best goodness of t and was considered the best compared to other models. This result is consistent with previous literature studies mapping a disease-speci c instrument to a preference-based instrument [15,19,36]. The results of this study show that EQ-5D has a high ceiling effect (62.7%), which is similar to a previous study on the general population in the United States, which presented a relatively high EQ-5D index score (50%) on the ceiling effect [37]. Most mapping functions have been estimated by the CLAD model, Tobit model, and TPM because the collected data suffer from a high ceiling effects. A study found that the CLAD model was more closely related to the health utility values between the OLS and CLAD models, when mapping FACT-G to EQ-5D health utility [25]. Previous studies showed that CLAD and OLS models perform better than the Tobit model using mean prediction in a developed mapping algorithm [38].
Although the Tobit and CLAD models allow to censor the data of preference-based measures and censored the predicted values at 1, they performed poorly under the serious assumptions of heteroscedasticity and non-normality in economic evaluation and should not be used for estimating the health utility index [39]. We also used the GLM with the Gamma family with an identity link function that replaced the Gaussian family as it performed similar results with the OLS model.  [19,20,22]. However, contrary to previous ndings, the TPM in this study showed a worse predictive effect than other models [4]. In addition, there may be a multicollinearity problem in TPM, and the speci c data distribution leads to misspeci cation. It is also possible that the model's predictions were too low because fewer patients with poor health responded to the survey. Nevertheless, while the TPM did not perform well, it could also deal with the boundary problem in most situations when the second TPM takes a more stringent setting.
Most of the previous literature studies have been published to explore the predictive performance of the model when developing a mapping function to estimate EQ-5D preference-based health utility values from a disease-speci c measure of FACT-G for cancer patients in cost-utility analysis. These studies show that mapping models focus more on the selection criteria of model performance and prediction ability. The most appropriate model depends on the data and the way they are applied. Some mapping studies have reported that the predictive performance of disease-speci c measures was achieved by using statistical criteria and standards based on the MAE, MSE, and RMSE criteria to compare the difference between the predicted and the observed values [2,4,23]. However, although the values of r 2 , MAE, and RMSE were higher than those of previous studies, the overall predictive ability was not satisfactory. We used the mean scores of the FACT-G instead of the patient's prediction when predicting the mean of the EQ-5D utility index [40]. In addition, uncertainties or errors in the economic assessment may affect the accuracy of the utility value, leading to an incorrect estimation of the patient's HRQoL, and further research is needed to assess their impact on the mapping algorithm [2].
Our current study demonstrated that the predictive performance of the FACT-G was effective in the OLS model in the Chinese cancer population. Although the OLS model is a common mapping algorithm and the predicted values are close to the true values, it requires very strict assumptions, namely those of normal distribution and homogeneity of variance. In addition, previous studies have shown that OLS produces a low predictive ability, which will affect its prediction performance. This is similar to previous literature studies that overestimated those with poor health and underestimated those with good health utility values, as shown in Table 6 [15,41]. Therefore, as reported by Zhang et al., the mapping algorithm of cost-utility analysis predicts the average utility index of the general population, rather than at the individual level [25]. When the data distribution is heavily skewed, it is important to consider the proportion of cancer patients who are in poor health.

Page 15/18
There are several limitations to this study. First, the study suffers from a high ceiling effect in the health utility index, which was 62.7% for Chinese cancer patients, leading to a high mean EQ-5D values of 0.9353. This may be due to the smaller proportion of people in poorer health. Moreover, we did not observe a negative EQ-5D utility value, which limits the generalizability of outcomes in more severe patients. That is, the study did not take full advantage of the potential range of scores by the Liu et al. algorithm. Recently published studies have suggested that an increasing number of countries are using the EQ-5D-5L tool as a preference-based measure instead of EQ-5D-3L due to its ability to reduce ceiling effects and sensitivity [42]. In addition, the Chinese value set of EQ-5D-5L was published in 2017 [43]. Second, the EQ-5D-3L value set we used is based on the general Chinese population preference sample, which may not apply to other countries due to cultural and other differences. Therefore, the value set should be based on a mapping algorithm developed according to the country's speci c preferences for economic assessment analysis. Finally, the study collected a relatively small sample and the data consisted of Chinese patients with ve different types of cancer. Therefore, future studies need a larger sample size and external data to verify the generalizability of this study.

Conclusion
We developed a FACT-G to EQ-5D mapping algorithm for the economic assessment analysis of cancer patients in China. The algorithm found that the OLS model, GLM, and Tobit models perform well and have a good predictive ability compared to the observed and predictive EQ-5D utility scores among all the regression models, so the three models preferred to predict EQ-5D values in mapping studies. The ndings in this study may provide policymakers and researchers with references for the economic evaluation of speci c health conditions in cost-utility analysis when estimating the health utilities of cancer patients.

Consent for publication
Not applicable. All results are reported as aggregated data.

Availability of supporting data
The datasets used and/or analyzed during the current study are available.

Competing interests
The authors declare that they have no competing interests. Funding