The Journal of the Economics of Ageing Health misperception and healthcare utilisation among older Europeans

Health misperception can have serious consequences on health. Despite their relevance, the role of such biases in determining healthcare utilisation is severely underexplored. Here we study the relationship between health perception and doctor visits for the population 50+ in Europe. We conceptualise health misperception as arising from either overconfidence or underconfidence, where overconfidence is measured as overestimation of health and underconfidence is measured as underestimation of health. Comparing objective performance measures and their self-reported equivalents from the Survey of Health, Ageing and Retirement in Europe, we find that individuals who overestimate their health visit the doctor 17.0% less often than individuals who correctly assess their health, which is crucial for preventive care such as screenings. In contrast, individuals who underestimate their health visit the doctor more often (21.4% more). Effects are similar for dentist visits, but we find no effects on hospital stays. The results are robust to several sensitivity tests and, more important, to various conceptualisations of the health perception measure. of to


Introduction
Biased perception of one's own ability is a hallmark of human nature. The literature in psychology, economics, and evolutionary biology has repeatedly demonstrated this phenomenon. Zell and Krizan (2014) conducted a meta-synthesis across different scientific areas and concluded that people have only moderate knowledge of their ability. Johnson and Fowler (2011) presented an evolutionary model of one such bias, namely, overconfidence, and the conditions under which it prevails. Over-and underconfidence biases have significant implications for education, labour market outcomes, savings, investment choices, and political decisions (Anderson et al., 2017;Ortoleva and Snowberg, 2015;Reuben et al., 2017). Moreover, the limited research available suggests that perception biases are particularly relevant for health, as they can directly affect risk for accident and injury (Preston and Harris, 1965;Sakurai et al., 2013) and have serious long-lasting effects on wellbeing and mortality. Recent work in this domain also shows that overconfidence is related to engagement in risky health behaviours (Arni et al., 2021), mental health (Nie et al., 2022), and adaptive behaviour during the COVID-19 pandemic (Spitzer et al., 2022).
Despite the relevance of health misperception, its role in healthcare seeking is largely unexplored. Here we study the relationship between misperception of one's own health and healthcare utilisation. We categorise misperception as arising from either overconfidence and underconfidence that is derived from the objective performance measures in the Survey of Health, Ageing and Retirement (SHARE). We analyse differences between subjective and objective health based on individuals' self-reported and tested ability to stand up from a chair. Individuals who subjectively report being able to stand, but objectively are unable to do so, are classified as overconfident, whereas those who subjectively report being unable to stand, but objectively are able to, are classified as underconfident. Individuals who do not differ in their subjective report and objective assessment are classified as concordant. Prior research has shown the chair stand test to be a good predictor of overall objective health (Ferrer et al., 1999;Sainio et al., 2006;Pinheiro et al., 2016;Spitzer and Weber, 2019); nevertheless, we conduct robustness analyses based on alternative conceptualisations of health perception, using differences in subjective and tested cognition and walking ability.
Our study focuses on 15 countries in Europe, which provides an interesting setting for the analysis of health misperception and healthcare utilisation. Utilising health services is conditional on having access to such services; a fair comparison of utilisation requires no significant difference in accessibility among the entities being compared. Universal coverage in European countries ensures that everyone has a certain minimum level of access to the health system, unlike in the United States (OECD and European Commission, 2018). Also, Europe is a policy-relevant setting because of its rapidly ageing population (Lutz et al., 2003;Eurostat, 2019) and fiscal pressures to reduce expenditures and unnecessary care (Christensen et al., 2009;European Commission, 2018).
To assess utilisation, we use data on the annual number of doctor visits, which includes emergency room visits and outpatient clinic visits. Employing count models and a rich set of controls, we find that relative to individuals who achieve concordance (i.e., those who estimate their health accurately), individuals who underestimate their health visit the doctor more often (1.2 more visits per year). In contrast, individuals who overestimate their health visit the doctor less often (1.8 less visits per year). We find similar effects for dentist visits, but not for hospital stays, which often require a referral by a physician. Our results are robust to controlling for differences in health status and frailty as well as other individual characteristics, such as education, age, employment, or marital status. We also run further robustness analyses considering different model specifications, estimation methods, sample constructions, and alternative measures of health perception. The results are robust to these.
The contribution of this study is twofold. First, we contribute to the growing literature that explores individuals' heatlh misperception (Beaudoin and Desrichard, 2011;Coman and Richardson, 2006;Furnham, 2001), analyses heterogeneities (Spitzer and Weber, 2019), and assesses its relationship with outcomes such as health behaviours (Arni et al., 2021;Spitzer et al., 2022). This literature reports variation in health misperception by sociodemographic characteristics like as age (Crossley and Kennedy, 2001;Oksuzyan et al., 2019;Srisurapanont et al., 2017), gender (Merrill et al., 1997;Schneider et al., 2012), country of residence (Capistrant et al., 2014;Spitzer and Weber, 2019), education (Black et al., 2017), and race (Jackson et al., 2017). It also evaluates a closely related yet distinct concept termed 'reporting bias', 'reporting heterogeneity', or 'response error' (Ziebarth, 2010;d'Uva et al., 2008;Jürges, 2007;Choi and Cawley, 2017). Some sub-groups of the population may respond differently to subjective health questions in a systematic manner due to, for example, cultural differences, interpretation of the subjective questions, different reference points in evaluation of own health, age, gender, and education. Since empirically, distinguishing misperception from reporting bias is not an easy exercise (Arni et al., 2021), we acknowledge the possibility of such an alternative interpretation of the misperception measure, i.e. misreporting, and allow a broader definition incorporating overand underconfidence.
Second, the paper also contributes to the literature on the determinants of healthcare use. The difference between subjective and predicted survival probability affects healthcare utilisation (Bíró, 2016a), and individuals with higher expected longevity are more likely to go for cancer screening (Picone et al., 2004), suggesting that health perception affects healthcare utilisation. However, in explaining variation in health expenditures and healthcare utilisation, this literature focuses on either the supply side (i.e., provider confidence and precision) (Baumann et al., 1991;Berner and Graber, 2008;Cutler et al., 2013;Meyer et al., 2013) or easily observable demand characteristics (e.g., age, gender, income, social class, employment and education) (Bíró, 2013;Cameron et al., 2010;Tavares and Zantomio, 2017;Vallejo-Torres and Morris, 2013;Doorslaer et al., 2004;Zhang et al., 2018). Our paper makes a novel contribution by extending this research to assess a difficult-to-observe demand side variable that has consistently been shown to affect health.
The remainder of this paper is structured as follows. In Section ''Data and descriptive statistics'', we describe the data and variables. In Section ''Method'', we introduce our methodology. Section ''Results'' presents and discusses the results along with heterogeneity analyses and robustness tests, and Section ''Conclusion'' concludes the paper. Additional summary and output tables are provided in the Supplementary material.

Data and descriptive statistics
We analyse the relationship between health misperception and healthcare utilisation using the SHARE survey, a representative crosscountry panel study of noninstitutionalised individuals aged 50 and older (Börsch-Supan et al., 2013). The survey provides rich information on health, health care utilisation, socioeconomic background, and social networks based on about 380,000 interviews. It is particularly well suited for studying European countries, as the data are ex-ante harmonised. Also, because it focuses on older individuals, who generally have higher healthcare needs than the young, it is the ideal data source for our analyses. SHARE was previously used to analyse healthcare utilisation by, among others, Bíró (2014), Bolin et al. (2009), Paccagnella et al. (2013, and Tavares and Zantomio (2017). Details on the sampling design and response rates are, for example, provided by Luca and Rossetti (2018), Bergmann et al. (2017), and Lynn et al. (2013).

Sample construction
The main analyses are based on SHARE Wave 2 (2006/2007) and Wave 5 (2013), because these waves include the chair stand test, which we use to determine our main measure of over-and underconfidence. For robustness, we use additional measures of health perception from other waves (Section ''Different dimensions of health perception''). After excluding all observations based on proxy respondents, a sample of 84,743 observations from 15 European countries remains, namely, Austria, Belgium, Czechia, Denmark, Estonia, France, Germany, Italy, Luxembourg, The Netherlands, Poland, Slovenia, Spain, Sweden, and Switzerland. For sensitivity analyses, we drop individuals with Alzheimer's disease, dementia, or another serious memory impairment, as their survey answers might not be reliable (Section ''Accounting for response reliability'').

Outcome variables: healthcare utilisation
In line with the literature, we use the annual number of doctor visits as our main measure of healthcare utilisation (see d 'Uva and Jones, 2009;Bíró, 2016b;Bolin et al., 2009;Lugo-Palacios and Gannon, 2017;Tavares and Zantomio, 2017;Zhang et al., 2018, among others). By analysing this number, we are able to capture the relationship between health perception and public expenditures, as doctor visits are frequently subsidised by the public. In addition, doctor visits are good indicators of healthcare seeking in general, and preventive healthcare and screenings in particular.
The annual number of doctor visits is ascertained by answers to the following question: ''Now please think about the last twelve months. About how many times in total have you seen or talked to a medical doctor or qualified/ registered nurse about your health? Please exclude dentist visits and hospital stays, but include emergency room or outpatient clinic visits''. The healthcare utilisation measure thus includes emergency and outpatient visits. The survey question is phrased almost identically in Waves 2 and 5; however, the words ''or qualified/registered nurse'' are excluded in Wave 2. We thus run separate estimations for each wave as a sensitivity analysis (Section ''Consistency across different waves''). The number of doctor visits is top-coded at 98 visits per year. On average, individuals in our sample visit the doctor 7.1 times per year. The median, however, is lower (4 times), which demonstrates the variable's strong right-skewness (Table 2).
In addition to doctor visits, i.e. outpatient visits, we also analyse the effect of health perception on inpatient visits. They are operationalised as the number of times a survey participant has been a patient in a hospital overnight during the last twelve months, as well as the number of nights spent in hospital altogether during the last twelve months. We expect health believes to be less important in determining this type of healthcare utilisation, as they often require a referral by a physician who evaluates whether a hospital stay is necessary or no. Furthermore, we analyse the effect of health perception on dental care. SHARE does not survey the number of yearly dentist visits but only asks ''During the last twelve months, have you seen a dentist or a dental hygienist?'', thus only allowing for a binary outcome variable. For this reason, our main analysis focuses on doctor visits rather than dental care.

Explanatory variable: health perception
Following the literature in psychology, our measure of misperception relates to the most common interpretation of over-and underconfidence, namely, over-and underestimating one's performance, actual ability, chance of success, or level of control (Moore and Healy, 2008). Assuming an underlying true level of health, we group individuals according to their perception of their health status. More specifically, we differentiate among individuals who perceive their health status correctly (concordance), those who believe that they are healthier than they really are (overestimation), and those who believe that they are unhealthier than they really are (underestimation). The true level of health is proxied by objective performance measures data based on physical tests. This objective information about the respondent's health is matched with the respondent's subjective assessment of his or her health, thus revealing whether that individual's beliefs are correct or no-this is a common way to measure misperception and has been previously used in Arni et al. (2021) or Ning et al. (2016). SHARE provides several objective performance measures that can be utilised as proxies for true health. The measure most suited to analysing differences between objective and subjective health is the ability to stand up from a chair, as this self-assessed variable relates most directly to its tested equivalent. This measure has been introduced recently by Spitzer and Weber (2019). In the robustness analyses, we show that the misperception measure is consistent across time (Sections ''Panel analysis and consistency of the health perception over time'' and ''Consistency across different waves'') and different dimensions of health perception, i.e. the ability to stand up from a chair, cognition and walking ability (explained in detail in Section ''Different dimensions of health perception'').
To evaluate subjective ability to get up from a chair, survey participants are first asked whether they have difficulties getting up from a chair after sitting for long periods. Fig. A.1 in the Supplementary material provides the detailed survey question. Unfortunately, it is not possible to excludes individuals with temporary limitations only, but we control for a set of health shocks that might result in such temporary limitations. Individuals are considered subjectively impaired if they report difficulties getting up from a chair, and subjectively unimpaired if they do not. Overall, 16.7% (unweighted: 17.3%) of the survey participants in our sample are considered subjectively impaired.
In the objective assessment, individuals are asked to physically stand up from a chair. The chair stand test is introduced with the interviewer saying, ''The next test measures the strength and endurance in your legs. I would like you to fold your arms across your chest and sit so that your feet are on the floor; then stand up keeping your arms folded across your chest. Like this . . . '' Following this, the interviewer makes sure that it is safe for the participant to try the chair stand test by asking ''Do you think it would be safe for you to try to stand up from a chair without using your arms?'' The exact sequence of questions leading to the chair stand test is shown in Supplementary  Fig. A.2. We use information from the single chair stand test, for which participants are asked to stand up only once. Individuals are considered objectively unimpaired if they stand up without using their arms and objectively impaired if they are not able to stand up from the chair, if they have to use their arms to stand up, or if they think it is unsafe to try to stand up from the chair. We provide sensitivity analyses for which individuals that have to use their arms are considered objectively unimpaired, since the subjective question does not directly refer to the usage of arms, and also utilise the repeated chair stand test for robustness, considering only those as unimpaired who are able to stand up from a chair five times in a row (Section ''Sensitivity to different specification of objective impairment''). In Wave 2, the chair stand test was only conducted among those younger than 76 years. We provide wave-specific estimations to analyse whether this restriction affects the results (Section ''Consistency across different waves''). Overall, 18.1% of the survey participants in our sample are considered objectively impaired.
Following the subjective report of impairment (i.e., unimpaired or impaired) and the subsequent objective test, individuals can either achieve concordance, overestimate their own health, or underestimate their own health. If participants subjectively report being unimpaired but are objectively impaired, they overestimate their health. Likewise, if they subjectively report being impaired but are objectively unimpaired, they underestimate their health. Although the categorisation of over-and underestimation is straightforward, the categorisation of concordance (i.e., accurate beliefs about their health status) requires further consideration. Given true (objective) health, it is important to distinguish between two types of concordance. Individuals with a poor health status (i.e., objectively impaired) are classified as ''negative concordance'' if they also subjectively report being impaired. Likewise, individuals with a good health status (i.e., objectively unimpaired) are classified as ''positive concordance'' if they also subjectively report being unimpaired. The four health perception outcomes are shown in Table 1.
Distinguishing between the two types of concordance ensures that we use the appropriate reference category for over-and underestimation in regression analyses. Overestimation can only be measured in the group whose objective health is impaired yet who subjectively report being unimpaired. Therefore, an appropriate group of individuals to compare to are those who are also objectively impaired (i.e., negative concordance). Underestimation can only be measured in the group whose objective health is unimpaired yet subjectively report being impaired. The appropriate comparator for these individuals is the group that is also objectively unimpaired (i.e. positive concordance). This separation of the concordance group also provides an important empirical advantage; it ensures that we compare like with like in terms of true initial health, thereby controlling for one source of endogeneity, namely variation in health that can determine utilisation. Nevertheless, in our estimations, we control for different health measures including a composite indicator of frailty, which is particularly relevant for older individuals. Given that the objective assessment is performed towards the end of the survey, which on average takes 67 min in Wave 2 and 76 min in Wave 5 (Jürges, 2005;Bristle, 2015), it is reasonable to presume that respondents were sitting for a long period of time before being asked to stand up from a chair. Therefore, we argue that the objective measure captures to a large degree what the subjective question assesses. We acknowledge, however, that the subjective question refers to difficulties in getting up from a chair, while the chair stand test assesses the ability to stand up from a chair. This discrepancy does not affect overestimating, since individuals that report no difficulties in getting up from a chair but are unable to stand up from a chair during the test are clearly overestimating. The mismatch might, however, affect underestimating if individuals are able to stand up from the chair, but with difficulties. We thus provide two robustness analyses to account for this discrepancy. First, as mentioned above, we consider individuals that have to use their arms to stand up from the chair -i.e. have difficulties -as objectively unimpaired (Section ''Sensitivity to different specification of objective impairment''). Second, we analyse additional health perception measures based on differences in subjective and tested cognition and walking ability (Section ''Different dimensions of health perception'').
Another concern that may arise is related to the objective measure used, which might be prone to error and can impact the subjective assessment of the ability to stand up from the chair. This might be particularly problematic if the single chair stand test we rely on is unable to capture all the variation in true health. If that is indeed the case, then the underestimating group may be more impaired than captured by the single objective assessment. The analysis requires the assumption that there is no correlation between the error in the objective assessment and the subjective reporting of the same. We acknowledge that this assumption may be implausible. To address this concern, we conduct further robustness tests using the repeated chair stand test (Section ''Sensitivity to different specification of objective impairment''). This is similar to the single chair stand test, however, the respondent is asked to get up from the chair five times in a row.
As shown in Table 1, in the objectively impaired group, 57.0% overestimate their health status; in the unimpaired group, only 12.1% underestimate. The large number of people reporting overconfidence is not surprising, as it has been documented in psychology and evolutionary theory as being favoured by natural selection and providing adaptive gains. Individuals tend to be overconfident because it increases morale and ambition and may thus improve potential (Johnson and Fowler, 2011). Furthermore, our sample consists of older people, among whom overconfidence is particularly prevalent (Idler, 1993;Spitzer and Weber, 2019) and is seen as a resilience strategy to maintain a positive self-image (Brandtstädter and Greve, 1994).
A detailed description on how overestimation, underestimation and concordance based on the chair stand test vary across sociodemographic groups is provided elsewhere in Spitzer and Weber (2019). Overall, concordance decreases with age and is higher among the welleducated. Southern Europeans are more prone to overestimating their health, while underestimating is more common in Central and Eastern European countries. Differences between genders are less pronounced but indicate that women are more likely to underestimate their health than men.

Additional control variables
Ideally, we would randomly assign health perception to individuals to elicit causal effects of (mis)perception on healthcare utilisation. In the absence of such random assignment, we control for a rich set of variables to account for confounding effects. Summary statistics for these control variables are provided in Table 2, and cross-tabulations of control variables, doctor visits and health perception are provided in Supplementary Tables A.1 and A.2.
Most important, we control for other health factors, since they directly affect healthcare utilisation. In particular, we include the number of activities of daily living (ADLs) and limitations in instrumental activities of daily living (IADLs). ADLs that we consider are difficulties dressing, walking across a room, bathing or showering, eating and cutting up food, getting in or out of bed, and using the toilet. IADLs include difficulties using a map, preparing a hot meal, shopping for groceries, making a telephone call, taking medications, doing work around the house or garden, and managing money. We also account for frailty by including a adaptation of the well-established indicator introduced by (Fried et al., 2001). Based on this indicator, individuals are considered frail if they suffer from three or more of the following dimensions: exhaustion, weakness, slowness, shrinking and low activity levels. The indicator was adapted for SHARE by (Santos-Eggimann et al., 2009), and we follow exactly their operationalisation, except one deviation: Instead of a binary variable, we operationalise frailty using a score that ranges from zero to five, depending on the number of dimensions an individual suffers from. Finally, we add indicators for chronic conditions and health shocks-in particular, heart problems, high blood pressure or hypertension, high blood cholesterol, strokes or cerebral vascular disease, diabetes, chronic lung diseases, cancer, stomach or duodenal ulcers, Parkinson's disease, cataracts, hip or femoral fractures, other fractures, and Alzheimer's disease.
We also control for sociodemographic characteristics, as they are expected to influence health perception as well as healthcare utilisation (Avitabile et al., 2011;Lange, 2011;Spitzer and Weber, 2019). In particular, we include an interaction term for gender and age, as well as educational attainment according to the International Standard Classification of Education (Eurostat, 2018). Because pensioners appear to have higher healthcare utilisation (Bíró, 2016b;Zhang et al., 2018), we also consider whether an individual is retired as opposed to all other employment options (employed, self-employed, unemployed, permanently sick or disabled, homemaker, other). Furthermore, we control for whether the survey participant is married or in a registered partnership as opposed to never married, divorced, or widowed. In robustness analyses, we also account for differences in access to healthcare as well as supplementary health insurance (Sector ''Robustness to further controls'').
The effects of economic resources on healthcare utilisation are considered via equivalised household income. Because there are many missing values for household income in SHARE, the data set comes with two additional imputed variables. We use one of these imputed variables in our model and conduct a robustness analysis with the other (Section ''Robustness to different specification of the income variable''). We equivalise household income by using the square root scale, in which household income is divided by the square root of household size. Using the OECD equivalence scale is not feasible, as children cannot be identified unambiguously. Furthermore, we use a cube root transformation to normalise the skewed income distribution (Cox, 2011). Standard log normalisation is not feasible because of the substantial number of zero values; however, results are robust to dropping observations with zero values in household income. Moreover, we run a robustness analysis in which we use equivalised household income that was not normalised (Section ''Robustness to different specification of the income variable'').

Method
The main outcome variable -annual doctor visits -is strongly skewed to the right, yet without severe mass at zero. To accommodate this, we use a negative binomial model with mean dispersion, which is used frequently in the healthcare literature. We refrain from using a simple Poisson model, as the variance in the outcome variable is much larger than its mean. However, we perform robustness analyses using different models commonly used in the healthcare literature (Supplementary Figs. A.3 and A.4) as well as Ordinary Least Squares. Thus, the number of doctor visits of individual at wave (DOCTOR , ) is assumed to follow a Poisson distribution but with a negative binomial specification for which each individual unit has a separate, Gamma-distributed mean. More specifically, and exp( ) ∼ Gamma(1∕ , ) HEALTH PERCEPTION is a binary variable that indicates whether individual achieves concordance or misperceives his or her health at wave . The vector HEALTH includes the number of ADLs and IADLs in period , the frailty score from period , and a range of indicators for chronic conditions and health shocks. The vector of control variables X , includes an interaction term between the individual's gender and age, educational attainment, retirement and marital status, household income, and control dummies for the survey wave as well as for the country of residence. The terms , , and represent coefficients. When analysing the effect of health perception on dental care, we use logistic regressions, since visits to the dentists are operationalised using a binary variable that indicates if the observation has seen a dentist or a dental hygienist in the last twelve months.
As discussed earlier, the sample is split into individuals who are overconfident (i.e., overestimate their health status) and individuals who are underconfident (i.e., underestimate it). The regression coefficients are therefore interpreted relative to those who estimate their health correctly (i.e., achieve concordance). For heterogeneity analyses, we further split the sample by gender, country, and number of chronic diseases (Section ''Heterogeneity of effects'').
Health perception is expected to affect healthcare utilisation, but the opposite mechanism, that healthcare utilisation precedes health perception, appears plausible too. For example, individuals who frequently visit the doctor are more likely to achieve concordance, as they receive more information about their health status. To demonstrate the robustness of our results concerning this issue, we conduct a robustness test where we analyse the relationship between current health perception (wave ) and future healthcare utilisation (wave + 1) (Section ''Accounting for circular effects'').
Despite the rich set of controls that we utilise, scope for unobserved individual heterogeneity remains and can still affect both health perception and healthcare utilisation, or the reporting of healthcare utilisation. Since we do not exploit exogenous variation in health perception, it is impossible to account for all such unobserved variation that can potentially bias our estimates, which remains a potential limitation of the empirical analysis. Nevertheless, we run further robustness tests using SHARE's panel dimension and provide within-estimations based on individual fixed effects; while these do not fully address concerns related to unobserved confounders and therefore do not imply causality, they do grant further confidence in the relationship between the outcomes and independent variable of interest (Section ''Panel analysis and consistency of the health perception over time''). The main analysis is, however, based on cross-sectional data for two reasons. First, the chair stand test -and all other potential objective measures -are only collected in two waves, most of which are not consecutive. Thus, the panel consists of two time points only, with up to four years in between. Only 18.3% of the observations participate in both waves. Second, health perception appears time constant between the two waves, hence inclusion of individual fixed effects leaves little variation in health perception. We therefore rely on cross-sectional estimates to draw our main conclusions.

Results
Healthcare utilisation is measured by the annual number of doctor visits. Supplementary Table A.1 shows that overall, individuals who overestimate their health have fewer doctor visits (8.3 visits) compared to their reference group (i.e., negative concordance = 13.9 visits). Similarly, those who underestimate their health have significantly more doctor visits in a year (10.4 visits) compared to their relevant reference group (i.e., positive concordance = 5.8 visits). Table A.3 in the Supplementary material shows the regression results. Columns 1 and 3 provide baseline results, Columns 2 and 4 show the main results for the two groups (i.e., overestimators and underestimators categorised based on the objective health status as impaired or unimpaired). All coefficients are to be interpreted relative to the concordance category.
We find a strong and significant association between health misperception and healthcare utilisation. Individuals who underestimate their health visit the doctor 21.4% more often than individuals who achieve concordance (100 × (exp(0.194) − 1)). Computing average marginal effects shows that this results in 1.2 additional doctor visits per year (Fig. 1). Tables A.5 in the Supplementary Material shows results with the full set of controls in a stepwise manner where we first add demographic controls such as age and gender, and their interaction, followed by several health indicators including ADLs, IADLs, frailty, and specific diagnosed diseases such as stroke or cardiac conditions. Finally, we add education, income, retirement status, and marital status. It is interesting to note that the coefficient size decreases only slightly when demographic variables are controlled, however, it drops by half in magnitude as soon as the health indicators are included and remains stable even when further controls are added. This indicates a strong potential role of health status as a mediator in the relationship between health perception and healthcare utilisation. The other control variables do not affect the point estimates significantly. The control variables themselves have coefficients that are statistically significant and in the expected direction.
We also find a strong and significant link between overestimation and the annual number of doctor visits. Individuals who overestimate their health go to the doctor less often than those who achieve concordance. Overestimating health results in 17.0% fewer doctor visits compared to perceiving one's health correctly (100 × (exp(−0.186) − 1)). The average marginal effect of overestimating health on healthcare utilisation is 1.8 fewer doctor visits per year (Fig. 1). Similar to the underestimation results, Table A.6 in the Supplementary Material shows results with the full set of controls in a stepwise manner. While the addition of demographic controls does not change the coefficient size, it decreases slightly when ADLs and IADLs are added in the regression, and decreases further when the frailty score is included. It drops by more than half in magnitude when the full set of health variables are added and then remains quite stable, again alluding to the potential role of health status as a mediator in the relationship between health perception and healthcare utilisation. While we do control for as many health variables as the data allow, we cannot rule out unobserved heterogeneity of other forms. The results are therefore to be interpreted as associations and do not infer causality.
The effect of health perception on dental care is similar to that described above: individuals that underestimate their health are more likely to visit the dentist, while individuals that overestimate their health are less likely to go to the dentist (Table A.4, Columns 1 and 2). We find, however, no statistically significant effects of health believes on the number of inpatient stays (Table A.4, Columns 3 and 4) or the overall number of nights spend in hospitals (Table A.4, Columns 5 and 6). These results confirm our prior: since a referral by a physician is often needed for hospital stays, health believes may be less important for this type of healthcare utilisation.

Heterogeneity of effects
We assess the heterogeneity of our main results in several ways. In particular, we consider gender differences, country specificities, and differences by health status. Overall, results do not differ across the analysed groups (Fig. 1).

Gender differences
The literature has shown differences in health perception by individual characteristics, most important by gender (Merrill et al., 1997;Schneider et al., 2012). Gender differences in the association between health beliefs and healthcare utilisation may partly explain the well-documented differences in healthcare seeking between men and women, as men tend to have lower healthcare use (Galdas et al., 2005;Mansfield et al., 2003;Schlichthorst et al., 2016). Thus, we assess whether the relationship between health (mis)perception and utilisation also differs between men and women. As noted above, Supplementary Table A.1 shows that, overall, women have slightly more doctor visits annually compared to men; this is true also within the misperception category, but the difference is not large. Furthermore, both under-and overestimating men and women have more doctor visits relative to their respective concordant comparators.
Regression analyses by gender reveal that the association between health misperception and the annual number of doctor visits is slightly larger in magnitude for men than for women (Supplementary Table  A.7). A Wald test, however, reveals that the coefficients for women and men are not statistically different from each other. This is also shown in Fig. 1.

Country specificity
Differences in reporting behaviour by country are well documented (Capistrant et al., 2014;Jürges, 2007;Spitzer and Weber, 2019). To ensure that our findings are not driven by differential reporting due to cultural biases in reporting health or the oversampling of certain countries in the SHARE survey, we rerun our analyses for each country separately. By and large, we find similar results for all countries, for both under-and overestimation, with the exception of a few countries for which we do not find statistically significant results because of small sample sizes (Supplementary Tables A.8 and A.9).

Differences by health status
The descriptive statistics in Supplementary Table A.2 indicate a slight decrease in concordance as the number of chronic diseases increases; however, this trend is far from obvious and might also be due to the correlation between health and age. To disentangle these effects, we run separate regressions for those individuals who do not have any chronic diseases (healthy) and those who report one or more chronic diseases (unhealthy). The results are reported in Supplementary Table  A.10 and shown in Fig. 1. Although health perception is associated with the number of doctor visits of impaired individuals with and without chronic diseases similarly, underestimation has a stronger association with those without chronic diseases than on those with chronic diseases-this is confirmed by a Wald test. Because we categorise based on health (in other words, fix health at the same level) we can conclude that the results are not driven by health differences: Both the healthy group's and the unhealthy group's healthcare utilisation is affected by their health perception in the same direction.

Different dimensions of health perception
For the main analyses, health perception is operationalised based on tested and self-reported ability to stand up from a chair. In this subsection, we show that the misperception measure is consistent across different dimensions of health perception, i.e. the ability to stand up from a chair, cognition and walking ability (Fig. 1).
Cognition. Similar to previous work, we use the difference between subjective and objective cognition as an additional measure of health perception (Spitzer and Weber, 2019). Objective cognition is operationalised based on a memory test, which is conducted in Waves 4 to 6. In particular, individuals are asked to recall a list of 10 words in any order within a minute.
Subjective cognition is based on the question ''How would you rate your memory at the present time?'' which is answered on a Likert scale with the categories excellent, very good, good, fair, and poor. Because the subjective cognition variable has more than 80% missing values in Wave 6, we only utilise data from Waves 4 and 5. Hence, the estimates for cognition are based on a different sample.
Defining cognitive impairment is not as straightforward as defining the ability to stand up from a chair. Whereas the chair stand variables are binary and therefore clearly indicate whether an individual is impaired, both the subjective and objective cognition variables are categorical. Thus, we rely on previous literature to define the threshold marking cognitive impairment. Participants are considered objectively impaired if they recall three words or fewer (Grodstein et al., 2001;Purser et al., 2005). In addition, in robustness analyses, individuals are considered impaired if they recall two words or fewer. Individuals are considered subjectively impaired if they report having a fair or poor memory (Gardner et al., 2017).
Supplementary Tables A.12 provides regression results for this new specification of health perception and average marginal effects are shown in Fig. 1. The results confirm our earlier findings. Individuals who underestimate their cognitive ability are more likely to visit the doctor than individuals who achieve concordance between objective and subjective measures of cognition. By contrast, survey participants who overestimate their health have fewer annual doctor visits than those who achieve concordance. Modifying the threshold for objective impairment from three to two words changes the magnitude of the coefficient for overestimation but not its sign. The magnitude of the coefficient for overestimation remains virtually identical.
Overall, health perception measures based on the ability to stand up from a chair and cognition yield strikingly similar results also on the individual level. In Wave 5, both the chair stand test and the memory test were conducted, thus allowing us to compare outcomes for both measures for each participant. A total of 64.2% of the participants in Wave 5 has identical, non-missing outcome for health perception based on the ability to stand up from a chair and cognition. Combining the impaired and unimpaired groups for Wave 5, 81.2% (72.7%) achieve concordance based on the ability to stand up from a chair (cognition), 8.9% (7.7%) overestimate their health and 10.0% (19.6%) underestimate their health.
Walking ability. We also operationalise health perception based on walking ability. Objective walking ability is based on a walking speed test in which participants have to walk a distance of 2.5 m. Individuals are considered objectively impaired if their walking speed is 0.4 m per second or slower. This threshold is in line with the previous literature (Jürges, 2007;Steel et al., 2003). Because the test is only conducted in Waves 1 and 2, the analysis is restricted to those waves (Börsch-Supan, 2019). The walking speed test is supposed to be conducted only for individuals older than 75 years. However, the data set includes information for those 75 and younger too. The variable has many missing values (∼90%) and thus needs to be handled with caution.
Subjective walking impairment is based on the following question: ''Please look at card [. . . ]. We need to understand difficulties people may have with various activities because of a health or physical problem. Please tell me whether you have any difficulty doing each of the everyday activities on card [. . . ]. Exclude any difficulties that you expect to last less than three months''. Participants are coded as having subjectively impaired walking ability if they report difficulty walking 100 m.
When analysing health perception based on walking ability, we do not control for frailty, as the ability to walk 100 m is considered in the frailty measure. Also, the second imputed income variable is used for this analysis, as the first one is not available in Wave 1. The robustness analysis in Section ''Robustness to different specification of the income variable'' shows, however, that both income variables produce the same results.
Of those with objectively impaired walking ability, 14.9% overestimate their health, while 8.6% of those with objectively unimpaired walking ability underestimate their health. Walking ability is never collected for the same group of participants at the same time as the chair stand test. However, for a small number of participants (1244 observations), we can compare health perception based on walking ability from Wave 2 with health perception based on the ability to stand up from a chair in Wave 5. A total of 63.8% of all participants have the same outcome for health perception on either measure. Combining the impaired and unimpaired group shows that 81.2% (81.4%) achieve concordance based on the ability to stand up from a chair (walking ability), 8.9% (11.3%) overestimate their health and 10.0% (7.2%) underestimate their health.
Results for the association between health perception and the annual number of doctor visits based on walking ability are provided in Supplementary Table A.11 and Fig. 1. The coefficients in Supplementary Table A.11 confirm once again that individuals who underestimate their health have more annual doctor visits than those who assess their health correctly. The results also show that those who overestimate their health have fewer doctor visits. Thus, our results are robust to different specifications of health perception.

Accounting for response reliability
We account for response reliability and exclude anyone diagnosed with Alzheimer's disease, dementia, or another serious memory impairment, as their survey answers might not be reliable (Supplementary  Tables A.13 and A.14, Column 2). The results remain robust, perhaps also because the number of individuals observations with severe cognitive impairments in the survey is small.

Consistency across different waves
We also separate the sample by survey wave to explore whether the slight change in the phrasing of the survey question about doctor visits in Wave 5 (Section ''Outcome variables: healthcare utilisation'') or the restriction of the chair stand test to those younger than 76 years in Wave 2 (Section ''Explanatory variable: health perception'') affect the results. The estimates in Fig. 1 and Supplementary Table A.15 reveal that the association between health misperception and healthcare utilisation is slightly stronger at Wave 5 than at Wave 2; however, the difference is not statistically significant according to a Wald test.

Panel analysis and consistency of the health perception over time
From a total of 85,207 survey respondents, only 18.3% participated in both Waves 2 and 5. Based on this subsample, we can explore how health perception varies over time. Descriptive analyses show that health perception is constant between Wave 2 and Wave 5 for the large majority of observations. Only 35.4% of the individuals change their health perception over time, while the remaining 64.6%. individuals have time-constant beliefs about their health. The withinindividual variation of health perception between Waves 2 and 5 for the unimpaired group is only 0.20; for the impaired group, it is even smaller (0.08). Speculatively a, potential driver of changes in health perception could be health shocks like heart attacks, strokes, hip fractures and cancer. Such health shocks, however, rarely occur between Waves 2 and 5 for the analysed participants. For example, the within-individual variation for heart attacks and hip fractures in the unimpaired group is only 0.08 and 0.03 respectively. For the impaired group, the within-individual variation is only 0.05 and 0.04 respectively.
Despite the low number of individuals that participated in both Waves 2 and 5 and the little variation in health perception within individuals between the two waves, we provide within-estimations based on individual fixed effects as robustness analyses. They allow us to consider the possibility that unobserved individual heterogeneity could affect both health perception and healthcare utilisation, or the reporting of healthcare utilisation. Results based on a fixed effects negative binomial model as well as a fixed effects Poisson model are provided in Supplementary Tables A.13 and A.14, Columns 3 and 4. Both models lead to estimates that show the expected sign, i.e. underestimation increases doctor visits and overestimation decreases doctor visits. Standard errors are, however, large and results thus not statistically significant for the unimpaired sample based on the Poisson model and for impaired sample based on both models. Most of the other coefficients are also not statistically significant. Future work based on longer panels and more observations might enable more reliable analyses based on within estimators.

Sensitivity to different specification of objective impairment
In the main analysis, participants are considered physically impaired if they have to use their arms to stand up from the chair. For robustness, we apply a more lenient threshold for which individuals are not considered physically impaired when they have to use their arms. The estimated coefficients are virtually identical under the new specification of objective impairment (Supplementary Tables A.13 and A.14, Column 5). We also find nearly identical results using the repeated chair stand test for the operationalisation of objective impairment (Tables A.13 and A.14, Column 6).

Robustness to further controls
We also conduct robustness analyses by accounting for healthcare accessibility and supplementary health insurance-these controls are, however, only collected in Wave 5 and thus robustness analyses are based on this Wave 5 only (Supplementary Table A .16). First, we analyse whether differences in access to healthcare is associated with the number of doctor visits. For this, the household respondent is asked ''How easy is it to get to your general practitioner or the nearest health center? Would you say it is very easy, easy, difficult or very difficult?'' We dichotomise the variable by comparing the first two and the last two possible answers and add it to the model (Columns 2 and 5). The coefficients show, however, that the results do not depend on access to healthcare. Second, we investigate whether the results are robust to individuals purchasing supplementary health insurance. Although supplementary insurance increases healthcare utilisation (Moreira and Pita, 2010;Paccagnella et al., 2013), we find no significant changes in our results (Columns 3 and 6) when controlling for this variable.

Robustness to different specification of the income variable
We operationalise economic resources in various ways for robustness analyses (Supplementary Tables A.13 and A.14). First, we exchange the first imputed income variable provided by SHARE with the second imputed income variable (Column 6), and we use income that is not normalised with the cube root method but only equivalised (Column 7). These adjustments have no effects on the results. We also replace income with wealth (Column 8), and the results remain robust.

Accounting for circular effects
We provide robustness analyses to account for potential bias due to reverse causation. Health perception is expected to affect healthcare utilisation, but the opposite mechanism, that healthcare utilisation precedes health perception, appears plausible too. For example, individuals who frequently visit the doctor are more likely to achieve concordance, as they receive more information about their health status. To address this revers effect, we analyse the association between current health perception (wave ) and future healthcare utilisation (wave + 1) (Supplementary Tables A.13 and A.14, Column 9).
For this sensitivity analysis, the dependent variable is taken from wave + 1, which is why we drop all observations that do not provide information on doctor visits at wave + 1. This affects 32% observations in the unimpaired sample and 39% in the impaired sample. The subsequent wave for Wave 2 is Wave 4, because SHARE Wave 3 focuses on people's life histories and thus does not provide the variables needed for our analysis. The subsequent Wave for Wave 5 is Wave 6. All explanatory variables are taken from wave . Results based on this sensitivity analysis confirm the initial findings.

Different estimation methods
The results for doctor visits in Supplementary material

Conclusion
We utilise rich data from 15 European countries from SHARE to explore the association between health (mis)perception and healthcare utilisation. We operationalise misperception as arising due to overconfidence or underconfidence. Following the literature in psychology, overconfidence is measured as overestimation of one's health, whereas underconfidence is defined analogously as underestimation of one's own health. Healthcare utilisation is measured as the annual number of doctor visits. Our results based on count models suggest that individuals who underestimate their health visit the doctor more often than those who assess their health correctly. By contrast, survey participants who overestimate their health visit the doctor less often. Similar effects were found for dental care, but not for inpatient visits.
While our data and empirical approach limit a causal interpretation of our findings, we control for a rich set of demographic and health indicators that could potentially confound our results. Our results are also robust to a range of sensitivity analyses with different model specifications, sample compositions, and estimation methods. In addition, we exploit the panel structure of our data and conduct a individual fixed-effects analysis to further demonstrate the systematic relationship between misperception and utilisation.
Beside the limitation around a causal interpretation of our result, a second important limitation of this study is related to the fact that not more tested health variables are available, meaning the control variables are based on self-reports too and might thus be object to misperception. Moreover, tested health is not available in every survey wave, which prevents us from building a long-enough panel and providing reliable within-individual estimations. Future work could fruitfully explore the long-term effects of health misperception on healthcare utilisation, for example, by exploiting national panel data collected over a longer period of time than SHARE data. Longer panel would also enable to further analyse the interdependencies between health status, health perception, and healthcare utilisation, as our analyses have identified health status as an important mediator for the effect of health believes on doctor visits. Furthermore, as discussed in Section ''Explanatory variable: health perception'', there is a slight discrepancy between the subjective and objective health dimension that could affect underestimating. We account for this discrepancy, however, by providing robustness analyses based on different specifications of objective impairment as well as alternative conceptualisations of health perception, using differences in subjective and tested cognition and walking ability.
Finally, as outlined earlier, an error in the objective measurement allows an alternative interpretation of our findings. Specifically, individuals that are objectively unimpaired, i.e., able to perform the chair stand test once but report subjective impairment, might actually be more unhealthy than what the single trial captures. In this case, the higher utilisation of this group might simply be due to underlying health differences. While we do split our sample according to tested health, control for several indicators of health, and show robustness using the repeated chair stand test, we cannot completely rule out other unobserved differences in health. 1 Nevertheless, the results of this paper are informative for policy making. First, addressing rising health expenditures has been a top priority on policymakers' agenda in many countries. Excessive hospital admissions use more than 37 million bed days across the European Union every year, significantly increasing public expenditures (OECD and European Commission, 2018). Containing sources of waste and inefficiency in healthcare on either the demand or supply side is important in this regard. Our paper provides new insights, highlighting the role of demand-side misperception in determining health expenditures. While the implications are mixed, our findings provide an opportunity to explore these further. If individuals that believe they are unhealthier than they actually are visit the doctor more frequently, all else equal, such visits may be unnecessary and hence increase costs to the public system. However, individuals that underestimate health may also end up having earlier screening and diagnosis of diseases due to frequent doctor visits, and impact costs differently by preventing further deterioration of health. Similarly, individuals that believe they are healthier than they actually are could delay doctor visits even when necessary and turn up sicker in later stages of the illness. While this might result in short term savings, in the long run, health gets worse if left untreated and results in more serious illness leading to higher costs. Overconfident individuals might also be less inclined in using preventative and screening services for early detection of diseases. Arni et al. (2021) show that individuals that overestimate their health are more inclined to exercise less, to eat unhealthily, to drink alcohol daily, and to be obese. In the long run, this can have implications on health itself and therefore result in higher costs to the system. Second, if individuals' own perception of health is what drives healthcare demand beyond actual health and other socioeconomic characteristics, then equipping them through personalised or public health campaigns with the necessary tools and information to accurately assess their health and determine when to seek healthcare is perhaps a valuable long-term strategy for reducing unnecessary healthcare use. This is a particularly relevant measure in countries with ageing populations that suffer from cognitive dissonance and thereby increased health misperception (Brandtstädter and Greve, 1994;Frieswijk et al., 2004;Henchoz et al., 2008;Idler, 1993;Spitzer and Weber, 2019). Reaching out to those who overestimate their health by providing information about the benefits of screening and preventive care might also improve their health and thus prevent suffering and costs in the long run. Initiatives to increase health literacy, such as the National Action Plan on Health Literacy, are already in place in Germany (Vogt et al., 2018). Other countries can follow similar approaches to evaluate health literacy levels and take strategic action to educate people.
Third, waiting time is often used as a non-price rationing measure in healthcare by policymakers (Barzel, 1974;Iversen and Siciliani, 2011). Identifying patients with health misperception and reducing unnecessary visits to the doctor can have important implications for the effectiveness of such rationing mechanisms. Not only will they free up physician capacity, but they can also directly ensure timely care for other patients who are in need of urgent intervention. Moreover, with the advent of artificial intelligence and technology, providing individuals with the option to use online physician chatbots and telephone consultations will further reduce the burden of unnecessary doctor visits due to misperception rather than true health need.
Finally, the study also relates to the ongoing COVID-19 crisis for what concerns individuals underestimation of risk of contracting the disease, overestimation of own immunity levels, and adaptive behaviour (Spitzer et al., 2022). Identifying characteristics of persons over-and underestimating such risks can provide valuable insights for public health campaigns.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.