Minimum sample size for the survey measurement of a wealth-dependent parameter with the UK VPF as exemplar. Measurement

of Measurement of an economic good by opinion survey constitutes a variant of the political opinion polls widely familiar from news reporting. The paper relates the minimum sample size needed for the survey measurement of a wealth-dependent parameter to the smallest sample for a political poll giving the same precision. Measuring a strongly wealth-dependent parameter by survey requires a sample size of (cid:1) 2000 or more to provide precision equivalent to the 3% margin of error customary in UK political opinion polls. It is shown that the survey measurement of the ‘‘value of a prevented fatality” (VPF) used in the UK as a health and safety spending yardstick requires (cid:1) 3000 people to be questioned. The analysis shows the actual sample size used, 167, to be inadequate. This adds to the problems besetting the UK VPF, as the method the surveyors used to interpret their data has already been shown invalid. (cid:1) 2019 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Measurement history in the social sciences
It is widely acknowledged that metrology has played an enormous role in the development of the physical sciences that have seen such success over the last 150 years. Less well appreciated is the fact that measurement has been applied to commerce since the earliest days of man's civilisation [1,2]. Furthermore, increasing attention is now being paid to the processes by which human attributes such as perceptions, judgements and valuations might be measured. There can be no doubt, however, that measuring the properties or attributes of human behaviour is a difficult task.
The economist Edward Glaeser argues [3] that the social sciences are still young, claiming: ''The widespread application of scientific methods to the study of human society-rigorous formal theories, serious empirical testing-occurred only during the twentieth century, mostly since World War II." He highlights the work of ''great measurers" of the 20th century such as the economist Simon Kuznets, whose economic tradition he expects will continue to expand in the coming decades. Kuznets stressed that reliable results can be derived only through large numbers of observations and his Nobel prize winning work in economics provided the key to measuring gross national product [4,5]. The need for large numbers of observations will be a theme to which this paper will return.
But measurement on a large scale in the field of social science dates back at least to the middle of the nineteenth century, when the first public examinations for schools were introduced in the United Kingdom. The move was in response to a demand for some way of judging pupils' levels of attainment [6] and in 1858, the University of Cambridge Local Examinations Syndicate set and marked papers for school students under the ages of 16 and 18 over a wide range of topics. The subjects were similar to those examined today: English Language and Literature, History, Geography, Geology, Greek, Latin, French, German, Political Economy and English Law, Mathematics, Arithmetic, Chemistry, Physical Sciences, Zoology, Drawing, Music and Religious Knowledge. The issues involved in measuring proficiency from the answers to graded questions have been the subject of a number recent research studies [7,8]. Meanwhile a good summary of the historical development of measurement thinking since the 1930s for more general psychological attributes is given in [9]. Deriving from the success of measurement in the physical sciences, a philosophical and theoretical basis was developed for the universal process of measurement [10]. A proposal for a formal theory of a general measurement system [11] has been put forward recently.
The need to determine what can and cannot be claimed for measurements in the social sciences has stimulated the reexamination of the principles of metrology. Stevens made a notable contribution just after the Second World War [12] by analysing scales of measurement and classifying them as nominal, ordinal, interval and ratio. The ratio scale is the most comprehensive of these, in the sense that it can accommodate the four important quantitative relations: equality, rank-order, equality of intervals and equality of ratios. At about the same time, von Neumann and Morgenstern [13] examined the principles of measurement in relation to the concept of utility widely used in economics then and now. They concluded that utility could be claimed correct to a positive linear transformation -Stevens's interval scale. Subsequently Thomas [14,15] was able to build on this work by including the notions concerning utility introduced by Pratt [16] and Atkinson [17] to show that a ratio scale for utility against wealth can be constructed, with risk-aversion as parameter.
Schley and Peters [18] have proposed that an individual's riskaversion is influenced by the subject's imprecise perception of numerical magnitudes: ''inexact mappings of symbolic numbers onto mental magnitudes". Differences in the curvatures of people's utility functions can then be explained, at least in part, by differing levels of numerical acuity. Those with higher powers of numerical discrimination display an effective risk-aversion that is lower than those with a less well developed ability to distinguish between numbers of a similar but different magnitude. Meanwhile, different situations call for different risk-aversions. Thomas and Chrystal [19] suggest that a consumer will be open to quantity retail promotions such as ''buy one get one free" provided his/her risk-aversion lies between 0 and 1. When commodities such as eggs are sold in different pack sizes, satisfying the desires of the average cautious consumer (risk-aversion = 0.5) will then result in a ratio of successive pack sizes equal to the square of the golden ratio, namely 2.62, while the price-ratio will be the golden ratio, 1.62. The ''three for the price of two" promotion (where, for example, 18 eggs are offered for the price of 2 half-dozen packs) emerges as a reasonably close approximation, with an implied risk-aversion of 0.37.
Thomas [15] laid out the challenge of measuring risk-aversion, an important parameter lying squarely in the field of social science, at the interface between psychology and economics. Further progress was made when it was shown recently that this psychoeconomic property can be measured empirically for decisions on life extension using data collected from the lives of billions of people throughout the world [20]. See also [21]. The measurement became possible as the result of the application of a new theory, the J-value [22,23], to decisions on extending life. The study by Thomas and Waddington utilised the results of the continuing ''natural experiment" that is taking place in all societies at all times to find the optimal trade-off between promoting longer life (for example through better public health, enhanced health care and improved industrial protection) and the resources that people are prepared to devote to the task, under the constraint that different nations have different levels of resource.

Problems facing measurement by survey
Subjectivity and bias are two obvious hazards lying in wait for those attempting measurements in non-physical fields. Finkelstein [2] suggests that the observers/analysts in social sciences may well not be objective, but operate ''on the basis of ideologically motivated theories". Mari et al. [24] follow Karl Popper [25] in regarding objectivity and inter-subjective testability as the critical features for the reliability and dependability of physical and non-physical measurements. Transgressions against those precepts can arise in one particular method of measurement used extensively in the social sciences, namely an opinion survey, by which the consolidated view of the wider population is sought by questioning only a sample. The intention is usually to measure a subjective parameter, and this may bring its own problems, as discussed later.
The important concept of the ''person as a measuring instrument" is introduced by Rossi and Berglund [9] in the context of human perceptions of smell, who propose that ''the screening and testing of participants, as measuring instruments, are prerequisites for reliable and valid psychological measurement". An example would be the wine tasters employed in wineries and restaurants, who, by nature and experience, have developed more sophisticated and discerning palates than the average person and are therefore able to pass a more expert verdict. Their judgements will influence the market into which a particular wine is sold, as well as its price. The wine-tasting process might be automated eventually, perhaps using a version of the electronic nose discussed in [9], possibly acting in conjunction with analytic chemical instrumentation.
After ''sentencing" by the experts, the wine will subsequently be tasted by many people, and each will make his or her own evaluation of it. It is important to note that, in this case, each person will be making a measurement of the same thing, namely the qualities of the wine in question. However, this contrasts strongly with many survey measurements made in the socio-politico arena, where the person is certainly the measuring instrument, but what is being measured is not common to all but is, instead, unique to the person.
For example, by the theory of the utility model for the value of a prevented fatality (VPF) (see Section 3.1 for details), the individual is required to possess a maximum acceptable price (MAP) that he or she is prepared to pay to avert an injury and a minimum acceptable compensation level (MAC) that he/she is prepared to accept to make up for receiving that injury. It should be noted that the MAP and the MAC are specific to the individual and cannot be regarded as the individual's subjective measurement of a feature which the other respondents have access to and can measure. In these surveys, there is no better expert on the opinion of the individual than the individual himself/herself. Moreover, screening out individuals in such a case is not justified unless it can be proved in advance that the person is not qualified to offer an opinion, for example by reason of insanity or through being too young to have acquired the necessary experience (e.g. being below voting age). There is no correct view in such a survey, and in an open and civilised society, there should be no discrimination in favour of certain, selected opinions. It is with such surveys that the present paper is concerned.
The least problematic and simplest of such surveys is probably the political opinion poll conducted among adults qualified to vote. This is because in a democracy, the respondent has exclusive control of his or her opinion, which he or she may later convert into a vote. But even here the process of conversion from opinion to vote is not straightforward. For example, the person's view can change between the date of the survey and the date of the election, and, of course, the person may not vote on the day, either through choice or circumstance. Moreover, the respondent may not want to divulge his/her true opinion to the pollster and may instead express a view calculated to avoid the interviewer's presumed censure.
It is obvious that, as a minimum, the subjective opinions of the surveyor should not be allowed to affect the result of the political opinion survey. There are two correctives against this. In the first instance, it is unlikely in a free and competitive society that there will be only one political polling organisation, so that a fairly immediate check on results will be at hand. Moreover there will eventually be a definitive check when the result of the official ballot becomes known. But it is important to realise that such checks and balances may well be absent in surveys beyond the political arena, and this increases the requirement for the highest levels of impartiality among surveyors and analysts.
The need to avoid analysts' opinions encroaching on the results of their opinion surveys led to the development in 2014 of the new criterion of Structural View Independence [26]. The mathematics prove that it is possible for the analyst to select in advance a plausible method to consolidate the survey results into a single figure so that the final value is guaranteed to be biased either low or high. For example, the geometric mean will always produce a figure that is too low, while the root-mean-square value will always give a value that is too high. These are instances where a general nonlinear, increasing and differentiable transformation is applied, then the mean found for the transformed sample and finally the result back-transformed to give the consolidated figure. Structural View Independence requires the study to be structured to ensure that the weighting given to each person's opinion is independent of its content. Only in this way can the views of all respondents be accorded the equal treatment they deserve. Out of all the transforms considered, this criterion is satisfied only by linear transformations. This implies that the consolidated view should be found simply by taking the arithmetic average. In one important case, a team of surveyors employed a further, non-differentiable transform to interpret opinion surveys, and the results were used to justify reducing the spending against multi-fatality accidents on the UK's railways by a factor of three. However it was shown in [27] that this new transformation discriminated systematically in favour of low valuations (and hence low spendings) and thus violated the principle of Structural View Independence. This meant that it was unsafe to make use of the survey results.
It is obvious from the discussion above that there are significant pitfalls associated with the application of opinion surveys for general parameter estimation. Even so they are often used in economics to put a value on public goods, such as clean air or the continued survival of a rare species of plant or animal.
It is generally accepted [28] that the ways to measure the value of any good are, in order of preference: 1. the market value if a free market in the good exists -this is the best way 2. the value deduced from revealed preferences, in line with John Locke's dictum ''I have always thought the actions of men the best interpreters of their thoughts" [29] 3. the value deduced from stated preferences, typically from opinion surveys -where Fujiwara and Campbell [30] have uttered the reservation: ''respondents in stated preference surveys may have an incentive to deliberately misrepresent their true preferences in order to achieve a more desirable outcome for themselves . . . individuals may overstate their valuations of the good if they believe their responses influence its provision and are unrelated to the price they will be charged for it".
It is when methods 1 and 2 prove either impossible or very difficult to arrange that option 3 is sometimes chosen.
As a minimum the criterion of Structural View Independence needs to be satisfied if an opinion survey is to be used as a measurement tool. It is equally clear that the most rigorous standards must be applied to the task of statistical inference if the survey results are to have meaning. Even then, of course, the Fujiwara and Campbell caveat suggests that that high accuracy cannot be expected.

Survey measurement of the ''value of a prevented fatality"
The present author has highlighted a number of instances of poor practice in survey measurements, where lack of rigour has had notable adverse implications [31]. One ethically significant and practically important case concerns the measurement of the UK Government's ''value of a prevented fatality" (VPF). This is defined as the maximum amount that it is notionally reasonable to pay for a safety measure that will reduce by one the expected number of preventable premature deaths in a large population. But it is pointed out in [32] that the VPF constitutes only a crude estimate of what is lost when someone is deprived of his or her life. For example, an 18 year-old generally has many more years of life ahead than a 78 year-old. Moreover, the UK VPF, which is based on an opinion survey carried out over 20 years ago [33], has been shown to contain major flaws as a result of the use of an invalid method of interpreting its survey evidence [34].
The success in validating the J-value method against pannational data [20] enables the statement to be made that the UK VPF in current use, £1.83M (2016 £s), is 4 to 5 times below what would be reasonable. To the extent that the concept has validity, the VPF should lie a lot closer to the equivalent figure announced by the U.S. Department for Transportation in 2016, namely $9.6M, based on stated preference studies [35]. Using J-value analysis for valuing human life, the VPF, now interpreted as the amount that should be spent to preserve the life expectancy of the average person in the UK, ought to be £8.59M (2015 £s). The clear disparity between this figure and that still used by the UK Government has obvious, negative implications for the priority being assigned to the safety of UK citizens.
Although the research team responsible for the VPF survey made two attempts to defend its interpretation method, [36,37], the team's defence was refuted in detail [38,39]. See also [40] for a review of the multiple flaws besetting the UK VPF and [41] for a demonstration of the lack of justification for the surveyors rejecting their initial survey, which would have produced a much higher figure [42].
It has been pointed previously that the validity of a survey estimate for the VPF ''rests critically on the sample population reflecting closely the probability density for wealth of the target population as a whole" (last paragraph of Appendix A.5 of [34]). But the average wealth of the participants in the Carthy survey may be calculated using the surveyors' own method as between £3568 and £7136, less than 10% of the average net wealth of UK adults at the time of the survey.
A further factor contributing to the glaring mismatch between the apparently low levels of wealth amongst the respondents and the much higher figures in the UK population as a whole might well be the small sample size used in the VPF survey: just 167 people.
The important general question is raised: what is the minimum sample size needed to measure a wealth-dependent economic parameter by opinion survey? In addressing this issue, the well documented case of the survey measurement of the VPF is used as an exemplar. Such a survey is a variant of the political opinion polls widely familiar from news reporting. A significant difference is that the measurement of the economic good usually involves the characterisation of people's choices from a continuum rather than from two or more discrete options.

Structure of the paper
This section has provided a brief review of the history of measurement in the social sciences, of the difficulties encountered in using surveys as measurement tools and of some the problems that have been observed in attempts to measure the ''value of a prevented fatality" (VPF).
New data on the distribution of personal wealth in the UK allow an assessment to be made, in Section 2, of the challenge to the measurement of a wealth-dependent parameter that is posed by the very great variation in wealth across a developed country such as the United Kingdom. Section 3 develops two diverse models for the survey measurement of the VPF: the Utility Model and the Multiplier Model.
In Section 4, the smallest sample size needed to guarantee a specified precision in a political opinion poll is established using the DeMoivre-Laplace Limit Theorem. This is then related via the Central Limit Theorem to the lowest number of people required in a survey measurement of a wealth-dependent parameter at the same precision. A degenerate lower limit is found for the ratio of the VPF minimum sample size to that of a political opinion poll at the same level of precision. This limiting ''numbers ratio" is found to be the same under both models for the VPF survey.
Section 5 contains the discussion and Section 6 the conclusions. Appendix A derives the mean and variance when data for cumulative probability are given in tabular form. Appendix B introduces the DeMoivre-Laplace Limit Theorem to explain the margin of error used in political opinion polls. Appendix C sets out how, for an opinion poll, the confidence interval results from a minimisation procedure applied to the interval containing the population average. Appendix D introduces the Central Limit Theorem as a generalisation of the DeMoivre-Laplace theorem and explains the derivation of a confidence interval for the population average of a continuous parameter.
The final Appendix E, details how a random multiplier may be incorporated into the Utility Model to produce the Extended Utility Model, thereby allowing for the likely variation in stated personal VPF amongst individuals with similar personal wealth.

The challenge of measuring a wealth-dependent parameter by survey
Figures from the UK's Office for National Statistics (ONS) show people's wealth varies over a very wide range [43]. While a small fraction, between 1% and 2%, of UK households have negative assets, the richest person in Britain is said to be worth £21bn [44]. Other developed countries are likely to exhibit similarly large variations in the individual assets of their citizens.
Approximate values for individual wealth percentiles may be found by dividing the household wealth percentiles provided by the Office of National Statistics (Fig. 3 of [43]) by the average number of people per household, namely 2.4 [45]. See Table 1 below. A small fraction (<2%) of the UK population was living with the burden of net debt in 2014-2016, but the average personal wealth was £198,112 (a figure found by dividing the ONS figure for total wealth, £12.8 tn, by the 2014 UK population of 64.6M). Meanwhile, slightly more than 10% of UK citizens had assets of £0.5M or above, and more than 1 in 50 were millionaires. The cumulative probability for individual wealth in the UK is illustrated in Fig. 1, while Fig. 2 gives the probability density for wealth in the UK in 2014-2016. The irregularity of the probability density is striking, showing a very considerable variation within small ranges as well as over the full span.
The starting wealth in Table 1 (a debt of £4933) is found by extrapolation from the ONS data. The final point is estimated after assuming that the probability density is uniform between percentiles, as in Appendix A. Using Eq. (A.5), £1,796,407 emerges as the necessary balancing wealth needed to ensure that the average wealth from Table 1 matches the figure of £198,112 calculated in the last paragraph. This provides an approximate match to the rising trend observed in the preceding percentiles (Fig. 3). Of course, given the £21 bn of wealth ascribed to one UK citizen and the number of other known billionaires in the population, the limiting wealth is clearly a notional figure only. Its relatively low value, less than £2M, suggests that the variance on wealth will be understated as a result.
Even so, the great variation in wealth across the nation remains evident, and this poses a significant challenge to those hoping to measure, by opinion survey, the average of a wealth-dependent parameter. To put the challenge in context, the ONS felt it necessary in their wealth survey to set the size of their sample at 18,000. This sample size is very much higher than those used in most political opinion polls, for example, which tend to be in the low thousands.
The paper will examine the VPF as an exemplar of a continuous parameter that is strongly conditioned by the wealth of the individual making the response.

Models for a survey measurement of the VPF
This Section will derive two possible mathematical models of the survey measurement of the VPF. The first of these will provide a justification for the assumption that an individual's personal VPF is heavily dependent on his or her wealth. The second provides a general, and hence transferable, representation of a parameter that is strongly wealth dependent.

Utility Model of VPF survey
The first approach accords with that presented by Carthy et al. [33] as analysed by Thomas and Vaughan (Appendix A of [34]). Under this model the person will be prepared to sacrifice some of his/her wealth to avoid injury as long as his/her expected utility stays constant.
Let w i be the starting wealth of the ith individual and U i w i ð Þ be his/her utility of wealth when he/she is in good health. Now consider an injury, k, that is assumed to reduce the individual's utility of wealth to I ki w i ð Þ. Let the new utility be a fraction, c ki , of the person's utility of wealth when in good health: (This equation, attributed to Viscusi and Evans [46], differs from the form, I ki w i ð Þ ¼ U i w i ð ÞÀa ki ; a ki ! 0, proposed by Carthy et al. [33], but those later authors clearly regarded it as a legitimate alternative (see Note 5 of [33]).) Now suppose that, over some period of time, an individual faces two mutually exclusive possibilities: (i) he/she may incur injury k, with probability q k , or (ii) he/she may continue in full health. (The simplifying assumption is made that the probability of death from causes other than the specified injury is negligible.) The individual's utility of wealth will now depend on whether or not the injury occurs and hence will be a random variable, Z i w i ð Þ, with an expected value, z i , given by: It may be possible for the individual to reduce the probability, q k , of injury, k, over some period by expending money on protection, leading to a fall in his/her wealth. Alternatively, he/she might accept a higher injury probability but receive compensation, with the result that his/her wealth, w i , rises. Thus changes to q k will be accompanied by related changes in w i , and the limiting condition occurs when the expected utility stays the same, viz. z i = constant, which implies that @z i =@q k ¼ 0.
Carrying out the necessary partial differentiation of Eq. (2) with respect to q k and setting the result to zero gives the rate of change, @w i =@q k , of wealth, w i , with the probability, q k , of injury k, for a constant expected utility Now the probability, p k , of not receiving injury k is p k ¼ 1 À q k , so that dp k =dq k ¼ À1 and @w i =@p k ¼ À@w i =@q k . It is reasonable to assume that a person would be prepared to trade some of his/her wealth, w i , for a higher probability of not receiving injury k. An indifference curve should then exist, where the person's utility stays constant. The expression, À@w i =@p k j z i ¼const , quantifies the trade-off at a general point, p k ; w i ð Þ , on the indifference curve: wealth is decreased by a positive small amount, dw i , from w i to w i À dw i in order that non-injury probability, p k , should increase to p k þ dp k , where dp k is small and positive. The negative of the partial differential of wealth with respect to non-injury probability may be described as the marginal rate of substitution, m ki , of non-injury probability, p k , in place of wealth, w i : Table 1 Cumulative probability distribution for individual wealth, FW w ð Þ, for the UK, 2014-2016.    It is argued in [14,15] that only utility functions from the Power family can be true descriptions of human experience. Hence the utility of wealth may be written: where e is risk-aversion and a constant. a i is also a constant, conforming with the ideas of von Neumann and Morgenstern [13] and the clarification brought out in [14,15] that the Power utility formulation of Eq. (5) gives the utility relative to the utility of one unit of wealth, thus producing a ratio scale for utility of wealth.
Differentiating equation pair (5) with respect to wealth gives: Thus, substituting from Eqs. (5) and (6) into Eq. (4) gives: Now let k denote a fatal injury, in which case Eq. (8) is dependent both on both the individual's fractional loss, 1 À c Di , of utility through dying prematurely and on the probability of being killed, q D . However, we may judge that that nearly all utility is lost on death for the average person, so that c Di ! 0.
Presumably c Di will attain exactly that figure for an individual without close relatives and uninterested in charitable giving. Moreover in many cases, for example when an improvement is being considered for an industrial protection system, the probability of death is already low: q D << 1. Hence equation set (8) may be approximated by: for e-1 w i lnw i for e ¼ 1 ( ð9Þ A significant failing of the model just presented is the lack of any dependence on age even though what is at stake for a typical young person is patently very different from that for an elderly individual -the VPF values the life of a 20 year-old with 61 years' expected life the same as that of a 90 year-old with about 4 years of life expectancy. This reflects a major philosophical weakness in the VPF concept, as been pointed out previously by Sunstein [47] as well as Thomas and Vaughan [32]. By the theory just presented, and from Eq. (9) in particular, the individual's VPF is determined entirely by his or her wealth as soon as the risk-aversion, e, is specified, and randomness is introduced only by the process of selecting a person from the population. The absence of variation in stated VPFs amongst people with the same wealth is clearly a simplification that will reduce the variance predicted by the model. (This issue is explored in Appendix E.) A table of percentiles for the VPF may be built up by applying equation set (9) to the percentiles of wealth given in Table 1. The methods of Appendix A may then be applied to find the population average VPF and the variance of personal VPF.
The first two percentiles of wealth are negative, which poses a problem in computing the corresponding figures for VPF from Eq. (9). Two potential ways forward may be considered. In the first, w i j j and w i j j e are substituted in place of w i and w e i respectively in Eq. (9) and the utility of negative wealth is calculated as a negative multiple, Àa : a % 2, of the result. This takes up the notion, introduced by Kahneman and Tversky [48] and discussed further in [49], that the disutility of a debit is numerically greater than the utility of the same sum in credit. Hence which produces negative values for v 1 and v 2 . The other approach is simply to set v 1 ¼ v 2 ¼ 0. In fact it matters little which of these methods is used, as the changes in calculated values for l V and r 2 V are reflected only in the 4th significant figure. Hence it is reasonable to choose the simpler second option. Such a course avoids the awkward implication that those in debt will regard death as a financially attractive option, although it can still be inferred that they regard death as financially neutral.
The validation of the J-value method [20] means that it is possible to find a good estimate of the value to be placed on the average life to come for UK citizens. Carrying out this exercise for actuarial and GDP data for 2015 gives a figure of £8,587,700. Taking this to be the actual value of the VPF, an implied value for e may be found, namely that which causes the expected value, E V ð Þ, to coincide with this figure. E V ð Þ may be found by calculating the personal VPF, v i , for each wealth, w i , listed in Table 1 and applying Eq. Book [50]. It conforms also to the Treasury's previous advice in its Green Book, namely that e was likely to be just below or just above unity [51].

Multiplier Model of VPF survey
An alternate model assumes that the individual's stated VPF is some multiple, R, of his/her wealth: the ''Multiplier Model". Rather than assigning the multiplier the same value for all people, it is assumed that R is a random variable, independent of individual wealth. This provides for likely variations in the responses of people with similar wealth, allowing for the presence of other, unspecified influences such as differences in psychological makeup.
The individual's VPF, V, is then regarded as a continuous random variable obeying the relation: where W is wealth, with W and R taken to be continuous random variables, independent the one of the other. Let the multiplier, R, take the form: where r 0 ! 0 is deterministic and Q is a random variable distributed uniformly over the interval, 0; s ð Þ. Thus the probability density function, f Q q ð Þ, for Q is The expected value of the multiplier is therefore: based on the standard result that the expected value for a uniform distribution is its central value. The maximum value, r 0max , of the deterministic component, r 0 , is that which causes the random band to degenerate to a point of zero length, when E R ð Þ ¼ r 0max . Given that R and W are modelled as independent, the expected value of their product is simply the product of their expected values: The average value of individual wealth is calculated using Eq. (A.5) of Appendix A as: As was the case with the Utility Model, E V ð Þ is set to the monetary value assigned to population-average life expectancy by Jvalue analysis, namely £8; 587; 700. Hence from Eq. (16), the expected value of the multiplier, R, is: Using Eq. (15), the extent of the random band, s, is then Since r 0 is deterministic, the variance of R is found by applying the variance operator to Eq. (13), revealing it to be simply the variance of Q: where the last step uses a standard result for a uniform distribution. Meanwhile the variance of individual wealth may be found from the tabulated data of Table 1 The variance of V is the product of the two independent random variables, R and W, which may be written [52] with the square root giving the associated standard deviation: Varying the deterministic component, r 0 , of the random multiplier, R, will cause the length, s, of the random band to vary. At the upper limit for r 0 , r 0 ¼ r 0max , the multiplier will be wholly deterministic, while at the lower end of the range, r 0 ¼ 0, the multiplier, R, will be uniformly distributed over the full range, 0 ! 2r 0max . Fig. 4 shows the effect on the probability distribution, f Q q ð Þ, for the random component, Q, of the multiplier as deterministic component is decreased. The changes in the deterministic component, r 0 , will be reflected in alterations to the standard deviation, r V , for the VPF. Fig. 5 shows the standard deviation, r V , plotted against the value selected for the deterministic component, r 0 , over the range of possible values: 0 r 0 r 0max , where r 0max ¼ E R ð Þ ¼ 43:35. The standard deviation for VPF takes its highest value, £14M when the multiplier, R, is uniformly random between 0 and 2E R ð Þ, viz. between 0 and 86.7. Most of the variability comes from the variation of individual wealth in the population, as can be seen from the last point on the graph, where the multiplier is fixed at At this point r V ¼ r 0max r W ¼ £11:34M. This is only 19% down on the £14M figure for standard deviation of the VPF that is associated with a maximally random model for R.

The numbers of respondents in a political opinion poll and in a survey measurement of a general continuous parameter, A
For compactness, the term, ''b-level precision" is introduced in this paper as a measure of the precision afforded by the average value derived from a survey. It is taken to denote the length of the b% confidence interval for the population mean divided by the best available estimate of that mean.
Appendix B examines the mathematical representation of political opinion polls and sets out the relationships amongst the confidence interval, the margin of error and the b-level precision. Appendix D explains the relation between the b-level precision and the confidence interval for a general, continuous parameter measured by survey.
The b-level precision, k b ð Þ A , for the measurement of a general continuous variable, A, is found from Eq. (D.9) of Appendix D. Particularising r; l and n by assigning the subscript, A, gives: where n poll is the number of respondents to the opinion poll. Let us now compare two different survey measurements, one attempting to measure a continuous parameter, A, and the other a political opinion poll. If they exhibit the same b-level precision, poll , then, from Eqs. (24) and (25): Two features of Eq. (26) are noteworthy: (i) if the survey measurement of a parameter, A, and the political opinion poll are to have the same b-level precision, the ratio of the number of respondents in the survey for A to the number in the political opinion survey will depend solely on the distribution of A in the population (ii) the result applies to all confidence levels, b.

The margin of error and the ''12% rule"
Most people are familiar with opinion pollsters quoting a ''margin of error" to indicate the likely precision of their political predictions. The margin of error may be defined as half the length of the b ¼95% confidence interval when a binary choice is given and the probability of either option being supported in the actual vote is about 50% (see Appendices B and C). Such conditions were met, for example, in the UK's EU Referendum of 2016 [54], where 51.9% of the voting public chose the option of leaving the European Union while 48.1% wanted to remain.
Political opinion pollsters have reached a de facto consensus in the UK that the margin of error should be no greater than 3%. This figure implies that the length, w, of the 95% confidence interval should be no more than 6% in absolute terms. Referred to the fraction of the people voting for a particular option of about 0.5 when the vote is close, this implies a b-level precision, k b ð Þ poll , of 12% or better for a political opinion poll with b ¼ 95%: k 0:95 ð Þ poll 0:12. Such a blevel precision may be seen as corresponding to a b-level tolerance of ±6% or better at b ¼ 95%.
It seems reasonable to require that a survey estimate of any important population parameter, A, should match the precision of the political opinion polls frequently carried in newspapers and the broadcast media. This implies that the b-level precision at b ¼ 95% should be 12% or better for the survey measurement of A.
Thus n poll ! 1068 after allowing for the integer requirement. The sample size for the VPF for the same b-level precision is given, using Eq. (26), as For the Utility Model for a VPF survey, the numerical values of E V ð Þ and var V ð Þ are given in Eq. (11), so that, combining Eq. (30) with inequality (29), the sample size when the 12% rule is obeyed is A range of values of E V ð Þ and var V ð Þ are possible for the Multiplier Model for VPF survey, corresponding to different values of the fixed component, r 0 , of the multiplier, R. Allowing for r 0 to span its full range from 0 to r 0max ¼ 43:35 gives the following interval for the minimum sample number, n V;min , that will obey the 12% rule and thus provide a b-level precision of k V 0:95 ð Þ 0:12: 1860 n Vmin 2835 ð32Þ Fig. 6 graphs the minimum sample size, n V min , when the 12% rule is obeyed for the Multiplier Model for VPF. The ratio of the fixed component, r 0 , of multiplier, R, ranges from 0.0, where R is fully stochastic, to r 0max , where the multiplier becomes entirely deterministic. Also marked up is the minimum number of respondents needed by the Utility Model of VPF survey when the 12% rule is obeyed.
Comparing inequalities (31) and (32), it is clear that the two models for VPF survey produce similar estimates for minimum sample size, with the figure for the Utility Model falling within the range set by the Multiplier Models.
It may be concluded that the sample size for a VPF survey needs to be set much higher than for a political opinion poll if both surveys are to satisfy the 12% rule.
where q is the correlation coefficient. When risk-aversion is e ¼ 1:1605, the correlation coefficient is 0.9955 for the range of wealths listed in Table 1. See Fig. 7. Combining Eq., (34)- (36) gives: where q % 1 has been used in the second step. Factorising the numerator as a square and using Eq. (26), the ratio of the sample size for a VPF survey to that for a political opinion poll giving the same b-level precision is: Since the distribution of W e is conditioned by the distribution of wealth, W, Eq. (38) shows that, under the Utility Model of VPF survey, the ratio of numbers of VPF respondents and political poll respondents giving the same b-level precision for any b, depends only on the distribution of wealth.
The numbers ratio, n V =n poll , may be seen to reach a minimum over all positive risk-aversions: e ! 0 at the point where e ¼ 0.
The decrease in the numbers ratio continues as the risk-aversion falls below 0.0 and moves into the risk confident region, but the difference from the value at e ¼ 0 affects only the 6th significant figure as e decreases over the interval, 0 ! À0:5.
There is a limiting value of the ratio, var V ð Þ= E V ð Þ ð Þ 2 , for the degenerate case when the risk-aversion is zero. Inserting e ¼ 0 into either Eq. (38) or the first line of Eq. (37) to give: where the last step follows from the fact that E W ð Þ, at £198,112, is 5 orders of magnitude greater than £1.
Each person's individual VPF will decrease as e drops, and by Eq.
(9), when e ¼ 0, then v i ¼ w i À 1 % w i . Hence the average value of the VPF will fall to approximately the average value of wealth, £198,112 (strictly to £198,111).
On combining Eqs. (39) and (30), it is clear that, no matter what value of risk-aversion is employed in the utility function (and hence whatever the VPF produced), the number of respondents needed to measure the VPF by survey must be at least 74% bigger than the number consulted in a political opinion poll giving the same b-level precision.
It will be shown in Section 4.3.2 that the Multiplier Model produces the same degenerate limiting value for the numbers ratio, n V =n poll , when the b-level precision is the same for the VPF survey as for a political opinion poll.

The numbers ratio n V /n poll for the Multiplier Model for VPF survey
Dividing Eq. (22) by E V ð Þ ð Þ 2 and using Eq. (16) produces Hence var V ð Þ In the degenerate case, as r 0 ! r 0max , so the variance, var R ð Þ, disappears and the multiplier becomes entirely deterministic. Eq. (41) is the outcome, a result for the Multiplier Model that mirrors Eq. (39) for the Utility Model when the risk-aversion is set to zero.   Eq. (41) will be valid for all E R ð Þ : E R ð Þ-0 and thus will apply to any positive expected value, E R ð Þ, set for the multiplier. This conclusion holds irrespective of the initial probability distribution for the multiplier, R. The resulting condition gives the numbers ratio as: if the b-level precision of the VPF survey is to match that of the political opinion poll.
The fact that there is no dependence on E R ð Þ means that Eq. (42) will hold whether the VPF is assumed to be £8.5M or £1M or £2M or £20M or any other value.
The generality of these arguments means that they may be applied to the survey measurement in the UK of any parameter, A, that is strongly wealth-dependent. Hence the numbers ratio will obey the condition:   9 gives another perspective by plotting, against sample size, the b-level precisions for the Utility and the Multiplier Models for VPF survey, as well as the b-level precision for a political opinion poll, all at b = 95%. The curve for the Multiplier Model of VPF survey incorporates the assumption that r 0 takes its central value, 1 2 r 0max . It is clear, when the number of respondents is the same, that the precision of the opinion poll prediction will be significantly better than that of either model of VPF survey. Conversely, the Utility and Multiplier Models for VPF survey require a significantly greater sample size to achieve the same level of precision as a political opinion poll.
It is also evident that the b-level precision of the VPF survey will deteriorate increasingly rapidly for sample sizes less than about 500, irrespective of which model of the VPF survey is used. Fig. 9 may also be as a test for a biased sample. Suppose that a survey is carried out to measure a strongly wealth-dependent parameter but that the number of opinions canvassed is low, say 100 people. Unbiased estimates might still be made of the mean and the standard deviation based on the opinions in the 100strong sample. If these were used to calculate a b-level precision at b ¼ 95% that was much lower than the curves in Fig. 9, at say 14%, this would suggest that the opinions surveyed showed too little diversity, implying that the sample had not been chosen randomly across the board. Achieving a b-level precision at b ¼ 95% that was similar to those predicted for this sample size by the Utility or Multiplier Model would, however, be no guarantee that people with the necessary spread of wealths had been canvassed.

Discussion
Statistics from the UK's Office for National Statistics enable the range of wealth held by individuals in the United Kingdom to be assessed and the distribution to be characterised by the percentiles listed in Table 1. Extrapolation has been needed for the starting and finishing points (0% and 100%), and the top figure for personal wealth in the table is accepted as being too low. While Table 1 encompasses an extensive range of wealth, the variance of individual wealth in the UK will tend to be somewhat understated even so.
The size of sample needed to give adequate precision may be appreciated by reference to the ''margin of error" used by political opinion surveyors. All other things being equal, the greater the sample size the better the precision of the opinion poll's prediction. General agreement appears to have emerged amongst the UK's practitioners that a margin of error of 3% is the highest figure at which a reasonable compromise may be struck between the expense of the survey and the commercially valuable reputation of the polling organisation. A margin of error of 3% or better will require the political opinion pollster to question at least 1068 randomly selected people.
The concept of ''b-level precision", k b ð Þ , can be used to generalise the margin of error to other types of survey. The ''12% rule" extends the 3% margin of error or better (k 0:95 ð Þ 0:12) customary for a political opinion poll to surveys of continuous parameters. Measurement exercises conducted in line with this rule will encompass the true mean within AE6% of the estimated average about 95% of the time, which seems a reasonable aim.
The number of respondents needed in a survey measurement of a general continuous parameter, A, may be related to the size of the sample required for a political opinion poll when both have to satisfy the same b-level precision. The ''numbers ratio" of the sample sizes in the two surveys then equals the variance of A divided by the square of its expected value. This result, valid at all values of b and b-level precision, means that the sample size for survey measurement of a continuous parameter that is strongly dependent on wealth will need always to be substantially greater than that of a political opinion poll.
Applying this finding to individual wealth in the UK gives the numbers ratio as 1.74, so that the number of people needing to be consulted in the determination of average wealth needs to be 1860 in order to just satisfy the 12% rule: k The ''value of a prevented fatality" is an example of a population parameter that is conditioned by wealth. Two diverse mathematical descriptions are offered of the process of measuring the population average VPF by opinion survey. The first, the Utility Model, accords with the description presented in the VPF paper by Carthy et al. [20], as analysed in [34]. The Utility Model provides a justification for the intuition that the individual's personal VPF will depend strongly on his/her wealth.  Fig. 9. b-Level precision at b = 95% as a function of sample size for the two models for VPF survey and for the opinion poll. The 12% rule is also marked.
Building on this confirmation, a second, conceptually simpler mathematical description is developed for the VPF survey, namely the ''Multiplier Model". Now the person's VPF is found by multiplying his/her wealth by a random factor that is uniformly distributed between limits. Two sources of random variation are present in this model, first in the process by which a respondent is selected and second in the size of the multiplier.
The parameters of the two models may be adjusted so that each produces a population average VPF to match the average value of life to come for a UK citizen found by the J-value, viz. £8,587,700 (2015 £s). For the Utility Model of the VPF survey, this means that the risk-aversion parameter should take the value e ¼ 1:1605, which is well within ranges for risk-aversion suggested by the UK Treasury. Matching the same figure for the VPF in the Multiplier Model requires the expected value of the multiplier to be 43.35.
Similar variances for the individual's VPF then emerge from the two independent systems of equations. There is overlap in the values of standard deviation predicted by the two models, with equality occurring at the point where the deterministic factor, r 0 , in the multiplier, R, is about 10% of its maximum value, r 0max , in the Multiplier Model, implying that the multiplier then contains only a small deterministic component.
Under the 12% rule, the minimum sample size is 2648 for the Utility Model of the VPF survey. Meanwhile the minimum number of people needing to be consulted is estimated under the Multiplier Model to lie between 1860 and 2835, a range that encompasses the figure coming from the Utility Model.
There is a degenerate lower limit for the numbers ratio of the minimum sample size in a VPF survey to the corresponding figure in a political opinion poll at matching b-level precisions. This degenerate lower limit is the same for both the Utility Model and the Multiplier Model, and is found to depend only on the distribution of wealth in the population. It is thus clear that the need for high sample sizes is driven mainly by the variation in individual wealth in the nation.
In the case of the Utility Model of VPF survey measurement, the degenerate lower limit for minimum sample size, 1860, occurs when the risk-aversion, e, declines to zero. The population average VPF will decrease in tandem, and will be to all intents and purposes identical to the average individual wealth, £198,000, by the time risk neutrality has been reached. At this point the survey measurement of the VPF will share the characteristics of wealth measurement by survey. This explains why the sample size has fallen to exactly that needed to find the average wealth under the 12% rule, as discussed in paragraph 5 of this Section.
The same degenerate lower limit for sample size, 1860 obtains in the Multiplier Model when the deterministic component of the multiplier, r 0 , has reached its maximum value, r 0 ¼ r 0max ¼ E R ð Þ, and the multiplier is fully deterministic. The result is independent of the value previously specified for the expectation, E R ð Þ, of the multiplier, R. Thus at least 1860 people would always need to be consulted in order for the measurement to conform to the 12% rule, no matter what the value chosen for E R ð Þand hence irrespective of the value of the VPF under the Multiplier Model.
The convergence of the two diverse models at a degenerate lower limit for minimum sample size suggests that the number of people surveyed will always need to be at least 1860, irrespective of the value of the VPF.
The next agreement between the two VPF survey models occurs at a minimum sample size of 2648 people, the figure found using the Utility Model when set up to give a VPF to match the J-value generated figure of £8.59M. This sample size is, in fact, likely to be an underestimate as it does not allow for the variation in response likely amongst people with very similar levels of wealth. Including this realistic possibility would increase the necessary sample size. The effect has been quantified in Appendix E by extending the Utility Model through the introduction of a multiplier to represent the variation likely between the stated VPFs for people with very similar levels of wealth. Allowing for a random variation of AE50% on each person's calculated personal VPF takes the minimum sample size to 2958.
Applying the 12% rule, the range of VPFs suggested by the Multiplier Model is 1860 ! 2835, while the Extended Utility Model gives the overlapping range 2648 ! 3887. The midpoint of the extremal values is thus (1860 + 3887)/2 = 2874. It is possible to conclude, in broad terms, that the minimum sample size for measuring the VPF by survey is likely to be 3000 or more randomly chosen people in order to comply with the 12% rule. This level of precision, equivalent to a tolerance of AE6% at the 95% confidence level, seems a reasonable objective. For comparison, this sample size is still only a sixth of the 18,000 people questioned by the Office of National Statistics in its survey of household wealth.
While the Utility Model is specific to the VPF, the Multiplier Model provides a general paradigm that can be applied to the survey measurement of any continuous parameter that is strongly dependent on wealth. In all such cases the minimum sample size must be 1860 if the 12% rule is to be obeyed, and it may well be that more people will need to be questioned.
As mentioned in the Introduction, the sample size used in the opinion survey [33] on which the UK VPF is based was only 167.
[In fact Carthy et al. reduced their sample size to 149 (Table 3 of [33]) to find the set of ''trimmed means" that influenced their finally recommended VPF value; meanwhile, in generating their Table 7, they reduced the number of people in their sample to 135]. It is clear from the minimum sample sizes derived in this paper and discussed above that the sample size on which the UK VPF is based was at least an order of magnitude too small, and hence wholly inadequate. Obviously any similar exercise in the future would need to be scaled up very significantly so that the views of 3000 or more randomly chosen individuals could be sampled. Table 2 shows the sample size figures for the Multiplier Model for VPF, the Utility Model and the Extended Utility Model discussed in Appendix E. These are compared and contrasted with the sample size used by the ONS for measuring individual wealth [43] and the sample size used to generate the UK VPF [33]. Small scale exercises in opinion surveying to find a VPF are clearly ruled out for the future.
A further feature of the Carthy study [33] is the anomalously low level of the average wealth of its respondents, as revealed in [34]. The problem of obtaining proper representation for wealthier people in the statistics is recognised by the ONS. Its statisticians stress in the commentary on their survey of national wealth that special care is needed to include adequate number of richer respondents, saying: ''As wealth is known to be unevenly distributed, addresses more likely to contain wealthier households were sampled at a higher rate to improve the efficiency of the sample. These addresses were identified utilising data from HMRC." -Section 8: Quality and methodology of [43].
Equal care is clearly needed in the survey of any population parameter, such as the VPF, known to be strongly dependent on wealth.
The figures for sample size derived in the paper are predicated on the assumption that the sample is chosen randomly across the whole population. As noted by the ONS, it will generally not be easy in practice to guarantee that the selection is truly random across the full spectrum of individual wealth.
The statistics for individual wealth pertain to the United Kingdom in the period 2014-2016. Clearly there will be differences in detail between different nations and they will change over time.
Nevertheless the numbers found for minimum sample size are likely to give good guidance to those intending to undertake a survey measurement of any parameter that is strongly wealth dependent in a developed country.
The measurement by survey of household wealth, as carried out by the Office of National Statistics, enjoys a significant advantage over the estimation of a general, non-market parameter, in that it is possible to break down a household's wealth into assets for a which a market exists: bank accounts, shares, property etc. This will allow a reasonably accurate quantification of the monetary value of each class of holding, and, as long as all assets are identified, will permit a similarly accurate and testable quantification of total wealth. Unfortunately such a methodical mode of proceeding is not available to those attempting a stated preference survey, for example a survey estimation of the VPF. In such a case, while choosing a sample truly randomly and ensuring it is of adequate size can provide the desired b-level precision for the survey, the accuracy of the resulting figure cannot be relied upon for the reasons outlined by Fujiwara and Campbell [30], as discussed in Section 1.2 above.

Conclusions
The large variation in individual wealth in a developed country like the UK means that the sample size for the survey measurement of a continuous parameter that is strongly dependent on wealth will always need to be at least $75% larger than the sample size of a political opinion poll giving the equivalent level of precision. This sets a degenerate lower limit of about 2000 on the minimum sample size for the survey measurement of a such a parameter at an achieved precision equivalent to the 3% margin of error common in political opinion polls.
In the case of the UK ''value of a prevented fatality", the sample size needs to be about 3000, based on the Extended Utility Model likely to be most realistic, if the precision of the survey is to match that of the political opinion polls that people are used to seeing reported in their national media.
It should be emphasised that the selection of the sample needs to be fully random across the whole population for the survey to have validity. Particular care is needed if the opinions of wealthy people are to be captured adequately.
It is clear that small scale opinion surveys cannot provide adequate precision in the measurement of continuous parameters, such as the VPF, that have a strong dependence on wealth.
No confidence can be placed in the figure for the VPF used as a safety yardstick by the UK government because the sample size used in its foundational opinion survey, 167, is less than a tenth of what would be required to give reasonable precision. This adds to the catalogue of well documented problems besetting the UK VPF, the use of which is not justified.
The Fujiwara and Campbell caveat needs always to be borne in mind, of course, that stated preferences do not necessarily represent true preferences. Hence even when a survey has included a wide enough spread of randomly chosen respondents to provide good precision, the test of accuracy may still not be passed.

Declaration of Competing Interest
The author declares the following financial interests/personal relationships which may be considered as potential competing interests: the author is employed by the University of Bristol as Professor of Risk Management; he is a director of Michaelmas Consulting Ltd, which has sponsored open-access publication of this article. DF X ¼ 1 n ðA:1Þ Here, for example, n ¼ 4 if the data are given in quartiles, n ¼ 10 if given in deciles and n ¼ 100 for a table of percentiles such as Table 1.
The probability density function, f X x ð Þ, may be calculated for each interval of x after assuming a piece-wise uniform increase in cumulative probability, F X x ð Þ, between quantiles. Thus x < x i i ¼ 1; 2; :::; n ðA:2Þ in which 0 is the index of the first data point, x 0 ; F X x 0 ð Þ ð Þand n is the index of the last data point, x n ; F X x n ð Þ ð Þ , where F X x 0 ð Þ ¼ 0 and F X x n ð Þ ¼ 1. The expected value of X is then: But another way of proceeding is to assume that the contest is going to be close, as in the case of the EU Referendum, implying x=n % 0:5. This leads directly to Eq. (B.13).
While the exact nature of the derivation of Eq. (B.15) is mathematically satisfying, its usefulness in practice rests on the de facto approximation Eq. (E.9) may be evaluated for n V m min as the deterministic component, r 0 , varies between r 0 ¼ r 0min ¼ 0 and r 0 ¼ r 0max ¼ 1. It is found that 2648 n Vmmin 3887 ðE:10Þ and that the middle of the range for r 0 gives a value of n Vmmin that is just below 3000. Fig. 12 plots the minimum sample size for the Extended Utility Model, n Vmmin , against the ratio r 0 =r 0max . This allows the results for the Utility Model and the Multiplier Model to be presented on the same graph. (In the case of the Extended Utility Model, r 0max ¼ 1 and so r 0 =r 0max ¼ r 0 ).