Differences in vision performance in different scenarios and implications for design

To design accessibly, designers need good, relevant population data on visual abilities. However, currently available data often focuses on clinical vision measures that are not entirely relevant to everyday product use. This paper presents data from a pilot survey of 362 participants in the UK, covering a range of vision measures of particular relevance to product design. The results from the different measures are compared, and recommendations are given for relative text sizes to use in different situations. The results indicate that text needs to be 17e18% larger for comfortable rather than perceived threshold viewing, and a further 20% larger when users are expected to wear their everyday vision setup rather than specific reading aids. © 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Visual ability is often critical in product and service use, affecting many aspects such as the capability to read text, see warning signs and recognise icons. It is thus important to consider the visual ability of the target population when designing products and services. Otherwise, users may struggle or may even be excluded from using the product. This is particularly important in the context of accessibility and inclusive design, which aim to meet the needs of a wide range of users, and reduce the numbers of those who would be excluded (Keates and Clarkson, 2003).
To design appropriately, designers need good population data on visual abilities and how they relate to product use. However, the currently available data often focuses on just a few vision measures, which are appropriate for some but not all design situations.
Population-based surveys commonly use distance visual acuity to reflect visual function. This is important information for designing signage and advertising viewed at a distance, but products are often viewed close-up. Near vision ability is distinct from (not correlated well with) that at a distance (Lovie- Kitchin and Brown, 2000), and in older patients requires different refractive correction (Pointer, 1995).
Furthermore, surveys of visual ability typically measure best corrected vision. However, there are many product use situations where users may not want to or be able to change their glasses, such as in the middle of cooking or on a date. It is also important that people should be able to discover and read warning labels and critical information without first putting on their reading glasses. Further, not everyone has spectacles that provide best correction, even in developed countries such as the UK (Evans and Rowlands, 2004).
Lack of best correction can often be compensated for to some extent by changing the working distance of the task, i.e. the distance at which the items used are viewed at during the task. For example, people with uncorrected age-related long sightedness may hold text at arm's length. The distance does not matter as long as the text can be read at that distance without difficulty. Therefore near vision tests that examine physical print size, allowing the user to choose the working distance, are most relevant to product design. However, in clinical assessment, reading ability is usually assessed at standardised working distances (Bailey and Lovie, 1980;Mansfield et al., 1996) to determine the angular size of print that can be read.
Vision studies typically measure threshold performance, often examining the smallest letters that can consistently be correctly identified (Bailey, 1998). This is often chosen as it is a standardised measure that can easily be compared across groups. However, from the perspective of product design, it is more important to understand what people can see comfortably (Porter et al., 2004;Legge and Bigelow, 2011). Comfort (or lack of comfort) can impact perceptions of and emotional response to a product, as well as the effective use of the product. For example, if users cannot read text on a product comfortably, they may not read it carefully, resulting in misreading of information. For example, Kenagy and Stein (2001) explain that problems with medicine labels and packaging can result in serious medical errors, citing as an example "two vials that appear to be virtually identical (except for the drug name, in 8point type)". For reading, the smallest print size that supports the maximum reading speed is termed the critical print size, and is often taken to indicate the print size that can be comfortably read Whittaker and Lovie Kitchin, 1993). However, the size of print perceived as comfortable by an individual is different to measured values of critical print size (Friedman et al., 1999;Szlyk et al., 2001;Tejeria et al., 2002;Latham and Usherwood, 2010).
A further issue of visual function studies is that vision measures are typically collected under clinical conditions with optimal lighting levels. However, products are commonly used in the varying and often poor lighting conditions of people's homes (Farrell, 1991;Percival, 2007). Since visual ability declines with reduced illumination (Hecht, 1927;Elton et al., 2013), a design that is usable in a clinical environment may not be usable in practice.
These issues indicate that clinical measures of visual function may not correspond to visual ability as it relates to product use in the real world. This paper aims to address some of these issues, by presenting and comparing data on vision measures that have been intentionally chosen to be relevant to real-life product use situations. Vision measures were collected using printed vision charts in participants' own homes, which is a typical setting for product use. They include near as well as distance visual ability; perceived comfort as well as perceived threshold vision ability; and near vision with the vision aids participants wear on an everyday basis, and with the setup they choose for reading. Recommendations are given for relative text sizes to use in different design situations.

Survey as a whole
A survey was conducted examining a wide range of human capabilities and characteristics related to product use, including, but not limited to, vision. Items were a mix of self-report questions and performance tests. The survey was conducted face-to-face in participants' homes so that the testing environment would be similar to that in which most products are typically used. For pragmatic reasons, the in-house testing environment was used for all tests, even though some of the measures (e.g. distance vision measures) could be more applicable to an outdoor environment.
The survey was a pilot in preparation for a full national survey. There were 362 participants, with the sample taken to represent the general adult population living in private households (see below). It can therefore provide useful data and enable preliminary conclusions. The survey is described in more detail by Tenneti et al. (2013). The resultant dataset is publicly available online (Clarkson et al., 2012).

Sample and weighting
The sampling strategy was designed to obtain a representative sample of the general population in England and Wales aged 16þ and living in private households. 990 postcode addresses were drawn from 30 primary sampling units across England and Wales. At responding households, interviewers selected one individual aged 16þ at random. The response rate was 37% of the issued sample or 40% of the eligible sample. 362 responses were obtained (53.6% female). The age distribution was: 16e39 (31%), 40e65 (47%) and over 65 (22%).
Weighting factors were applied to the results to account for just one person being interviewed per address, even though some addresses had multiple people at them. The weights also accounted for household non-response based on a logistic regression model with various demographic variables. They were further adjusted so that the weighted sample best matched the population in terms of age, sex and region. The results reported in this paper use these weights. More details can be found in Collingwood et al. (2010).

Vision charts
The tests were conducted using logarithmic progression letter charts as used by Elton et al. (2013). The distance charts used LogMAR progression, while the near vision charts were based on a logarithmic progression with the letter sizes rounded to the nearest 0.1 mm. The charts were printed at 300 dpi, and matt laminated.
The tests took place under the variable lighting levels present in the participants' homes. Vision performance declines with reduced lighting, e.g., Elton et al. (2013) found that "VA decreased by 0.2 log units between … overcast and street lighting conditions". The variable lighting levels therefore affected the results, with some participants measuring at a poorer visual acuity because of low lighting levels. Nevertheless, lighting levels were not controlled in the study because lighting is not controlled in users' homes in practice. Designers do not typically design for a particular lighting level but simply for use in the "real-world" (as described in Section 1).
The interviewers coded 97.5% of the tests as taking place in at least "adequate" lighting, based on their personal judgement. This method of coding matches the situation in real world product use, where the lighting levels are typically chosen by users based on personal judgement. A more objective measure of lighting, such as a light meter, may have been desirable but was not feasible within the constraints of the study. The vision tests were part of a larger battery of tests, and were carried out in multiple areas of the UK by a team of interviewers from an external agency. The amount of time available to train interviewers on the vision module was limited, and introducing additional new equipment that they had not seen before was not feasible. This paper describes results from tests using two charts: (i) a distance vision chart with very high contrast (90%) letters, and (ii) a handheld near vision chart with 70% contrast letters. A 90% contrast level was used for the former because this closely matches the standard vision chart for distance vision. A 70% contrast level was chosen for the near vision chart because Elton et al. (2013) demonstrated no significant difference in near vision readability between 70% and 90% contrast, and 70% is more typical of text and graphics used in product design.
The distance chart had nine rows, and the near chart had twelve rows with eight letters per row, consistent with the Regan acuity chart (Hazel and Elliott, 2002). Stroke width was one fifth of letter height. The capital letters used were: D, E, F, H, K, N, P, U, V and Z, presented in 5 Â 5 format (as used by Elton et al., 2013). The letter sizes on each row of the charts are given in Table 1.
The distance test was chosen to closely match a standard distance visual acuity test. The near vision tests used scaled versions of the same letter charts. This was chosen for compatibility with the distance test and because this is the standard test for fine detail visual acuity. Perceiving fine detail is one of the critical factors involved in a range of visual tasks, including reading words and letters , and discriminating icons. However, note that this differs from clinical near vision charts which tend to assess word (Bailey and Lovie, 1980) or text (Ahn et al., 1985) reading, rather than letter recognition. Reading letters requires discrimination of the strokes that make up the letter, rather than detecting the shape of a word, which is quite a different visual task (Kitchin and Bailey, 1981). Assessment of letter recognition was chosen so that the results would be more applicable to a range of visual elements, such as numbers and icons, rather than just words. Discriminating these visual elements critically relies on visual acuity, although other factors also have an impact.

Distances
The distance tests were conducted at a standard 3 m, and results are therefore given in LogMAR, the standard measure of angular size. In 27 cases, where there was insufficient space in a participant's home, the test was omitted and "missing data" was recorded. For the near vision tests, respondents were asked to hold the charts at a comfortable reading distance to correspond more closely with reading in practice (see Introduction). As a result, near vision measures indicate physical text sizes that participants can read, rather than angular print sizes resolvable by the eye.

Procedure
For each chart, participants first identified the smallest row they found comfortable to read. This was used in the calculation of a comfort vision measure (see below). It also helped to reduce the amount of time taken in the test because participants started from this row rather than the top of the chart. If participants read this row successfully, they then read down the chart (smaller letters) until they failed to read a row or they gave up. Success on a row was defined as making one error or no errors on it. If they did not read their starting row successfully, they read up the chart (larger letters) until they did read a row successfully. The smallest row read successfully provided a measure of "perceived threshold". A "perceived comfort" measure was also calculated, corresponding to the smallest row that participants said they could read comfortably and could actually read correctly.
These procedures differ from the standard protocol for measuring threshold acuity. Firstly, the line assignment scoring used here is coarser than standard letter by letter scoring procedures, resulting in less sensitivity of the measure to change (Bailey et al., 1991;Vanden Bosch and Wall, 1997). Also, terminating the test on a line with more than one error differs from usual clinical tests, where subjects are asked to continue, guessing if necessary, until close to an entire line is read incorrectly (Carkeet, 2001).
The procedures used were chosen so that tests could be performed quickly, by interviewers without a clinical background. The aim was not to produce a clinically reproducible threshold measure, but to indicate the number of people who cannot accurately distinguish letters of a certain size (Elton et al., 2013). This difference is highlighted by the terminology in this paper, where the measures are referred to as 'perceived threshold' and 'perceived comfort' ability, rather than 'threshold visual acuity'.
The measure of "perceived comfort" used in this paper also differs from the typical measure of comfort vision as the critical print size below which reading speed sharply declines (e.g. Legge et al., 1985). Rather, it relates to what participants themselves feel is comfortable, which is important in product use. In addition, it can be used for identifying visual features in general and not just reading text.

Use of vision aids
Tests were done with (i) the participant's "everyday" vision setup, i.e. the vision aids (if any) that the participant used for the majority of the day; and (ii) the specialised vision setup used for near or distance vision (if different, and if it was available).
In this paper, the "reading setup" refers to the vision setup the participant generally used for near vision, such as reading a book. The "distance vision setup" refers to the setup for distance vision, such as watching a film in a cinema. For example, a participant might usually wear glasses, but remove them for reading, or might only put on glasses for reading.
Note that the numbers of those who changed their vision setup for distance are relatively small (13.3%). Therefore, this paper does not report results on how vision changes between the everyday and distance vision setup, but rather focuses on changes in vision with the reading setup.

Measures
The following vision measures are used in this paper: Perceived threshold ability: corresponding to the smallest row read successfully. Perceived comfort ability: corresponding to the smallest row that participants said they could read comfortably and could actually read correctly. Perceived comfort is relevant to design because ideally users should be able to see design elements both comfortably and accurately.

Results and discussion
The survey results are used to examine how the size of text and other visual elements should change to address different situations. For example, a company may have data from user trials indicating the font size to use for threshold viewing. However, later on they may realise the importance for designing for comfort viewing. The results in this paper indicate how much larger the text should be in this situation.
The results reported in this paper are weighted (see above) and exclude missing cases, resulting in different sample sizes (n) for different variables.

Distance vision: perceived threshold vs. perceived comfort vision
Distance vision is important for seeing text and graphics at distances over 3 m, e.g. signage and advertising. The differences between individuals' perceived threshold and perceived comfort measures when wearing their distance vision setup (e.g. distance glasses) are shown in Fig. 1. Perceived threshold indicates what users can read if they try hard, while perceived comfort corresponds to the text size that participants can read both comfortably and accurately. In some applications, users may be prepared to push themselves to read the text. However, in most cases, user response will be formed by the level of perceived comfort. Approximately one third of participants (33.7%) had a difference between their measures, indicating that they needed text larger than their perceived threshold row for comfortable distance viewing. Most of these had a difference between the measures of 0.1 LogMAR. While this is not a large difference in vision terms, it can still make it awkward to use a product in practice. Furthermore, 12.4% of the sample had a difference of 0.2 LogMAR or more.
There was a wide variation between individuals, as shown in Fig. 1. This means that a single figure cannot be given indicating how much larger letters need to be for individuals to read them comfortably rather than at perceived threshold. However, it is possible to examine how much larger letters need to be for a similar proportion of the sample to see them comfortably rather than at threshold. Doing this fits with theories of design exclusion, in which the proportion of the population who can (or cannot) do a task is used a measure of a design's inclusivity or accessibility (Keates and Clarkson, 2003).
The amount larger that letters need to be varies depending on the proportion of the sample chosen. 95% was chosen because it is commonly used in ergonomics (Pheasant, 1999, p28). Thus, for each measure (perceived comfort and perceived threshold), we calculated the size of letters needed for 95% of the sample to be able to read them.
The calculations were done by calculating a weighted sum of all the individuals with distance visual ability equivalent to a given letter size or better. This corresponds to those individuals who would be able to read that letter size. Linear interpolation was used between measured sizes to calculate the size that 95% of the population would be able to read (as shown in the example in Fig. 2). Note that the line-by-line scoring method used in the vision tests only allows vision ability to be measured to the nearest 0.1 LogMAR for any individual. Similarly, the letter sizes measured were in 0.1 LogMAR increments. However, interpolating between these values for group measurements (e.g. means) is feasible mathematically and helpful pragmatically. However, the results need to be interpreted with some caution. For example, we are not claiming that any individuals actually had 0.23 LogMAR vision ability. Rather, we are saying that, in practice, using letters of size 0.23 LogMAR means that about 95% of the sample would be able to read them. More details of the calculations can be found in Goodman-Deane et al. (2011).
Applying these calculations, we found that 95% of the sample would be able to read 0.23 LogMAR letters when trying hard (i.e. at perceived threshold). In comparison, 95% of the sample could read 0.30 LogMAR letters comfortably. Thus letters need to be 0.07 LogMAR larger (18% larger) for the same proportion of people to be able to read them comfortably instead of at perceived threshold.
We advocate that all designs should aim for comfortable viewing, whenever possible.
These results cannot be compared with other studies because we are not aware of any other studies looking at the difference between comfort and threshold for distance vision. Threshold acuity is generally all that is measured.

Near vision: perceived threshold vs. perceived comfort vision
Near vision is also very important for design, affecting the perception and use of household products, packaging and reading matter. The differences between individuals' perceived threshold and perceived comfort measures for near visual ability with the reading setup (e.g. reading glasses if used) are shown in Fig. 3. The results shown are for the 70% contrast test chart. Over a third of participants (37.3%) had a difference between their measures, Fig. 1. Histogram of the difference between individuals' perceived threshold and perceived comfort distance visual ability (distance vision setup). Weighted data, excluding missing data (weighted n ¼ 287). Fig. 2. Linear interpolation.
indicating that they needed letters larger than their perceived threshold row for comfortable viewing. As explained above, these measures were taken at a reading distance chosen by the participant, in order to be typical of reading in practice. Similar calculations to those for distance vision were performed to determine the letter sizes that 95% of the sample would be able to read. Note that, although fractions of a row do not actually exist, they are appropriate in these calculations because the letter sizes in the rows follow an approximate logarithmic progression. As before, the results should be interpreted with some caution. We are not claiming that any individuals could actually read the equivalent of row 6.4, but saying that, in practice, using letters of size equivalent to row 6.4 means that about 95% of the sample would be able to read them.
We found that 95% of the sample would be able to read letters equivalent in size to row 6.4 when trying hard (i.e. at perceived threshold). In comparison, 95% would be able to read letters equivalent to row 7.1 comfortably (perceived comfort). Thus letters need to be about 0.7 rows larger (17% larger) for the same proportion of people to be able to see them comfortably instead of at perceived threshold.
It is difficult to compare the difference between perceived threshold and comfort (reserve capacity) in the current study to previous findings because of differences in the ways in which threshold and comfort were measured. The reserve capability in our study is smaller than typically found in the literature. For example, Latham and Tabrett (2012) indicate that for most people with visual impairment, print should be at least 2 times larger (3 rows on a LogMAR chart) than the threshold measure in order to allow reading at close to maximum reading speed. This difference is probably because perceived threshold was measured differently, and comfort is not usually measured on a letter chart at all. In particular, perceived threshold in our study is likely to represent bigger letters than clinically measured threshold because participants were not pushed to their limit.

Near vision: everyday vs. reading vision setups
The near vision analysis thus far has considered function with the reading setup, i.e. the vision aids (if any) usually used for reading. The survey also measured near ability with the participant's "everyday setup", i.e. the setup used for the majority of the day (see above). This is also relevant to design because people may not want to, or be able to, change their glasses in order to use a product or service. For example, it may not be convenient to put reading glasses on in the middle of cooking to read the text on a food packet.
24.3% of the sample changed their vision setup for reading. However, this proportion varied with age. Only 5.1% of those under 40 used a different setup, compared to 32.7% of those aged 40 to 65, and 43.3% of those over 65. Note that, although only 43.3% of over 65s changed their setup, almost all will require a different prescription for reading than for everyday use (Pointer, 1995). However, many of these use varifocals or bifocals (43.7% of the over 65s in the study) and so do not need to change their glasses for reading. Some of the remaining 12.9% will have non-optimal prescriptions. Fig. 4 shows the differences in perceived comfort near vision with the everyday and reading vision setups. Numbers are given as percentages of those who changed their vision setup. This is because, by definition, those who did not change their setup had zero difference between the setups. As this accounts for 75.7% of the sample, this would swamp the histogram and hide actual variation amongst those who do use a specialised vision setup for reading.
If designing for the population as a whole, we need to consider the 95th percentile of the whole sample. 95% of the whole sample would be able to read row 7.5 with their reading setup, and row 8.2 with their everyday setup. Letters therefore need to be about 0.75 rows larger (20% larger) when intended for reading with the everyday vision setup.
There is little previous literature comparing different vision setups. In fact, we are not aware of any other surveys comparing visual ability with a reading setup and a general use setup. Table 2 summarises the findings on the relative sizes of letters needed in different situations. Designs may often be based on threshold figures (what people can see if they try hard) or assume that users will use the most appropriate vision setup (e.g. reading glasses). This paper argues that it is better to design for comfort viewing, resulting in a more pleasant experience for the user, fewer errors and less difficulty. Furthermore, it is not always feasible or appropriate to change one's vision setup. The table shows how the text sizes used should change to take these aspects into account. The figures can be used in conjunction with each other. For example, visual elements on a piece of packaging should be approximately 40% larger (17% larger and then 20% larger) to be  suitable for comfortable use without people having to change their vision setup. In practical terms, if the original text used letters that were 1.2 mm high, then the recommendations indicate that letters should now be 1.7 mm high.

Summary of design recommendations
However, note that making elements a certain size will not automatically make them accessible. There are many other factors to consider, including colour, whitespace, distinctiveness of shape, lighting, surface reflections and the use of visual aids. Therefore, it remains important to test designs with real users (Dong et al., 2007;Helen Hamlyn Centre for Design, undated).
These guidelines specifically focus on the 95th percentile. They assume that the current design is suitable for about 95% of the sample in the current situation, e.g. text designed for perceived threshold viewing. They recommend how letter size should be changed to ensure that this proportion continues to be included in the new situation, e.g. when aiming for comfort viewing. Note that it is sometimes necessary to aim for a different percentile, in which case the relative sizes required may vary. If desired, they can be derived from examination of the dataset from the survey (Clarkson et al., 2012). There may also be variations when designing for specific user groups, such as older people or people with vision impairments.
These design recommendations are distinct from those presented in the literature. Existing text size guidelines specify sizes to use in general, often with a focus on the needs of users with low vision. However, they do not examine how the sizes needed vary in different situations. For example, the RNIB's Clear Print guidelines specify that text size should be 12 to 14 pt Arial, preferably 14 pt (RNIB, 2006). These text sizes are for blocks of text, and take into account the needs of people with low vision. Similarly, UK government regulations say to "design to be as legible as possible, for example using a minimum 14 point text size" (Office for Disability Issues, 2014) Russell-Minda et al. (2006) also focus on the needs of people with low vision. They summarise the literature, saying that "perhaps the most accepted guideline for low vision reading materials is that type should be large, [preferably] at least 16 to 18 points". These guidelines are useful to provide a ball-park figure for text size, but do not address differing needs in different situations.
The findings in this paper apply directly to the recognition of individual printed letters. In practice, many designs use words rather than individual letters, while others use symbols, letters or icons. Discriminating these visual elements critically relies on visual acuity, although other factors also come into play, as such overall shape and colour. Given that the charts used in this survey test visual acuity, we would expect the principal differences between conditions examined here to be valid for these other visual elements as well. For example, visual elements also need to be larger for comfortable rather than perceived threshold viewing, and when users are not required to change their vision setup.
In some ways, these findings are not surprising. However, many designers are young, and easily overlook conditions such as age-related long sightedness (Zitkus et al., 2013). Furthermore, when assessing the visual demand of handheld products, typical design practice relies heavily on experience and personal judgement, typified by the phrase "I can see it, I think it is fine". This is known to be a misconception in the design industry (Cornish et al., 2015;Pheasant, 1999). It is important to provide the data and the tools to encourage designers to think actively about the needs of people with different vision abilities to their own. Part of the aim of this paper is to encourage designers to do this, and to consider a wider range of use scenarios and practical needs.
Clearly, the requirements for visual accessibility also have to be balanced against other constraints on design. For example, designs on packaging have to contend with many considerations such as limited space, legal requirements on product information and challenging packaging shapes. The information in this paper is not the full story but it does help to inform effective design decisions and to counteract assumptions that text is readable or an element is visible when, to many users, it is not.
Further research is required to develop more exact calibration methods to apply the results from letter recognition tests to a broad range of real world vision tasks. The authors are currently researching the use of impairment simulation to perform this kind of calibration (Goodman-Deane et al., 2014) and early indications are that this is both possible and extremely useful in practice.

Conclusions
Clinical vision measures provide a good indication of people's vision ability in ideal situations. However, in everyday product use, people do not always wear the right correction, and may be unwilling to push themselves hard to read text or distinguish graphical elements. This paper presented data from a range of measures which may be more appropriate for such situations. These were used to calculate recommendations for how the sizes of graphical and text elements should change when products are intended for comfortable viewing instead of perceived threshold and when users are not able to (or do not want to) change their vision setup for reading, e.g. by putting on reading glasses.
These recommendations are intended to encourage designers to think about the usage situation of their products, and the assumptions they make about how products are used, as well as providing actual figures for how much larger graphical elements should be.
The results reported in this paper are from a pilot survey (Tenneti et al., 2013) and we hope to obtain funding for a full UK representative survey which will give more comprehensive results.