Development of a whiteness formula for surface colors under an arbitrary light source.

The whiteness specification is critically important in surface color industry, especially when fluorescent whitening agents (FWAs) are added to objects. The CIE whiteness formula, the most widely used whiteness formula, only characterizes the whiteness under CIE standard D65, which ignores the change of whiteness under different light sources due to the spectral content of the light sources. Though the adoption of a Chromatic Adaptation Transform (CAT02) in the CIE whiteness formula was found effective in recent studies, it failed to allow a comparison across different correlated color temperatures (CCTs). In this study, a haploscopic viewing condition was employed, with a D65 simulator in the left booth, for evaluating the whiteness of eight samples under different light sources in the right booth. The whiteness of the four samples under the D65 simulator was employed as a whiteness scale to aid the evaluation. Based on the experiment results, the characterization of whiteness for a surface under an arbitrary light source is proposed to use the CIE whiteness formula with the sample chromaticities being transformed using CAT02 and an adjusted degree of chromatic adaptation factor D.


Introduction
Whiteness appearance is an important colorimetric characteristic in surface color industry. Consumers and users always associate a whiter appearance to a better quality and the absence of contaminant. For typical surface colors (e.g., NCS and Munsell color samples), a color with a lower chroma and a higher lightness generally appears whiter. The invention of fluorescent whitening agents (FWAs), however, allows to produce a whiter appearance. The FWAs contained in an object absorb the violet or ultraviolet radiation from the illumination and re-emit blue radiation, which enhances whiteness appearance by increasing the lightness and introducing a blue tint. Under a light source containing violet or ultraviolet radiation, a surface with a greater amount of FWAs appears whiter. Thus, the amount of FWAs is modulated to create different degrees of whiteness.
Currently, the degree of whiteness is typically specified under the International Commission on Illumination (CIE) standard D65, a standard illuminant for color specification in surface color industry. The whiteness value of a sample is calculated using the chromaticities measured under a D65 simulator in the CIE whiteness formula [1], as shown in Eq. (1) (2) in both equations are calculated using the CIE 1964 10° color matching functions (CMFs). The CIE whiteness and tint formulas are noted to be used when W CIE is between 40 and 5Y-280 and T 10,CIE is between −4 and + 2. Figure 1 illustrates the relationship between chromaticity shifts and change of W CIE and T 10 The weaknesses of the CIE whiteness formula have been widely documented. First, the range for the CIE whiteness formula was found too small. Samples outside the range may still appear white [2]. Moreover, the specification of whiteness using the CIE whiteness formula ignores the influence of illumination on FWA excitation and whiteness appearance, making the CIE whiteness value become an indicator of the amount of FWAs contained in a surface. Recent psychophysical studies have clearly revealed that the influence of illumination cannot be ignored, especially for typical phosphor-converted white-light light-emitting diodes (LEDs,) and the CIE whiteness values failed to characterize the whiteness under some illuminants [3][4][5]. Even under the D65 simulators above BB grade, as characterized using the CIE metamerism index, the CIE whiteness values can vary 16 points [6].
Efforts have been made to revise the range of the CIE whiteness formula. Uchida extended the CIE whiteness formula using 5Y-275 as a base point, as described in Eqs. (3) to (5) [7]. Ma et al. [8] and Vik et al. [9] simply proposed to revise the CIE limit to −5 < T 10,CIE < + 5 and to −4 < T 10,CIE < + 1 respectively for the samples under 6500 K illuminants. Recently, Wei et al. [10] proposed ellipsoids to define the whiteness boundary for surface colors under different CCT levels.
where Y and (x,y) are the luminance factor and the chromaticity coordinates of a sample under a light source; (x 0 ,y 0 ) are the chromaticity coordinates of the light source; ω = 1800 is the sensitivity of whiteness to saturation; η is the angle between x-axis and the direction from (x 0 ,y 0 ) to (0.1152, 0.1090) (i.e., the chromaticities of a 470 nm monochromatic light); φ = 16.6° is a small angle between enhanced whiteness and the direction towards the chromaticities of the 470 nm monochromatic light). Later, Ma et al. [8] used a linear regression method to fit the coefficients in the CIE whiteness formula, as shown in Eqs. (7) to (9), so that the experimental data under different CCT levels were correlated to the optimized formula. Meanwhile, both Ma et al. [8] and Wei et al. [4] found that the adoption of CAT02 to transform the chromaticities of the samples to the D65 in the CIE whiteness formula can also provide a good prediction to the perceived whiteness of surface colors.
Most of these recent studies [4,8,10], including those comparing the performance of different whiteness formulas or developing new formulas, generally employed a similar experimental design that the observers evaluated the whiteness of individual samples under each light source in comparison to observers' memory using a rating scale between 0 and 100%. In several other studies [3,12,13], the observers were presented a series of samples simultaneously and ranked the samples according to their whiteness. As these studies used either the rating scale between 0 and 100% or the average rank as the dependent variable, they only allowed an analysis of correlation between the calculated and perceived whiteness. Furthermore, such analyses were performed based on the assumptions that observers were always completely adapted under any light source and the whiteness appearance of surface color was not affected by the light source chromaticity, which was not supported by the recent studies [4,14]. With these in mind, this study was purposely designed to address these two points using a haploscopic viewing condition, which allowed the observers' two eyes were individually adapted to two light settings [15], so that the whiteness of surface colors under an arbitrary light source can be evaluated using a whiteness scale under a CIE D65 simulator.

Apparatus, light settings, and whiteness samples
Two viewing booths, with dimensions of 60 cm (width) × 60 cm (depth) × 60 cm (height), were placed side-by-side. The interiors of the booths were painted with Munsell N7 spectrally neutral paint. A 14-channel spectrally tunable LED device (i.e., LEDCube from THOUSLITE, Changzhou, China) was placed above each booth to provide a uniform illumination to the booth floor where the whiteness samples were placed. The front side of the booths was partially covered to prevent the observers from seeing the LED devices. Figure 2 shows the experiment setup. A chin-rest was mounted outside the two booths to align the observer's sagittal plane with the dividing panel between the two booths, so that a haploscopic viewing condition and a 0°:45° viewing geometry were kept for all the observers. Haplosopic viewing condition assumes the two eyes are separately adapted to two conditions, which allows a direct comparison across adapting conditions and allows the results of colorappearance studies to approach the precision of traditional color-matching experiments [16].
Though such an assumption is believed to hold for sensory chromatic-adaptation mechanism but may not for cognitive chromatic-adaption mechanism, haploscopic viewing condition is still the most widely used experimental technique in color appearance research for developing color apperance models and chromatic adaptation transforms [15]. The intensities of the 14 channels in the LED devices, with peak wavelengths between 350 and 700 nm, were carefully adjusted to produce seven light settings, with one for the left booth and the other six for the right booth, to provide a horizontal illuminance of 1000 ± 20 lx. The light setting for the left booth was created to simulate a high quality CIE D65 illuminant as a reference light setting, with CIE metamerism indices M v of 0.836 and M u of 0.692. The six light settings for the right booth comprised three levels of CCT (i.e., 3000, 4000, and 5000 K) and two levels of violet radiation (i.e., low and high). The two levels of violet radiation were created to produce different degrees of whiteness for the eight whiteness samples in the right booth but still maintain the samples appear white. The SPDs of the light settings, as measured using a calibrated JETI specbos 1811UV spectroradiometer with a calibrated reflectance standard purchased from LabSphere, are shown in Fig. 3, with the colorimetric characteristics being summarized in Table 1.  Twelve diffuse acrylic whiteness samples, with four in the left booth and eight in the right booth, purchased from Avian Technologies, contained different amounts of FWAs. These whiteness samples had a reflectance factor around 0.85 in the visible range with the fluorescent effect being excluded. As the light setting used in the left booth was close to CIE standard D65, the whiteness of the four samples in the left booth can be characterized with the CIE whiteness values (i.e., 84.3, 90.7, 122.1, and 142.9), which were calculated using the CIE whiteness formula and the chromaticities of the light setting and the chromaticities of the samples under illumination. These four samples were carefully selected so that the adjacent two samples had a 10-, 30-, and 20-point difference in the whiteness value. Figure 4 shows the chromaticities of the whiteness samples under different light settings, calculated using the CIE 1964 10° CMFs. The lightness value Y of all the samples were between 88.5 and 93.9, so the whiteness difference between the samples were mainly due to the chromaticity shifts.

Whiteness appearance evaluation
The four whiteness samples under the 6500 K light setting in the left booth were employed to aid the observer in evaluating the whiteness appearance of the eight samples in the right booth. During the experiment, the observers were informed that the whiteness value for the four samples in the left booth was 84, 91, 122, and 143 respectively. The observers were asked to evaluate the whiteness appearance of each sample in the right booth and to scale its whiteness value in comparison to the four samples in the left booth, with a greater value for a whiter appearance. The observers were reminded that the whiteness value was not limited between 84 and 143.

Observers
Fifteen naïve observers (4 females and 11 males), between 20 and 25 years of age (mean = 21.4, std. dev. = 1.40), participated in the experiment. All the observers had normal color vision, as tested using the Ishihara Color Vision Test, and none of them had prior knowledge about the study.

Experimental procedures
Upon arrival, the observer completed a general information survey and the Ishihara Color Vision Test. The experimenter explained the general procedure of the experiment to the observer and answered questions raised by the observers. Then the observer was escorted to the viewing booths and was seated in front of the two booths, with his or her chin being fixed on the chin rest, so that a similar haploscopic viewing geometry was formed for all the observers. The general lighting in the experiment space was then turned off. Under each pair of the light settings, the observer was asked to observe the samples placed in the two booths for two minutes for chromatic adaptation. The left booth was always under the the 6500 K reference light setting; while the right booth was under one of the other six light settings. A rating sheet, with the whiteness values of the four samples in the left booth being labeled as 84, 91, 122, and 143, was placed in the right booth. The observer was asked to compare the whiteness of each sample in the right booth to the four samples in the left booth, and rate its whiteness based on the four whiteness values given. The experimenter reminded the observer that the four samples in the left booth had a 10-, 30-, and 20-point difference in the whiteness value and the rating was not limited between 84 and 143. Among the six pairs of the light settings, two pairs were randomly selected for each observer to perform repeated evaluations. Same procedure was repeated for each pair of the light settings, with a new rating sheet being used for each pair. The order of the eight pairs were randomized for each observer and the entire procedure took around 45 minutes.

Inter-and intra-observer variations
Both the inter-and intra-observer variations were characterized using the Standardized Residual Sum of Squares (STRESS). Specifically, the STRESS values of the inter-observer variation were calculated by comparing all the whiteness ratings given by an observer and the whiteness ratings given by an average observer (i.e., the mean ratings); the STRESS values of the intra-observer variation were calculated by comparing the repeated evaluations made by each observer under the two light settings that were randomly selected for each observer.
For the inter-observer variation, the STRESS values ranged between 8.81 and 21.65, with a mean of 12.62 and a standard deviation of 3.84. For the intra-observer variation, the STRESS values ranged between 5.05 and 15.98, with a mean of 10.36 and a standard deviation of 3.45. The lower STRESS values in this study in comparison to past studies [4,8,10] suggested the high reliability of the experiment. Figure 5 shows the scatter plot of the whiteness rating given by the observer versus the whiteness value of each sample under each light setting, which was calculated using different whiteness formulas. W CIE was calculated using the chromaticities of the sample under the 6500K (Ref) light setting, so its value remained constant for each sample regardless of the light setting. Thus, samples with a greater amount of FWAs had a higher W CIE value.

Performance of various whiteness formulas
It can be observed that all the whiteness formulas had good correlations between the perceived and the calculated whiteness for the samples under each light setting, as shown in Fig. 5 and Table 2. The correlations, however, decreased when all the light settings were considered together, as shown in Table 2. Such a discrepancy was caused by the fact that all the light settings in this study contained violet radiations to excite the FWAs in the samples. When the samples were under a same light setting, those with a greater amount of FWAs were always rated to be whiter.  When all the samples were considered across different light settings, it can be observed that the CCT of the light setting had a significant effect on whiteness appearance, as shown in Fig. 5. All the formulas overestimated the whiteness of samples under the light settings, especially under those at 3000 K. In other words, a sample under a light setting with a lower CCT were rated to be less white than that under a light setting with a higher CCT, when both of them had a similar calculated whiteness value, which can be clearly observed in Fig. 5.
As the observers rated the whiteness of the samples on a same whiteness scale created by the four samples under the D65 simulator, the perceived and the calculated whiteness values of each sample should not only correlate to each other, but also close to each other (i.e., the data points should distribute close to the diagonal line in Fig. 5). Thus, the Pearson correlation coefficient is no longer appropriate, as the underlying model  i i y a bx = + allows a non-zero intercept [17,18]. The root-mean-square error (RMSE) between the whiteness ratings and the calculated whiteness values can better characterize the performance of the formulas than the Pearson correlation coefficient. As shown in Table 3, the RMSE values are higher under a lower CCT.

Degree of chromatic adaptation and whiteness
As all the samples had a very similar lightness level (i.e., Y = 88.5 ~93.9) and direction of chromaticity shift, as shown in Fig. 4, a similar magnitude of chromaticity shift was expected to resulted in a similar level of perceived whiteness. A similar magnitude of chromaticity shift, however, did not lead to a similar level of whiteness, as shown in Fig. 6, which was even more obvious after applying CAT02. Though the chromaticities of the samples under the 3000 K light settings had larger shifts after being transformed to those under CIE standard D65, as shown in Fig. 7(a), these samples were rated to be less white under the 3000 K light settings.  Such a phenomenon was likely caused by a lower degree of chromatic adaptation under a light setting with a lower CCT, which has been identified in several recent studies [4,14]. In particular, Wei et al. [4] found a series of nominally white samples appeared less white under the illuminants with a lower CCT when FWA excitation did not happen. Similarly, Zhai and Luo [14] found that the whitest sample under an illuminant needed to have a chromaticity shift towards blue when the CCT of the illuminant was low.  Table  4.
Thus, we optimized the degree of chromatic adaptation factor D to minimize the RMSE between the perceived whiteness and W CIE,CAT02,D . The optimized D shifted the chromaticities of some samples towards a lower whiteness direction, as shown in Fig. 7(b), which resulted in a much smaller difference between the perceived whiteness and W CIE,CAT02,D for all the CCT levels, as shown in Fig. 8 (b) and Table 4. Though the degree of chromatic adaptation under 3000 K in this study was a little lower in comparison to that in Wei et al. [4], which was likely caused by the fact that the haploscopic viewing condition used in this study only allowed the sensory chromatic-adaptation mechanism, but not the cognitive chromatic adaptation mechanism, to be separately happen in the two eyes [16], the findings that a higher CCT illuminant introduced a higher degree of chromatic adaptation corroborated those in recent studies [4,14].  The D values of 0.720, 0.752, and 0.772 for 3000, 4000, and 5000 K illuminants were optimized by using a 6500 K illuminant as a reference, which may require a further validation.

Conclusion
A psychophysical experiment was carried out to investigate the whiteness of surface colors under an arbitrary light source. The observers evaluated the whiteness of samples under a haploscopic viewing condition. Four samples containing different amounts of FWAs were placed under a high-quality D65 simulator to create a whiteness scale, which was used to help the observers to evaluate the whiteness of eight samples under six light settings, with two under each CCT levels (i.e., 3000, 4000, and 5000 K). Various whiteness formulas were found to have good performance to predict the whiteness of samples under each light setting but fail to predict the whiteness across the light settings at different CCTs. With a same chromaticity shift, a sample under a light setting with a lower CCT was perceived to be less white, which suggested a lower degree of chromatic adaptation. Based on the recent studies suggesting the effectiveness of using CAT02 with the CIE whiteness formula, it was found that an optimized degree of chromatic adaptation factor D in CAT02 can significantly reduce the discrepancy between the perceived and calculated whiteness values across different CCTs. In short, it was proposed that the whiteness for a surface color under an arbitrary light source can be characterized using the sample chromaticity that is transformed to the D65 illuminant using CAT02 with an adjusted degree of chromatic adaptation factor D in the CIE whiteness formula.

Funding
The National Natural Science Foundation of China (No. 61705191).