Observer metamerism to display white point using different primary sets

: Displays with diﬀerent primary sets were found to introduce perceived color mismatch between stimuli that are computationally metameric and to aﬀect the variations of the perceived color diﬀerence of metameric stimuli among observers (i.e., observer metamerism). In this study, computational analyses and psychophysical experiments were carried out to investigate the possibilities of increasing the color gamut area of a commercially available liquid crystal display (LCD) system using 16 three-primary sets, so that the perceived color diﬀerence of the white point between the system and the reference display and observer metamerism can be minimized. It was found the primary set with the peak wavelengths of 450, 525, and 665 nm was able to increase the sRGB color gamut by 72.1% in the CIE 1931 chromaticity diagram, which was found to have a strong correlation to the color volume of wide color gamut displays, while introducing the minimal color mismatch to the white point of the reference display and observer metamerism. The small white point color mismatch could be due to the similar wavelengths of the blue and green primaries in comparison to the reference display. In addition, the experiment results suggested that the CIE 2006 2° Color Matching Functions (CMFs) had better performance in characterizing the color match of the white point than the CIE 1931 2°, 1964 10°, and 2006 10° CMFs, which could be due to the fact that the stimulus used in the experiment only had a ﬁeld of view (FOV) around 3.8°.


Introduction
The color gamut is a critically important characteristic for a display. It decides the range of colors which can be produced by a display, with a larger color gamut for producing a wider range of colors. The releases of new recommendations for standard system colorimetries, from sRGB [1], Adobe RGB [2], DCI-P3 [3], to Rec. 2020 [4], clearly suggest the preference for a larger color gamut in the display industry. The color gamut of a display is fundamentally decided by its primaries. A primary set with chromaticities closer to the boundary of a chromaticity diagram (i.e., the spectrum locus) results in a larger color gamut.
Based on Grassmann's laws [5], stimuli with the same chromaticities in the CIE chromaticity diagram but composed of different primary sets should not be expected to introduce color mismatch and the color-matching data should be easily transformed between different primary sets [6]. Various psychophysical studies, however, suggested the failure of color match using different primary sets [7][8][9][10][11][12][13]. Specifically, stimuli with same chromaticities but different spectra were found to have perceived color difference [7,9,11] and stimuli that were visually match using different primaries were found to have different chromaticities [8,10,12,13]. The failure of color match using different primary sets was fundamentally caused by the failure of Grassmann's laws and the fact that the CIE chromaticity diagram does not represent the underlying cone mechanisms in the human visual system [14]. Moreover, changing the primary set was also found to cause observer metamerism, a phenomenon that the color appearance of a pair of color stimuli composed of two primary sets appear to match to one observer, but mismatch to other observers. It was found that the degrees of observer metamerism (i.e., the variations of color differences perceived by different observers) varied with the primary sets [15][16][17][18][19][20][21].
Four CIE Color Matching Functions (CMFs)-CIE 1931 2°, CIE 1964 10°, CIE 2006 2°, and CIE 2006 10°-were developed to characterize the color match of an average color-normal observer [22][23][24], as shown in Fig. 1. The CIE 1931 2°CMFs were derived based on the results of two color matching experiments [25][26][27]; the other three CMFs were derived based on the works of Stiles and Burch [28] and Speranskaya [29]. In addition, a model was developed to characterize the effects of field size and age on cone fundamentals in 2006 [23], which allowed transformations to different corresponding CMFs [24]. In this article, we refer the two sets of CMFs in [24] as the CIE 2006 2°and 10°CMFs. Various factors, such as lens optical density, macular pigment optical density, and photopigment optical density, were also found to cause variations in CMFs. In 2015, Asano proposed an individual colorimetric observer model [30] based on the CIE 2006 physiological observer model. The individual colorimetric observer model characterizes the possible variations caused by eight physiological parameters, together with age and field of view (FOV), to derive the individual cone fundamentals and CMFs. Though the CMFs vary from person to person, it is critically important to investigate the performance of the CMFs for an average color-normal observer due to the wide adoption in manufacturing and quality control processes. The main objective of this study was to investigate the possibility of increasing the color gamut of a display system using three-primary sets, with a goal to minimize the observer metamerism and the perceived color difference of the white point between the system and a commercially available liquid crystal display (LCD). The a prior hypothesis was that certain primary sets can increase the color gamut while having the smallest observer metamerism of the white point in comparison to a commercially available LCD display. Computational analyses based on 1000 individual CMFs were performed, and psychophysical experiments, in which the human observers performed color match of the white point using 16 primary sets, were carried out for validation.

Apparatus
The experiment was carried out using two devices which were placed side-by-side to produce color stimuli. An iPad Air 2, whose LCD display was covered by a piece of black paper with a 4 cm × 4 cm opening cut in the center, was used to produce the reference stimulus. An 11-channel spectrally tunable LED device, together with a diffusing chamber, was used to produce the matching stimuli. The interiors of the diffusing chamber were painted in white and a diffuser was placed inside the chamber. An additional diffuser was placed at the opening of the chamber to make the matching stimulus close to a Lambertian stimulus. Similarly, a black paper with 4 cm × 4 cm opening cut in the center was placed in front of the diffuser, so that the matching stimulus had the same size as the reference stimulus. The distance between the two stimuli were 22 cm. A chin rest was mounted 60 cm in front of the two stimuli, with the observer's sagittal plane being aligned with the center between the two stimuli during the experiment. Therefore, each stimulus occupied an FOV around 3.8°and the viewing angle between the two stimuli was around 20°. The observers were free to move their eyes and to view the stimuli successively or simultaneously during the experiment. Figure 2 shows the schematic illustration of the experimental setup and the photograph taken at the observer's eye position during the experiment.

Reference stimulus and primary sets
The reference stimulus produced by the iPad was calibrated to have chromaticities around D65 with a luminance L 10 around 250 cd/m 2 , with the measured spectral power distribution (SPD) producing a correlated color temperature (CCT) of 6485 K, a D uv of +0.0041, and a luminance L 10 of 253 cd/m 2 (note: subscript 10 refers to the calculations using the CIE 1964 10°CMFs). Eight of the 11 channels, three with peak wavelengths in the longer wavelength region (labeled as R1-R3), two in the middle wavelength region (labeled as G1 and G2), and three in the shorter wavelength region (labeled as B1-B3), in the spectrally tunable LED device were used as the primaries of the matching stimulus. Figure 3 shows the SPDs of the reference stimulus and the eight primaries at full intensity which were measured using a JETI specbos 1211UV spectroradiometer at the observer's eye position. Table 1 summarizes the colorimetric and photometric characteristics of the eight primaries.
These eight channels resulted in 18 primary sets, with each set including one from R1 to R3, one from G1 and G2, and one from B1 to B3. The primary sets were calibrated using a piecewise linear interpolation assuming constant chromaticity coordinates (PLCC) model and the CIE 1964 CMFs. The performance of the reverse model (i.e., converting X 10 Y 10 Z 10 values to RGB values) were evaluated using the 24 colors in the Macbeth ColorChecker, with the average color difference of 3 in ∆E 00 units. Two of the 18 sets (i.e., R1G1B3 and R1G2B3) were discarded in the experiment, since they were not able to cover a large gamut across 3000 and 10000 K at a luminance of 250 cd/m 2 due to the hardware design. Therefore, only 16 primary sets were used in the experiment.  For each primary set, a customized program was developed based on the PLCC model, so that the four arrow keys on a Bluetooth keyboard can remotely adjust the chromaticities of the matching stimulus in the CIE 1976 u' 10 -v' 10 chromaticity diagram at the constant L 10 value of 250 cd/m 2 . The left and right arrow keys allowed the adjustments along the blackbody locus with a step of 2 mireds unit (i.e., 10 6 K −1 ); the up and down arrow keys allowed the adjustments perpendicular to the blackbody locus with a step of 0.0008 u' 10 v' 10 unit. All the adjustments were started at the stimulus with a CCT of 7000 K and a D uv of +0.04, which had a significant color difference from the reference stimulus. Figure 4 shows the chromaticities of the eight primaries, the reference stimulus, and the starting point, together with the chromaticity ranges of the 16 primary sets with the L 10 at 250 cd/m 2 . Table 2 summarizes the color gamut areas of the 16 primary sets, the reference display, and the four standard color gamuts in the CIE 1931 chromaticity diagram and the CIE 1976 UCS chromaticity diagram.

Computational analysis
The computational analyses were performed using 1000 individual CMFs, based on the individual colorimetric observer model proposed by Asano [30]. It was used to simulate as if the color matching experiment was performed by these 1000 observers. These CMFs were derived using the Monte Carlo method, with the input age fixed at 25, which was estimated to be the average age of the observers of the experiment, and the FOV of 4°, which was similar to our experimental setup. Figure 5 shows these 1000 CMFs. Each of these 1000 CMFs was used to derive three scalers for the matching stimulus, with each scaler for one primary in a primary set, so that the reference and matching stimuli had the same tristimulus values using this CMF. Thus, a total of 1000 matching stimuli were derived for each primary set. The one-standard-deviation ellipses [31] were fitted based on the chromaticities calculated using these 1000 SPDs and the four CMFs (i.e., the CIE 1931 2°, CIE 1964 10°, CIE 2006 2°, and CIE 2006 10°CMFs), as shown in Fig. 6. The areas of the ellipses and the chromaticity distances between the reference stimulus and the average of the 1000 chromaticities are shown in Fig. 7, with the numerical values summarized in the Appendix ( Table 3).
The area of the ellipse can be used to characterize whether a certain primary set caused large inter-observer variations. It can be observed that the three primary sets-R1G2B2, R2G2B2, and R3G2B2-resulted in the smallest variations among the observers, regardless of the CMFs. In addition, the primary sets including B3 always caused much larger inter-observer variations than other primaries. Moreover, the sizes and orientations of the four ellipses for each primary set were always similar, suggesting the variations among individuals were similar in comparison to an average observer as characterized using a typical CMF.
In addition, the computational results suggested that the shifts of the red primary from R1 to R2 to R3 can increase the color gamut by 27.8 and 32.9% respectively in the CIE 1931 chromaticity diagram, as shown in Fig. 8, while introducing little observer metamerism and metamerism failure. Using the primary set R3G2B2 can increase the color gamut area of sRGB by 72.1%. It is worthwhile to mention that the color gamut calculated in the CIE 1931 chromaticity diagram was found to have a higher correlation to the gamut volume than those calculated in other chromaticity diagram for wide color gamut displays [31].
The distance between the average adjusted chromaticities and the chromaticities of the reference stimulus can be used to characterize whether a certain CMF can be applied to characterize the color match of the stimuli produced using the two primary sets (i.e., the one used for color match and the one used for producing the reference stimulus). The average chromaticity differences of the 16 primary sets were 0.0160, 0.0101, 0.0117, and 0.0115 in u'v' units for the CIE 1931 2°, CIE 1964 10°, CIE 2006 2°, and CIE 2006 10°CMFs respectively, suggesting no significant differences among the four CMFs on average. The three primary sets-R1G2B2, R2G2B2, and R3G2B2-also had the smallest chromaticity distances, in addition to the ellipse areas. It suggested that the white point of a display using these three primary sets would have smallest color differences to the reference stimulus, if they were calibrated using the four CMFs. Fig. 6. One-standard-deviation ellipses fitted based on the 1000 chromaticities, which were calculated using 1000 SPDs and the four CMFs, under each primary set and the chromaticities of the reference stimulus calculated using the four CMFs. These 1000 SPDs were derived using each of the 1000 individual CMFs to achieve a mathematical color match to the reference stimulus. For illustration purposes, the blackbody loci are plotted using the CIE 1964 10°CMFs.  Fig. 6 for the different primary sets. The larger the ellipse area, the larger the inter-observer variation when using a certain primary set; (b) Chromaticity distances between the center of the ellipses and reference stimulus shown in Fig. 6 for the different primary sets. The smaller the chromaticity distance, the better the performance of a CMF in characterizing the color match. The three primary sets having the smallest areas and distances are highlighted with a red *.

Psychophysical experiments
The experiment was carried out in the Color and Illumination Laboratory at The Hong Kong Polytechnic University. The procedure and protocol of the experiment were approved by the Institutional Review Board at The Hong Kong Polytechnic University.

Observers
Eight observers (two females and six males) between 21 and 31 years of age (mean = 25, std. dev. = 3.2) completed the experiment. All the observers had a normal color vision, as tested using the Ishihara Color Vision Test, and none of them had prior knowledge about the study.

Experimental procedures
Upon arrival, the observer completed the general information survey and the Ishihara Color Vision Test. Then the experimenter explained the procedure and task to the observer and escorted the observer to the equipment. The observer was asked to fix his or her chin on the chin rest throughout the experiment. The experimenter read the instructions and answered questions raised by the observer. Then the general illumination in the space was switched off and the observer was adapted in the dark condition for two minutes.
The reference stimulus, which was produced by the iPad, was presented on the left; the matching stimulus which was produced by the spectrally tunable LED device with the starting chromaticities described above was presented on the right. Both devices had been switched on for 30 minutes prior to the experiment for stabilization. The observer was asked to adjust the color appearance of the matching stimulus to match the color appearance of the reference stimulus as similar as possible using the four arrow keys on the keyboard. As described in Section 2.2, the left and right arrows allowed the adjustments along the blackbody locus; the up and down arrows allowed the adjustments perpendicular to the blackbody locus. The observer was allowed to take as much time as he or she needed. Once he or she confirmed the best match, he was asked to press the enter key to proceed to the next primary set. The same procedure was followed for each primary set. Each observer completed the color match using four of the 16 primary sets twice for evaluating the intra-observer variations, with a total of 20 matches. The order of the 20 matches was randomized. The entire experiment took around 30 minutes for each observer. In total, 160 matches were made.

Results
The SPDs of all the matched stimuli that were adjusted by the observers were measured after the completion of the experiment using the JETI spectroradiometer from the observer's eye positon. These SPDs were used to derive photometric and colorimetric values. It was found the values of L 10 derived from the measured SPDs were 2.2% higher than the values calculated using the PLCC model on average. Figure 9 shows the chromaticities, together with the one-standard-deviation ellipses, adjusted by all the observers using each primary set. The intra-and inter-observer variations were characterized using the mean color difference from the mean (MCDM) in the CIE 1976 u' 10 -v' 10 chromaticity diagram. The intra-observer variation was evaluated using the differences between the chromaticities that were adjusted by each observer twice using each of the four primary sets, which resulted in the MCDM values between 0.0012 and 0.0033, with a mean of 0.0020 in u' 10 v' 10 units. The chromaticities and the fitted ellipses of these repeated adjustments are also shown in Fig. 9.   Fig. 9. Chromaticities of the stimuli adjusted by the observers, together with the fitted one-standard-deviation ellipse, using each primary set to match the color appearance of the reference stimulus whose chromaticities are labeled with +. The chromaticities and ellipses in blue are the repeated adjustments.

Intra-and inter-observer variations
The inter-observer variation was evaluated using the differences between the chromaticities adjusted by each observer and the average chromaticities adjusted by all the observers (i.e., an average observer) for each primary set with a mean MCDM values of 0.0057 in u' 10 v' 10 units. Figure 10 shows the inter-observer variation of each primary set, based on the MCDM value and the area of the fitted one-standard-deviation ellipse. Both the intra-and inter-observer variations were comparable to past studies [8,32,33].  Figure 11 shows the one-standard-deviation ellipses that were fitted using the chromaticities of the stimuli adjusted by the observers, which were calculated using the four CMFs, with the ellipse Fig. 11. One-standard-deviation ellipses fitted based on the chromaticities of the matched stimuli, which were calculated using the SPDs adjusted by the observers and the four CMFs, and the chromaticities of the reference stimulus calculated using the four CMFs. For illustration purposes, the blackbody loci are plotted using the CIE 1964 10°CMFs. area and the distance between the average chromaticities of the adjusted and reference stimuli being shown in Fig. 12 and the numerical values being summarized in the Appendix (Table 3).  Fig. 11; (b) Chromaticity distances between the center of the ellipses and reference stimulus shown in Fig. 11. The three primary sets having the smallest areas and distances are highlighted with a red *.

Effects of primary sets and CMFs on color matching
The experiment results showed that the four primary sets using B3 (i.e., R2G1B3, B3G1B3, R2G2B3, and R3G2B3) resulted in much larger variations among the observers, which was similar to the computational analyses. The average chromaticity distances between the matched stimuli using different primary sets and the reference stimulus were 0.0123, 0.0109, 0.0072, and 0.0124 in u'v' units for the CIE 1931 2°, CIE 1964 10°, CIE 2006 2°, and CIE 2006 10°C MFs respectively, which suggested that the CIE 2006 2°CMFs had the best performance to characterize the color matches using these primary sets. Similar as the computational results, the three primary sets-R1G2B2, R2G2B2, and R3G2B2-resulted in the smallest variations among the observers and also had the smallest color differences in comparison to the reference stimulus as characterized using these four CMFs.

Comparisons between experiment results and computational analyses
It can be observed that the centers of the ellipses in Fig. 11 were closer to the starting point than those in Fig. 6. This could be due to the fixed starting point used in the experiment, while a recent study suggested to use multiple starting points that were uniformly distributed around the expected end point to avoid a starting point bias [34]. In addition, the ellipses in Figs. 6 and 11 had similar orientations for most primary sets. Figure 13 compares the ellipse areas and chromaticity distances derived from the experiment, in comparison to the computational analyses. As shown in Fig. 13(a), the observer variations were only underestimated in the computational analyses when the primary set R1G2B2 was used. For the other 15 primary sets, the ellipses derived from the experiment consistently generally had similar areas in comparison to those derived from the computational analyses, with all the four CMFs resulting only 1% average difference. Similarly, the primary set R1G2B2 also caused the largest differences, in terms of the chromaticity difference between the stimuli adjusted by an average observer and the reference stimulus, between the experiment and computational analyses, as shown in Fig. 13(b). In comparison to the computational analyses, the average chromaticity differences of all the 16 primary sets derived from the experiment were 19% and 34% smaller when the CIE 1931 2°and CIE 2006 2°CMFs were used, while they were 13% and 14% larger when the CIE 1964 10°a nd CIE 2006 10°CMFs were used. Therefore, the individual colorimetric observer model was generally a good tool for estimating individual color matches, but may not always be accurate for different primary sets.

Effects of primary sets
The best performance of the three primary sets (i.e., R1G2B2, R2G2B2, and R3G2B2), as found in the computational analyses and the experiment, was likely due to the similar wavelength ranges of the blue and green primaries between the primary sets and the reference stimulus, as shown in Fig. 3. When the spectra of two stimuli match, CMFs would no longer significantly affect the color match. The insignificant effect of the red primary could be due to the fact that the CMFs in the red wavelength regions generally have much gentler slopes (i.e., the first order derivative of the CMF value to the wavelength was small) and become less sensitive to the shifts in this region. On the other hand, the wavelengths of the primaries of these three sets, especially B2 and G2, were also found to allow good color matches in a recent study [8], which used a smooth broadband source as a reference stimulus. In addition, the wavelengths of the B2 and G2 and those of the reference stimulus were close to the "prime-color" (PC) regions identified in Thornton [35,36].
With the stimuli and primaries used in this study, it was difficult to explain why the three primary sets had the best performance. In order to better understand the underlying good performance of the primaries in these wavelength regions, reference stimuli with various SPDs (e.g., broadband SPDs and SPDs composed of primaries at various wavelength regions) should be used in future studies.
The effects of primary sets can also be investigated by looking at how the change of individual primary affected the adjusted chromaticities. Figure 14 shows the shift of the average adjusted chromaticities that were calculated using the four CMFs when only one primary was switched. For example, the first column in Fig. 14 shows the average chromaticity shifts when the red primary was switched from R1 to R2, and R2 to R3, while the green and blue primaries were fixed. It can be observed that the two 2°CMFs-Figs. 14(a) and 14(c)-had the similar chromaticity shift directions, in response to the change of a primary, while the two 10°CMFs-Figs. 14(b) and 14(d)-had the similar shifts. More importantly, it can be observed that the change of the blue primary generally introduced the largest shifts, while the change of the red primary had the smallest shifts.

Individual observers
The variations of individual observers merit further discussions. For each primary set, the sizes and orientations of the four ellipses, which were derived based on the chromaticities calculated using the four CMFs, were generally similar, as shown in Fig. 11. The shifts from the average chromaticities (i.e., the average observer) to the chromaticities of the adjusted stimuli are plotted for each observer in Fig. 15. It can be observed that the chromaticities of the stimuli adjusted by some observers were consistently shifted towards a certain direction, in comparison to an average observer. For example, the chromaticities adjusted by Observer 3 were consistently shifted towards the direction of positive u' and negative v' in comparison to an average observer.

Experiment setup
Last but not the least, the experiment setup used in the psychophysical experiments merits further comments. During the experiment, the observer was free to look at the reference and matching stimuli successively or simultaneously, as the stimuli were not next to each other. This could lead to the stimuli being imaged on different retinal regions and cause differences due to the non-uniform distributions of the macular pigment at different retinal locations [14]. It is difficult to predict how this affected the results, but it needs to be considered in future experiments. MFs; (d) CIE 2006 10°CMFs. (Note: some combinations only had the primary shifts once. For example, when G1 and B1 were fixed, R was switched from R1 to R2 to R3; when G1 and B3 were fixed, R was only switched from R2 to R3). Fig. 15. Shifts from the average chromaticities adjusted by the observers (i.e., an average observer) to the chromaticities of the stimuli adjusted by each observer using the 16 primary sets. The calculations were also performed using four CMFs. In total, there are 64 arrows in each figure, with 16 arrows in each color (i.e., each CMF set).

Conclusion
Computational analyses and psychophysical experiments were performed to investigate the possible increase of color gamut of a commercially available LCD display system by changing the primaries, while minimizing the perceived color differences of the white point and observer metamerism. Sixteen real primary sets that were composed of three blue, two green, and three red primaries were considered to match the white point of a reference display. The computational analyses employed 1000 individual CMFs that were derived using the individual colorimetric observer model [30] to achieve a computational color match to the white point of the reference display. The psychophysical experiments involved eight human observers to perform color match using stimuli with an FOV around 3.8°. The computational analyses and psychophysical experiments generally suggested similar effects of primary sets and CMFs, while significant differences existed for some specific primary set. It was found the primary set with the peak wavelengths of 450, 525, and 665 nm was able to increase the color gamut by 72.1% in the CIE 1931 chromaticity diagram [37] while introducing the minimal color mismatch to the white point of the reference display and the observer mismatch. In addition, the psychophysical experiment results suggested that the CIE 2006 2°CMFs on average had the best performance to predict the perceived color difference of the white point between these 16 primary sets and the reference display. Future experiments involving more observers and stimuli with different types of SPDs and primaries are needed to better understand observer metamerism.