Effect of color gamut and luminance on observer metamerism in Effect of color gamut and luminance on observer metamerism in HDR displays HDR displays

Observer metamerism (OM) is one of the potential issues in HDR displays because of the required wide color gamuts and high peak luminance levels. A simulation was performed using hypothetical displays to investigate how OM in HDR displays would vary with changes in color gamuts and peak luminance levels. In this work, a robust metric, observer metamerism magnitude (OMM) is introduced, which quantiﬁes the OM of paired displays, depending on the similarity in spectral bandwidth between them. Also, the effect of changes in peak luminance on OM was found to be small, increasing OMM by 7 ∼ 8 % when peak luminance doubles.


Introduction
Observer metamerism (OM) refers to the phenomenon that color matches for one observer do not hold for another observer under the same viewing condition. This phenomenon is intrinsically attributed to the fact that human color vision differs from person to person [1]. Various studies pointed out that narrow-band primary displays would increase the possibility of metameric failures [1,2,3], and it can be a severe issue in color critical applications, for example, color grading. As HDR displays are ultimately expected to cover the full extent of Rec.2020 color gamut, it is inevitable for such displays to use extremely spectrally narrow light sources or color filters and/or adding more primaries [4,5]. Therefore, it is not surprising that OM is regarded as a potential issue of HDR displays [6]. Nonetheless, no extensive studies have been performed to investigate how OM would vary with expansions in the color gamut of displays, because of the lack of ideal HDR displays and difficulty in performing color matching experiments for OM. This paper introduces a simulation-based analysis to look into the effect of color gamut changes on OM in HDR displays. Also, the significance of the effect of peak luminance changes on OM is presented.

Procedures
To perform simulations to look into OM in displays, observers, color stimuli, and displays to be evaluated are essential. In particular, because inter-observer variability is a crucial factor causing OM, color matching functions (CMFs) representing color-normal population are required in addition to suitable color stimuli and displays. In this section, how the three factors are determined, in turn, are described.

Color Matching Functions
In this study, 1,000 2 • CMFs were generated according to the latest age distribution reported by the United Nations [7] based on the Asano model [8,9] using the LMS-to-XYZ transformation proposed by the author. The 1,000 CMFs are meant to represent the CMFs of the color-normal population aged between 20 and 80, which is plausible because the Asano model was developed to account for physiologically probable distributions of observer variability. Also, the number of observers seemed sufficient as a sample to represent the color-normal population. A set of peak luminance levels and chromaticity gamuts were selected in order to investigate observer metamerism magnitude (OMM). In terms of peak luminance, 500, 1,000, 2,000, and 4,000 cd/m 2 were determined taking the standards and recommendations for HDR displays [10] into account. For 28th Color and Imaging Conference Final Program and Proceedings chromaticity gamuts, the three chromaticity gamut standards, Rec.709, DCI.P3, and Rec.2020, widely used in the display industry, were considered as a starting point. Because various approaches to realize wider chromaticity gamuts beyond the DCI.P3 have been proposed [4,5], selecting several chromaticity gamuts that cover the extent between the DCI.P3 and Rec.2020 was necessary. Intermediate chromaticity gamuts lying between the DCI.P3 and Rec.2020 were selected based on the following assumptions and rules. First, considering the fact that the DCI.P3 approximately covers 72.0 % of the Rec.2020 on the xy chromaticity coordinate, the intermediate chromaticity gamuts were defined to cover 75%, 80%, 85%, 90%, and 95% of the Rec.2020, respectively. Specifying gamut coverage in xy chromaticity follows display industry practice, and usefully been shown to correlate well with gamut volume [11]. Second, possible xy coordinates of each primary were chosen along curved lines passing the xy coordinates of each primary of the three standards. The curved lines were determined using a shapepreserving piece-wise cubic interpolation algorithm. Then, the xy coordinates of each primary of the intermediate chromaticity gamuts were determined to be precisely equidistant to each other as shown in Figure 2. Some xy coordinates of the primaries of the intermediate chromaticity gamuts slightly exceeded the spectrum locus due to the curvatures of the curved lines. Hence, the deviated coordinates were carefully corrected by moving the coordinates to the nearest points on the spectrum locus.

Spectral definitions
The next step to build hypothetical displays, which cover the target chromaticity gamuts, was to determine the spectral power distributions (SPDs) of the displays. For the sake of convenience, all hypothetical displays were assumed to have three primaries, red, green, and blue, and the SPD of each primary is a form of the Gaussian functions following the Equation (1), where M indicates a primary of a given display while λ means the range of wavelength. The wavelength range, λ , was limited from 390 to 780 nm with a 1nm step, considering the wavelength range of human sensitivity. The parameter µ means the peak wavelength of a primary, while σ modulates the spectral bandwidth of the SPD. Also, the white point for the chromaticity gamuts was assumed as the D65, (x, y) = (0.3127, 0.3290), according to the international HDR standard [12]. Importantly, the two parameters, µ and σ , determine the xy coordinates of the primaries of a display, but in some regions of color space there are numerous combinations of µ and σ which meet the target xy coordinate. Also, only quantized xy coordinates were generated with discrete values of µ and σ , to reduce the computational cost. For this reason, the ranges of µ for each primary were limited to [600 nm : 0. I Compute all possible SPDs for a given primary using Equation (1) with a nested loop for µ and σ .
II Compute xy coordinates for the computed SPDs using the CIE 1931 standard observer, and calculate Euclidean distances between the computed xy coordinates and the target xy coordinates. So, each pair of µ and σ is indexed with its Euclidean distance. III Sort µ and σ pairs in descending order of Euclidean distance, and then sort again the pairs in descending order of σ . These sortings end up that the first pair indicates the broadest with a relatively short distance. IV Because the Gaussian function in Equation (1) with a pair of µ and σ is not scaled to the luminance level (Y) of a given primary, a scalar (D M ) for a given SPD (S M , N × 1 matrix, N =the length of λ ) is computed using Equation (2), is not a square matrix, its pseudo-inverse ( + operator) should be used to compute the scalar.
Due to the discrete values for µ and σ and assumption to select a possible broadest SPD of a given primary, a decision rule was applied. The rule allowed to choose a scaled SPD for a given is less than ∆E 00 0.1. However, in some cases, for example, the red primary for Rec.709, no pair of µ and σ was able to meet the decision rule. For this reason, such a scaled SPD (S ′ M ) should be corrected by properly adding SPDs of the other primaries such as green or/and blue. This correction can be expressed as Equation (4), represents the corrected SPDs while M C is a 3 × 3 correction matrix. The correction matrix, M C , can be computed using Equation (5), where M T indicates a 3 × 3 matrix of the tristimulus values of the target primaries. It should be noted that the elements in the correction matrix should not be negative values. For that reason, a function called lsqlin on MATLAB, which is a linear leastsquares solver with linear constraints, was used to avoid that problem. Figure 3 shows the final SPDS of the 8 hypothetical displays and the spectral specifications of the displays are described in Table 1. It is not surprising to see the primary spectra of most of the displays get narrower as chromaticity gamut expands. In particular, the spectra of the primaries of the Rec.2020 display approach its native monochromatic property, and these peak wavelengths precisely coincide with the spectral description of the Rec.2020 described in the standard [12].

Color Stimuli
A large set of uniformly distributed color stimuli for simulation was created based on the IC t C p color space [13]. The color stimuli set on the IC t C p color space was designed to utilize the entire 10,000 cd/m 2 and Rec.2020 container with 10-bit precision. First of all, the RGB colors on the surface of the entire 3-dimensional color volume were selected. Then, lightness (I), chroma (C), and hue (H) values corresponding to the selected RGB colors were computed through the IC t C p conversion. This conversion resulted in 97 hue slices. Notably, due to the limit of precision and non-uniformity of the RGB color space, the hue slices are not completely uniformly spaced. Subsequently, the lightness range between the minimum (0) and the maximum (1) was divided into 101 levels, and the chroma range which varies with hue was divided into 21 levels. Because the entire container delivers a huge luminance range up to 10,000 cd/m 2 , the lightness range was much more finely divided than the chroma range. For each hue, ICH values within a given triangular-shape extent were sampled based on the division rule. Exceptionally, stimuli on the boundary were included in the color stimuli set even if those do not comply with the division rule. Figure 4 represents the selected color stimuli set, which consists of 102,044 colors, on two different color spaces, IC t C p and CIE xyY .

Simulation Procedure
The simulation procedure is categorized into two phases. The first phase is to compute metameric pairs of a display pair from the color stimuli set for given CMFs. The second phase is to calculate color differences between metameric pairs for the CIE 1931 standard observer. An alternative approach -computing each individual's color difference for a pair of displays that are metameric to the standard observer -was considered, but it would result in non-standard color differences that may not be legitimately combined in the analysis. In the method employed, all of the color differences are in the same, standard units. Actually, both computations were made, and the results are nearly identical.
A subsampling technique was devised to minimize computational cost because this simulation aimed at computing OM for 28 display pairs using 1,000 CMFs. The subsampling was selecting about 5,000 color stimuli, uniformly distributed within the intersection of the gamuts of a given display pair, at random from the large color stimuli set generated in the previous section. The number of stimuli resulting from the subsampling differed due to the difference in gamut size between display pairs. This subsampling technique also enabled using statistically identical color stimuli for simulation. It means that the simulation results were not skewed or biased due to the random subsample selection. This was confirmed by statistical analysis using ANOVA, which revealed that the three magnitudes of OM between a display pair for three different sets of random color stimuli are statistically the same (p = 0.421, α = 0.05). The metric used to compute the OMM is introduced at the end of this section.
First of all, a given color stimulus should be reproduced on two spectrally different displays as a metameric pair for a given individual observer. In order for this, the color stimulus is reproduced on one display, the reference display, using the CIE 1931 standard observer first. Then, the SPDs of the other display, test display, are accordingly modulated to produce a color match for each of the different observers. The specific procedure is as follows.
I Let two displays, ref and test, and their native SPDs, S re f and S test . II For the XYZ value of a given color stimulus, p, compute the required RGB intensity for the display ref using the CIE 1931 standard observer (C std ) using Equation (6), III Using the tuned SPDs (S re f ,p ) of the display ref by the required RGB intensity, compute the tristimulus value of the color stimulus ,p, for an individual observer (C ind ).
  X re f ,ind,p Y re f ,ind,p Z re f ,ind,p   = 683 ·C ind · S re f ,p IV As in Equation (6) As a result, the two tuned SPDs, S re f ,p and S test,p , are a metameric pair for the individual observer. However, they are likely to produce a color mismatch for other observers including the CIE 1931 standard observer. The above steps (I ∼ V) were repeated for the generated 1,000 individual observers and a set of color stimuli. As noted earlier, although the number of a color stimuli set slightly varies with the gamut extents of a display pair, it consists of about 5,000 colors. Thus, the simulation resulted in about 5,000,000 (1, 000 × 5, 000) SPDs pairs for each display pair.
In order to assess OM between two displays, a color difference formula based on the CIEDE 2000 color difference formula was devised. The color difference formula was modified by eliminating the term computing differences in lightness from the CIEDE 2000 color difference formula. In reality, a plausible case is that a pair of displays for color reproduction or color grading works placed side-by-side with a significant separation. Therefore, the elimination would be reasonable because humans tend to become less sensitive to differences in lightness when there is a large separation between the stimuli [3]. The mean color difference (according to the standard observer) between metameric pairs for an individual observer, was computed using the modified color difference formula as expressed in Equation (10), where △E C † ,i ′ refers to the mean color difference of the i th observer across the all colors in the color stimuli set, and P denotes the number of colors in the color stimuli set. While △E C † ′ indicates the modified color difference function, L * a * b * re f ,std,p and L * a * b * test,std,p mean the CIELAB values derived from the XYZ values computed using Equation (7). Observer metamerism magnitude (OMM) was defined by the following Equation (11), where the function prctile90th returns the mean color difference of the 90th percentile observer. The 90th percentile was determined to prevent OMM from being biased by a single peculiar observer who reports the maximum mean color difference. One important point to make sure here is a reference white which is essential as an adapting point when converting CIE XYZ values to CIE L * a * b * values. Typically, the white point with the highest luminance of the display has been considered for SDR displays as the reference white for such conversion [6]. However, the use of the peak luminance of HDR displays is usually reserved to present specular highlights in scenes [14]. It is, in fact, controversial what reference white level for HDR contents should be [16,17], although ITU-R BR.2480 [15] suggests 200 cd/m 2 with the D65 chromaticity coordinate. This issue is also quite important in computing OMMs because the magnitudes can be exaggerated or understated for some colors depending on the reference white level. For this reason, the ITU recommendation is firstly considered in this work as reference white level, but also the effect of changes in reference white level is briefly addressed, showing how OMMs vary with different reference white levels.

Results & Discussion
The simulations were performed to understand how OM varies with expansions in chromaticity gamut and increases in peak luminance using the hypothetical displays covering 8 different chromaticity gamuts with peak luminance levels of 500, 1,000, 2,000, and 4,000 cd/m 2 . The simulations using the pairwise comparison method resulted in OMM indices of 28 pairs of displays for each peak luminance level. The simulation results show effects of chromaticity gamut, peak luminance, and reference white level, analyzed in the following sections. Figure 5 shows a pairwise comparison matrix, which represents how the OMM index varies with changes in the chromaticity gamut for a peak luminance level of 1,000 cd/m 2 . Each element in the matrix represents the OMM index between a pair of displays whose names are denoted in the row header and column header, respectively. Note that the main diagonal elements are all zero, because spectral matches (a pair of the same displays) results in zero OM. For this reason, these diagonal elements were excluded in this analysis; for example, these elements were not considered when indicating a pair of displays with the smallest OMM index. Because the matrix is symmetric, the upperright elements above the main diagonal were highlighted with chromatic colors to make the difference in the OMM index noticeable. As mentioned, the proposed index stemmed from the CIELAB color space, and the reference white level for the color space was 200 cd/m 2 . The impact of changes in the reference white level on the magnitude of OM is addressed later.

Effect of Color Gamut
The simulation results show that OMM goes up steeply with increasing differences in the spectral width between the paired displays. For example, looking at the first row of the matrix in Figure 5, the largest OMM index appears between the display Rec.709 and Rec.2020 100% while the smallest OMM index is witnessed between the display Rec.709 and DCI.P3. As noted in Table 1, the display Rec.2020 100% has the narrowest bandwidth. On the contrary, the display Rec.709 and DCI.P3 are the two displays which have the broadest spectra. It is noteworthy that the OMM indices induced by the display Rec.2020 100% are nearly twice as large as those of display Rec.2020 90% regardless of paired displays except for one case, which is they paired with Rec.2020 95%, respectively. This exception seems right because the spectral width of the display Rec.2020 95% is closer to that of the display Rec.2020 100% than Rec.2020 90%. However, more notably, the differences in the induced OMM index between the display Rec.2020 100% and Rec.2020 90% tend to increase as the primaries of those paired displays get narrower. For example, the OMM index between the display Rec.2020 85% and Rec.2020 90% is 1.06 while that between the display Rec.2020 85% and Rec.2020 100% is 4.42. Presumably, it is attributed to the fact that the display Rec.2020 100% has monochromatic primaries that amplify inter-observer variability. These results imply that such monochromatic primaries of the display Rec.2020 100% would cause a large magnitude of OM even if spectrally conspicuous narrow-band displays are paired with, for example, the display Rec.2020 90%.

Effect of Peak Luminance Level
The simulation was run for several different peak luminance levels, and their pairwise matrices (not shown) are very similar.  In fact, an important finding from the simulation is that the OMM index between a pair of display increases by only a small fraction of the ratio of increase in peak luminance. For example, the OMM index between the display Rec.709 and Rec.2020 100% with a peak luminance of 500 cd/m 2 is 4.97 while that the OMM index between the same display pair with a peak luminance of 1,000 cd/m 2 is 5.39. The relationship between the OMM index at one peak luminance level (500 cd/m 2 ) and that at the other peak luminance levels is graphically described in Figure 6. A set of colored circles is specified by the OMM indices for the display pairs with a peak luminance of 500 cd/m 2 and those for the same display pairs with another peak luminance level. For example, the red circles are defined by the OMM indices for the display pairs with 500 and 1,000 cd/m 2 . The colored lines are derived using a linear regression for each set of colored circles. Interestingly, the linear regressions show that the OMM indices at one peak luminance level highly correlate with those at another peak luminance level: R 2 = 0.9997 for the pair 500 and 1000 cd/m 2 , R 2 = 0.9996 for the pair 500 and 2000 cd/m 2 , and R 2 = 0.9990 for the pair 500 and 4000 cd/m 2 . These relationships imply that if the peak luminance level of a given display pair is doubled, then the OMM index between the display pair merely increases by about 7 ∼ 8%. Besides, these linear relationships suggest a possibility that the OMMs between display pairs at different peak luminance levels can be predicted from those between the same display pairs at one peak luminance level known, without additional complicated computations.

Effect of Reference White Level
Previous studies [16,17] pointed out that the reference white level in HDR contents could differ from scene to scene. In this analysis, the mean color differences of individual observers at one reference white level (200 cd/m 2 ) were compared to those at reference white levels: 100, 500, and 1,000 cd/m 2 . As illustrated in Figure 7, the relationships of the mean color differences of the individual observers between a display pair computed at two different reference white levels are highly correlated, showing R 2 values of approximately 1. It is noteworthy that these linear relationships between two different reference white levels also appear between different display pairs, for example, between the display Rec.709 and Rec.2020 100%. It indicates that reference white level can be regarded as a scalar. Therefore, if the OMM between a display pair is computed with one reference white level, then those between other display pairs for other reference white levels can be predicted using these linear relationships.

Conclusion
The effects of changes in chromaticity gamut and peak luminance levels on potential OM in HDR displays are examined using simulation in this paper. For simulation, 1,000 individual CMFs were created based on the the latest age distribution reported by the United Nations. About 100,000 uniformly distributed color stimuli were generated, taking into account the possible colors in HDR contents. Hypothetical displays with eight different chromaticity gamuts ranging between Rec.709 and Rec.2020 were generated, and these eight displays were simulated with peak luminance levels 500, 1000, 2000, and 4000 cd/m 2 .
The simulation results revealed that the OMM of a display is relatively determined by paired displays, depending on the similarity in terms of spectral bandwidth between displays. Notably, the display Rec.2020 100% tends to cause large OMMs even if paired with narrow-band primary displays. This result implies that displays with less narrow-band primaries than the display Rec.2020 100% might be a better option in applications, where wide color gamut displays are required.
Surprisingly, it was found that the OMM between a pair of displays does not increase as much as increases in peak luminance levels of the displays. The simulation results showed that the OMM increases by 7 ∼ 8% when the peak luminance levels of a display pair doubles. The simulation results also indicated that the effect of changes in reference white level on OM is a scalar similar to the peak luminance level result. The OMM increases by about 15% as the reference white level drops half. On the contrary, it decreases by up to about 82% when the reference white level rises two-and-a-half times. However, notably, it stays at 70% even if the reference white level increases five times. These two results imply that the effect of changes in peak luminance and reference white on OM could be predicted without additional complicated computations.