Stereoscopic 3d Display with Color Interlacing Improves Perceived Depth

Temporal interlacing is a method for presenting stereoscopic 3D content whereby the two eyes' views are presented at different times and optical filtering selectively delivers the appropriate view to each eye. This approach is prone to distortions in perceived depth because the visual system can interpret the temporal delay between binocular views as spatial disparity. We propose a novel color-interlacing display protocol that reverses the order of binocular presentation for the green primary but maintains the order for the red and blue primaries: During the first sub-frame, the left eye sees the green component of the left-eye view and the right eye sees the red and blue components of the right-eye view, and vice versa during the second sub-frame. The proposed method distributes the luminance of each eye's view more evenly over time. Because disparity estimation is based primarily on luminance information, a more even distribution of luminance over time should reduce depth distortion. We conducted a psychophysical experiment to test these expectations and indeed found that less depth distortion occurs with color interlacing than temporal interlacing. Temporal presentation protocols in stereoscopic displays: Flicker visibility, perceived motion, and perceived depth, " J. Advanced stereo projection using interference filters, " J. The stroboscopic Pulfrich effect is not evidence for the joint encoding of motion and depth, " J. The interaction of color and luminance in steresoscopic vision, " Invest. Ophthalmol. The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings, " are preparing a manuscript to be called " The visibility of color breakup and a means to reduce it. " 26. Mechanism of color breakup in field-sequential-color projectors, " J.


Introduction
Stereoscopic 3D (S3D) displays create a compelling sensation of depth by presenting slightly different images to the two eyes.The visual brain computes depth from those slight differences, which are known as binocular disparity.Various techniques are used to deliver distinct images to the eyes with one display screen.
In spatial interlacing, alternate rows on the screen have opposite polarizations.The viewer wears passive glasses that contain filters with opposite polarization states such that one eye sees the images presented on odd pixel rows while the other eye sees the images presented on even rows [1,2].By sending information simultaneously to the two eyes, spatial interlacing is not as prone to temporal artifacts as other techniques [3].But spatial interlacing reduces the number of pixels delivered to each eye by half and therefore reduces the effective resolution of the display when the viewer is at the recommended viewing distance (for example, three times picture height for high-definition television).
In temporal interlacing, all pixels are delivered to both eyes in temporal alternation.In one instantiation, a polarization switch at the display presents opposite polarization states over time in synchrony with the presentation of the left-and right-eye images.The viewer wears glasses containing filters with fixed, but opposite polarization before the two eyes, so the images are delivered to the two eyes at different times [4,5].Another instantiation of temporal interlacing uses a time-varying color filter at the display to deliver the left-and right-eye images over time.The viewer wears passive glasses that transmit the appropriate wavelengths for the left and right eyes [5,6].Yet another instantiation uses active glasses that alternate between transmitting and blocking the images delivered to an eye [1,5].These temporalinterlacing techniques present all pixels to each eye thereby maximizing effective resolution, but they are prone to temporal artifacts such as flicker and distortions of perceived depth [3].It also restricts the frame rate of each view to half of the display's native frame rate.Nonetheless, temporal interlacing is currently the predominant technique in cinema, television, and desktop displays [7], so we will focus on the perceptual outcomes associated with that technique.
Figure 1 illustrates how temporal interlacing can create distortions of perceived depth.The left panel is a space-time plot of a stimulus moving horizontally at constant speed.It has a nominal disparity of zero, so it should be seen as moving in the plane of the display screen.But it often appears to be in front of or behind the screen, depending on the direction of motion and which eye is stimulated first.This Mach-Dvorak effect was first reported in the 19th century [8,9].Fig. 1.Disparity computation with temporal interlacing.Left: Space-time plot of a stimulus moving horizontally at constant speed on a temporal-interlacing display.The stimulus has a spatial disparity of zero and moves at a speed of Δx/Δt (displacements of Δx in presentations separated in time by Δt).Left-and right-eye presentations are represented by filled and unfilled symbols, respectively.In each frame, right-eye images are delayed by Δi relative to left-eye images.(In most protocols, Δi = Δt/2.)Right: Disparity estimation with weighted averaging over time.The abscissa represents the arrival time of each candidate match from the right eye relative to the reference image from the left eye.The left ordinate represents the disparity of each potential match.The black squares represent the disparities and the time differences for four candidate matches.The right ordinate represents the weight given to each match.The weights vary from 0 to 1.The estimated disparity is a weighted average of the disparities of the potential matches (Eq.(3).In the example, the stimulus is moving rightward and the left image leads the right, so the erroneous disparity is crossed.Therefore the object should be seen closer to the viewer than intended.
To understand the cause of this depth distortion, we must consider how the visual system computes disparity.To make the computation, the visual brain has to solve the binocularmatching problem: Which image feature in one eye should be matched with a given feature in the other eye?With temporal interlacing, no features are presented simultaneously, so the neural mechanisms that perform the matching have an interesting problem: Should a given image in the left eye be matched with a later or earlier image in the right eye?There is presumably no way to know, so it makes sense that the brain would use a time-weighted average as depicted in the right panel of Fig. 1 [10].The weighting function would give the highest weight to images that are delivered simultaneously and successively lower weights to images that arrive at increasingly different times.This is a weighted running average with the weights determined by the inter-ocular time difference.
To formalize this, we start with a reference image in the left eye, the one in the middle of the left panel of Fig. 1.Then the time differences (the leads and lags) between that image and ones delivered to the right eye are: for j = -∞ to ∞, where Δt is the frame time and Δi is the delay of each frame's right-eye image relative to the reference image.Following Read and Cumming [10], we assume a Gaussian weighting function: ( ) where τ is the time constant for binocular matching; j = 0 for the right-eye image that is captured simultaneously with the reference image (e.g., the right-eye image at position 0 in the left panel of the figure).The estimated disparity is then: The behavior of this weighted-averaging model is very consistent with observed depth distortions.For example, Fig. 2 shows data from Hoffman et al. [3] in which a particular temporal-interlacing protocol was used to present a stimulus with a spatial disparity of zero that was moving at different speeds.There is close agreement between the predicted and observed depth distortions.Fig. 2. Predicted and observed depth distortions.Hoffman et al. [3] presented stimuli using a temporal-interlacing protocol with Δt = 1/75sec (frame rate of 75Hz) and Δi = 1/150sec.They added spatial disparity in order to eliminate the distortion in perceived depth.The nulling disparity is a direct measure of the magnitude of the depth distortion.That disparity is plotted as a function of the horizontal speed of the stimulus.The deviations of the data points from the horizontal line at 0arcmin are manifestations of distortions in perceived depth.The dashed line represents the predictions of Eq. ( 3) with τ = 25msec.(Data were also collected at speeds of −10 and 10deg/sec, but those measurements were corrupted by an artifact in the measuring technique, so those points are not plotted here.) Distortions of perceived depth due to temporal interlacing are a serious problem because they lead to situations in which one depth cue, binocular disparity, contradicts another one, such as occlusion.Figure 3 (see Media 1) illustrates such a situation.The bike rider moves from left to right and should be seen as farther than the standing person.The depth distortion, however, causes the biker to appear closer than the person so it is somewhat startling (and annoying) to see him occluded by the person as he passes him.With the goal of minimizing such perceptual distortions, we [11] and Simon and Jorke [12] developed a temporal-interlacing protocol that uses a different procedure for presenting the three color primaries to the two eyes.Unlike conventional temporal interlacing, which presents all three primaries to an eye simultaneously, this protocol presents green to the left eye at the same time as red and blue to the right eye and then presents red and blue to the left eye at the same time as green to the right eye.The color-interlacing protocol is schematized in Fig. 4.Each frame is divided into two sub-frames in both conventional temporal interlacing and color interlacing.In conventional interlacing, all three primaries (R, G, and B) are presented simultaneously first to the left eye and then to the right.In color interlacing, the first sub-frame consists of G presented to the left eye and R and B to the right eye; the second subframe consists of R and B to the left eye and G to the right.We know that luminance and color signals are segregated into different pathways early in visual processing: a luminance channel (black-white) and two color-opponent chromatic channels (red-green and blueyellow) [13,14].The effectiveness of the color-interlacing protocol in minimizing depth distortions will depend on whether disparity is calculated with signals in the luminance channels only or whether the estimated disparity also depends on signals in chromatic channels.Stereopsis is weaker, sometimes absent, when the stimulus is equiluminant (e.g., red dots on a green background) [9,15,16].This suggests that disparity is indeed calculated primarily, perhaps exclusively, with signals in the luminance channels and not the chromatic channels.In color interlacing then, one should in theory be able to minimize or even eliminate depth distortions by presenting stimuli to the two eyes that are roughly equal in luminance.Specifically, by presenting roughly equiluminant G and R + B signals, G to one eye and R + B to the other, the stimulus presentation becomes effectively simultaneous to the two eyes, and this should minimize or even eliminate the Mach-Dvorak effect.If disparity were calculated from signals in chromatic channels, the color-interlacing technique should lead to the experience of depth distortions in opposite directions for different colors: i.e., the G component of a leftward moving object would appear behind the screen while the R + B component would appear in front of the screen.
The primary goal in the work presented here is to compare the depth distortions that occur with conventional temporal interlacing to those that occur with color interlacing.If the color used in the video content is either saturated green or magenta, color interlacing becomes equivalent to temporal interlacing, and should therefore produce equivalent depth distortion.But colors in the natural environment are rarely saturated, so we predict that color interlacing will in most cases minimize the depth distortions that plague temporal interlacing.We tested that prediction by measuring depth perception while varying the saturation of chromatic stimuli.

Subjects
Three subjects, 22 to 32 years old, participated.All had normal or corrected-to-normal visual acuity according to a Snellen letter-acuity test and normal stereo acuity according to the Titmus stereo test.They all had normal color vision according to the Ishihara color-deficiency test.Two were authors; the other was unaware of the experimental hypotheses.

Apparatus
Stimuli were presented on one display (ViewSonic G225f CRT, 1280x960 resolution, 120Hz refresh rate), the left half to the left eye and the right half to the right eye.Front-surface mirrors in the optical path to each eye allowed us to set the vergence distance to 107cm, the same as the path length to the display from each eye.At that distance, each pixel subtended 1arcmin.

Stimuli and procedure
We presented five colors: i.e., different ratios of green and magenta intensities.We measured the luminances of the green and magenta components with a photometer (Minolta CS-100): they are respectively l G and l M .From heterochromatic flicker photometry measurements [17], we found the values of l G and l M that were equally bright: they are E G and E M .For our five colors, l G /l M were E G /0 (green only), 0.75E G /0.25E M , 0.5E G /0.5E M (equally bright green and magenta), 0.25E G /0.75E M , and 0/E M (magenta only), corresponding respectively to saturated green, desaturated green, gray, desaturated magenta, and magenta stimuli (Fig. 5).
If disparity is estimated from luminance signals only, we expect the magnitude of depth distortions to depend on the relative luminances of the stimuli being presented.When the green and magenta components are equiluminant, depth distortions should be eliminated.This would occur when the stimulus is gray (middle panel in Fig. 5).When the green and magenta components are far from equiluminant-i.e., when l G /l M was ∞ or 0-the color-interlacing technique becomes identical to conventional temporal interlacing, so the magnitude of depth distortions should be the same in the two techniques.This would occur when the stimulus is either saturated green or saturated magenta (far left and right in Fig. 5).If chromatic signals contribute to disparity estimation, equiluminant stimuli should yield depth distortions in different directions, one for green and the other for magenta.In other words, instead of eliminating depth distortions, color interlacing would actually create an additional artifact that conventional temporal interlacing does not produce.
To test the effect of chromatic signals on disparity estimation, we added a brightnessequivalent protocol.This protocol maintained brightness, but eliminated hue variation in color interlacing.To this end, we calculated the luminances of gray that are equally bright as a green and a magenta: where a G is the luminance of the achromatic stimulus that is equal in brightness to the luminance of green at L G , a M is luminance of the achromatic stimulus that is equally bright magenta at L M , and 0.73 and 0.27 are the proportions of luminance that green and magenta had in the achromatic stimuli.In the brightness-equivalent protocol, we presented the achromatic stimulus with a lu minance at a G to the left-eye and the achromatic stimulus with a luminance at a M to the right eye during the first sub-frame.Similarly during the second subframe, we presented the achromatic stimulus with luminances a M and a G to the left and right eyes.There were five conditions in the brightness-equivalent protocol whose brightness ratios (same as luminance ratios because there was no hue change) of a G /a M were 1/0, 0.75/0.25,0.5/0.5, 0.25/0.75, and 1/0, each an analog to one of the five conditions in the colorinterlacing protocol.
Fig. 6.Schematic of the stimulus.Two rows of disks (1° in diameter) moved horizontally at the same speed but in opposite directions.Disks were horizontally separated by 3°.The upper and lower rows were vertically separated by 2.3°.Three stationary crosses with zero disparity were presented to provide a reference for the distance of the screen.
The stimulus on each trial consisted of two groups of disks moving horizontally at constant speed in opposite directions (Fig. 6).The two groups should produce depth distortions of the same magnitude but opposite directions because the speeds are the same but the directions are opposite.The directions of the upper-and lower-row motions were random across trials, but always opposite from one another.We presented the stimuli with one of the two presentation techniques: color interlacing and brightness-equivalent.For each combination of color and presentation technique, we presented six speeds: 12, −7, −3, 3, 7, and 12deg/sec.The stimuli were presented for 1.5sec.After each presentation, the subject made a forced-choice judgment of whether the upper row of disks appeared nearer or farther than the lower row.No feedback was provided.Depending on the judgment on the previous trial, disparity of opposite signs was added to the upper and lower rows for the next trial.If the subject had indicated on the previous trial that the upper row was farther than the lower, crossed disparity was added to the upper row (making it appear closer) and uncrossed disparity of the same magnitude was added to the lower (making it appear farther).If the subject had indicated that the upper row was nearer than the lower, the addition of crossed and uncrossed disparity was the reverse.The magnitude of added disparity was varied according to a 30-trial, one-down/one-up staircase procedure.The resulting psychometric data were fit with a cumulative Gaussian using a maximum-likelihood criterion to find the disparity magnitude that made the upper and lower rows appear equidistant [18][19][20].
There were 60 combinations of experimental parameters (five colors × two presentation methods × six velocities) requiring the presentation of roughly 1800 trials.The entire experiment was run in one session that was about an hour in duration.

Results
We measured the magnitude of depth distortions for the conditions described in the Methods.Figure 7 shows the predicted results for the color-interlacing protocol if disparity estimation is done on luminance signals only.When the stimulus is equally bright across sub-frames, the model predicts that no distortions of perceived depth should occur because the presentation to the two eyes becomes effectively simultaneous.This prediction is represented by the horizontal line at a nulling disparity of 0. When the stimulus is saturated green or magenta, the color-interlacing protocol becomes identical to conventional temporal interlacing (because each eye is stimulated in one sub-frame and not the other; left eye in first sub-frame, right eye in second), so depth distortions proportional to object speed are predicted.When the stimulus is desaturated green or magenta, smaller depth distortions are predicted because the brightness variation across sub-frames is smaller than with saturated colors.Fig. 7. Predicted depth distortions with color interlacing.The predicted disparity that eliminates distortion of perceived depth is plotted as a function of stimulus speed.The dashed lines represent the predictions for different colors as indicated in the legend.
Figure 8 plots the results from the three subjects and the predictions from Fig. 7.The circles represent the data with the color-interlacing protocol in which color and brightness may both change between sub-frames.The diamonds represent the data with the equivalentbrightness protocol in which only the brightness changes.The data from those two conditions are essentially identical, which is consistent with the idea that disparity estimation is done on luminance signals only.Both sets of data are generally quite consistent with the predictions, which is also consistent with the hypothesis that disparity estimation is based on luminance.
There is a possible alternative explanation for these data.Perhaps disparity estimation is done in part on chromatic signals so subjects saw one sign of depth specified by green and the opposite sign of depth specified by magenta.When seeing two rather different depths, perhaps subjects responded according to the average of the two.This is consistent with the plotted data, but completely at odds with subjects' depth percepts: They always perceived one depth, not two, and responded accordingly.
The results show that color interlacing greatly minimizes or even eliminates distortions of perceived depth with desaturated colors.Distortions are still observed with saturated colors, but they are never larger than the ones with conventional temporal interlacing.Saturated colors are uncommon in natural video content, so the results show that color interlacing is a viable procedure for minimizing or eliminating depth distortion.

Summary of findings
The experimental results showed that color interlacing usually reduces depth distortion compared to temporal interlacing.In color interlacing, the severity of depth distortion for a moving object would depend on the hue and saturation of an object's color, with worse depth distortion occurring with highly saturated colors-particularly magenta and green.However, colors in the natural environment are rarely saturated.This would suggest that depth distortion should be minimal for typical content.To demonstrate the effect of color interlacing, we generated a color-interlacing version of the bike-rider demo, shown in Fig. 9 (see Media 2).Unlike Fig. 3, the disparity and occlusion are now in agreement: The bike rider appears at the appropriate place in space as he passes behind the standing person.

Implementation of color interlacing
The proposed color interlacing can be easily implemented by modifying an existing S3D presentation technique [12].The existing technique employed by Dolby [21] uses two interference filters that transmit different spectral bands per each color primary (Fig. 10, left) [6].These filters are applied to the viewer's glasses so that the transmission bands for the two eyes are mutually exclusive.The same filters are also applied to the color wheel in the projector such that the first and second sub-frames are shown only to the left and the right eyes, respectively.Color interlacing can be implemented out of the existing technique by modifying the interference filters in either the viewer's glasses or the color wheel.For example, switching the left-and right-eye transmission bands for the red and blue primary in the glasses could accomplish this (Fig. 10, right).

Flicker
Color interlacing should also reduce the visibility of flicker compared to conventional temporal interlacing.The chromatic channels are not as sensitive to high temporal frequencies as the luminance channel, so flicker visibility is generally determined by the luminance channel [22].Because color interlacing distributes light more evenly over time, flicker visibility should be reduced with this technique compared to conventional temporal interlacing.

Color breakup
In color interlacing, the red and blue components are presented at a different time than the green component.This can yield the visual impression that colors in a moving object are spatially misaligned, a phenomenon called color breakup [23][24][25].Color breakup occurs when colors are presented sequentially (as in single-chip DLP projectors [26,27]).Color interlacing differs from conventional color-sequential presentation by showing different color components to the two eyes at each moment in time.One might expect the two color components to add binocularly yielding little color breakup.We investigated the salience of color breakup with color-interlacing protocol in some pilot testing using the procedure described in Johnson et al. [25].Color breakup was not observed when the object moved slowly but became more and more salient as the object moved faster.In a psychophysical experiment, we measured the speed at which color breakup becomes visible.That threshold speed was quite similar with the conventional and proposed color-sequential presentations.However, when the object moved faster than the threshold speed, color breakup was somewhat more visible with the proposed technique.We think that binocular rivalry might have made breakup more salient when it occurred.Fig. 10.Illustration of interference filter design for temporal interlacing (left) and color interlacing (right).Transmittance of filters is plotted as a function of wavelength.Dashed and solid lines denote the transmittance of the glasses and color wheel, respectively.The transmittances of glasses are the same in each column while the transmittances of color wheel are the same in each row.During sub-frame 1 of temporal interlacing, the color wheel's interference filter has the same transmittance as that of the glasses for the left eye.Thus the left eye sees all color components, but the right eye sees none.During sub-frame 2, the color wheel's filter has the same transmittance as that of the glasses, so the right eye sees all colors, but the left eye sees none.Color interlacing can be implemented with a simple modification of the filter design.The transmission bands for the red and blue primaries are exchanged between the two eyes, while the band for the green primary is not changed.As a result, the left eye sees green and the right eye sees red and blue during sub-frame 1, and vice versa for sub-frame 2.

Conclusion
In this paper, we verified that the proposed color-interlacing technique reduces depth distortion by keeping luminance per eye relatively constant over time.Another method for eliminating depth distortion would be to use two projectors with static interference filters that are mutually exclusive.But a two-projector approach is generally not favored because highend movie theater projectors are expensive.Depth distortions could also be eliminated by capturing content with alternating capture, as opposed to simultaneous capture.This method obviously relies on the display using alternative presentation for the benefit to be realized; otherwise depth distortion would still occur.The proposed technique should enhance the S3D viewing experience by reducing depth distortion in a feasible way.

Fig. 3 .
Fig. 3. Video illustrating the cue conflict created by depth distortion (see Media 1).Cross fuse the two panels to see the image in stereo.A bike rider goes through the scene from left to right.Due to the depth distortion, he is seen as closer than the standing person.But when the biker passes by the person, the person occludes the biker indicating that he is in fact farther than the person.

Fig. 4 .
Fig. 4. Temporal interlacing with and without color interlacing.Left: The conventional temporal-interlacing protocol.Time proceeds from top to bottom.R, G, and B are presented simultaneously to the left eye in sub-frame 1 and then simultaneously to the right eye in subframe 2. Right: Color interlacing.In sub-frame 1, G is presented to the left eye and R and B to the right eye.In sub-frame 2, R and B are presented to the left eye and G to the right eye.Thus for most stimuli, both eyes are always being stimulated.

Fig. 8 .
Fig.8.Experimental results.Each panel shows the data from one of the three subjects.In each case, the disparity that eliminated perceived depth distortion is plotted as a function of stimulus speed.The dashed lines represent the predictions from Fig.1.The circles represent the data when the stimuli were presented according to the color-interlacing protocol.The diamonds represent the data when stimuli of equivalent brightness but with no color variation were presented.Error bars represent 95% confidence intervals.

Fig. 9 .
Fig. 9. Demo of color interlacing (see Media 2).Left and right panels correspond to right-and left-eye views (cross-fuse for stereoscopic effect).There is no depth distortion and the biker appears at the appropriate place in depth.