Stereoscopic Three-dimensional Optic Flow Distortions Caused by Mismatches Between Image Acquisition and Display Parameters

We analyzed the impact of common stereoscopic three-dimensional (S3D) depth distortion on S3D optic flow in virtual reality environments. The depth distortion is introduced by mismatches between the image acquisition and display parameters. The results show that such S3D distortions induce large S3D optic flow distortions and may even induce partial/full optic flow reversal within a certain depth range, depending on the viewer’s moving speed and the magnitude of S3D distortion. We hypothesize that the S3D optic flow distortion may be a source of intra-sensory conflict that could be a source of visually induced motion sickness. c © 2019 Society for Imaging Science and Technology. [DOI: 10.2352/J.ImagingSci.Technol.2019.63.6.060412]


INTRODUCTION
With the recent growth of interest in virtual reality (VR), many solutions to the visual discomfort that occurs while exploring VR worlds presented in stereoscopic three-dimensional (S3D) have been proposed. Such discomfort symptoms are considered to be critical issues that may prevent potential users from embracing VR devices, as once a user experiences these symptoms, they may be reluctant to try it again. Our work presented here is motivated by the need to eliminate or reduce visually induced motion sickness (VIMS) in S3D systems.
The most common explanation for motion sickness symptoms is based on the inter-sensory motion conflict theory. When motion signals from different sensory systems, such as visual and vestibular, are in conflict with each other, that may cause motion sickness. This theory has been used to explain many real-world motion sickness experiences such as carsickness and seasickness. This theory also may explain motion sickness in VR, as users are usually stationary while exploring VR worlds with full virtual and visual motion.
However, there is currently no tested theory or data to explain the finding that S3D presentations induce more motion sickness than two-dimensional (2D) presentations, even with the same content, and even though the visual information is considered to be truthfully replicated in S3D.
We have been exploring explanations based on the intrasensory motion conflict, where the visual motion information does not match with the naturally expected visual motion information. We proposed that such intra-sensory conflict could increase overall motion sickness by adding another layer of conflict on top of the existing motion conflict among sensory systems [1]. Since the perception of self-motion (vection) through the visual system is mainly driven by the optic flow of the presented scene [2], we hypothesize that discrepancies between the optic flow generated in S3D and the optic flow experienced in the real world may be a source of additional VIMS in S3D VR.

Theories of Motion Sickness in VR
When a person walks and turns in the real world, his/her vestibular and proprioceptive sensory systems send motion signals to the brain in addition to the efferent copy of the motor commands, while the visual system monitors the optic flow, which confirms the individual's current state of motion. However, when a person ''walks'' in a VR environment, only the optic flow indicates self-motion, while the other sensory systems provide no motion signals. This conflict is believed to cause motion sickness.
The inter-sensory motion conflict theory [3,4] states that conflicts arise between motion signals received by two (or more) sensory systems during the information integration process, which determine the state of self-motion, cause motion sickness. Although this theory suggests a plausible underlying mechanism for the onset of motion sickness in VR, it does not explain the reduction of motion sickness symptoms through repeated exposure (habituation).
The sensory rearrangement theory [5,6] expands on the inter-sensory motion conflict theory to include the concept of a comparator that matches current sensory inputs with the motion information expected or learned from prior experience. Repeated exposure to a certain type of motion signal conflict will eventually be registered as a new type of ''normal'' experience such that the comparator no longer generates a motion conflict signal. One example of such response is the vestibular gain adaptation under conditions of optical magnification [7,8].
The sensory rearrangement theory supports other possible causes of motion conflict, such as intra-sensory motion conflict [8], where the VIMS may be induced by a distorted optic flow that has never been experienced before or is not consistent with the expected optic flow.

Elevated VIMS in S3D Compared to 2D Viewing
Elevated VIMS symptoms resulting from S3D viewing compared to 2D viewing has been reported in numerous studies [9][10][11][12][13][14][15][16]. For example, in a study on motion sickness of movie viewers [11], one group watched a movie in 2D, while another group watched the same movie in S3D. More viewers in the S3D condition experienced motion sickness than did viewers in the 2D condition. This suggests that the stereoscopic depth added in the S3D viewing condition may lead to more VIMS.

Geometric S3D Space Distortions
Various image distortions in S3D [17,18] have been discussed in the context of the accuracy of images projected to each eye (e.g., straight lines should not look curved). Other studies [1,19] pointed out that the corresponding parameters between S3D capturing (or rendering) and displaying are frequently mismatched, and the mismatches introduce S3D space distortions that affect the perception of size-and-depth relations in the reconstructed world.
When such a space distortion is introduced, viewers may experience perception or illusions such as the Alice in Wonderland syndrome [20], where the viewer perceives the displayed world to be either larger (micropsia), smaller (macropsia), closer (peliosia), or farther (teliopsia) than it should be. However, if the viewer does not actively interact with the reconstructed S3D world (e.g., while watching a stationary scene), these perceptions may not cause any motion sickness symptoms because other 2D depth cues such as perspective, size, texture, and occlusion are correctly following the distorted space, although the reconstructed scene may look unfamiliar or unnatural in some sense.
Different types of distortions can co-exist in a single S3D reconstruction, and different levels of parameter mismatches may result in compounded non-linear changes of various distortion patterns. Figure 1 shows a type of S3D distortion caused by mismatch between the virtual camera convergence distance and viewer screen distance. Various other parameter mismatches potentially causing S3D distortion were discussed and demonstrated in [19].

Perception of Motion in Distorted S3D Space
Although the modeling works described above showed various patterns of the S3D distortion in a stationary scene, it did not address how such S3D space distortions affect the perception of self-motion in the S3D world. For instance, when the user moves forward in a distorted S3D space, objects at near or far distances seem to approach the viewer slower or faster than the speed expected by natural depth-to-size relationship. Furthermore, since the monocular depth cues remain veridical in the stereo depth distorted scene, the objects may appear to be compressed or expanded along the depth direction (depth wobbling ) as the user moves forward, which violates the natural stability of (apparent) rigid objects. In this case, the viewer will experience an optic flow that rarely happens in the real world.
Note that unlike usual 2D projection distortions (e.g., lens distortions or keystone distortions), the stereo Figure 1. Patterns of S3D space distortion when the virtual camera convergence distance and viewer screen distance are mismatched (from [19]). The brown cubes are orthoscopic representations of a cube located at the screen distance in the original virtual world. The gray plane represents the display screen. The purple cube represents the reproductions of the cube when the virtual camera convergence distance is (a) shorter than the display screen distance and (b) longer than the display screen distance. The cube will be displayed as an extended, or compressed hexahedron, respectively. disparity based S3D spatial distortions are barely noticeable from a stationary viewpoint because the distortions occur in the depth direction so that straight lines still appear straight in the distorted S3D world when it is observed from a stationary view point.
In a pseudoscopic condition (i.e., reversed disparity condition), for example, where the left eye view is displayed to the right eye and the right eye view is displayed to the left eye, each individual view may show no 2D distortion. However, when those images are viewed in stereo, the viewer sees the extreme (reversed) depth distortion if the stereo cues dominate the percept [21].

Distorted Perception of Self-Motion and VIMS
We hypothesize that unfamiliar optic flow presented to the VR users in distorted S3D space may be a reason for an increased VIMS in S3D.
Similar issues may result from the prismatic (or other optical) effects of refractive ophthalmic lenses in real-world practices [22,23]. For example, when people are fit with new glasses that have a different optical power or lens design (i.e., base curve), they may need time to adapt to the new lenses (sometimes up to two weeks). During the adaptation period, they may experience motion-sickness-like symptoms such as nausea and dizziness. Such symptoms may be much more severe and require a longer period of adaptation when the patients are fit with progressive addition lenses, which introduce more complex and spatially variable geometric distortions. These are different from the S3D distortions as the optical lenses may distort both the stereo depth and monocular 2D images.
Since almost all S3D contents are produced by using a set of S3D parameters chosen by the producers and are not customized for each user or user's display environment, S3D space distortion may be frequently introduced. In addition, since users spend a limited amount of time (e.g., 2-3 hours for watching a movie) with a reconstructed S3D world, it is expected that full adaptation to the S3D distortions will not be achieved in that period. The initial discomfort caused by VIMS may be severe enough to prevent further use and, thus, fail to lead to habituation even if it is possible.
Here, we analyze and demonstrate the type of optic flow distortions in such mismatched S3D.

METHODS
Using the S3D space distortion models proposed in [19], we computed the S3D optic flow in distorted S3D worlds, which may occur while a VR user is in ''motion,'' and analyzed their effects. Figure 2 shows a schematic of the VR environment setup for the S3D optic flow analysis where 20 spherical objects are aligned along the forward depth direction 1 m to the left side of the viewer's path and spaced 1 m apart (as if they are on the wall of a corridor, or a road with trees on one side). The viewer is assumed to move forward at 1, 2, or 3 m/s. In this configuration, the objects initially cover the viewer's lateral visual field from 2.8 • (20th farthest object) to 45 • (1st nearest object).

The Sample VR Environment
Note that since the objects are aligned along the depth direction (e.g., viewer moving direction), near and far objects are located at higher and lower eccentricity, respectively, where the vanishing point is located at the center of the visual field (i.e., eccentricity = 0 • ). This setup simplifies the full-scale optic flow analysis into an analysis of one-dimensional lateral motion of each object at different distances. For the full-scale optic flow, one may imagine that the indicated speed and space distortion may occur along all radial directions from the viewer.

Optic Flow Analysis for S3D Depth Distortion
In a conventional 2D optic flow diagram, the apparent object's motion is described by the angular velocity of objects in the visual field. However, such a 2D optic flow has a limited ability to illustrate individual object's shape transformations in the depth direction.
For example, if virtual objects move towards the viewer along the radial axis while simultaneously shrinking their size following the inverse of the linear perspective ratio, their apparent sizes will not be changed. The eccentricities of the objects will not change either because the object's motion occurs along the same eccentricity. In this case, an ordinary 2D optic flow diagram shows no apparent motion in the visual field. However, even in those cases, human stereoscopic vision fusing the two 2D optic flows can determine that the object is approaching because its angular disparity is increasing.
To overcome these limitations of the conventional 2D optic flow diagram, the linear velocities of objects were computed in xyz Euclidean coordinates, instead of angular velocities, and then the computed velocities of objects was associated with corresponding location in the visual field so that the speed changes in distorted depth (i.e., compressed or expanded space) can also be considered in a comparison with the object motions in S3D space without any depth distortion. Still, an angular velocity based optic flow is needed to fully describe the actual 3D motion of the object, but even without it, describing optic flow in linear speed is an effective way to show the impact of S3D depth distortion because the S3D depth distortion does not affect the angular position of the object but only affects the position along the depth direction.

Applying the S3D Optic Flow Analysis
The S3D optic flow analysis was applied to the following S3D rendering and display-related parameter pair mismatches between: (1) virtual camera separation and viewer eye separation; (2) camera field of view and virtual screen field of view; (3) camera convergence distance and virtual screen distance.
In the usual VR head-mounted display (HMD) condition, for instance, the virtual camera and viewer eye separation mismatch may be introduced when an S3D scene is rendered with a virtual camera separation matching an average interpupillary distance (IPD) but shown to viewers with a smaller or larger IPD without any corresponding rendering adjustment; the camera and virtual screen field of view mismatch may be introduced when an S3D scene is rendered with a certain fixed rendering camera frustum but is displayed on a different size of virtual screen in HMD hardware; the camera convergence and the virtual screen distance mismatch may be introduced if the virtual cameras render the scene with a certain convergence distance, but the virtual screen is located at a different distance.
Note that it is a common misconception that only the parallel-axis camera setup, but not the converging camera setup, properly renders the binocular scene, mainly driven by the concerns of inducing keystone distortions and accompanying vertical disparity [17]. The important point here is whether the camera axis and display screen orientation are matched or not [1]. If the convergence of camera (optical) axis is achieved by employing asymmetric frustums (for virtual world) or a sensor shift (for real world), the captured/rendered images do not induce any 2D projection distortions when presented on fronto-parallel display screens.
Those parameter mismatches do not require any display-viewer interactions. Therefore, eye movements within the S3D scene while the viewer is standing still do not affect the perception of the scene since the depth structure is maintained [1].  Figure 3(a) shows the distance from the viewer to the objects and visual eccentricity of objects in the sample world of Fig. 1, where the nearer objects are located at higher eccentricities and farther objects are located at lower eccentricities. The distance-to-eccentricity relation in the sample VR world is non-linear in nature. For each of the objects, the displacement following 0.1 s duration was computed when the viewer is assumed to move forward at 1, 2, and 3 m/s. In our sample VR world, as a viewer moves forward, the retinal projection of all objects move toward larger eccentricities, where the nearer objects (at larger eccentricities) appear to move faster than the farther objects (at smaller eccentricities), indicating that conventional motion parallax is in place (as shown in Fig. 3b). Fig. 3(c) shows the linear speed of objects (computed by the displacement divided by the duration) in the viewer's visual field in an ideal S3D world without any space distortion (i.e., orthoscopic reproduction). Both angular (Fig. 3b) and linear (Fig. 3c) optic flow plots can be considered as a ground truth or baseline of the S3D optic flow that people experience in real-world condition or distortion-free reproductions of the S3D world. Figure 4 shows the linear speed of the object motion in the same forward movements but with various S3D distortions introduced by the mismatched parameter pairs between capture and display processes. The magnitude of parameter mismatch is marked as a ratio between the corresponding parameters, as described in [19]: (1) the camera and eye separations (Ks), (2) the camera and screen Field of views (FOVs) (Kw), and the camera convergence and screen distances (Kd). In all cases, a ratio of 1 indicates a perfectly matched condition (as in Fig. 3c).
As can be seen in Fig. 4, the S3D space distortion affects the apparent speed of objects at all distances, and the amount of speed error increases as the viewer motion increases. More dramatic optic flow distortion occurs in conditions with expanded S3D space (i.e., right column plots in Fig. 4, as shown in Fig. 1a) compared to the compressed S3D space (i.e., left column plots in Fig. 4, as shown in Fig. 1b). The larger effects occur at lower eccentricities, where the optic flow motion is slower than that at higher eccentricities.
However, in the compressed S3D space conditions (Fig. 4a, 4c, 4e), they violate the trend of the depth motion where far objects (at smaller eccentricities) approach slower than the objects at near/mid distances. In the normal undistorted condition, as shown in Fig. 3(c), a monotonic increase of linear speed for the approaching object is expected as the eccentricity decreases. When the reconstructed S3D space is expanded, in most cases (such as Fig. 4b and 4d), such a monotonic linear speed increase can be observed except for the condition in which the space distortion is caused by the screen distance and convergence distance mismatch (Fig. 4f). In this case, the speed direction may be reversed (i.e., negative values), indicating that objects appear to move backward away from the viewer who is moving forward toward the objects. In other words, the ratio of space expansion along the depth direction is faster than the viewer's displacement in a given time.
Note that the ''Dolly zoom'' or ''Contra-zoom'' technique used in 2D movies (i.e., Hitchcock movie ''Vertigo, '' 1958) [24] induces similar perceptual depth/speed distortion, where objects at certain distances move ''relatively'' slower (compressed) or faster (expanded) than they should be. This is a monocular effect and is often applied to stationary scenes. Our analysis indicates that similar effects may occur in S3D motion scene due to mismatched parameters.
To emphasize the impact of S3D space distortion on optic flow, the differences between the orthoscopic condition (Fig. 3c) and the non-orthoscopic conditions (Fig. 4) were computed ( Figure 5). In those plots, positive and negative values indicate slower and faster optic flow in the distorted S3D world than what it would have been in the natural (orthoscopic) condition, respectively. They do not show the absolute speeds of the objects.
In the cases of the mismatches between camera and eye separation condition (Fig. 5a) and between screen and camera FOV mismatch condition (Fig. 5b), the patterns of S3D optic flow distortion are similar. If the reconstructed S3D space is compressed or expanded, the optic flow of far objects (i.e., at small eccentricity in our sample VR world) is mostly affected when a VR user moves. However, since the S3D optic flow distortion changes gradually along the depth direction and the apparent speed of objects is governed by the linear perspective rule, the S3D optic flow distortion of very far objects may not be noticeable, even if they are large.
The problem might occur at mid-range distances, where the distortion of the S3D optic flow may be noticeable. Note that in our sample VR world, the eccentricity of an object located at 1 m in front of the viewer is 45.0 • , while the eccentricity of an object at 6 m away is 9.5 • (Fig. 3a). Over this depth range, the speed of viewer motion may also become a dominant factor because as the viewer moving speed increases, it increases the amount of distortion surpassing the viewer's perceptual threshold, allowing the viewer to detect the S3D optic flow distortion (Fig. 5a, b). The patterns of S3D optic flow distortion are more complex in the case of the mismatch between the screen distance and virtual convergence distance (Fig. 5c). When a user moves slowly in the virtual world (i.e., V = 1 m/s), the effect of S3D space compression (Kd < 1) causes relatively little apparent speed distortions for objects in all depths, as they all move slightly slower than what they are expected. However, when the same viewer motion happens in the expanded (Kd > 1) S3D world condition, the S3D space distortion causes large S3D optic flow distortions for near objects. As the user's speed increases, a larger speed error will be introduced along the depth direction for near objects, and even larger errors will be introduced in the expanded S3D space. In addition, an abrupt speed sign change (a vertical asymptote in Fig. 4f and Fig. 5c) occurs about 11 • eccentricity which is approximately at a distance of 6 m in our sample VR world.
The asymptote shown in this condition is not unique for the screen and convergence distance mismatch condition; in fact, such an asymptote occurs in all mismatch conditions discussed here, depending on the magnitude of parameter mismatch. With the given (and usual) parameter mismatch ranges, the asymptotes are positioned at much farther distances. However, the problem of the relative optic flow reversal (i.e., a reversal of space expansion direction relative to the viewer motion) in the screen distance and convergence distance mismatch is particularly important because such a large amount of optic flow distortion can easily be introduced in practice. In many HMD designs, the virtual screen distance is optically set at a few meters or at infinity, while the convergence distance of the virtual camera is configured to be either parallel axis (i.e., converged to infinite distance) or converged axis (i.e., converged to a certain distance) to render the VR scene. With these parameters, the ratio between the two parameters can be extremely low (i.e., Kd ≈ 0) or high (i.e., Kd = ∞). Note again that the parameter ratio for the orthoscopic condition is equal to one (Kd = 1). Figure 6 shows a few snapshots of the optic flow simulation where the space is dynamically filled with a cloud of random dots. It is different from the sample VR world (Fig. 1) because the dots are floating at different distances and filling up the full visual field while the viewer is moving forward. In this simulation, for each dot, the speed and the direction of the apparent motion are marked by the length and the direction of the horn, respectively. The sizes of all dots are the same and designed to maintain their initial size regardless of their relative distance to the viewer (i.e., those dots are not following the linear perspective rule) so that each dot represents a point in the space and not the object with physical volume. The same size-change rule is applied to the horn construction. However, the length of the horn depends on the relative distance to the viewer (i.e., the horns follow the linear perspective rule) to illustrate the motion of the dots in the viewer perspective.
The apparent effect of optic flow distortion is much more easily conveyed in a video. A demo video clip can be found on YouTube using the following link: https://www.youtube. com/watch?v=1Xj0RvkV-wY&feature=youtu.be.

CONCLUSION
Using S3D space distortion models, we demonstrated that the parameter mismatch between the capture and display procedures introduce S3D optic flow distortions, where, in general, the amount of S3D optic flow distortion increases (1) as the viewer's virtual moving speed increases and (2) because of more distortions in S3D expansion than compression conditions. Also, the S3D optic flow distortion (3) may violate the linear perspective rule along the depth direction, and (4) in some extreme but practical conditions, the S3D optic flow direction may be reversed.
Since this kind of S3D optic flow distortion with respect to the viewer's self-motion hardly ever occurs in the real world, the viewer may experience a completely unfamiliar stereoscopic optic flow, which may conflict with the 2D optic flow behavior in the scene in each eye. Considering the intra-sensory conflicts with these visual motion signals (with an emphasis on vection), we postulate that the increased VIMS reported in S3D compared to 2D condition may be explained by S3D space distortions.
For the developers of the S3D HMD and corresponding application programming interfaces (APIs), this analysis highlights the importance of parameter matching between the optical configuration of the device and supporting APIs, where some viewer-initiated parameters for the display (e.g., eye separation, screen FOV, or screen distance) should be individually adjustable and directly affect the render parameters.
This hypothesis on the relation between S3D distortion and VIMS should be verified empirically. We are currently conducting a study in which we compare the level of VIMS induced by various S3D distortion conditions. The results of this study should assist future VR environment design, and hopfully reduce VIMS.