Transmittance‐based Extinction and Viewpoint Optimization

A long‐standing challenge in volume visualization is the effective communication of relevant spatial structures that might be hidden due to occlusions. Given a scalar field that indicates the importance of every point in the domain, previous work synthesized volume visualizations by weighted averaging of samples along view rays or by optimizing a spatially‐varying extinction field through an energy minimization. This energy minimization, however, did not directly measure the contribution of an individual sample to the final pixel color. In this paper, we measure the visibility of relevant structures directly by incorporating the transmittance into a non‐linear energy minimization. For the first time, we not only perform a transmittance‐based extinction optimization, we concurrently optimize the camera position to find ideal viewpoints. We derive the partial derivatives for the gradient‐based optimization symbolically, which makes the application of automatic differentiation methods unnecessary. The transmittance‐based formulation gives a direct visibility measure that is communicated to the user in order to make aware of potentially overlooked relevant structures. Our approach is compatible with any measure of importance and its versatility is demonstrated in multiple data sets.


Introduction
Scientific visualization is frequently concerned with the visualization of three-dimensional spatial data.Depending on the application, different parts of a spatial domain might be more or less relevant to the user.Unfortunately, many of those interesting regions might be occluded and thereby hidden.In the literature, this is known as the occlusion problem [VG05, MMD10, GRT13, AZD17].To reduce occlusions two strategies are available: either the transparency of less relevant structures is increased to clear the view on meaningful structures, or a viewpoint is chosen in which less occlusion occurs.Either way, a measure is needed that determines how much of the relevant information is visible on the screen.Previous opacity and In the following, we first summarize existing work on transfer function, extinction, and viewpoint optimization, which is followed by a brief description of direct volume rendering.Afterwards, the transmittance-based optimization of extinction and viewpoint is introduced.The approach is evaluated on several data sets, covering application domains such as fluid dynamics, geophysics, and medicine.The paper is concluded with an outlook on future work.

Transfer Function Optimization
Direct volume rendering uses transfer functions to map from a scalar field to optical properties such as color and transparency [JSYR14].How well information is perceived strongly depends on the choice of transfer function.To avoid tedious trial and error exploration, He et al. [HHKP96] stochastically mutated transfer functions from an initial pool based on interactive user preferences and quality metrics such as entropy, histogram variance, and edge energy.Using the scalar field and its first and second-order directional derivatives along the gradient, Kindlmann and Durkin [KD98] semiautomatically adjusted transfer functions to reveal material boundaries.Wu and Qu [WQ07] let users choose individual features from multiple direct volume renderings that have been made with different transfer functions, and then synthesized a view that shows or removes selected features.Correa and Ma [CM11] augmented a transfer function editor with a visibility histogram to communicate how well certain value ranges are represented in the image.Further, they represented the opacity transfer function with a Gaussian mixture model and optimized for an improved visibility.To minimize the informational divergence between data and view, Ruiz et al. [RBB * 11] minimized the Kullback-Leibler distance between the viewed visibility distribution and a user-defined target distribution.All of the methods above modeled transfer functions as functions of the scalar value (and its derivatives).To address occlusions the transfer function was varied spatially, which is discussed next.

Opacity and Extinction Optimization
Several approaches varied the opacity spatially along view rays in order to remove occlusions [VG05].Early smart visibility approaches [VKG04,VKG05] displayed per pixel the object with maximum importance along the ray or switched to sparser object representations to support windowing.The opacity has also been adjusted in a context-dependent manner [BGKG05].Further lines of research avoided occlusions through exploded views [APH * 03] and spatial deformation [CSC07].A finite-difference based optimization of discrete opacity values along rays was performed by Chan et al. [CWM * 09], who improved the perception of direct volume renderings by considering visibility, structural shape and image variations, and perception theory.Instead, we utilize a continuous measure for pixel contribution and derive derivatives symbolically.Maximizing the contribution of the farthest relevant sample point on a ray, Marchesin et al. [MMD10] derived a blending weight for the samples.Ament et al. [AZD17] extended a linear visibility optimization from semi-transparent geometry [GRT13,GRT14,GSME * 14] to volume data and realized that the optimization has a closed-form solution if smoothing is done separately in a post-process.This idea has later been picked up for geometric data [GTG17,BRGG20,ZRPD20].The problem is that this formulation reduces the visibility of a voxel indirectly based on how much importance is gathered in front or behind it, which does not measure how visible a certain voxel truly is.In this paper, we measure the actual contribution of a fragment to the final pixel color [CM11].Unlike Correa and Ma [CM11], we optimize for an extinction field and not for a parametric transfer function.Further, Ament et al. [AZD17] performed a separate extinction optimization from the light source to illuminate relevant objects, which can lead to inconsistent lighting.In this paper, we optimize the extinction from camera and light source concurrently to obtain an illumination that is consistent with the visible volume.

Viewpoint Optimization
As scientific data sets increase in resolution quickly, the automatic generation of meaningful preview images or animation paths became early on a research interest.By decomposing a volume data set into a set of components and by using isosurface properties as feature descriptor, a globally optimal viewpoint has been searched for as a compromise between locally optimal viewpoints for each component [TFTN05].Another optimization target was an even opacity distribution, an even distribution of salient features, and more perceived curvatures [SJ06].The opacity entropy [SJ06] has later been applied in a differentiable volume renderer to find best viewpoints [WW22].To balance between global structures or local details two descriptors were introduced [TLB * 09]: a shape view descriptor which evaluates the overall feature orientation, and a detail view descriptor which measures the amount of visible details on boundary structures by estimating variances.A first information theoretic approach based on entropy measures accounted for the transfer function, the data distribution, the voxel visibility, view-likelihood and view-stability [BS05].Several following methods likewise expressed the quality of a viewpoint based on information theoretic measures [CJ10], including multi-scale entropy [VMN08].The information theoretic approach was also applied to judge the image quality after a streamline generation in flow visualization [XLS10, LMSC11, TMWS13].To guide users towards viewpoints that were chosen by experts, image similarity metrics have been evaluated by majority voting [TWC * 16] and have been estimated by deep learning [YLLY19].In this paper, we optimize for the extinction in the volume by measuring the amount of visible important information, as given by the user, and we select a suitable viewpoint simultaneously.None of the methods above attempted to address both problems at the same time.

Direct Volume Rendering
Direct volume rendering is a popular approach for the visualization of three-dimensional scalar fields [JSYR14].For this, the scalar field s(x) : D → R, which is defined in a spatial domain D ⊆ R 3 , is mapped via transfer functions to optical properties.Based on those, a participating medium is constructed, which can be rendered from a given viewpoint using light transport simulation methods [Jar08].These computer graphics techniques may include illumination effects such as reflections and shadows to improve the spatial perception.The volume rendering equation determines the radiance L(x 0 ← ω) seen at point x 0 coming in from direction ω: The radiance is collected along the view ray x(t) = x 0 + t • ω, consisting of three terms: the view ray transmittance T (x 0 ← x), the scattering coefficient µs(x), and the incident radiance L i (x ← ω).The subscript i in L i distinguishes the incident radiance from emitted radiance Le, which is introduced later.
Transmittance T (x 1 ↔ x 2 ) describes the fraction of light that begins its journey at point x 1 and reaches point x 2 without scattering: The extinction coefficient µt (x) = µs(x) + µa(x) probabilistically describes the fraction of light that is either scattered (µs) or absorbed (µa) per unit distance.It is therefore a number that depends on the extent of the data set.This coefficient is used to define the optical thickness of a medium, i.e., setting a small value makes a voxel invisible, while a high value makes it appear opaque.
Scattering Coefficient µs(x) determines the visual appearance of a location in a participating medium by defining the fraction of photons that scatter per unit distance.In direct volume rendering, it is common to apply phenomenological scattering models [JSYR14]: where the color c(x) : D → [0, 1] 3 specifies per color channel the fraction of extinct light that is lost due to scattering (µs/µt ).A low value means that light is absorbed (black), while a high value indicates scattering.
Incident Radiance L i (x ← ω) determines the amount of light incoming at point x and that is scattered in direction ω.We apply a single-scattering model, which directly connects the scattering point x to a light source at point x L that emits a radiance Le: where the light ray transmittance T (x ↔ x L ) denotes the fraction of light reaching the scattering point from the light source.Thus, as shown in Fig. 2, two transmittance terms are considered: one from the camera in Eq. (1) and one from the light source in Eq. ( 4).The phase function f (x, ω L → ω) describes how much incoming light from direction ω L is being scattered into the outgoing direction ω.
Differentiable Volume Rendering.Inverse rendering refers to the reconstruction of scene parameters from a given image [KBM * 20, RRN * 20], which can be phrased as an optimization problem to which iterative gradient-based solvers can be applied.It is therefore necessary to determine how much a perturbation of a scene parameter affects the final image, i.e., a gradient of the output image with respect to the scene parameters is needed.In the field of direct volume rendering, reverse automatic differentiation with analytic inversion of blending functions has been proposed [WW22] and generative networks have been trained [BLL19].For global illumination simulations, differentiable Monte Carlo renderers [NDVZJ19] have been proposed that approximate gradients [ZSGJ21] or calculate them in two passes [VSJ21].To address visibility discontinuities, edge sampling towards silhouettes [LADL18], a bidirectional generation of boundary paths that connect sensors and emitters [ZMY * 20], and a transformation from boundary to area integrals [BLD20] have been proposed.The path space formulation has recently been extended to optimize the boundary geometry of translucent volumetric objects [YZM * 23].In this paper, we model single-scattering light transport on differentiable scalar fields s(x), which allows us to compute the necessary derivatives symbolically.

Method
In volume visualization, visibility optimization of relevant information is a long-standing problem [VG05, CWM * 09, CM11, AZD17].In this paper, we introduce a concurrent optimization of the extinction field and the camera viewpoint to clear the view onto relevant structures.To assess the visibility of relevant information, we measure the transmittance of relevant objects in the scene directly.In the following section, we first formally introduce the variational formulation that is used to derive the required partial derivatives.

Problem Statement
Given is a 3D scalar field s(x) : D → R in the spatial domain D ⊆ R 3 that we want to visualize.Further, a differentiable 3D importance field g(x) : D → [0, 1] defines how important each region in the domain is.Lastly, transfer functions are provided to compute color c(x) : D → [0, 1] 3 and base extinction µ t (x) : D → [0, µt ] from the scalar field s(x) with µt being the majorant extinction, i.e., the highest possible extinction coefficient.The base extinction µ t (x) is the extinction coefficient that our optimization will strive towards if a voxel does not occlude any relevant information.Thus, if no notion of relevance is given, our approach becomes a standard direct volume rendering with µ t (x) being the extinction coefficient.
Our goal is to find a new extinction field µt (x) : D → [0, µt ] as well as a perspective camera transformation g V (x) : D → Y, which maps from world space D to view space Y, such that both together produce a direct volume visualization with single-scattering light transport in which the important regions are shown.This entails the adaption of the extinction to reveal important structures and an optimization of the camera position.
Differentiable Volume Rendering.In order to optimize for the unknown functions µt (x) and g V (x), a differentiable volume renderer is needed.Inserting Eqs. ( 3)-( 4) into the volume rendering equation in Eq. ( 1) gives a single-scattering radiance integral, which is differentiable and contains both unknowns, the extinction coefficient µt (x) and the camera transformation g V (x): In this formulation, x 0 and x 1 are the entry and exit of a view ray.The maps g V (x) : D → Y and g L (x) : D → Z define perspective transformations from world space D into view space Y and light space Z, respectively.Later, we introduce the view space transmittance T V (y) = T (x 0 ↔ g −1 V (y)) and light space transmittance T L (z) = T (g −1 L (y) ↔ x L ), which depend on the extinction µt (x).
Variational Approach.We express the search for the optimal extinction coefficient µt (x) and the camera transformation g V (x) as variational minimization problem.Variational calculus is a mathematical toolbox that allows for the optimization of functionals, i.e., functions which receive other functions as input [GF63].The functional F that we will optimize later integrates a Lagrangian L over the world space domain D: In general, a necessary condition for the minimium of such functional is given by the Euler-Lagrange equations [GF63]: ).These differential equations hold for every point in the domain.Note that δF δµt (x) in Eq. ( 8) is scalar-valued and that δF δgV (x) in Eq. ( 9) is vector-valued.In the following, we describe the ingredients needed to define the functional that we minimize.

Camera Transformation
The first unknown of our optimization is the camera transformation g V (x).In the following, we describe its degrees of freedom and provide partial derivatives with respect to the degrees of freedoms.
Degrees of Freedom.For the efficient generation of view rays, a camera transformation is often defined by three basis vectors u, v, w ∈ R 3 and by a camera location o ∈ D in world space, as shown in Fig. 3. Instead of optimizing these vectors, we set the camera to a fixed location and apply a world space rotation R(α 1 , α 2 ) ∈ SO(3) to the scene, which contains the unknown azimuthal and polar angles α 1 and α 2 , keeping the camera upright: Thus, the camera position is restricted to a sphere that encloses the data set.Without loss of generality, the pivot point around which the camera rotates is set to the origin of the world coordinate system.During volume rendering, view rays are cast through the image coordinates (y 1 , y 2 ) ∈ [−1, 1] 2 via: The camera orientation is defined by a u, v, w frame, and the camera position is o.When optimizing for the optimal viewpoint, we rotate the world space by the azimuthal and polar angles α 1 , α 2 .
Perspective Mapping.The perspective mapping g V (x) transforms a world space coordinate x ∈ D to the camera space y . Under this perspective mapping, the world space depth range [zn, z f ] along the w axis is mapped to a linear depth in [0, 1].The inverse transformation g −1 V (y) is: Similarly, the transformation to the light space is given by a perspective mapping g L (x) : D → Z.We chose to model a point light, although directional lights would be easily supported, as well.
Derivatives.The partial derivative of the transformation g V (x) with respect to the rotation angles α i is computed symbolically: with ∂k ∂αi = . Using this derivative, we can express how much the camera transformation changes when varying the azimuthal and polar angle of the rotation.

View Space and Light Space Quantities
The second unknown is the extinction µt (x), which appears in the transmittance in Eq. (2).The volume rendering equation in Eq. ( 6) contains two transmittance terms: one is measured along view rays (T V ), the other along light rays (T L ).To accelerate computations we discretize both onto perspective grids, as shown in Fig. 4.
Transmittance.We describe the view space transmittance T V (y) : Y → [0, 1] and the light space transmittance T L (z) : Z → [0, 1] in their respective view space and light space coordinates using the view space transformation g V (x) : D → Y and the light space transformation g L (x) : D → Z: with the camera being placed at y 0 and the light source being placed at z 0 .The fields are discretized onto perspective grids with a resolution of X ×Y × Z by looping once over each ray: where h is the grid spacing in view direction.Afterwards, the transmittance can be looked up in world space by transforming to the grid coordinates and by trilinearly interpolating the transmittance value.For light rays, this is known as a shadow volume [AZD17].In computer graphics, the exponential is often first-order Taylor approximated, i.e., e −µt (xi)•h = e −αi ≈ 1 − α i , which results in the common alpha blending equations [BM08].
Visibility Integrals.How much an individual voxel strives to be seen depends on the amount of relevant visible information behind it.Visibility is thereby characterized by high extinction and transmittance.Thus, we compute visibility-weighted importance integrals behind a given point in view space (G V ) and light space (G L ) as: which are computed on a X × Y × Z grid by traversing the rays back-to-front, whenever the camera changes its position.
with h being again the step size.With this, the amount of occluded relevant information can be looked up at any world space position.

Optimization
Now we have all the ingredients needed to define the functional that we aim to minimize.Afterwards, we provide its functional derivatives and a gradient descent update for the optimization.
Energy.Similar to the opacity optimization energy [AZD17, GTG17], we decompose our variational energy into multiple terms: with each individual term serving a specific task: where the extinction µt (x) is subject to the constraint: Fp in Eq. ( 22) is a regularization term that -in the absence of all other terms -makes the unknown extinction µt (x) become equal to the given base extinction µ t (x).Thus, if no extinction optimization is applied, we obtain a regular volume rendering.
Fq in Eq. ( 23) maximizes the extinction µt (x) of important structures (g(x) is high) such that they scatter more light and the transmittance T V (x) is kept high, which makes sure that no object occludes the view up to x. Phrased as a minimization, the term is negated.
Fr in Eq. ( 24) does the same from the perspective of the light source.This ensures that light is reaching the relevant structures.
Since µt (x) in Eqs. ( 23)-( 24) is unbounded, the constraint in Eq. ( 25) is added.We set the upper bound µt to the maximum extinction that is present in the transfer function that produced µ t (x).Thereby, the volume does not become more optically dense than the base volume rendering.
We set the same energy weights in all data sets p = 10 −3 , q = 10, r = 1 unless mentioned otherwise.A parameter study is presented in the supplemental material, where other choices are demonstrated.With our approach it is possible to optimize for a single extinction field µt (x) that fulfills its visibility constraints from both the camera and light perspective.Alternatively, we may also optimize for two separate extinction fields: one for the camera constraint, and one for the light constraint.The latter approach follows Ament et al. [AZD17], which, however, may lead to an inconsistent lighting.Later, in Section 4.2, we demonstrate and compare both approaches.
Partial Derivatives.Differentiation of Eqs. ( 22)-(24) according to the Euler-Lagrange equation in Eq. ( 8) with respect to the extinction µt (x) results in the following functional derivatives: Since only Fq depends on the camera transformation g V (x), and thus the angles α i , there is only one functional derivative for the camera optimization: Using µt (x) from the extinction optimization in Eq. ( 29) results in a coupled optimization, for which the convergence requires careful balancing of learning rates.Using a view-independent proxy µt (x) := µt • g(x) in Eq. ( 29) instead is more stable and empirically gave similar optimal camera positions.The functional derivatives are derived in the additional material.We estimate the energy derivative δFq δαi with second-order central differences.
Gradient Descent.To compute the extinction field µt (x) that fulfills the Euler-Lagrange equation in Eq. ( 8), we discretize the field onto a world space grid with extinction values µ t,i , which are optimized via gradient descent, starting at an initial guess µ (0) t,i := µ t (x i ).Thus, if the optimization is disabled (q = r = 0), we already see the regular volume rendering.Likewise, to find the camera rotation angles α i that fulfill the Euler-Lagrange equation in Eq. ( 9), we apply the chain rule using Eq. ( 14), integrate contributions from all points in the domain, and perform a gradient descent with step size h, starting from an initial guess α To satisfy the constraint in Eq. ( 25), the extinctions µ t,i are clamped to the valid range after each gradient descent step.Since our functional is non-linear, the result depends on the initial condition, which is discussed later in Section 4.6.

Visualization System
We implemented our extinction and viewpoint optimization using CUDA on the GPU, setting ourselves the following requirements: (R1): Support camera navigation during extinction optimization.(R2): Provide feedback on how much relevant information is seen.
Interactivity.When disabling the viewpoint optimization, the user is able to change the camera transformation interactively in order to explore the data set.To reach interactive frame rates (R1), the iteration in Eq. ( 30) is deferred over multiple frames.Thus, during interactive camera navigation the optimization needs a few seconds to adjust the visibility to the new viewpoint, see Fig. 5, which was similarly the case in early optimization algorithms [GRT13].
Feedback.Our visibility optimizations incorporate the transmittance of each object.To convey how much relevant information is seen (R2), we accumulate a visibility score S ∈ R+ over the domain: The score is high if a voxel is important (g(x) is high), the voxel is scattering light (µt (x) is high), and no other objects occluded the view (T V (g V (x)) is high).Fig. 6 shows quantitative measurements of the visibility score for different views.
Multi-Resolution Solver on GPU.To meet the above requirements, we implemented the gradient-based optimization with a multi-resolution update of the grids of Section 3.3 on the GPU.Following 32 iterations on quarter resolution, the result is upscaled to half resolution, and after 64 iterations, the result is upscaled to the full resolution.The number of iterations per resolution level were chosen empirically.Each optimization step includes the following: 1. calculate derivatives of energies with respect to the unknowns 2. gradient step on extinction field and on camera parameters 3. recalculate the view space and light space grids, cf.Section 3.3 4. calculate energies and metrics after the optimization step The gradient descent steps are carried out with adaptive estimates of lower-order moments by using the Adam method [KB17].The lower resolution grids that we start the optimization from do not need additional memory, since they are stored at every second or fourth grid point in the full resolution grid.Upscaling to a higher resolution thereby becomes a simple interpolation problem.

Results
In the following, we apply our algorithm to a number of data sets from fluid dynamics, geophysics, and medicine.After introducing the data sets, we discuss the separate and joint extinction optimization, we compare with previous work, study the impact of the grid resolutions, provide performance measurements, and discuss limitations.We refer to the supplemental material for an informal user study, a parameter study of the weights q and r, as well as additional comparisons with related work for a different camera view.

Test Data
The ROTATING MIXER contains a numerical simulation of a liquid in a cylindrical container that is stirred into motion by three rotating paddles.We show the vorticity [GT18] of this flow, highlighting regions with exceptionally high vortical motion.
The EARTH MANTLE simulation resolves the sinking of cold, dense material from the crust, and the rising of hot plumes from the core [GBT21].We visualize the temperature anomaly, which shows the difference in temperature to the average temperature at a certain depth and we highlight cold slabs with high importance.
The VISIBLE HUMAN data set (version 2.0) contains highresolution CT and MR scans of the head and neck of a male human [RHGJ03].We visualize the CT data set and assigned bone structures a high importance.
The HEPTANE FLAME data set contains a single time step from a combustion simulation of a jet of heptane gas.The importance is set to reproduce the visualizations of Ament et al. [AZD17].

Separate vs Joint Extinction Optimization
When minimizing Eqs. ( 22)-( 24), we may either solve for one extinction grid µt (x) that clears the view from the camera and the light, or we solve for two separate extinction grids by once setting q = 0 and once setting r = 0.A joint solver has a lighting that is consistent with the visible extinction field.In contrast, the separate solution displays the context around the region of interest independent of the illumination direction.Results for both approaches can be seen in Fig. 7, where the nose and forehead have been removed in the joint optimization to clear the path for the light.Unlike Ament et al. [AZD17], we have the freedom to choose between the two options, while they had to solve for both views separately.

Comparisons
Extinction Optimization.We compare our approach with three extinction optimization algorithms in Fig. 8. Methods that do not optimize the shadows are compared without rendering shadows, i.e., T L = 1.Viola et al. [VKG04] segmented the domain into multiple objects and determined per object a sparseness level, which allows for a smooth transition from fully transparent to fully opaque object representations.We compare with their cylindrical maximum importance projection, which determines along the ray the most relevant object and then clips everything in front of this object.In our reimplementation, we consider objects to be determined by the g(x) = 0.5 level set, and start the regular ray marching at the first level set intersection of the object with highest value along the view ray.Being based on a maximum intensity projection, the algorithm cannot retain depth information.This is particularly noticeable in the VISIBLE HUMAN data set.Marchesin et al. [MMD10] computed an importance-weighted color average of samples along the view ray.Accordingly, the visibility score was equal-weighted along the ray.We determined the transmittance for the blending with the background by calculating the transmittance of the base extinction field µ(x).Due to the order-independence of the compositing, depth information cannot be maintained.While adding detail in some regions, less relevant structures receive too much weight, which is MIXER.We reported the visibility scores S that were obtained by the various methods by sampling them to world space.Since we optimize towards this metric, our approach consistently reached the highest score and showed details not only in high importance ranges.
Viewpoint Optimization.In our search for the optimal viewpoint, the candidates are restricted to a sphere.In Fig. 9, we sampled the sphere of possible viewing directions to convey the smoothness of the objective function.We visualize the visibility score S of Eq. (32) on the Northern and Southern hemisphere, showing that there are distinct viewing directions that are significantly better or worse than others.Further, the best and worst view are shown to convey the expressiveness of the metric.In the best views, the HEPTANE FLAME and the EARTH MANTLE utilize the screen space well, while the worst views are those with the most occlusion.

Grid Resolution
We discretize the transmittance integrals T V (y), T L (z) and the visibility integrals G V (y), G L (z) in both view space and light space, and we discretize the unknown extinction field µt (x) in world space.In Fig. 10, we analyze the impact of different grid resolutions on the obtained visual quality.Lowering the resolution of the world space grid results in noticeable voxel artifacts.In comparison, lowering the resolution of the view space grids is less noticeable.A too strong reduction shows grid artifacts in the shadows.As previously discussed in Section 3.5, we use a multi-resolution optimization that progresses along the diagonal of Fig. 10

Performance
In the following, we report on the performance measurements and the memory consumption of our GPU implementation.All performance measures were taken on a workstation that is equipped with an AMD Ryzen 9 7950X CPU and an NVIDIA RTX 4090 GPU.We list the timings for the update of the view space and light space grids, a single optimization step, and the rendering time per frame in Table 1 for all data sets at a viewport resolution of 1024 × 1024 pixels.The reported resolution of the world space and view space grids are the ones used for all images throughout the paper, unless mentioned otherwise.While the rendering is possible at interactive rates (about 15 fps), the optimization process takes at full resolution several seconds, since a single frame takes about half a second, including the grid updates.To achieve interactivity, the view is optimized and rendered at lower resolution when moving and continues at full resolution when the camera is standing still, as demonstrated earlier in Fig. 5.The current video memory consumption of up to 21 GB leaves opportunities for further research in the future.

Discussion
Non-Linearity.In contrast to previous methods [AZD17, GTG17], our non-linear extinction optimization is not guaranteed to have a unique solution.By further adding the viewpoint optimization on top, the problem becomes even harder to solve.Similar to Weiss and Westermann [WW22], the problem can be approached by starting multiple optimizations from different initial conditions, which results in a sampling problem.In Fig. 9, we visualized the visibility score of the visibility optimization on a sphere, showing that the objective function is smooth enough for a gradient-based solver.
Memory Consumption.Currently, we discretize the transmittance fields, the importance integrals, and the unknown extinction field onto dense grids.The maximal resolution is thereby constraint by the available GPU memory.To circumvent this limitation, several approaches are imaginable, including sparse or hierarchical representations, compression algorithms, or approximations by projection into a different function basis [BRGG20].
Light Source Optimization.In this paper, we optimized the position of the camera to find the best view.Likewise, it is imaginable to optimize the light source position.This would, however, result in a trivial but unwanted solution.Since the same energy is minimized from the camera and light direction, cf.Eqs. ( 23)-(24), while the term in Eq. ( 22) has comparatively low weight, the light source would likely be placed at the camera location, resulting in a headlight that minimizes shadowing.Headlights, however, are known to have poor spatial perception.Instead, lateral lighting is preferable, which would have to be phrased in an energy term.We consider this an interesting direction for future research.
Perceptual Quality Metric.Our quality metric does not consider the human perception.For example, our metric is invariant to changes in the emitted radiance Le.From a perceptual point of view, the quality metric should depend on the emitted radiance, since human perception depends on the luminance.In the future, it would be interesting to apply visual quality metrics that include or agree with aspects of the human visual perception [CWM * 09, WQC * 10, BRB * 13].

Conclusions
Given a scalar-valued importance field, we proposed a variational formulation for the concurrent optimization of an extinction field and the camera viewpoint for direct volume rendering of threedimensional scalar fields.For this, we measured the contribution of a voxel to the final pixel color directly by considering extinction and transmittance.We derived the necessary condition for an optimal extinction field and for an optimal camera transformation.
The required functional derivatives that need to vanish have been calculated analytically.We employed an iterative multi-resolution gradient descent to find an optimum.The method has been applied to scalar fields from fluid dynamics, geophysics, and medicine.
In the future, it would be interesting to further investigate how the light source could be optimized, such that the visibility and the perception are maximized.At present, our formulation requires the scalar field and the importance field to be differentiable.Supporting non-differentiable scenes would be an interesting avenue for future research.So far, we discretized the unknown extinction field onto a grid, which simplified the optimization.Other function representations could be explored to reduce the required memory footprint.Further, it is imaginable that more specialized numerical solvers might help to accelerate the performance even further.It is imaginable to combine our approach with existing smart visibility techniques [VG05,AZD17] to provide interactive feedback during navigation.Further, varying levels of sparseness [VKG05] could add depth cues that anchor the important objects in the scene.Lastly, finding camera paths that view relevant structures in time-dependent data would be an interesting challenge.

Figure 2 :
Figure 2: Single-scattering light transport accumulates all incoming light that gets reflected along the view ray towards the camera.The camera is place at x 0 , the light source at x L and one exemplary scattering point x is shown.The fraction of light arriving from the light source at the scattering point is the transmittance T (x ↔ x L ) and the fraction of light reaching the camera from the scattering point is denoted by the transmittance T (x 0 ↔ x).

Figure 4 :
Figure 4: To efficiently sample the transmittance up to a certain point and the amount of visible information behind that point, we discretize the transmittance and visibility integrals onto perspective grids, shown here for the view space (left) and the light space (right).

Figure 5 :
Figure 5: To reach interactivity, the extinction optimization is carried out over the course of multiple frames.Here, the delay of the visibility adjustment after interactive camera adjustment is shown.The grid resolution is adapted automatically to support interactivity.

Figure 6 :
Figure 6: To judge the visibility of relevant structures, we introduce a visibility score S. Here, we see the improvement achieved over direct volume rendering (left) compared to our method (right).

Figure 7 :SFigure 9 :
Figure7: With our approach we can either optimize for two separate extinction fields for the view from the camera and from the light, which preserves more context, or the two views are optimized jointly, which produces a more consistent shadowing, but removes nose and forehead.

Figure 10 :
Figure10: Here, we systematically vary the resolution of the world space grid for the extinction µ(x) (in the columns) and the resolution of the view space grid and light space grids for transmittance T V (y), T L (z) and visibility integrals G V (y), G L (z) (in the rows).Hereby, Z is the value of the world space resolution times two.

Table 1 :
Performance measurements for our method.All measurements were taken with an image resolution of 1024 × 1024.