Interactive desktop analysis of high resolution simulations: application to turbulent plume dynamics and current sheet formation

The ever increasing processing capabilities of the supercomputers available to computational scientists today, combined with the need for higher and higher resolution computational grids, has resulted in deluges of simulation data. Yet the computational resources and tools required to make sense of these vast numerical outputs through subsequent analysis are often far from adequate, making such analysis of the data a painstaking, if not a hopeless, task. In this paper, we describe a new tool for the scientific investigation of massive computational datasets. This tool (VAPOR) employs data reduction, advanced visualization, and quantitative analysis operations to permit the interactive exploration of vast datasets using only a desktop PC equipped with a commodity graphics card. We describe VAPORs use in the study of two problems. The first, motivated by stellar envelope convection, investigates the hydrodynamic stability of compressible thermal starting plumes as they descend through a stratified layer of increasing density with depth. The second looks at current sheet formation in an incompressible helical magnetohydrodynamic flow to understand the early spontaneous development of quasi two-dimensional (2D) structures embedded within the 3D solution. Both of the problems were studied at sufficiently high spatial resolution, a grid of 5042 by 2048 points for the first and 15363 points for the second, to overwhelm the interactive capabilities of typically available analysis resources.


Introduction
Over two decades of continual advancement in microprocessor technologies have fuelled the development of increasingly more powerful supercomputers, and computational scientists today enjoy unprecedented computing capability. Access to systems with teraflops, and in the near future, petaflops of performance has enabled numerical modellers to compute at extraordinary scales. A consequence of this unparalleled computing capability is the production of vast amounts of numerical data. Sadly, our ability to manage, analyse, and gain insight from these numerical outputs has not kept pace with our ability to generate them. Thus for many numerical modellers the greatest challenge in the scientific discovery process begins once the simulation has completed and the analysis process commences. The result is that often only limited scientific return is realized from substantial computational investment.
Many factors have contributed to this situation. Perhaps first and foremost is the contrasting natures of numerical simulation, which is well-suited to batch job submission, and data analysis, which is inherently interactive. This is exacerbated by the historical focus of computing centres on the delivery of batch computing cycles to the detriment of other computing needs; seldom are computing resources for analysis available that are on par with batch computing systems. Disparity in the advancement of various computing technologies plays a sizeable role as well. Microprocessor performance-the main driver behind supercomputer advancements-doubles roughly every 18 months in accordance with 'Moore's law' [1]. File input/output (IO) interconnect bandwidth performance, key to the interactive processing of very large datasets, is on a much more modest improvement curve. Finally, while numerical model codes are most frequently developed by scientists themselves, or their colleagues, and are therefore amenable 3 DEUTSCHE PHYSIKALISCHE GESELLSCHAFT to tuning and adaptation to supercomputing architectures, analysis software often comes from third parties, frequently commercial, who have little incentive to scale the application to meet the needs of the small niche market of high resolution computational data analysis.
Technology curves and the makeup of computing environments are beyond our control. Where we hope to have an impact is in the application software available for scientific data analysis. In this paper, we describe a desktop application developed specifically for the analysis of large computational datasets, VAPOR. Unique to VAPOR are its combination of domain specificity, its ability to explore datasets that are terabytes in size using only a ubiquitous PC, and its close coupling of advanced visualization with quantitative data analysis capabilities. We present our experiences with VAPOR in the context of the investigation of two relevant and challenging problems in geophysical and astrophysical fluid dynamics. The first example is motivated by stellar envelope convection and investigates the hydrodynamic stability of compressible thermal starting plumes as they descend through a stratified layer of increasing density with depth. The second example looks at the spontaneous formation of current sheets in a turbulent incompressible and conducting flow, and is motivated by observations of the solar wind and the Earth's magnetosphere. In both cases, spatial resolutions were larger than what interactive desktop visualization and analysis tools can typically handle, and VAPOR enabled new scientific findings that would have been impossible using standard methods in scientific supercomputing (e.g. run-time analysis, study of global quantities, spectra, etc).

VAPOR
Numerous freeware and commercial applications exist for analysing and visualizing large, timevarying gridded datasets. All suffer from pitfalls that significantly curtail their usefulness in the exploration of high resolution simulation data. Open source visualization applications such as Paraview [2] and Visit [3], and commercial applications such as CEI's Ensight, support advanced visualization algorithms appropriate for computational datasets, but these are lacking in quantitative analysis capabilities, are, in the authors'opinion, targeted more towards visualization experts than scientific end-users, and demand specialized parallel computing resources to handle large datasets. High-level, fourth-generation data languages such as ITT's IDL and Mathwork's Matlab were designed with the scientific end-user in mind, support a rich set of mathematical operators suitable for quantitative data analysis, but offer only limited visualization capability and only minimal scalability, restricting their use to moderate sized problems.
The VAPOR software environment attempts to address both of these classes of shortcomings while at the same time providing an intuitive user interface and feature set, the design of which is guided by a committee of computational physicists to ensure that the tool meets the needs of the user community. VAPOR's advanced visualization capabilities enable the rapid identification of features or spatial-temporal regions of interest (ROIs). Its seamless coupling to high-level analysis languages facilitates rigorous quantitative investigation and data manipulation on the reduced domain identified, and its hierarchical data representation scheme permits the investigator to make effective speed/quality trade offs in order to maintain a high degree of interactivity. In this section, we discuss each of these points and indicate how they benefit both visual and quantitative analysis and make possible the exploration of terabyte size datasets using only commodity hardware.

Hierarchical data representation
Perhaps the greatest challenge posed by the analysis of high resolution simulations is the scale of the datasets involved. Simply storing the data on rotating media where it can be randomly accessed may not be feasible, necessitating time-consuming shuffling of data between archival and disk-based storage. Even when sufficient on-line storage capacity exists, IO bandwidths are generally woefully inadequate to support interactive processing without complicated and costly parallel IO systems. Constraints on physical memory size, both for the CPU and graphics processor (GPU), form the next bottleneck in this storage hierarchy. Even if these issues were to be in some way addressed, the processing capabilities of the CPU and GPU place additional limitations on the amount of data that can reasonably processed interactively.
Predicated on the assumption that, while some analysis operations may require all of the detail present in the simulation output, a large class of useful operations are less sensitive to information loss, we address these difficulties through the use of hierarchical data representation. Simulation outputs are stored hierarchically with each level of the hierarchy providing a coarsened approximation of the data at the preceding level. The original data may be accessed in their entirety, without loss of information, or an approximation of the original field, sampled on a coarser grid, may be retrieved. Successive coarsening results in a halving of the resolution along each spatial axes. Thus for a three-dimensional (3D) data set each consecutive coarsening results in a factor of eight reduction in data, and a corresponding reduction in demands on analysis resources.
Hierarchical data representation in VAPOR is achieved using a wavelet decomposition [4]- [7], a two-parameter linear expansion given by where c(k) and d j (k) are real-valued coefficients, and the scaling expansion functions, φ(t), and wavelet expansion functions, ψ(t), are given by This nesting of expansion function leads to a multiresolution representation of f(t) where the first summation in equation (1) gives a low resolution, coarse approximation of f(t) and the second summation, for increasing j, provides finer and finer detail. Although a number of wavelet families are supported in VAPOR, for reasons of computational efficiency we primarily employ the Haar transform [8]. The Haar scaling function is a unit-height, unit-width box function with φ(t) defined by Thus scaling coefficients c j (k) are simply pair-wise averages of c j+1 (k) and c j+1 (k + 1). The corresponding Haar wavelet function ψ(t) is with d j (k) thus being the pairwise difference between neighbouring grid values. We could have accomplished a hierarchical data representation, without the complexity of a wavelet transform, by generating multiple copies of data at varying resolutions. This, however, would impose additional storage requirements. Saving the wavelet coefficients eliminates the storage penalty, and due to the aforementioned rapid advancement of microprocessor technology, the computational cost of both the forward and inverse transform are negligible compared to the IO involved in reading or writing the data. The data transform, in fact, costs very little. Further details on VAPOR's wavelet decomposition process are available from our previous work [9,10].
This data model permits the investigator to throttle the flow of data in accordance with the resources available, and thus control the level of interactivity. Similar to the ubiquitous and highly effective Google Earth application, which permits users to navigate terabytes of image data using only a modestly configured PC, users can browse coarsened representations of their global spatial-temporal domain, identifying features of interest. Once identified, the reduced domain may be refined to any desired level of detail up to the original sampling (e.g. figure 1). Thus massive datasets can be examined as a whole at reduced resolution and arbitrarily refined within the chosen spatial and/or temporal ROIs. Often both visual inspection and numerical analysis are fairly insensitive to rather dramatic data coarsening [10], allowing considerable savings in computational overhead during the early exploratory stages of investigation when interactivity is most crucial.
Moreover, the hierarchical data structure naturally facilitates storage and management of potentially terabytes of simulation data. Data are organized into a multi-file collection consisting of a single XML metadata file (viewable with any text editor or HTML browser) 6 DEUTSCHE PHYSIKALISCHE GESELLSCHAFT and multiple machine-portable binary files containing the wavelet coefficients themselves. The metadata file describes characteristics of the data that are constant across a single simulation run (e.g. grid resolution, variable names, physical coordinate systems and number of time steps output). Individual wavelet coefficient files contain a single time step, for a single variable and a single transformation level. The primary benefit of this decomposition is that the VAPOR Data Collection (VDC) readily supports incomplete datasets: not all time steps, variables, or even refinement levels need be present on disk for a VDC to be valid. If disk space is limited components of a VDC may be stored off-line (or their computation deferred) until needed.

Coupling visualization with quantitative data analysis
The understanding of massive datasets is an iterative process that can greatly benefit when highly interactive, qualitative, visual examination is combined with quantitative numerical analysis. Visualization can be used to identify salient features of the data, giving rise to hypotheses that can be validated or rejected through numerical study. Conversely, notable quantitative characteristics of the data require visualization to illuminate their associated geometric shapes and physical properties. Combining these qualitative and quantitative techniques is most effective when the process is seamless, enabling users to quickly transition back and forth between the two. The challenge with massive data is to maintain a sufficient level of interactivity throughout the analysis process. We address this difficulty by combining data culling, through a combination of ROI isolation and hierarchical representation as described earlier, with seamless coupling to already existing analysis packages.
Visualization within VAPOR is performed with intrinsic algorithms, while numerical analysis operations are currently performed using ITT's fourth-generation language, IDL. A user can simultaneously maintain active VAPOR and IDL sessions, visually identifying ROIs with VAPOR that may be exported to IDL for further study. Interactivity within IDL may be maintained if the ROI is sufficiently small or if the operation is sufficiently well-behaved over coarsened approximations of the data [10]. Newly calculated derived quantities can be imported back into the existing VAPOR session for continued visual investigation. By repeating this process, massive data can be interactively explored, visualized, and analysed without being subject to the delays inherent in reading and writing large data arrays en masse. This coupling of IDL and VAPOR is made possible by a library of data access routines, which can be invoked directly from IDL, permitting IDL to read and write arrays or sub-arrays directly from VAPOR's wavelet-encoded data representation. The data access library could of course be employed by other scientific data processing utilities as well, and so the approach is readily generalizable to the user's favourite analysis package.
We note that the foremost advantage of coupling visual data investigation with an array-based data analysis language is the ability to defer the often expensive calculation of derived quantities until needed and to compute them only over the domain of interest. The time and space required for computing such variables in advance and across the entire domain can easily overwhelm the resources available, delaying or preventing altogether further analysis. Moreover, some quantities can only be computed with reference to the location of a flow structure and are therefore not in principle a priori computable. The coupling between VAPOR and IDL facilitates the calculation of derived quantities on-the-fly, on an as-needed basis over potentially significantly smaller subregions, realizing considerable savings in storage space and processing time. This technique has proven invaluable to both of the investigations to be described later in the paper.

Visualization capabilities
The remainder of this section describes the more salient visualization capabilities of VAPOR, paying particular attention to the algorithmic details that are essential for understanding the transformation of numerical gridded data into graphical imagery and to the user interface features that are needed for producing meaningful and informative visualizations that aid in analysis.

Volume rendering.
Computational data generally consists of a collection of 3D arrays of single or double precision floating point data, with each array specifying the value of one variable at each spatial discretization point and at a single time step. The primary visualization method for exploring scalar 3D data arrays within VAPOR is direct volume rendering (DVR) [11]- [13]. Conceptually, DVR enables a scalar field to be visualized, by specifying a mapping that determines how the field interacts with light; possibly absorbing, reflecting, transmitting, or emitting radiant energy. In practice a simplified optical model is employed whereby the scalar field mapping consists of the assignment of transparency and colour values to a discrete range of data values. An image is generated by integrating colour and opacity along discrete lines of sight passing through the data volume. Thanks to recent improvements in graphics hardware, low-cost graphics cards are widely available which can quickly perform volume rendering on large data grids. Using graphics cards typically available on today's PCs, grids sized up to 512 3 can be rendered interactively.
The mapping of data values to colour and opacity is called a transfer function. Transfer functions in VAPOR are defined by first linearly quantizing a user-selected data interval into 8-bit integers and then mapping each of the resulting 8-bit quantities to colour and opacity. Colour is represented as three floating-point values (red, green and blue) and opacity is a single floating point value between 0.0 (transparent) and 1.0 (opaque).
Image generation is performed in the graphics card as follows: the volume is interpolated (using tri-linear interpolation) at discrete locations along 2D planes orthogonal to the view direction, generating a series of 2D images aligned orthogonal to the users viewpoint, in backto-front order. As these images are generated they are composited (from back to front) into a final image that will be displayed on the user's screen. The compositing operation used to compute opacity and colour at each step in the integration is given bŷ whereĉ i andα i are the accumulated colour and opacity, respectively, and c i and α i are the interpolated colour and opacity. The exponent x/ x is a correction term that allows for varying sampling rates. It is the ratio of the distance between the i and i − 1 sampling planes, and the distance between uniform samples with a sampling frequency of one sample per voxel. Since the entire rendering operation is performed in the graphics hardware, the resulting image usually appears instantaneously. By default, the volume rendered images in VAPOR do not take into account the physical location of light sources. The visual effect is as if the 3D grid were illuminated by a uniformly distributed diffuse light source. More realistic images which better convey geometric shape can be rendered by simulating a directional light source that is reflected or absorbed as it interacts with the volume. Lighting is available in VAPOR with consequent image quality improvement and some loss of rendering speed. When lighting is enabled, a normal vector is needed for each point in the volume. The normal is determined by approximating the gradient of the variable with a finite difference. The lighted colour associated with the point is sensitive to both the colour value at the point (specified by the transfer function) and the angle between the light direction vector and the normal vector. It consists of three components (see [14]): ambient, specular and diffuse. Ambient colour is unaffected by the light source and is a constant multiplied by the point colour value. Diffuse colour, based on the Lambertian lighting model, multiplies the colour value by the cosine of the angle between the light source direction and the normal vector. The specular component, useful in indicating shiny surfaces, is calculated as cos n : the cosine of the angle between the normal vector and the light direction vector, raised to a user-selectable exponent n. Larger values of n result in surfaces that appear more shiny. Additional detail about the lighting model used in VAPOR can be found in [14]. Figure 2 shows the difference between a lighted and unlighted volume rendering of a current sheet roll (discussed in section 4 of this paper).

Volume rendering user interface.
The construction of a transfer function that exposes and highlights scientifically interesting features of a 3D field, displayed as a 2D image, is a nontrivial task, and the development of new methods for transfer function generation is an active research area [15]. Transfer function construction in VAPOR is performed with a graphical editor that enables the user to specify a piecewise-linear mapping from data values to opacity and colour values. Users specify the location of control points (for either colour or opacity) numerically or interactively using the mouse. Important features in the data can be distinguished by applying higher opacity and/or by specifying an identifying colour value at specific control points. A histogram of the data distribution is presented in the transfer function editor so that the user can align graphical features with data values, using the frequencies of the data values shown by the histogram to identify important data intervals. Figure 3 illustrates a simple transfer function that has been used to distinguish regions with strongly positive and negative vertical momentum in a 3D simulation of turbulent compressible thermal convection.

Field-line integration.
To aid in understanding vector field data, VAPOR provides the capability of integrating and displaying field lines. VAPOR supports both steady (time-invariant) 9 DEUTSCHE PHYSIKALISCHE GESELLSCHAFT Figure 3. Histogram assisted transfer function editor. Displayed is the vertical momentum from a simulation of fully compressible turbulent convection. Positive and negative values have been assigned blue and orange colour tones respectively. The range of data values displayed is limited by the opacity value assigned, with very low data values having been eliminated from the scene by the assignment of zero to the opacity. and unsteady (time-varying) field-line integration. Field lines are calculated by adaptive line integration [16] using a fourth-order Runge-Kutta scheme. The values used in the integration are determined by a tri-linear interpolation of field values over cells in the grid; and, for timevarying integration, the field values are linearly interpolated between time steps.As the integration proceeds, the interval size is iteratively doubled or halved to ensure that the angular change between successive line intervals stays between 3 • and 15 • . Users can increase or decrease the integration accuracy by controlling the minimum and maximum length of the integrated line intervals. At the lowest accuracy level (0.0) the interval size lies between 4 and 10 grid cells; at the highest accuracy level (1.0), the step size is between 0.05 and 0.25 grid cells.

Field line visualization user interface.
The field-line integration starts at a set of seed points. The seed points can be explicitly positioned in space and time by the user. Explicit placement of individual seed points is facilitated with a data probe. The probe can be used to interrogate specific locations in the domain for their data value, construct 2D images (contour planes) of data values in the grid, and position points within such contour planes. The probe plane can be positioned and rotated arbitrarily within the data volume. Alternatively, the spatial positioning of seed points can be controlled by a rake, a 1D, 2D, or 3D axis-aligned box specifying the bounds of a region within which the seeds are either randomly or uniformly distributed. The rake is useful when seeding the region for a quick overview of the flow within it, while the probe integrating the momentum field in both positive and negative directions, starting with a set of seed points. Field line colours in this plot indicate distance along each flow line, but can also be chosen to be constant, to differ for each individual field line, or to display the magnitude of any field variable along the line. As described in the text, the seed points can be placed randomly in the vortex region using VAPOR's rake tool (left) or explicitly inserted in a cross-section of the core region of highest vertical vorticity using the probe tool (right).
allows precise control of seed placement. Figure 4 illustrates the use of these seeding options. Finally, a list of seed positions can be provided in a file by the user. This further increases the flexibility of the field line visualizer, particularly when used in combination with VAPOR's analysis tool coupling capabilities, by allowing seed distributions based on quantitative properties of the solution.

Instabilities of compressible thermal starting plumes
The first of two sample applications of VAPOR which we consider is the visualization and analysis of 3D compressible thermal starting plumes. The dynamics and stability of thermal starting plumes is important to turbulent transport in a variety of settings (e.g. [17,18]), but the work presented here is primarily motivated by solar and stellar envelope convection. Convection in the outer layers of stars is dominated by three physical attributes, very significant mean stratification of the atmosphere (the density of the solar convective envelope, for example, increases by a factor of nearly a million between the top and bottom), ionization of the principal plasma components (two-thirds of the energy transported by convection near the top of the solar convection zone is carried as hydrogen ionization energy), and a strongly temperaturedependent radiative opacity (dropping by 5.5 orders of magnitude from its maximum in the outer 3% of the solar convection zone). These characteristics lead to vigorous new downflow plume formation in the solar photosphere (e.g. [19]- [22]). The plumes persist with depth and, in numerical simulations, extend to the bottom of the computational domain. This apparent stability of the plumes with depth has inspired a number of authors to consider them essential to the understanding of thermal and angular momentum transport in the overshoot region below the solar convection zone, with consequent implications for the magnetic dynamo mechanisms thought to operate there (e.g. [23]- [28]).
But do such plumes actually extend to the base of stellar convective envelopes, as suggested by simulations? No observations, helioseismic or otherwise, are able to resolve downflow plumes within the solar interior, and no numerical simulation of an extended convective domain is able to resolve the secondary instabilities of the individual downflow plumes found within it. Rather than realistically modelling a broad region of solar or stellar convection, at the expense of underresolving the individual downflows within, we focus here on the dynamics of a single thermal starting plume generated by locally cooling the upper boundary of an otherwise adiabatically stratified layer. We employ a highly nonuniform grid, concentrated around the plume, in order to resolve any secondary instabilities of the flow which might be present.

Formulation
We consider a single thermal starting plume descending through an adiabatically stratified polytropic [29] layer of ideal gas. The fluid is confined between stress fress, impenetrable, and constant temperature horizontal boundaries and within a horizontally periodic domain. The atmosphere spans 6.0 pressure and 3.6 density scale heights, with corresponding pressure and density contrasts across the layer of 401 and 36.5. To this quiescent and neutrally stable polytropic atmosphere we apply a Gaussian temperature perturbation. The perturbation is cool, centred in the upper boundary plane, and ramped up in amplitude over a short time interval (two time units, where time is measured as the upper-boundary isothermal sound crossing time over the perturbation full width at half maximum) and subsequently held at a constant value through the remainder of the calculation. The full domain dimensions are 20 × 20 × 40, in units of the Gaussian perturbation's full width at half maximum, and the solution is advanced for 140 time units. Fluid motions develop within the domain as the thermal anomaly imposed on the boundary spreads inward. The motions are determined by numerical solution of the fully compressible equations of mass, momentum, and energy conservation for a medium with constant dynamic viscosity and thermal conductivity. The exact form of the equations solved can be found in [30], with values of the parameters corresponding to case E of that paper. Only the gridding parameters and plume onset time have been slightly altered. The horizontal extent of the 3D domain has been halved but 504 grid points were employed, instead of the corresponding 512 grid points used over the same physical range in the 2D calculation, and the boundary perturbation was ramped up over a small interval of time to its final value, as explained above, rather than being imposed instantaneously as was done in the 2D study.
The 3D solutions to the initial value problem outlined above were computed using fully explicit second-order finite-difference approximations to the spatial derivatives and thirdorder Runge-Kutta time integration [31]. A 504 × 504 × 2048 nonuniform grid was employed (arctan deformation), with the plume horizontally occupying only the very central portion of a horizontally periodic domain. Horizontal and vertical grid stretching were done independently with the two horizontal directions, x and y, treated identically. The nonuniformity of the grid is illustrated by figure 5 which shows on the left the fully developed plume (enstophy field The vertical boundaries of the domain are displaced a significant distance from the plume core (10 units to either side) to minimize their effect on the solution. We shall see that these boundaries nonetheless continue to influence the results.

Summary of previous 2D results
Similar studies have already been carried out for 2D planar plumes. In those studies the thermal starting plume takes the form of a laminar stem flow with a leading vortex cap. The structure is subject to two instability mechanisms (figure 6): successive pinch detachment of the leading vortex pair and sinuous instability of the plume stem [30].
The pinch instability was identified in [30] to be the result of dynamical pressure fluctuations behind the vortex cap structure at the head of the downflow plume. Since the medium is compressible, as the fluid in the cap spins up positive pressure fluctuations behind the cap impede flow into the vortex pair from the stem behind. This initiates secondary head formation within the stem and detachment of the leading pair. The process repeats itself, with successive detachment of vortex pairs each weaker than the preceding and travelling downward throughout the layer at a slower rate. The instability was observed over a range of thermal Prandtl numbers In a quiescent background, the most prominent leads to successive pinch detachment of the leading vortex pair (left). A sinuous shear mode of the stem flow can be excited by weak thermal fluctuations in the background medium (right). Images from [30].
with slight modification. Interestingly, recent laboratory experiments [32] have suggested that a similar pinch off process occurs in the 3D thermal starting plumes in the absence of significant compressibility. From the published work however, it is not clear how completely the leading vortex ring detaches, nor does the process occur repeatedly as observed in the 2D compressible plume simulations. Nonetheless, it may well be that the mechanism leading to pinch detachment in the compressible thermal starting plume is not the only one which can cause such detachment.A model for vortex ring formation at the front of a buoyant starting plume [33] raises the possibility that the pinch process is sensitive to the rate of plume initiation. This may have significance for the 3D numerical results to be discussed below, since the rate of plume initiation in these was slower than that in the 2D studies.
The 2D plume studies were conducted both with and without random background temperature perturbations of amplitude 10 −3 . When the background was quiescent (no initial random perturbations), only the pinch instability was realized. This is because the pinch detachment was not the result of the linear varicose mode of the stem shear flow, to which it bears some resemblance (such a mode likely grows at a rate slower than the linear sinuous mode to be discussed next), but instead an intrinsically nonlinear development due to Reynolds stresses. By contrast, when the domain into which the starting plume descends is seeded with weak random temperature fluctuations, a linear sinuous mode of the stem shear flow develops on a timescale somewhat slower than the nonlinear pinch which already occurs before the stem shear flow profile is fully developed. Figure 6 displays the difference between the solutions when the background is quiescent (left) or not (right). By conducting a local analysis of the stem flow profile, it can be shown that the sinuous instability of the thermal starting plume stem behaves as a convective linear shear mode, modified by the stratification of the domain [30].  figure 6, yet the behaviour is dramatically different. The background through which the plume is descending is quiescent in the simulation illustrated by the time series on the left and is seeded with low amplitude temperature fluctuations in that illustrated by the series on the right. Rather than pinch detachment of the leading vortex ring, down flowing fluid from behind passes through the vortex torus core and disrupts it.

Summary of 3D results
The goal of the 3D compressible thermal starting plume studies presented here is to determine whether either or both of the instabilities identified in the 2D flows occurs in 3D flows as well. For this purpose case E of [30] was duplicated as closely as possible (in a horizontally reduced spatial domain, as discussed above) both with and without randomly seeded background temperature fluctuations (as above). Visualizations of the temporal development of the flows under the two conditions are displayed in figure 7. It is immediately evident that the late time evolution of the 3D plume is fundamentally different from that of the 2D one.
The early stages of the 3D plume, by contrast, do not differ markedly from those of the 2D plume. A leading vortex torus, analogous to the leading vortex pair in the 2D, quickly develops after plume initiation and propagates downward through the domain ahead of the stem flow. The pinch process initiates, as in the 2D, with secondary head formation in the stem region behind the main vortex torus. Evidence for as many as three vortex regions, the leading torus and two weaker embedded tori in the stem, is seen, as it was in the 2D flow. Subsequent development, however, departs significantly from that expected. The pinch detachment never occurs, even when the background medium is quiescent. Instead, the fluid behind the leading vortex torus descends through it and disrupts it. The process repeats, eventually turning the head of the plume into a mass of vortex filaments ( figure 7). This significantly altered behaviour seems to have its origin in two properties of the 3D plume, both of which can be related to its 3D geometry. The amplitude of the vorticity within the leading vortex torus, and within each of the subsequent secondary tori, is much smaller than that achieved in the analogous 2D vortex pairs. While the thermal perturbation applied to the upper boundary of the 3D domain has the same amplitude as that applied in the 2D one, the downflow induced is columnar rather than planar and so the return flow in the torus can be of much smaller amplitude while still achieving mass conservation. This in turn means that the dynamical pressure fluctuations behind the torus are much lower than those behind the 2D vortex pair. Moreover, the overall structure of the plume head is fundamentally different in the 3D. Figure 8 displays a time series of volume renderings of the y component of the horizontal vorticity with the front surface of the volume being a plane of constant y through the centre of the plume. These VAPOR renderings clearly show that, unlike for the 2D flow, the initial head structure and the subsequent secondary vorticity structures behind it all contain a relatively strong counter rotating torus out in front of the main vortical component. It is this torus pair that nearly detaches in the 3D solution. Detachment fails because the main vortex component, responsible for the pressure fluctuations behind it which stagnate the oncoming plume stem flow, is weakened by viscous interaction and mixing with the component out front. Pressure fluctuations behind the main component decrease in amplitude allowing the previously retarded stem flow behind to accelerate through the torus centre. Geometry plays a critical role. While the value of the dynamic viscosity is identical in the 2D and 3D simulations, dissipation in the 3D case is much more effective because surfaces rather than lines of shear are generated.

Difficulties and uncertainties
These 3D simulations raise interesting questions about the role of viscosity in compressible plume stability. It appears that compressible thermal plume vortex ring detachment is governed by a delicate balance between induced dynamical pressure fluctuations and vorticity dissipation. The rapidity of plume onset, the magnitude of the thermal perturbation driving the plume, and the viscosity of the medium all likely play a role in determining the form of the instability realized. This is important because the structure of the consequent turbulent plume is sensitive to the underlying instability mechanism. Coherent turbulent plumes (conical in shape when stratification is weak) are favoured by leap-frog interactions between vortices, while the pinch instability, in the presence of stratification, yields isolated detached vortical structures. The limited set of 3D calculations performed so far is unable to address which of these variants is more likely to occur in the very inviscid highly stratified solar convective envelope.
Other, fundamental difficulties arise. Adaptive grid methodologies (in contrast to the fine but fixed grid methods used in the work presented here), when applied to the 2D compressible plumes, tend to produce small-scale shear instabilities along the plume stem even when the background through which the plume descends is quiescent. Whether this is a more or less accurate representation of the actual flow which would be physically realized is currently unclear. Additionally, the form of the 3D instability in the simulations discussed here is affected at late times by the rectangular nature of the computational domain. Weak differences in horizontal inflow velocity at the plume position occur depending on the distance to the domain boundary, even when that boundary is placed many plume distances away. The far field of the flow feels only the pressure field which varies depending on the distance between the plume centre and the horizontally periodic boundary, that being greater in the directions of the domain corners. This imposes a weak angular perturbation on the solution which effects its late time evolution (figure 9).
The questions raised, and preliminarily addressed, here require mechanistic not statistical studies for their answer. It is these types of studies for which VAPOR is a particularly useful tool. Local flow properties, force balance, and stability can be interactively interrogated. Since the site of instability is not a priori known, region-of-interest selection from the vast spatial-temporal data volume is essential.

Equations and numerical simulations
In this section, we use VAPOR to explore and analyse the properties of small-scale structures in 3D MHD turbulence. The MHD equations extend the momentum equation for a neutral fluid to include the coupling of a conducting fluid to magnetic fields [34]- [36]. This approximation is often used in astrophysics and geophysics to study the dynamics of conducting fluids as well as the large-scale dynamics of plasmas, e.g. in planetary cores, planetary magnetospheres, the solar wind, the solar convective region, and in the interstellar medium.
The MHD equations are known to develop, from smooth initial conditions, thin current and vorticity sheets. In these regions, magnetic reconnection takes place, a process in which the connectivity of magnetic field lines changes and magnetic energy is rapidly dissipated. In 2D [37,38] regions of strong current density and reconnection are associated with neutral x-points of the magnetic field, and quadrupolar structures in the vorticity. In 3D, the geometry of these regions is more complex, and it has been shown that sheets can form without the need of magnetic null points [39]. Moreover, at moderate Reynolds numbers these structures were observed to render the flow locally 2D.
We consider here data stemming from a 1536 3 free decaying MHD simulation. The incompressible MHD equations in dimensionless Aflvénic units [35] read Here, v is the velocity field, b is the magnetic induction, j = ∇ × b is the current density, P is the pressure, ρ 0 is the density (assumed constant in this section), ν is the kinematic viscosity, and η is the magnetic diffusivity. Equation (8) is the momentum equation with the term j × b (the Lorentz force) taking into account the force exerted by the magnetic field in the flow. For simplicity in this section, we consider incompressible flows, and so the continuity equation reads ∇ · v = 0 (ρ 0 = 1). The pressure P is the result of a Poisson equation obtained by taking the divergence of equation (8) and using the fact that the velocity field is divergence-free.
Equation (9) is the induction equation and derives from Maxwell's equations for subrelativistic velocities (displacement currents are neglected). The equation is accompanied by ∇ · b = 0, indicating the lack of magnetic monopoles. Equation (9) expresses simple physics: magnetic field lines in a conducting fluid behave as material lines, which are advected and stretched by the velocity field. This fact, and the presence of the Lorentz force in the momentum equation (8) (which acts as a restoring force for the fluid), imply the equations support waves (known as Alfvén waves) with u = ±b. It is worth remarking that these waves are also exact nonlinear solutions of the ideal (ν = η = 0) MHD equations. When u = ±b, all the nonlinear terms in equations (8) and (9) cancel, and there is no nonlinear transfer of energy to smaller scales.
The MHD equations have three ideal quadratic invariants under proper boundary conditions (e.g. periodic boundaries): where E is the total energy (kinetic plus magnetic), H C is the cross helicity, and H M is the magnetic helicity. The vector potential A is defined as b = ∇ × A. While in MHD turbulence the total energy has a direct cascade to small scales (as in hydrodynamic turbulence), the magnetic helicity can display an inverse cascade and be transferred to the largest scales available in the system. The cross helicity can be considered as a measure of alfvenization, since in a flow with u = ±b the absolute value of the cross helicity is maximal for fixed amplitude of the fields. For convenience, we can introduce the relative cross helicity ρ C = 2H C ( |v| |b| ) −1 , and the relative magnetic helicity ρ M = 2H M ( |A| |b| ) −1 . With these definitions, ρ C and ρ M can take values between −1 and 1. Note that while the magnetic helicity H M is gauge invariant, the relative magnetic helicity ρ M is not and we will consider the vector potential A in the Coulomb gauge. Equations (8) and (9) are solved using a pseudospectral method [40,41]. The MHD equations are integrated in a cubic box of side 2π with periodic boundary conditions. The equations are evolved in time using a second-order Runge-Kutta method, and the 2/3-rule is used for dealiassing [42]. The initial conditions for the run are given by where it is assumed complex conjugate terms are added to have real fields, v ABC denotes a helical ABC flow (see e.g. [43]) and φ v,k and φ b,k are random phases with Gaussian distribution. The amplitudes A(k) are chosen so that the kinetic and magnetic energy spectra ∼ k −3 exp [− 2(k/k 0 )] 2 at t = 0. In practice, the random modes have to be chosen to have divergence-free velocity and magnetic fields. Also, the phases for the v and b fields are chosen to have negligible correlation between the two fields. As a result of the superposition of helical and random components, the initial fields are helical (although not maximally helical), and there are no null points in the fields. At t = 0, E = 1 (E V = E M = 0.5 are respectively the kinetic and magnetic energy), ρ C ≈ 1 × 10 −4 , and ρ M ≈ 0.7. The system is allowed to freely decay in time, and no external forcing is applied. The kinematic viscosity and magnetic diffusivity are ν = η = 2 × 10 −4 , and these numbers can be considered in these dimensionless units as the reciprocals of the large-scale Reynolds numbers of the flow. The MHD equations (8) and (9) are solved on a regular grid using 1536 3 grid points. The code is parallelized using MPI, and the simulation we discuss here was done using 512 processors with 2 GB per processor on an IBM cluster with POWER5 processors. Storing in single precision one snapshot of one Cartesian component of the fields for visualization purposes requires 13 GB. As a result, one snapshot of the three components of the velocity and the magnetic field requires 78 GB of storage. In practice, we also store at selected times the three components of the current density, the vorticity ω = ∇ × v, and the vector potential, to speed up the analysis. The evolution of the system was computed up to t ≈ 2 and eight snapshots of the fields at different times were stored, totalling 1.3 TB of data. Visualization and analysis of these outputs were performed using VAPOR on desktop computers.

Visualization and data analysis
As previously mentioned, 3D simulations of turbulent MHD flows at moderate resolutions and Reynolds numbers showed that thin current and vorticity sheets are spontaneously formed as the system evolves in time. However, unlike the evolution of vorticity sheets in hydrodynamic turbulence, these structures do not roll to create filaments or 'tubes'. On the other hand, recent observations in the magnetosheath [44] reported the presence of structures reminiscent of 'Alfvén vortices' [45], cylindrical current tubes with the velocity and magnetic fields aligned (although with different amplitudes). The origin of these structures is unclear, and can be either attributed to kinetic plasma effects or to turbulence.
Recently, the spontaneous rolling of current sheets in turbulent MHD flows to create cylindrical structures was reported in [46]. Large resolutions and Reynolds numbers were required, since at moderate Reynolds magnetic tension tends to prevent the rolling of the sheets, and instead structures are locally disrupted [39]. Here, we use VAPOR to study in detail the geometrical properties of these structures, their early development, and their ensuing evolution.
Details of the global evolution of the system can be found in [46]. Here we list briefly the features that are relevant to the following analysis. The system starts from the smooth initial conditions at t = 0 and as time evolves small-scale current density and vorticity are produced. From t = 0 to t ≈ 0.6, the maxima of the current density and vorticity grow exponentially. This stage corresponds to a linear growth of unstable modes. Then, from t ≈ 0.6 to t ≈ 1.1 the maxima of the current density and vorticity grow as a power law ∼t 3 . This stage corresponds to a nonlinear self-similar process in which the current and vorticity sheets that contribute to the maxima get thinner and more intense. Finally, the maxima of the current density and vorticity saturate, while the global kinetic and magnetic energy dissipation rates keep increasing as more and more structures are formed in the flow. Figure 10 shows a rendering of the current density intensity in a slice of 1536 × 1536 × 150 grid points at t ≈ 1.6, after saturation of the maxima of the current density and vorticity takes place. At this resolution, it is impossible to allocate the entire array in memory for visualization, and as a result the hierarchical representation of data in VAPOR is extremely useful to explore the data and recognize structures. From left to right, visualizations at different resolutions are shown. At 1/32 of the resolution, a few regions of strong current density can be recognized, and as the resolution is increased it can be seen that current sheets permeate the entire volume. Moreover, at 1/4 of the resolution, the occurrence of folding and rolling of the current sheets can already be observed (see e.g. the white box to the right of figure 10).
The amount of information preserved (or lost) when visualizations at lower resolution are done can also be shown by computing probability density functions (pdfs). Figure 11(a) shows pdfs of j z , the z-component of the current density in the region shown in figure 10 at full resolution, 1/2, 1/4, 1/8, 1/16 and 1/32 resolution. The pdfs at full resolution, 1/2, and 1/4 resolution show the usual stretched exponential tails often associated with intermittency: strong events in the tails of the pdfs have a large probability of occurrence (compared against a Gaussian distribution). As the resolution is lowered further, the width and amplitude of the tails decrease significantly and the pdf of j z at 1/32 resolution is closer to Gaussian. This is to be expected since strong intermittent events take place in thin current sheets that cannot be resolved at the lowest resolutions.
In the following, we focus on the structures in the region indicated by the white box in figure 10. The evolution in time of this region at full resolution is shown both in figures 11(b) and 12. Figure 11(b) shows pdfs of j z in the region at different times. At early times the pdf is close to Gaussian and before t ≈ 0.6 no strong tails are observed. As the self-similar regime starts and nonlinearities prevail, the tails in the pdf grow, and at t ≈ 1.1 the pdfs are far from Gaussian. However, after this time the tails keep increasing and they saturate about t ≈ 1.5.  The counterpart of this evolution can be seen in the visualizations of current density intensity in figure 12. Up to t ≈ 1.1, current sheets get thinner and more intense, and the geometry of the structures is similar to what was reported in [39] at smaller Reynolds numbers. But after t ≈ 1.1 the current sheets fold and roll. Two of these structures can be seen at t ≈ 1.6, rolling in opposite directions. Figure 13. Above: visualization of current density intensity at t ≈ 1.6 (left) and magnetic field lines superposed to the current density intensity (right). Below: vorticity intensity at the same time (left) and velocity field lines superposed to the vorticity intensity (right).

DEUTSCHE PHYSIKALISCHE GESELLSCHAFT
In 2D, the vorticity in the vicinity of current sheets has a quadrupolar structure [37]. In 3D at moderate Reynolds numbers, it was observed that vorticity tends to form sheets with the same geometry than the current sheets [39]. What is the geometry of regions with strong vorticity here? Figure 13 shows visualizations of the current density and vorticity intensity. Current and vorticity sheets have the same shape and are collocated. Regions of strong magnetic field gradients coincide with regions of strong velocity field gradients. Figure 13 also shows magnetic and velocity field lines, superposed to the visualizations of current and vorticity intensity, respectively. For convenience, intensities are made transparent using the transfer function editor in VAPOR, so velocity field lines can be seen across the structures. Magnetic field lines are aligned with the rolls. This is to be expected since magnetic field lines along this direction cannot exert any tension to prevent the rolling from occurring. Velocity field lines are more complex. A large component of the velocity field can bee seen along the direction these structures are advected. But close to the current and vorticity sheets strong shear can be recognized. Moreover, field lines close to the roll in the centre of the region are entangled and deformed, and a component of the velocity field along the direction of the roll develops. This is important for the alignment of the two fields inside the current sheets, as will be shown next.
Magnetic reconnection, easier to characterize in 2D, is more difficult to study in 3D. Several definitions of the process for 3D flows have been attempted, e.g. as the process where plasma flows across a thin sheet that separates regions with different magnetic field topologies [47], or as the process by which the connectivity of magnetic field lines changes (see e.g. [38]). In this context, magnetic helicity is in 3D a quantity of interest. This ideal invariant is a topological quantity proportional to the number of links in the magnetic field lines. Figure 14 shows a visualization of the density of relative magnetic helicity ρ M (x) = A · b(|A||b|) −1 where A is in the Coulomb gauge. Only regions with |ρ M (x)| > 0.5 are shown using the transfer function editor in VAPOR, and regions with ρ M (x) = ±1 (1 corresponding to blue, −1 corresponding to red) are strongly helical. Note the two strong current sheets (in the upper left corner, and in the centre of the region) are separating regions of opposite sign of relative magnetic helicity. It can also be seen that in this subvolume positive ρ M dominates over negative ρ M , as can also be seen from the pdf of relative magnetic helicity in figure 15(a). Figure 14 also shows the relative cross helicity ρ C (x) = v · b(|v||b|) −1 . This quantity is the cosine of the angle between the velocity and magnetic fields as a function of space. Both current rolls show strong alignment between the two fields, and the current sheets are collocated with the local maxima of relative cross helicity. The current roll in the centre of the image has positive cross helicity, while the current sheet in the top left corner has negative cross helicity. However, although aligned, the fields in the current sheets are not alfvenic (u = ±b), since the amplitudes of the two fields are different. In this region negative cross correlation dominates over positive, as can be seen from the pdf of ρ C (x) in figure 15(b). Also, although alignment of the flows is rather strong in the vicinity of the sheets, globally the flows remain uncorrelated and ρ C ≈ 4 × 10 −4 at t ≈ 1.6.
The geometry and evolution of these structures is reminiscent of the MHD Kelvin-Helmholtz instability [48]- [50]. This instability has been observed before in numerical simulations and in the Earth's magnetosphere, although in simulations it was reported only in studies in which the initial profiles of the velocity and magnetic field were adjusted to observe the instability, and not spontaneously occurring in a turbulent flow. The difference is crucial, because in turbulent MHD flows it was believed magnetic tension would prevent the formation of such structures at small scales. To the best of our knowledge, the formation of these structures in a turbulent flow was reported for the first time in [46], and the identification of the structures at this resolution was facilitated by VAPOR. As in the previous section, the use of VAPOR to explore and analyse huge datasets enabled us to study geometrical properties of the small-scale structures, and their dynamical evolution. This significantly augments the usual set of statistical and spectral tools used to analyse data from numerical simulations of geophysical and astrophysical flows.

Discussion
Both of the analyses described in the preceding sections were conducted using relatively modest computing resources. The authors each had access to a Linux workstation, equipped with a commodity graphics card, and a high-bandwidth storage system with sufficient capacity to contain an entire simulation output as well as any derived quantities needed. We contrast this to the massively parallel computing clusters required to perform the initial simulations. Despite this computing capability mismatch and the interactive processing demands inherent in analysis work, we were able to successfully carry out our investigations.
Combining visualization with quantitative tools, in this case ITT's IDL, was instrumental in coping with the huge datasets. By employing visual identification to cull unessential regions of the data, attention and computing resources were focused on pertinent structures, localized in space and time. This was true for both the readily identifiable vortex structure of the compressible plume head and the more difficult to detect current roll embedded in the MHD study. Once structure identification was made and a reduced ROI chosen, the ability to selectively coarsen or refine the computational grid further aided our work by significantly reducing the time spent waiting for results. Full resolution volume rendering or analysis of either dataset discussed, while possible, is frustratingly slow. Yet, as we have seen, coarse approximations of the data (which may be far more readily explored interactively) still possess sufficient information to preserve, and allow the identification of, small but relevant structures. Full resolution analysis needs to be applied only to a significantly reduced sub-domain of data.
While volume rendering of scalar quantities proved to be an extremely useful way of exploring both datasets considered, other tools, such as field-line integration and lighting adjustment, also contributed to the identification and understanding of structures and their evolution. In section 4, field-line integration and visualization in conjunction with volume rendering was used to clarify how the structures are locally aligned with the velocity and magnetic fields. Lighting, although decreasing the overall speed of the renderer, proved useful in identifying small perturbations in the pressure field (with as yet undetermined but perhaps significant implications for overall stability) of the plume study in section 3. Lighting also proved invaluable in highlighting structural borders, determining the relative positions of structures, and producing publication-quality figures. When rendering speed is critical, lighting can be employed intermittently as needed for clarification.
Notwithstanding the progress we have made in the investigation of high resolution numerical simulation output, challenges still remain. The analysis model we have presented-visual detection of ROIs that may be subsequently processed with more quantitative methods-works well provided the features of interest are large enough to be preserved at coarsened resolutions. For fine-scale structures such as vortex tubes in hydrodynamic turbulence or current sheets and rolls in MHD turbulence, where features approach the grid resolution, our wavelet-based transforms, which applies a low-pass filter to the data, may be less useful. Other transforms, which preserve local maximum values, for example, may be more beneficial with grid cell sized structures. The lack of scalability of our quantitative tool of choice posed another challenge. IDL is currently limited to running on computing systems with a shared memory address space. This places an upper bound on the size of the arrays that can be processed without investing substantial coding efforts to develop so-called out-of-core algorithms, capable of operating on data larger than the physical memory size.
Despite these limitations, we have found our methods to be useful in the exploration and analysis of very large datasets. Anticipated future improvements are outlined in section 6 but the core capabilities of VAPOR appear robust. The ability to interactively explore and analyse flows, identify structures, follow their dynamical evolution, compute new derived quantities, and seamlessly integrate these into an ongoing VAPOR session, all complement and expand the set of statistical and spectral tools often used to analyse the output from large-scale numerical simulations. In many cases a deeper physical understanding of the evolving flow requires local analysis of the structures produced by and within it. This is a task that becomes increasingly difficult with each advance in supercomputer capabilities. The two datasets discussed here are a good example of this. Neither represents the largest computation currently possible on the most advanced machines of our day, yet both presented significant hurdles to analysis efforts. VAPOR enabled data exploration that may have been otherwise very much more difficult.

Future work
VAPOR's approach to large data handling is demonstrably effective with terabyte-size datasets. In practice, computational grids on the order of up 2048 3 are manageable with relatively modest computing resources. However, pushing beyond these resolutions, as some computational physicists already have (e.g. [51]), will require additional measures. Parallel implementations may open the door to significantly larger datasets, but only at the cost of requiring substantially more complex and costly computing resources. Lossy compression, readily afforded by the wavelet data encoding scheme currently employed, may also provide incremental improvement without the need for more hardware, particularly if the wavelet encoding process is performed in tandem with the simulation, and provided the desired analysis tolerates a degree of information loss. We intend to explore both of these avenues in our future work.
In examining time-varying MHD datasets, it is useful to understand how magnetic field lines evolve in a changing velocity field. We are implementing a field-line advection algorithm that will enable users to visualize the motion of field lines (e.g. magnetic) under the influence of a time-varying velocity field. The scheme works by first identifying points on steady field lines with maximal field magnitude. These points are advected in the velocity field and their positions at the next time step are used to calculate a new set of lines. Visualization of the field lines advected in this way can provide useful information about the evolving field in the presence of flow, provided that the diffusion of the field being visualized by the lines is small.
VAPOR currently limits itself to Cartesian grids with uniform sampling. Yet more general computational meshes are widely found in computational fluid dynamics and MHDs. For example, the data presented in section 3 was computed on a structured but highly nonuniform grid and required resampling before investigation. Such resampling is time consuming and error introducing. Many local domain models employ such nonuniform gridding to resolve processes inherent to boundary layers. Spherical grids, prevalent in atmospheric research and in global solar and other astrophysical computations, pose yet another challenge. Adaptive meshing techniques and the use of higher-order unstructured grid elements are becoming more common place as a means to improve computational efficiency and reduce dataset sizes. These more generalized meshes pose challenges throughout the VAPOR pipeline, from progressive data access schemes to visualization algorithms able to understand and efficiently operate on complex mesh elements. VAPOR will need to evolve to seamlessly support a wider collection of data types.